History log of /src/sys/kern/sys_descrip.c |
Revision | | Date | Author | Comments |
1.52 |
| 16-Jul-2025 |
kre | Kernel part of O_CLOFORK implementation (plus kernel revbump)
This is Ricardo Branco's implementation of O_CLOFORK (and associated fcntl, etc) for NetBSD (with a few minor changes by me).
For now, the header file symbols that should be exposed to userland are hidden inside temporary #ifdef _KERNEL blocks, just to avoid random userland apps, or config scripts, from seeing any of this before it is better tested.
Userland parts of this will follow soon.
This also bumps the kernel version to 10.99.15 (changes to data structs, and the signature of fd_dup()).
|
1.51 |
| 20-May-2024 |
martin | branches: 1.51.2; Fix a few oversights from the renaming of dup3110 to dup3100
|
1.50 |
| 19-May-2024 |
christos | version dup3
|
1.49 |
| 19-May-2024 |
christos | PR/58266: Collin Funk: Fail if from == to, like FreeBSD and Linux. The test is done in dup3 before any other tests so even if a bad descriptor it is passed we will return EINVAL not EBADFD like Linux does.
|
1.48 |
| 10-Jul-2023 |
christos | Add memfd_create(2) from GSoC 2023 by Theodore Preduta
|
1.47 |
| 14-May-2023 |
riastradh | kern/sys_descrip.c: Nix trailing whitespace.
|
1.46 |
| 22-Apr-2023 |
riastradh | fcntl(2), flock(2): Assert FHASLOCK is clear if no fo_advlock.
|
1.45 |
| 22-Apr-2023 |
riastradh | fcntl(2), flock(2): Unify error branches.
Let's make this a bit less error-prone by having everything converge in the same place instead of multiple returns in different contexts.
|
1.44 |
| 22-Apr-2023 |
riastradh | fcntl(2), flock(2): Fix missing fd_putfile in error branch.
Oops!
|
1.43 |
| 22-Apr-2023 |
riastradh | file(9): New fo_posix_fadvise operation.
XXX kernel revbump -- changes struct fileops API and ABI
|
1.42 |
| 22-Apr-2023 |
riastradh | file(9): New fo_fpathconf operation.
XXX kernel revbump -- struct fileops API and ABI change
|
1.41 |
| 22-Apr-2023 |
riastradh | file(9): New fo_advlock operation.
This moves the vnode-specific logic from sys_descrip.c into vfs_vnode.c, like we did for fo_seek.
XXX kernel revbump -- struct fileops API and ABI change
|
1.40 |
| 16-Apr-2022 |
hannken | Lock vnode for VOP_PATHCONF().
|
1.39 |
| 15-Mar-2022 |
riastradh | posix_fadvise(2): Detect arithmetic overflow without UB.
Reported-by: syzbot+18f01abff11bd527c464@syzkaller.appspotmail.com
|
1.38 |
| 11-Sep-2021 |
riastradh | sys/kern: Avoid fp->f_offset without the object (here, vnode) lock.
|
1.37 |
| 23-Feb-2020 |
ad | UVM locking changes, proposed on tech-kern:
- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock. - Break v_interlock and vmobjlock apart. v_interlock remains a mutex. - Do partial PV list locking in the x86 pmap. Others to follow later.
|
1.36 |
| 01-Feb-2020 |
riastradh | Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.
While here:
- Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused. => This is used only in fd_close and fd_abort, where it holds. - Move bounds check assertion in fd_putfile to where it matters. - Store fd_dt with atomic_store_release. - Move load of fd_dt under lock in knote_fdclose. - Omit membar_consumer in fdesc_readdir. => atomic_load_consume serves the same purpose now. => Was needed only on alpha anyway.
|
1.35 |
| 15-Sep-2019 |
christos | branches: 1.35.2; Add F_GETPATH, presented to tech-kern.
|
1.34 |
| 26-Aug-2019 |
maxv | Reject negative offsets, to prevent panics later in genfs_getpages().
|
1.33 |
| 21-May-2019 |
christos | branches: 1.33.2; provide more info about who is getting ERESTART.
|
1.32 |
| 03-Feb-2019 |
mrg | - add or adjust /* FALLTHROUGH */ where appropriate - add __unreachable() after functions that can return but won't in this case, and thus can't be marked __dead easily
|
1.31 |
| 26-Dec-2017 |
kamil | branches: 1.31.4; Refactor pipe1() and correct a bug in sys_pipe2() (SYS_pipe2)
sys_pipe2() returns two integers (values), the 2nd one is a copy of the 2nd file descriptor that lands in fildes[2]. This is a side effect of reusing the code for sys_pipe() (SYS_pipe) and not cleaning it up.
The first returned value is (on success) 0.
Introduced a small refactoring in pipe1() that it does not operate over retval[], but on an array int[2]. A user sets retval[] for pipe() when desired and needed.
This refactoring touches compat code: netbsd32, linux, linux32.
Before the changes on NetBSD/amd64:
$ ktruss -i ./a.out [...] 15131 1 a.out pipe2(0x7f7fff2e62b8, 0) = 0, 4 [...]
After the changes:
$ ktruss -i ./a.out [...] 782 1 a.out pipe2(0x7f7fff97e850, 0) = 0 [...]
There should not be a visible change for current users.
Sponsored by <The NetBSD Foundation>
|
1.30 |
| 05-Sep-2014 |
matt | Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get a correctly typed pointer.
|
1.29 |
| 05-Sep-2014 |
matt | Don't next structure and enum definitions. Don't use C++ keywords new, try, class, private, etc.
|
1.28 |
| 08-Apr-2013 |
skrll | Remove some set but unused variables
|
1.27 |
| 05-Aug-2012 |
riastradh | branches: 1.27.2; Force sys_close not to restart by returning ERESTART.
Print a diagnostic message if we ever get ERESTART out of fd_close and convert it to EINTR instead.
Even if fd_close fails, it has already closed the file descriptor, so restarting the system call is a mistake, with dangerous consequences for multithreaded programs.
Should probably turn the message into a kassert eventually, and maybe add one deeper in fd_close in order to more easily debug it before all the data structures are destroyed.
|
1.26 |
| 11-Feb-2012 |
martin | Add a posix_spawn syscall, as discussed on tech-kern. Based on the summer of code project by Charles Zhang, heavily reworked later by me - all bugs are likely mine. Ok: core, releng.
|
1.25 |
| 25-Jan-2012 |
christos | Add locking, requested by yamt. Note that locking is not used everywhere for these.
|
1.24 |
| 25-Jan-2012 |
christos | As discussed in tech-kern, provide the means to prevent delivery of SIGPIPE on EPIPE for all file descriptor types:
- provide O_NOSIGPIPE for open,kqueue1,pipe2,dup3,fcntl(F_{G,S}ETFL) [NetBSD] - provide SOCK_NOSIGPIPE for socket,socketpair [NetBSD] - provide SO_NOSIGPIPE for {g,s}seckopt [NetBSD/FreeBSD/MacOSX] - provide F_{G,S}ETNOSIGPIPE for fcntl [MacOSX]
|
1.23 |
| 31-Oct-2011 |
christos | branches: 1.23.2; 1.23.6; PR/45545 Yui NARUSE: pipe2's return value is wrong
|
1.22 |
| 26-Jun-2011 |
christos | * Arrange for interfaces that create new file descriptors to be able to set close-on-exec on creation (http://udrepper.livejournal.com/20407.html).
- Add F_DUPFD_CLOEXEC to fcntl(2). - Add MSG_CMSG_CLOEXEC to recvmsg(2) for unix file descriptor passing. - Add dup3(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK. - Add pipe2(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK. - Add flags SOCK_CLOEXEC, SOCK_NONBLOCK to the socket type parameter for socket(2) and socketpair(2). - Add new paccept(2) syscall that takes an additional sigset_t to alter the sigmask temporarily and a flags argument to set SOCK_CLOEXEC, SOCK_NONBLOCK. - Add new mode character 'e' to fopen(3) and popen(3) to open pipes and file descriptors for close on exec. - Add new kqueue1(2) syscall with a new flags argument to open the kqueue file descriptor with O_CLOEXEC, O_NONBLOCK.
* Fix the system calls that take socklen_t arguments to actually do so.
* Don't include userland header files (signal.h) from system header files (rump_syscallargs.h).
* Bump libc version for the new syscalls.
|
1.21 |
| 12-Jun-2011 |
rmind | Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9). New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner. Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches formed the core changes of this branch.
|
1.20 |
| 10-Apr-2011 |
christos | branches: 1.20.2; - Add O_CLOEXEC to open(2) - Add fd_set_exclose() to encapsulate uses of FIO{,N}CLEX, O_CLOEXEC, F{G,S}ETFD - Add a pipe1() function to allow passing flags to the fd's that pipe(2) opens to ease implementation of linux pipe2(2) - Factor out fp handling code from open(2) and fhopen(2)
|
1.19 |
| 18-Dec-2010 |
rmind | branches: 1.19.2; do_posix_fadvise: fix and improve previous change - add a comment with some rationale and handle few range overflows.
Per report/discussion with yamt@.
|
1.18 |
| 27-Oct-2010 |
rmind | do_posix_fadvise: check for a negative length; truncate the offset and round the end-offset, not vice-versa.
Thanks to jakllsch@ for debug info.
|
1.17 |
| 28-Oct-2009 |
njoly | branches: 1.17.2; 1.17.4; Make flock(2) more robust to invalid operation, such as (LOCK_EX|LOCK_SH).
|
1.16 |
| 10-Jun-2009 |
yamt | do_posix_fadvise: - deactivate pages on POSIX_FADV_DONTNEED. - more sanity checks. fix a panic in genfs_getpages introduced by the previous (rev.1.15).
|
1.15 |
| 10-Jun-2009 |
yamt | do_posix_fadvise: on POSIX_FADV_WILLNEED, start prefeching of object's pages.
|
1.14 |
| 31-May-2009 |
yamt | do_posix_fadvise: turn some KASSERTs into CTASSERTs.
|
1.13 |
| 24-May-2009 |
ad | More changes to improve kern_descrip.c.
- Avoid atomics in more places. - Remove the per-descriptor mutex, and just use filedesc_t::fd_lock. It was only being used to synchronize close, and in any case we needed to take fd_lock to free the descriptor slot. - Optimize certain paths for the <NDFDFILE case. - Sprinkle more comments and assertions. - Cache more stuff in filedesc_t. - Fix numerous minor bugs spotted along the way. - Restructure how the open files array is maintained, for clarity and so that we can eliminate the membar_consumer() call in fd_getfile(). This is mostly syntactic sugar; the main functional change is that fd_nfiles now lives alongside the open file array.
Some measurements with libmicro:
- simple file syscalls are like close() are between 1 to 10% faster. - some nice improvements, e.g. poll(1000) which is ~50% faster.
|
1.12 |
| 28-Mar-2009 |
rmind | sys_fcntl: use FD_CLOEXEC, instead of magic number '1'.
|
1.11 |
| 04-Mar-2009 |
skrll | Fix the posix_fadvise return value... finally.
Tested martin on sparc64/m68k and me on hppa.
|
1.10 |
| 22-Jan-2009 |
yamt | branches: 1.10.2; malloc -> kmem_alloc
|
1.9 |
| 11-Jan-2009 |
christos | merge christos-time_t
|
1.8 |
| 21-Dec-2008 |
ad | Prevent a potential deadlock from a multithreaded process doing:
t1 dup2(0, 1) t2 dup2(1, 0)
|
1.7 |
| 15-Sep-2008 |
rmind | branches: 1.7.2; 1.7.4; Replace intptr_t with uintptr_t in few more places. OK by <matt>.
|
1.6 |
| 31-Aug-2008 |
njoly | Make dup(2) return the correct error value, not 0.
|
1.5 |
| 02-Jul-2008 |
matt | branches: 1.5.2; Change {ff,fd}_exclose and ff_allocated to bool. Change exclose arg to fd_dup to bool. Switch assignments from 1/0 to true/false.
This make alpha kernels compile. Bump kern to 4.99.69 since structure changed.
|
1.4 |
| 23-Jun-2008 |
ad | sys_fcntl: use l_fd, not p_fd.
|
1.3 |
| 28-Apr-2008 |
martin | branches: 1.3.2; 1.3.4; Remove clause 3 and 4 from TNF licenses
|
1.2 |
| 24-Apr-2008 |
ad | branches: 1.2.2; Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since we no longer need to guard against access from hardware interrupt handlers.
Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the child process share the parent's lock so that signal state may be kept in sync. Partially addresses PR kern/37437.
|
1.1 |
| 21-Mar-2008 |
ad | branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; File descriptor changes, discussed on tech-kern:
- Redo reference counting to be sane. LWPs accessing files take a short term reference on the local file descriptor. This is the most common case. While a file is in a process descriptor table, a reference is held to the file. The file reference count only changes during control operations like open() or close(). Code that comes at files from an unusual direction (i.e. foreign to the process) like procfs or sysctl takes a reference on the file (f_count), and not on a descriptor.
- Remove knowledge of reference counting and locking from most code that deals with files.
- Make the usual case of file descriptor lookup lockless.
- Make kqueue MP and MT safe. PR kern/38098, PR kern/38137.
- Fix numerous file handling bugs, and bugs in the descriptor code that affected multithreaded processes.
- Split descriptor system calls out into sys_descrip.c.
- A few stylistic changes: KNF, remove unused casts now that caddr_t is gone. Replace dumb gotos with loop control in a few places.
- Don't do redundant pointer passing (struct proc, lwp, filedesc *) unless the routine is likely to be inlined. Most of the time it's about the current process.
|
1.1.8.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.1.6.7 |
| 17-Jan-2009 |
mjf | Sync with HEAD.
|
1.1.6.6 |
| 28-Sep-2008 |
mjf | Sync with HEAD.
|
1.1.6.5 |
| 02-Jul-2008 |
mjf | Sync with HEAD.
|
1.1.6.4 |
| 29-Jun-2008 |
mjf | Sync with HEAD.
|
1.1.6.3 |
| 02-Jun-2008 |
mjf | Sync with HEAD.
|
1.1.6.2 |
| 03-Apr-2008 |
mjf | Sync with HEAD.
|
1.1.6.1 |
| 21-Mar-2008 |
mjf | file sys_descrip.c was added on branch mjf-devfs2 on 2008-04-03 12:43:04 +0000
|
1.1.4.3 |
| 27-Dec-2008 |
christos | merge with head.
|
1.1.4.2 |
| 01-Nov-2008 |
christos | Sync with head.
|
1.1.4.1 |
| 29-Mar-2008 |
christos | Welcome to the time_t=long long dev_t=uint64_t branch.
|
1.1.2.2 |
| 24-Mar-2008 |
yamt | sync with head.
|
1.1.2.1 |
| 21-Mar-2008 |
yamt | file sys_descrip.c was added on branch yamt-lazymbuf on 2008-03-24 09:39:02 +0000
|
1.2.2.4 |
| 11-Mar-2010 |
yamt | sync with head
|
1.2.2.3 |
| 20-Jun-2009 |
yamt | sync with head
|
1.2.2.2 |
| 04-May-2009 |
yamt | sync with head.
|
1.2.2.1 |
| 16-May-2008 |
yamt | sync with head.
|
1.3.4.2 |
| 03-Jul-2008 |
simonb | Sync with head.
|
1.3.4.1 |
| 27-Jun-2008 |
simonb | Sync with head.
|
1.3.2.2 |
| 24-Sep-2008 |
wrstuden | Merge in changes between wrstuden-revivesa-base-2 and wrstuden-revivesa-base-3.
|
1.3.2.1 |
| 18-Sep-2008 |
wrstuden | Sync with wrstuden-revivesa-base-2.
|
1.5.2.1 |
| 19-Oct-2008 |
haad | Sync with HEAD.
|
1.7.4.2 |
| 06-Nov-2012 |
riz | Pull up following revision(s) (requested by he in ticket #1815): sys/kern/sys_descrip.c: revision 1.11 Fix the posix_fadvise return value... finally. Tested martin on sparc64/m68k and me on hppa.
|
1.7.4.1 |
| 02-Feb-2009 |
snj | Pull up following revision(s) (requested by ad in ticket #341): sys/kern/sys_descrip.c: revision 1.8 Prevent a potential deadlock from a multithreaded process doing: t1 dup2(0, 1) t2 dup2(1, 0)
|
1.7.2.3 |
| 28-Apr-2009 |
skrll | Sync with HEAD.
|
1.7.2.2 |
| 03-Mar-2009 |
skrll | Sync with HEAD.
|
1.7.2.1 |
| 19-Jan-2009 |
skrll | Sync with HEAD.
|
1.10.2.2 |
| 23-Jul-2009 |
jym | Sync with HEAD.
|
1.10.2.1 |
| 13-May-2009 |
jym | Sync with HEAD.
Commit is split, to avoid a "too many arguments" protocol error.
|
1.17.4.3 |
| 21-Apr-2011 |
rmind | sync with head
|
1.17.4.2 |
| 05-Mar-2011 |
rmind | sync with head
|
1.17.4.1 |
| 16-Mar-2010 |
rmind | Change struct uvm_object::vmobjlock to be dynamically allocated with mutex_obj_alloc(). It allows us to share the locks among UVM objects.
|
1.17.2.1 |
| 06-Nov-2010 |
uebayasi | Sync with HEAD.
|
1.19.2.1 |
| 06-Jun-2011 |
jruoho | Sync with HEAD.
|
1.20.2.1 |
| 23-Jun-2011 |
cherry | Catchup with rmind-uvmplock merge.
|
1.23.6.1 |
| 18-Feb-2012 |
mrg | merge to -current.
|
1.23.2.3 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.23.2.2 |
| 30-Oct-2012 |
yamt | sync with head
|
1.23.2.1 |
| 17-Apr-2012 |
yamt | sync with head
|
1.27.2.3 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.27.2.2 |
| 23-Jun-2013 |
tls | resync from head
|
1.27.2.1 |
| 12-Sep-2012 |
tls | Initial snapshot of work to eliminate 64K MAXPHYS. Basically works for physio (I/O to raw devices); needs more doing to get it going with the filesystems, but it shouldn't damage data.
All work's been done on amd64 so far. Not hard to add support to other ports. If others want to pitch in, one very helpful thing would be to sort out when and how IDE disks can do 128K or larger transfers, and adjust the various PCI IDE (or at least ahcisata) drivers and wd.c accordingly -- it would make testing much easier. Another very helpful thing would be to implement a smart minphys() for RAIDframe along the lines detailed in the MAXPHYS-NOTES file.
|
1.31.4.3 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.31.4.2 |
| 08-Apr-2020 |
martin | Merge changes from current as of 20200406
|
1.31.4.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.33.2.1 |
| 20-Nov-2024 |
martin | Pull up following revision(s) (requested by riastradh in ticket #1921):
sys/kern/kern_event.c: revision 1.106 sys/kern/sys_select.c: revision 1.51 sys/kern/subr_exec_fd.c: revision 1.10 sys/kern/sys_aio.c: revision 1.46 sys/kern/kern_descrip.c: revision 1.244 sys/kern/kern_descrip.c: revision 1.245 sys/ddb/db_xxx.c: revision 1.72 sys/ddb/db_xxx.c: revision 1.73 sys/miscfs/fdesc/fdesc_vnops.c: revision 1.132 sys/kern/uipc_usrreq.c: revision 1.195 sys/kern/sys_descrip.c: revision 1.36 sys/kern/uipc_usrreq.c: revision 1.196 sys/kern/uipc_socket2.c: revision 1.135 sys/kern/uipc_socket2.c: revision 1.136 sys/kern/kern_sig.c: revision 1.383 sys/kern/kern_sig.c: revision 1.384 sys/compat/netbsd32/netbsd32_ioctl.c: revision 1.107 sys/miscfs/procfs/procfs_vnops.c: revision 1.208 sys/kern/subr_exec_fd.c: revision 1.9 sys/kern/kern_descrip.c: revision 1.252 (all via patch)
Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.
While here: - Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused. => This is used only in fd_close and fd_abort, where it holds. - Move bounds check assertion in fd_putfile to where it matters. - Store fd_dt with atomic_store_release. - Move load of fd_dt under lock in knote_fdclose. - Omit membar_consumer in fdesc_readdir. => atomic_load_consume serves the same purpose now. => Was needed only on alpha anyway.
Load struct fdfile::ff_file with atomic_load_consume. Exceptions: when we're only testing whether it's there, not about to dereference it.
Note: We do not use atomic_store_release to set it because the preceding mutex_exit should be enough.
(That said, it's not clear the mutex_enter/exit is needed unless refcnt > 0 already, in which case maybe it would be a win to switch from the membar implied by mutex_enter to the membar implied by atomic_store_release -- which I would generally expect to be much cheaper. And a little clearer without a long comment.) kern_descrip.c: Fix membars around reference count decrement.
In general, the `last one out hit the lights' style of reference counting (as opposed to the `whoever's destroying must wait for pending users to finish' style) requires memory barriers like so:
... usage of resources associated with object ... membar_release(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_acquire(); ... freeing of resources associated with object ...
This way, all usage happens-before all freeing. This fixes several errors: - fd_close failed to ensure whatever its caller did would happen-before the freeing, in the case where another thread is concurrently trying to close the fd (ff->ff_file == NULL). Fix: Add membar_release before atomic_dec_uint(&ff->ff_refcnt) in that branch. - fd_close failed to ensure all loads its caller had issued will have happened-before the freeing, in the case where the fd is still in use by another thread (fdp->fd_refcnt > 1 and ff->ff_refcnt-- > 0). Fix: Change membar_producer to membar_release before atomic_dec_uint(&ff->ff_refcnt). - fd_close failed to ensure that any usage of fp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&ff->ff_refcnt). - fd_free failed to ensure that any usage of fdp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&fdp->fd_refcnt).
While here, change membar_exit -> membar_release. No semantic change, just updating away from the legacy API.
|
1.35.2.1 |
| 29-Feb-2020 |
ad | Sync with head.
|
1.51.2.1 |
| 02-Aug-2025 |
perseant | Sync with HEAD
|