History log of /src/tests/fs/vfs/t_renamerace.c |
Revision | | Date | Author | Comments |
1.44 |
| 31-Jan-2022 |
ryo | Extend the time to wait for the thread to quit.
It seems that alarm(1) is not enough time for the thread to actually exit after quittingtime = 1. It randomly failed with "Test program received signal 14" on a slow environment.
|
1.43 |
| 27-Nov-2021 |
gson | Force failure of the nfs_renamerace_cycle, p2k_ffs_renamerace_cycle, and puffs_renamerace_cycle test cases as they fail only randomly or only on some systems.
|
1.42 |
| 23-Oct-2021 |
hannken | After converting msdosfs_rename() to use genfs_sane_rename() the MSDOS tests should pass.
Tested on QEMU/nvmm archs i386 and amd64.
Should resolve PR kern/43626 (directory renaming more than a little racy)
|
1.41 |
| 16-Jun-2021 |
riastradh | tests/fs/vfs: Mark udf_renamerace_cycle flaky, PR kern/56253.
|
1.40 |
| 05-Sep-2020 |
riastradh | Revert "ufs: Prevent mkdir from choking on deleted directories."
This change made no sense and should not have been committed.
|
1.39 |
| 05-Sep-2020 |
riastradh | ufs: Prevent mkdir from choking on deleted directories.
Fix some missing uvm_vnp_setsize in screw cases while here.
|
1.38 |
| 05-Sep-2020 |
riastradh | genfs_rename: Fix deadlocks in cross-directory cyclic rename.
Reproducer:
A: for (;;) { mkdir("c", 0600); mkdir("c/d", 0600); mkdir("c/d/e", 0600); rmdir("c/d/e"); rmdir("c/d"); } B: for (;;) { mkdir("c", 0600); mkdir("c/d", 0600); mkdir("c/d/e", 0600); rename("c", "c/d/e"); } C: for (;;) { mkdir("c", 0600); mkdir("c/d", 0600); mkdir("c/d/e", 0600); rename("c/d/e", "c"); }
Deadlock:
- A holds c and wants to lock d; and either - B holds . and d and wants to lock c, or - C holds . and d and wants to lock c.
The problem with these is that genfs_rename_enter_separate in B or C tried lock order .->d->c->e (in A/B, fdvp->tdvp->fvp->tvp; in A/C, tdvp->fdvp->tvp->fvp) which violates the ancestor->descendant order .->c->d->e.
The resolution is to change B to do fdvp->fvp->tdvp->tvp and C to do tdvp->tvp->fdvp->fvp. But there's an edge case: tvp and fvp might be the same (hard links), and we can't detect that until after we've looked them both up -- and in some file systems (I'm looking at you, ufs), there is no mere lookup operation, only lookup-and-lock, so we can't even hold the lock on one of tvp or fvp when we look up the other one if there's a chance they might be the same.
Fortunately the cases (a) tvp = fvp (b) tvp or fvp is a directory are mutually exclusive as long as directories cannot be hard-linked. In case (a) we can just defer locking {tvp, fvp} until the end, because it can't possibly have {fdvp or fvp, tdvp or tvp} as descendants. In case (b) we can just lock them in the order fdvp->fvp->tdvp->tvp or tdvp->tvp->fdvp->fvp if the first one of {fvp, tvp} is a directory, because it can't possibly coincide with the second one of {fvp, tvp}.
With this change, we can now prove that the locking order is consistent with the ancestor->descendant partial ordering. Where two nodes are incommensurate under that partial ordering, they are only ever locked by rename and there is only ever one rename at a time.
Proof:
- For same-directory renames, genfs_rename_enter_common locks the directory first and then the children. The order directory->child[i] is consistent with ancestor->descendant and child[0]/child[1] are incommensurate.
- For cross-directory renames:
. While a rename is in progress and the fs-wide rename lock is held, directories can be created or removed but not changed, so the outcome of gro_genealogy -- which, given fdvp and tdvp, returns the node N relating fdvp/N/.../tdvp or null if there is none -- can only transition from finding N to not finding N, if one of the directories is removed while any of the vnodes are unlocked. Merely creating directories cannot change the ancestry of tdvp, and concurrent renames are not possible.
Thus, if a gro_genealogy determined the operation to have the form fdvp/N/.../tdvp, then it might cease to have that form, but only because tdvp was removed which will harmlessly cause the rename to fail later on. Similarly, if gro_genealogy determined the operation _not_ to have the form fdvp/N/.../tdvp then it can't begin to have that form until after the rename has completed.
The lock order is,
=> for fdvp/.../tdvp: 1. lock fdvp 2. lookup(/lock/unlock) fvp (consistent with fdvp->fvp) 3. lock fvp if a directory (consistent with fdvp->fvp) 4. lock tdvp (consistent with fdvp->tdvp and possibly fvp->tdvp) 5. lookup(/lock/unlock) tvp (consistent with tdvp->tvp) 6. lock fvp if a nondirectory (fvp->t* or fvp->fdvp is impossible) 7. lock tvp if not fvp (tvp->f* is impossible unless tvp=fvp)
=> for incommensurate fdvp & tdvp, or for tdvp/.../fdvp: 1. lock tdvp 2. lookup(/lock/unlock) tvp (consistent with tdvp->tvp) 3. lock tvp if a directory (consistent with tdvp->tvp) 4. lock fdvp (either incommensurate with tdvp and/or tvp, or consistent with tdvp(->tvp)->fdvp) 5. lookup(/lock/unlock) fvp (consistent with fdvp->fvp) 6. lock tvp if a nondirectory (tvp->f* or tvp->tdvp is impossible) 7. lock fvp if not tvp (fvp->t* is impossible unless fvp=tvp)
Deadlocks found by hannken@; resolution worked out with dholland@.
XXX I think we could improve concurrency somewhat -- with a likely big win for applications like tar and rsync that create many files with temporary names and then rename them to the permanent one in the same directory -- by making vfs_renamelock a reader/writer lock: any number of same-directory renames, or exactly one cross-directory rename, at any one time.
|
1.37 |
| 05-Sep-2020 |
riastradh | tests/fs/vfs/t_renamerace: Test a screw case hannken@ found.
|
1.36 |
| 17-Aug-2019 |
gson | The udf_renamerace test case no longer fails due to PR kern/49046, but it does fail due to PR kern/53865 on real hardware.
|
1.35 |
| 13-Jan-2019 |
gson | branches: 1.35.2; Mark the fs/vfs/t_renamerace:udf_renamerace_dirs test case as an expected failure referencing PR kern/53865, and force failure to avoid reports of unexpected success as it does not realiably fail under qemu. This makes the treatment of udf_renamerace_dirs the same as that of udf_renamerace, only with a different PR. Also, make whitespace consistent between the two.
|
1.34 |
| 13-Jan-2017 |
christos | branches: 1.34.12; 1.34.14; Don't play with "../.." in includes for h_macros.h; deal with it centrally. Minor fixes.
|
1.33 |
| 04-May-2016 |
dholland | branches: 1.33.2; Cite a relevant PR for msdos_renamerace instead of one that was fixed several years ago.
|
1.32 |
| 29-Jul-2014 |
gson | Mark the udf_renamerace test case (but not udf_renamerace_dirs) as an expected failure again, now with a reference to PR kern/49046. Since the test only fails part of the time, force failure to avoid failure reports reports due to unexpected success.
|
1.31 |
| 25-Jul-2014 |
pgoyette | Remove atf_tc_expect_fail() calls for udf file-system. These tests are currently passing. As discussed on current-users. Any new failures should be reported via send-pr.
|
1.30 |
| 09-Jan-2014 |
hannken | branches: 1.30.2; Operation sysvbfs_remove() destructs inodes attached to active vnodes. Defer the destruction to sysvbfs_reclaim().
Disable test t_renamerace:sysvbfs_renamerace as it will exhaust the inode table (sysvbfs has space for 8 inodes only).
Ok: Izumi Tsutsui <tsutsui@netbsd.org>
|
1.29 |
| 10-Jul-2013 |
reinoud | Update test cases for UDF now udf_rename() uses the genfs_rename framework
|
1.28 |
| 08-Jul-2013 |
reinoud | Cover the last failing UDF test cases with a reference to PR kern/47986, i.e. all rename's fail until UDF switches over to the new rename framework solving the locking mechanism.
|
1.27 |
| 17-Mar-2013 |
jmmv | Fix the t_renamerace:lfs_renamerace_dirs test on fast machines.
This test was failing on my machine when run natively but not causing any problems when run within qemu, and the failure was "mkdir: No space left on device".
My understanding of the issue is that this test overflowed the temporary disk image due to its high rate of file churn and the lfs_cleanerd not being able to keep up. Note that this test is capped by time, not number of operations, so this is why the problem does not show up in a slow emulated system.
To fix this, just bump the test file system image limit a little bit. (I tried increasing the frequency at which lfs_cleanerd does its thing, but it wasn't enough.)
|
1.26 |
| 09-May-2012 |
riastradh | branches: 1.26.2; Adjust t_renamerace now that ext2fs and ffs have good rename.
|
1.25 |
| 16-Feb-2012 |
perseant | Pass t_renamerace and t_rmdirrace tests.
Adapt dholland@'s fix to ufs_rename to fix PR kern/43582. Address several other MP locking issues discovered during the course of investigating the same problem.
Removed extraneous vn_lock() calls on the Ifile, since the Ifile writes are controlled by the segment lock.
Fix PR kern/45982 by deemphasizing the estimate of how much metadata will fill the empty space on disk when the disk is nearly empty (t_renamerace crates a lot of inode blocks on a tiny empty disk).
|
1.24 |
| 08-Oct-2011 |
njoly | branches: 1.24.2; 1.24.4; Slightly adjust skipped messages, makes output more consistent.
|
1.23 |
| 18-Jul-2011 |
dholland | ffs and ffslog are no longer xfail.
|
1.22 |
| 14-Mar-2011 |
pooka | Apparently this way of triggering the msdosfs rename vnode leak does not bite every time (most commonly observed on the amd64/qemu runs), so add a race condition catcher.
|
1.21 |
| 06-Mar-2011 |
pooka | Add a race catcher for p2k_ffs renamerace -- it seems like the problem doesn't trigger always especially in a qemu env (but triggers 100% of the time on my desktop).
|
1.20 |
| 03-Mar-2011 |
pooka | The re-enabled renamerace test also triggers the recent msdosfs vnode leak. xfail this under the blanket of PR kern/44661.
|
1.19 |
| 03-Mar-2011 |
pooka | Apparently my last commit to msdosfs_vnops.c fixed the (harmless?) buffer overrun in rename (>15 years old bug), so re-enable other msdosfs rename tests too.
|
1.18 |
| 11-Jan-2011 |
pooka | branches: 1.18.2; need unrace-catcher for ffslog
|
1.17 |
| 07-Jan-2011 |
pooka | xfail PR kern/44336
|
1.16 |
| 07-Jan-2011 |
pooka | ffs -o log dies in renamerace_dirs just like the rest.
|
1.15 |
| 02-Jan-2011 |
pooka | + rump_lwproc_newproc -> rump_lwproc_rfork() + add a tess for rump_lwproc_rfork()
|
1.14 |
| 11-Nov-2010 |
pooka | skip tests which use features which rumpfs does not support (namely: vop_rename and a file system size limit)
|
1.13 |
| 01-Nov-2010 |
pooka | Create the process we use later in the test. Otherwise cwd doesn't go right and the test fails because of attempting to create files in the wrong directory.
|
1.12 |
| 01-Sep-2010 |
pooka | update to new rump lwp/proc interfaces
|
1.11 |
| 26-Aug-2010 |
pooka | chdir() once per process is enough, no need to do it for every thread (and doing so would cause occasional failures when some thread would cd out of the test mountpoint while another thread was still running in there).
|
1.10 |
| 26-Aug-2010 |
pooka | Put the workaround for PR kern/43799 into the common nfs unmount routine.
|
1.9 |
| 25-Aug-2010 |
pooka | Start many more threads for the renamerace since it seems to catch more errors.
Add a sleepkludge to deal with NFS's sillyrename brokenness.
|
1.8 |
| 16-Jul-2010 |
pooka | Some of the msdosfs tests are killed by SSP due to stack limit being exceeded. I cannot figure out what is going on by code reading, nor repeat this either on my desktop or in qemu, so skip those tests for msdosfs until I can get to the bottom of it.
|
1.7 |
| 16-Jul-2010 |
pooka | skip directory test on sysvbfs
|
1.6 |
| 16-Jul-2010 |
pooka | Fix typo in comment. comment tested by wizd.
|
1.5 |
| 16-Jul-2010 |
pooka | Fill in PR kern/43626 now that it exists.
|
1.4 |
| 16-Jul-2010 |
pooka | Do the famous renamerace test using directories. Uh oh, bad idea. PR coming soon.
|
1.3 |
| 16-Jul-2010 |
pooka | This test does not always fail for LFS, so apply same kludge as elsewhere while waiting for atf to grow support for these cases.
|
1.2 |
| 14-Jul-2010 |
pooka | xfail test on lfs. It goes badaboom faster than you can find your multipass. Borrow PR kern/43582 used earlier for rmdirrace, as it looks pretty much like the same problem.
|
1.1 |
| 14-Jul-2010 |
pooka | Convert "The Original" rename race test from to vfs and retire the ffs/tmpfs versions. The only difference is that the origamical one mounted ffs with MNT_LOG (and therein actually lay the bug).
|
1.18.2.1 |
| 05-Mar-2011 |
bouyer | Sync with HEAD
|
1.24.4.1 |
| 17-Mar-2012 |
bouyer | Pull up following revision(s) (requested by perseant in ticket #116): sys/ufs/lfs/lfs_alloc.c: revision 1.112 tests/fs/vfs/t_rmdirrace.c: revision 1.9 tests/fs/vfs/t_renamerace.c: revision 1.25 sys/ufs/lfs/lfs_vnops.c: revision 1.240 sys/ufs/lfs/lfs_segment.c: revision 1.224 sys/ufs/lfs/lfs_bio.c: revision 1.122 sys/ufs/lfs/lfs_vfsops.c: revision 1.294 sbin/newfs_lfs/make_lfs.c: revision 1.19 sys/ufs/lfs/lfs.h: revision 1.136 Pass t_renamerace and t_rmdirrace tests. Adapt dholland@'s fix to ufs_rename to fix PR kern/43582. Address several other MP locking issues discovered during the course of investigating the same problem. Removed extraneous vn_lock() calls on the Ifile, since the Ifile writes are controlled by the segment lock. Fix PR kern/45982 by deemphasizing the estimate of how much metadata will fill the empty space on disk when the disk is nearly empty (t_renamerace crates a lot of inode blocks on a tiny empty disk).
|
1.24.2.3 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.24.2.2 |
| 23-May-2012 |
yamt | sync with head.
|
1.24.2.1 |
| 17-Apr-2012 |
yamt | sync with head
|
1.26.2.2 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.26.2.1 |
| 23-Jun-2013 |
tls | resync from head
|
1.30.2.1 |
| 10-Aug-2014 |
tls | Rebase.
|
1.33.2.1 |
| 20-Mar-2017 |
pgoyette | Sync with HEAD
|
1.34.14.2 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.34.14.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.34.12.1 |
| 18-Jan-2019 |
pgoyette | Synch with HEAD
|
1.35.2.1 |
| 13-Sep-2020 |
martin | Pull up following revision(s) (requested by riastradh in ticket #1083):
sys/miscfs/genfs/genfs_rename.c: revision 1.5 tests/fs/vfs/t_renamerace.c: revision 1.37 tests/fs/vfs/t_renamerace.c: revision 1.38
tests/fs/vfs/t_renamerace: Test a screw case hannken@ found.
genfs_rename: Fix deadlocks in cross-directory cyclic rename.
Reproducer: A: for (;;) { mkdir("c", 0600); mkdir("c/d", 0600); mkdir("c/d/e", 0600); rmdir("c/d/e"); rmdir("c/d"); } B: for (;;) { mkdir("c", 0600); mkdir("c/d", 0600); mkdir("c/d/e", 0600); rename("c", "c/d/e"); } C: for (;;) { mkdir("c", 0600); mkdir("c/d", 0600); mkdir("c/d/e", 0600); rename("c/d/e", "c"); }
Deadlock: - A holds c and wants to lock d; and either - B holds . and d and wants to lock c, or - C holds . and d and wants to lock c.
The problem with these is that genfs_rename_enter_separate in B or C tried lock order .->d->c->e (in A/B, fdvp->tdvp->fvp->tvp; in A/C, tdvp->fdvp->tvp->fvp) which violates the ancestor->descendant order .->c->d->e.
The resolution is to change B to do fdvp->fvp->tdvp->tvp and C to do tdvp->tvp->fdvp->fvp. But there's an edge case: tvp and fvp might be the same (hard links), and we can't detect that until after we've looked them both up -- and in some file systems (I'm looking at you, ufs), there is no mere lookup operation, only lookup-and-lock, so we can't even hold the lock on one of tvp or fvp when we look up the other one if there's a chance they might be the same.
Fortunately the cases (a) tvp = fvp (b) tvp or fvp is a directory are mutually exclusive as long as directories cannot be hard-linked.
In case (a) we can just defer locking {tvp, fvp} until the end, because it can't possibly have {fdvp or fvp, tdvp or tvp} as descendants. In case (b) we can just lock them in the order fdvp->fvp->tdvp->tvp or tdvp->tvp->fdvp->fvp if the first one of {fvp, tvp} is a directory, because it can't possibly coincide with the second one of {fvp, tvp}.
With this change, we can now prove that the locking order is consistent with the ancestor->descendant partial ordering. Where two nodes are incommensurate under that partial ordering, they are only ever locked by rename and there is only ever one rename at a time.
Proof: - For same-directory renames, genfs_rename_enter_common locks the directory first and then the children. The order directory->child[i] is consistent with ancestor->descendant and child[0]/child[1] are incommensurate. - For cross-directory renames: . While a rename is in progress and the fs-wide rename lock is held, directories can be created or removed but not changed, so the outcome of gro_genealogy -- which, given fdvp and tdvp, returns the node N relating fdvp/N/.../tdvp or null if there is none -- can only transition from finding N to not finding N, if one of the directories is removed while any of the vnodes are unlocked. Merely creating directories cannot change the ancestry of tdvp, and concurrent renames are not possible. Thus, if a gro_genealogy determined the operation to have the form fdvp/N/.../tdvp, then it might cease to have that form, but only because tdvp was removed which will harmlessly cause the rename to fail later on. Similarly, if gro_genealogy determined the operation _not_ to have the form fdvp/N/.../tdvp then it can't begin to have that form until after the rename has completed. The lock order is, => for fdvp/.../tdvp: 1. lock fdvp 2. lookup(/lock/unlock) fvp (consistent with fdvp->fvp) 3. lock fvp if a directory (consistent with fdvp->fvp) 4. lock tdvp (consistent with fdvp->tdvp and possibly fvp->tdvp) 5. lookup(/lock/unlock) tvp (consistent with tdvp->tvp) 6. lock fvp if a nondirectory (fvp->t* or fvp->fdvp is impossible) 7. lock tvp if not fvp (tvp->f* is impossible unless tvp=fvp) => for incommensurate fdvp & tdvp, or for tdvp/.../fdvp: 1. lock tdvp 2. lookup(/lock/unlock) tvp (consistent with tdvp->tvp) 3. lock tvp if a directory (consistent with tdvp->tvp) 4. lock fdvp (either incommensurate with tdvp and/or tvp, or consistent with tdvp(->tvp)->fdvp) 5. lookup(/lock/unlock) fvp (consistent with fdvp->fvp) 6. lock tvp if a nondirectory (tvp->f* or tvp->tdvp is impossible) 7. lock fvp if not tvp (fvp->t* is impossible unless fvp=tvp)
Deadlocks found by hannken@; resolution worked out with dholland@.
XXX I think we could improve concurrency somewhat -- with a likely big win for applications like tar and rsync that create many files with temporary names and then rename them to the permanent one in the same directory -- by making vfs_renamelock a reader/writer lock: any number of same-directory renames, or exactly one cross-directory rename, at any one time.
|