CHANGES revision 1.6
11.6Sandvar# $NetBSD: CHANGES,v 1.6 2024/02/09 22:08:38 andvar Exp $ 21.1Sperseant 31.1Sperseantkernel: 41.1Sperseant 51.1Sperseant- Instead of blindly continuing when it encounters an Inode that is 61.1Sperseant locked by another process, lfs_markv will process the rest of the 71.1Sperseant inodes passed to it and then return EAGAIN. The cleaner will 81.1Sperseant recognize this and not mark the segment clean. When the cleaner runs 91.1Sperseant again, the segment containg the (formerly) locked inode will sort high 101.1Sperseant for cleaning, since it is now almost entirely empty. 111.1Sperseant 121.1Sperseant- A beginning has been made to test keeping atime information in the 131.1Sperseant Ifile, instead of on the inodes. This should make read-mostly 141.1Sperseant filesystems significantly faster, since the inodes will then remain 151.1Sperseant close to the data blocks on disk; but of course the ifile will be 161.1Sperseant somewhat larger. This code is not enabled, as it makes the format of 171.1Sperseant IFILEs change. 181.1Sperseant 191.1Sperseant- The superblock has been broken into two components: an on-disk 201.1Sperseant superblock using fixed-size types, exactly 512 bytes regardless of 211.1Sperseant architecture (or could be enlarged in multiples of the media block 221.1Sperseant size up to LFS_SBPAD); and an in-memory superblock containing the 231.1Sperseant information only useful to a running LFS, including segment pointers, 241.1Sperseant etc. The superblock checksumming code has been modified to make 251.1Sperseant future changes to the superblock format easier. 261.1Sperseant 271.1Sperseant- Because of the way that lfs_writeseg works, buffers are freed before 281.1Sperseant they are really written to disk: their contents are copied into large 291.1Sperseant buffers which are written async. Because the buffer cache does not 301.1Sperseant serve to throttle these writes, and malloced memory is used to hold them, 311.1Sperseant there is a danger of running out of kmem_map. To avoid this, a new 321.4Swiz compile-time parameter, LFS_THROTTLE, is used as an upper bound for the 331.1Sperseant number of partial-segments allowed to be in progress writing at any 341.1Sperseant given time. 351.1Sperseant 361.1Sperseant- If the system crashes between the point that a checkpoint is scheduled 371.1Sperseant for writing and the time that the write completes, the filesystem 381.1Sperseant could be left in an inconsistent state (no valid checkpoints on 391.1Sperseant disk). To avoid this, we toggle between the first two superblocks 401.1Sperseant when checkpointing, and (if it is indicated that no roll-forward agent 411.1Sperseant exists) do not allow one checkpoint to occur before the last one has 421.1Sperseant completed. When the filesystem is mounted, it uses the *older* of the 431.1Sperseant first two superblocks. 441.1Sperseant 451.1Sperseant- DIROPs: 461.1Sperseant 471.1Sperseant The design of the LFS includes segregating vnodes used in directory 481.1Sperseant operations, so that they can be written at the same time during a 491.1Sperseant checkpoint, avoiding filesystem inconsistency after a crash. Code for 501.1Sperseant this was partially written for BSD4.4, but was not complete or enabled. 511.1Sperseant 521.1Sperseant In particular, vnodes marked VDIROP could be flushed by getnewvnode at 531.1Sperseant any time, negating the usefulness of marking a vnode VDIROP, since if 541.1Sperseant the filesystem then crashed it would be inconsistent. Now, when a 551.1Sperseant vnode is first marked VDIROP it is also referenced. To avoid running 561.1Sperseant out of vnodes, an attempt to mark more than LFS_MAXDIROP vnodes wth 571.1Sperseant VDIROP will sleep, and trigger a partial-segment write when no dirops 581.1Sperseant are active. 591.1Sperseant 601.1Sperseant- LFS maintains a linked list of free inode numbers in the Ifile; 611.1Sperseant accesses to this list are now protected by a simple lock. 621.1Sperseant 631.1Sperseant- lfs_vfree is not allowed to run while an inode has blocks scheduled 641.1Sperseant for writing, since that could trigger a miscounting in lfs_truncate. 651.1Sperseant 661.1Sperseant- lfs_balloc now correctly extends fragments, if a block is written 671.1Sperseant beyond the current end-of-file. 681.1Sperseant 691.1Sperseant- Blocks which have already been gathered into a partial-segment are not 701.1Sperseant allowed to be extended, since if they were, any blocks following them 711.1Sperseant would either be written in the wrong place, or overwrite other blocks. 721.1Sperseant 731.1Sperseant- The LFS buffer-header accounting, which triggers a partial-segment 741.6Sandvar write if too many buffer-headers are in use by the LFS subsystem, has 751.1Sperseant been expanded to include *bytes* used in LFS buffers as well. 761.1Sperseant 771.1Sperseant- Reads of the Ifile, which almost always come from the cleaner, can no 781.1Sperseant longer trigger a partial-segment write, since this could cause a 791.1Sperseant deadlock. 801.1Sperseant 811.1Sperseant- Support has been added (but not tested, and currently disabled by 821.1Sperseant default) for true read-only filesystems. Currently, if a filesystem 831.1Sperseant is mounted read-only the cleaner can still operate on it, but this 841.1Sperseant obviously would not be true for read-only media. (I think the 851.1Sperseant original plan was for the roll-forward agent to operate using this 861.1Sperseant "feature"?) 871.1Sperseant 881.1Sperseant- If a fake buffer is created by lfs_markv and another process draws the 891.1Sperseant same block in and changes it, the fake buffer is now discarded and 901.1Sperseant replaced by the "real" buffer containing the new data. 911.1Sperseant 921.1Sperseant- An inode which has blocks gathered no longer has IN_MODIFIED set, but 931.1Sperseant still does in fact have dirty blocks attached. lfs_update will now 941.1Sperseant wait for such an inode's writes to complete before it runs, 951.1Sperseant suppressing a panic in vinvalbuf. 961.1Sperseant 971.1Sperseant- Many filesystem operations now update the Ifile's mtime, allowing the 981.1Sperseant cleaner to detect when the filesystem is idle, and clean more 991.1Sperseant vigorously during such times (cf. Blackwell et al., 1995). 1001.1Sperseant 1011.1Sperseant- When writing a partial-segment, make sure that the current segment is 1021.1Sperseant still marked ACTIVE afterward (otherwise the cleaner might try to 1031.1Sperseant clean it, since it might well be mostly empty). 1041.1Sperseant 1051.1Sperseant- Don't trust the cleaner so much. Sort the blocks during gathering, 1061.1Sperseant even if they came from the cleaner; verify the location of on-disk 1071.1Sperseant inodes, even if the cleaner says it knows where they came from. 1081.1Sperseant 1091.1Sperseant- The cleaning code (lfs_markv in particular) has been entirely 1101.1Sperseant rewritten, and the partial-segment writing code changed to match. 1111.1Sperseant Lfs_markv no longer uses its own implementation of lfs_segwrite, but 1121.1Sperseant marks inodes with IN_CLEANING to differentiate them from the 1131.1Sperseant non-cleaning inodes. This change fixes numerous problems with the old 1141.3Stoshii cleaner, including a buffer overrun, and lost extensions in active 1151.1Sperseant fragments. lfs_bmapv looks up and returns the addresses of inode 1161.1Sperseant blocks, so the cleaner can do something intelligent with them. 1171.1Sperseant 1181.1Sperseant If IN_CLEANING is set on an inode during partial-segment write, only fake 1191.1Sperseant buffers will be written, and IN_MODIFIED will not be cleared, saving 1201.1Sperseant us from a panic in vinvalbuf. The addition of IN_CLEANING also allows 1211.1Sperseant dirops to be active while cleaning is in progress; since otherwise 1221.1Sperseant buffers engaged in active dirops might be written ahead of schedule, 1231.1Sperseant and cause an inconsistent checkpoint to be written to disk. 1241.1Sperseant 1251.1Sperseant (XXX - even now, DIROP blocks can sometimes be written to disk, if we 1261.1Sperseant are cleaning the same blocks as are active? Grr, I don't see a good 1271.1Sperseant solution for this!) 1281.1Sperseant 1291.1Sperseant- Added sysctl entries for LFS. In particular, `writeindir' controls 1301.1Sperseant whether indirect blocks are written during non-checkpoint writes. 1311.1Sperseant (Since there is no roll-forward agent as yet, there is no penalty in 1321.1Sperseant not writing indirect blocks.) 1331.1Sperseant 1341.1Sperseant- Wake up the cleaner at fs-unmount time, so it can die (if we unmount 1351.1Sperseant and then remount, we could conceivably get more than one cleaner 1361.1Sperseant operating at once). 1371.1Sperseant 1381.2Sperseantnewfs_lfs: 1391.1Sperseant 1401.1Sperseant- The ifile inode is now created with the schg flag set, since nothing 1411.1Sperseant ever modifies it. This could be a pain for the roll-forward agent, 1421.1Sperseant but since that should really run *before* the filesystem is mounted, 1431.1Sperseant I don't care. 1441.1Sperseant 1451.1Sperseant- For large disks, it may be necessary to write one or more indirect 1461.1Sperseant blocks when the ifile inode is created. Newlfs has been changed to 1471.1Sperseant write the first indirect block, if necessary. It should instead just 1481.1Sperseant build a set of inodes and blocks, and then use the partial-segment 1491.1Sperseant writing routine mentioned above to write an ifile of whatever size is 1501.1Sperseant desired. 1511.1Sperseant 1521.1Sperseantlfs_cleanerd: 1531.1Sperseant 1541.1Sperseant- Now writes information to the syslog. 1551.1Sperseant 1561.1Sperseant- Can now deal properly with fragments. 1571.1Sperseant 1581.1Sperseant- Sometimes, the cleaner can die. (Why?) If this happens and we don't 1591.1Sperseant notice, we're screwed, since the fs will overfill. So, the invoked 1601.1Sperseant cleaner now spawns itself repeatedly, a la init(8), to ensure that a 1611.1Sperseant cleaner is always present to clean the fs. 1621.1Sperseant 1631.1Sperseant- Added a flag to clean more actively, not on low load average but 1641.1Sperseant filesystem inactivity; a la Blackwell et al., 1995. 1651.1Sperseant 1661.1Sperseantfsck_lfs: 1671.1Sperseant 1681.1Sperseant- Exists, although it currently cannot actually fix anything (it is a 1691.1Sperseant diagnostic tool only at this point). 170