Home | History | Annotate | Download | only in fsck_lfs
History log of /src/sbin/fsck_lfs/pass0.c
RevisionDateAuthorComments
 1.43  03-Apr-2020  joerg Avoid common symbols for fsck_lfs.
 1.42  01-Sep-2015  dholland branches: 1.42.16;
Make the inode fields in the 64-bit superblock 64 bits wide.
Reasoning as before.

Note that I am not going through and checking for 64->32 truncations
in inode numbers; I'm sure there are quite a few, but that's a project
for later.
 1.41  23-Aug-2015  christos swap the formats too, not just the args.
 1.40  23-Aug-2015  dholland Fix reversed arguments to a print. nice and confusing...
 1.39  12-Aug-2015  dholland Add IFILE32 and IFILE64 structures for the on-disk ifile entries.
Add and use accessors. There are also a bunch of places that cast and
I hope I've found them all...
 1.38  12-Aug-2015  dholland Make 32-bit and 64-bit versions of CLEANERINFO.

XXX: while this is written to disk, it seems like much of it would
XXX: be better set up as a commpage shared with the cleaner.
 1.37  28-Jul-2015  dholland Add a new lfs header file: lfs_accessors.h.

This contains all the accessor functions and macros out of lfs.h.
Add an include of lfs_accessors.h after all uses of lfs.h... except
for code that wants to define its own struct lfs-alike that the
accessors are supposed to play along with. For these, set STRUCT_LFS
and include lfs_accessors.h after the necessary structure has been
defined, so that lfs_accessors.h can emit functions in terms of it.
 1.36  24-Jul-2015  dholland Switch to accessor functions for elements of the LFS on-disk
superblock. This will allow switching between 32/64 bit forms on the
fly; it will also allow handling LFS_EI reasonably tidily. (That
currently doesn't work on the superblock.)

It also gets rid of cpp abuse in the form of fake structure member
macros.

Also, instead of doing sleep/wakeup on &lfs_avail and &lfs_nextseg
inside the on-disk superblock, add extra elements to the in-memory
struct lfs for this. (XXX: these should be changed to condvars, but
not right now)

XXX: this migrates a structure needed by the lfs code in libsa (struct
salfs) into lfs.h, where it doesn't belong, but for the time being
this is necessary in order to allow the accessors (and the various
lfs macros and other goop that relies on them) to compile.
 1.35  08-Jun-2013  dholland Tidy up the LFS userland build hacks.
Don't use -I${NETBSDSRCDIR}/sys; don't include files other than the
exported LFS headers, which are lfs.h, lfs_inode.h, and (for now)
lfs_extern.h.
 1.34  06-Jun-2013  dholland Cleanups and hacks to make lfs userland stuff build:
- lfs_cksum.c doesn't actually need ulfs_inode.h any more.
- neither does lfs_itimes.c.
- add hacks to fsck_lfs to make it compile.
- add hacks to newfs_lfs to make it compile.
- fix warning in ulfs_quota.c when quotas are fully disabled
(as I guess is happening with the rumpity version)

XXX: This commit adds -I${NETBSDSRCDIR}/sys to the Makefiles for
XXX: fsck_lfs, newfs_lfs, and lfs_cleanerd. This needs to be cleaned
XXX: up ASAP; but I consider this less problematic in the short term
XXX: than spewing ulfs_*.h into /usr/include.
 1.33  06-Jun-2013  dholland ufs -> ulfs for fsck_lfs.
 1.32  22-Jan-2013  dholland Stuff UFS_ in front of a few of ufs's symbols to reduce namespace
pollution. Specifically:
ROOTINO -> UFS_ROOTINO
WINO -> UFS_WINO
NXADDR -> UFS_NXADDR
NDADDR -> UFS_NDADDR
NIADDR -> UFS_NIADDR
MAXSYMLINKLEN -> UFS_MAXSYMLINKLEN
MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Sort out ext2fs's misuse of NDADDR and NIADDR; fortunately, these have
the same values in ext2fs and ffs.

No functional change intended.
 1.31  28-Apr-2008  martin branches: 1.31.20; 1.31.26;
Remove clause 3 and 4 from TNF licenses
 1.30  08-Oct-2007  ad branches: 1.30.8; 1.30.10;
Give brelse() a second argument so that it matches the kernel.
fsck_lfs now compiles again.
 1.29  09-Nov-2006  christos branches: 1.29.8;
Fix malloc/realloc/calloc issues: always check and exit, use EEXIT instead
of 8.
 1.28  18-Jul-2006  perseant Various improvements to fsck_lfs, to wit:

* Add lfs_balloc capability to the lfs library.
* Extend the Ifile if we run out of free inodes when creating lost+found.
* Don't roll forward if we have allocated a lost+found, to avoid
conflicts when adding new files in roll-forward.
* Make some messages slightly more verbose (e.g. include inode number,
and use pwarn() instead of printf() so the messages include the device
name when preening).
* Change superblock detection/avoidance to use the offset table in the
primary superblock, rather than looking at the contents.
* Be more verbose about various operations when passed the -d flag,
especially roll-forward.
* Be more careful about dirops during roll forward, since the cleaner can
sometimes write blocks from dirop vnodes. Detect and avoid this problem.
* Always check the free list, even if given -i; if we're going to write
it we have to check it first.
* Mark inodes dirty when blocks are found during roll forward, so the
inodes are written with the new block locations.
* Update size of inodes if blocks beyond EOF are found during roll
forward.
* Fix segment accounting for blocks and inodes found during roll
forward.
* Report statistics on roll forward: how many new/deleted/moved files
and how many updated blocks (or "nothing new").
* Don't care if the device being checked is really a device, if we have
been passed the -f flag (to facilitate automated testing).
* When writing to the disk, use the current time in the segment headers
rathern than time 0.
* When passed the -i flag, locate the partial segment containing the
Ifile inode and use that to calculate lfs_offset, lfs_curseg,
lfs_nextseg. (Again for automated testing.)
 1.27  17-Apr-2006  perseant Remove the free list ordering/disordering code, since the kernel now keeps
the list in order (ordering it on mount).

Regularize error messages: these are now all in ALL CAPS, with all hex
numbers (not reported in caps) prefixed by 0x. (The non-fsck-specific
messages are an exception to this all-caps rule.)
 1.26  17-Mar-2006  perseant Make it compile again.
 1.25  17-Mar-2006  rumble Check for allocation failures in malloc, calloc, realloc, asprintf, and
vasprintf and try to handle them.
 1.24  13-Sep-2005  christos rename lfs.h to lfs_user.h so that it does not conflict.
 1.23  20-Aug-2005  kent fix a compilation problem on LP64
 1.22  19-Aug-2005  christos 64 bit inode changes
 1.21  08-Jun-2005  perseant Use the correct method to create a new inode, when we allocate lost+found.

Correct uninitialized variable issues in pass6.c and dir.c (PR#30411 and
PR#30394, respectively).
 1.20  11-Apr-2005  perseant Be more efficient with the hash tables for the buffer and vnode caches.

Note that roll-forward can add more inodes to the filesystem; don't overflow
the tables but reallocate them.
 1.19  25-Mar-2005  perseant Make fsck_lfs optimize the inode free list, if it appears to have become
too disordered. This should improve file creation speed on aged filesystems.
Include code to disorder the list for debugging purposes, though this is
of course not compiled in by default.
 1.18  26-Feb-2005  perseant branches: 1.18.2;
Various minor LFS improvements:

* Extend the lfs library from fsck_lfs(8) so that it can be used with a
not-yet-existent LFS. Make newfs_lfs(8) use this library, so it can
create LFSs whose Ifile is larger than one segment.
* Make newfs_lfs(8) use strsuftoi64() for its arguments, a la newfs(8).
* Make fsck_lfs(8) respect the "file system is clean" flag.
* Don't let fsck_lfs(8) think it has dirty blocks when invoked with the
-n flag.
 1.17  19-Jan-2005  xtraeme ANSIfy, WARNS=2
 1.16  07-Aug-2003  agc branches: 1.16.4;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22308, verified by myself.
 1.15  31-Mar-2003  perseant Check inode free list tail pointer as well as head pointer, and write both
into the CLEANERINFO block of the Ifile as well as into the superblock.
Make preen update both superblocks.
 1.14  28-Mar-2003  perseant Add working writing ability to fsck_lfs, including roll-forward, based on
a partial-segment writer ported from the kernel.
 1.13  17-Feb-2003  perseant Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now
(there are still some details to work out) but expect that to go
away soon. To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:

* Create a writer daemon kernel thread whose purpose is to handle page
writes for the pagedaemon, but which also takes over some of the
functions of lfs_check(). This thread is started the first time an
LFS is mounted.

* Add a "flags" parameter to GOP_SIZE. Current values are
GOP_SIZE_READ, meaning that the call should return the size of the
in-core version of the file, and GOP_SIZE_WRITE, meaning that it
should return the on-disk size. One of GOP_SIZE_READ or
GOP_SIZE_WRITE must be specified.

* Instead of using malloc(...M_WAITOK) for everything, reserve enough
resources to get by and use malloc(...M_NOWAIT), using the reserves if
necessary. Use the pool subsystem for structures small enough that
this is feasible. This also obsoletes LFS_THROTTLE.

And a few that are not strictly necessary:

* Moves the LFS inode extensions off onto a separately allocated
structure; getting closer to LFS as an LKM. "Welcome to 1.6O."

* Unified GOP_ALLOC between FFS and LFS.

* Update LFS copyright headers to correct values.

* Actually cast to unsigned in lfs_shellsort, like the comment says.

* Keep track of which segments were empty before the previous
checkpoint; any segments that pass two checkpoints both dirty and
empty can be summarily cleaned. Do this. Right now lfs_segclean
still works, but this should be turned into an effectless
compatibility syscall.
 1.12  24-Jan-2003  fvdl Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
 1.11  04-Feb-2002  perseant Use the correct size for inode blocks. This caused false data checksum
mispatches to be reported on v2 filesystems.
 1.10  02-Nov-2001  lukem fix -Wshadow warning
 1.9  13-Jul-2001  perseant Merge the short-lived perseant-lfsv2 branch into the trunk.

Kernels and tools understand both v1 and v2 filesystems; newfs_lfs
generates v2 by default. Changes for the v2 layout include:

- Segments of non-PO2 size and arbitrary block offset, so these can be
matched to convenient physical characteristics of the partition (e.g.,
stripe or track size and offset).

- Address by fragment instead of by disk sector, paving the way for
non-512-byte-sector devices. In theory fragments can be as large
as you like, though in reality they must be smaller than MAXBSIZE in size.

- Use serial number and filesystem identifier to ensure that roll-forward
doesn't get old data and think it's new. Roll-forward is enabled for
v2 filesystems, though not for v1 filesystems by default.

- The inode free list is now a tailq, paving the way for undelete (undelete
is not yet implemented, but can be without further non-backwards-compatible
changes to disk structures).

- Inode atime information is kept in the Ifile, instead of on the inode;
that is, the inode is never written *just* because atime was changed.
Because of this the inodes remain near the file data on the disk, rather
than wandering all over as the disk is read repeatedly. This speeds up
repeated reads by a small but noticeable amount.

Other changes of note include:

- The ifile written by newfs_lfs can now be of arbitrary length, it is no
longer restricted to a single indirect block.

- Fixed an old bug where ctime was changed every time a vnode was created.
I need to look more closely to make sure that the times are only updated
during write(2) and friends, not after-the-fact during a segment write,
and certainly not by the cleaner.
 1.8  05-Jan-2001  lukem branches: 1.8.2;
use %ll_ instead of the less standard %q_
 1.7  14-Jun-2000  perseant Add "-i" flag to specify the location of the index file inode, to
examine alternate checkpoints. Regularize usage of maxino. Remove olf
debugging cruft.
 1.6  30-May-2000  perseant Check for cycles in the inode free list, and for free inodes not on the free
list.
 1.5  23-May-2000  perseant branches: 1.5.2;
Convert to NetBSD source code style
 1.4  16-May-2000  perseant fsck_lfs can now write to the filesystem, allowing it to correct most
(though still not all) errors in a damaged lfs. Segment byte accounting
is corrected in pass 5. "fsck_lfs -p" will do a partial roll-forward,
verifying the checkpoint from the newer superblock. fscknames[] is
updated so that fsck knows about fsck_lfs.
 1.3  03-Jul-1999  kleink RCS Id police.
 1.2  24-Mar-1999  nathanw printf format fixes for Alpha.
 1.1  18-Mar-1999  perseant Initial checkin of fsck_lfs. This version cannot do any repair (-p flag
does nothing, and one of -p or -n is required) but can be useful as a
diagnostic tool.
 1.5.2.1  22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.8.2.2  02-Jul-2001  perseant Change disk addressing unit to be the fragment, instead of the disk sector.
All quantities in the superblock, inodes, indirect blocks, etc. refer now
to this abstract unit (called "fsb" as it is in FFS) instead of disk sectors;
as a consequence segment summary blocks have to be multiples of a fragment in
size. In v1 filesystems, compatibility code ensures that 1 fsb == 1 sector,
regardless of fragment size.

Fragments can now range in size between 512 and 32k; in the event that
LFS_LABELPAD (8k) is smaller than the disk address unit size, an extra
proto-superblock is kept at 8k from the beginning of the disk, to be used
*only* to locate the real superblocks. (Not all of the userland knows about
this yet.)

Almost all of this was done not by me, but by joff.
 1.8.2.1  27-Jun-2001  perseant Import of what I've been calling "LFSv2", that is, LFS with some features
added that require changes to the on-disk data structures. These include:

- 64-bit time in everything but inodes
- User-specified segment offset, and segment size no longer
restricted to PO2.
- Serial number on segment summaries in addition to timestamp, and
a new volume identifier, to make roll-forward feasible without
fear of finding old data and thinking it was new.

Although I think this version works at least as well as what's on the trunk,
we're not done yet; hence this commit is going in on a branch and not on
the trunk. Enhancements that are not here yet include fragment addressing,
like FFS does, instead of block addressing.
 1.16.4.1  10-May-2005  riz Pull up the following revisions (requested by perseant in ticket #1281):

1.8 sys/ufs/lfs/TODO
1.75 sys/ufs/lfs/lfs.h (via patch)
1.74 sys/ufs/lfs/lfs_alloc.c (via patch)
1.49, 1.51 sys/ufs/lfs/lfs_balloc.c (1.51 via patch)
1.78 sys/ufs/lfs/lfs_bio.c
1.62 sys/ufs/lfs/lfs_extern.h (via patch)
1.156 sys/ufs/lfs/lfs_segment.c (via patch)
1.48 sys/ufs/lfs/lfs_subr.c
1.101 sys/ufs/lfs/lfs_syscalls.c
1.163 sys/ufs/lfs/lfs_vfsops.c (via patch)
1.134 sys/ufs/lfs/lfs_vnops.c (via patch)
1.61 sys/ufs/ufs/ufs_readwrite.c (via patch)

1.20 libexec/lfs_cleanerd/clean.h (via patch)
1.52 libexec/lfs_cleanerd/cleanerd.c (via patch)
1.41 libexec/lfs_cleanerd/library.c (via patch)

1.4 regress/sys/fs/lfs/newfs_fsck/Makefile
1.2 regress/sys/fs/lfs/newfs_fsck/mkfs_mount
1.2 regress/sys/fs/lfs/newfs_fsck/smallfiles
1.3 sbin/fsck_lfs/bufcache.c
1.3 sbin/fsck_lfs/bufcache.h
1.3 sbin/fsck_lfs/lfs.h
1.8 sbin/fsck_lfs/lfs.c (via patch)
1.8 sbin/fsck_lfs/pass3.c (via patch)
1.18 sbin/fsck_lfs/pass0.c (via patch)
1.18 sbin/fsck_lfs/utilities.c (via patch)
1.7 sbin/fsck_lfs/segwrite.c
1.19 sbin/fsck_lfs/setup.c (via patch)
1.3 sbin/newfs_lfs/Makefile
0 sbin/newfs_lfs/lfs.c (yes, remove it)
1.1 sbin/newfs_lfs/make_lfs.c
1.15 sbin/newfs_lfs/newfs.c (via patch)

Various minor LFS improvements.

Kernel:

* Note when lfs_putpages(9) thinks it is not going to be writing any
pages before calling genfs_putpages(9). This prevents a situation in
which blocks can be queued for writing without a segment header.
* Correct computation of NRESERVE(), though it is still a gross
overestimate in most cases. Note that if NRESERVE() is too high, it
may be impossible to create files on the filesystem. We catch this
case on filesystem mount and refuse to mount r/w.
* Allow filesystems to be mounted whose block size is == MAXBSIZE.
* Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN
entries in indirect blocks again, triggering a failed assertion "daddr
<= LFS_MAX_DADDR". Explicitly convert to and from int32_t to correct
this. Should fix PR #29045.
* Add a high-water mark for the number of dirty pages any given LFS can
hold before triggering a flush. This is settable by sysctl, but off
(zero) by default.
* Be more careful about the MAX_BYTES and MAX_BUFS computations so we
shouldn't see "please increase to at least zero" messages.
* Note that VBLK and VCHR vnodes can have nonzero values in di_db[0]
even though their v_size == 0. Don't panic when we see this.
Fixes PR #26680.
* Change lfs_bfree to a signed quantity. The manner in which it is
processed before being passed to the cleaner means that sometimes it
may drop below zero, and the cleaner must be aware of this.
* Never report bfree < 0 (or higher than lfs_dsize) through
lfs_statfs(9). This prevents df(1) from ever telling us that our full
filesystems have 16TB free.
* Account space allocated through lfs_balloc(9) that does not have
associated buffer headers, so that the pagedaemon doesn't run us out
of segments.
* Return ENOSPC from lfs_balloc(9) when bfree drops to zero.
* Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being
unmounted. Because vfs_busy() is a shared lock, and
lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be
holding the lock that umount() is blocking on, then try to vfs_busy()
again in getnewvnode().

cleaner:

* Adapt lfs_cleanerd to use the fcntl call to get the Ifile filehandle,
so it need not be in the namespace.
* Make lfs_cleanerd be more careful when there are very few available
segments.
* Make lfs_cleanerd less verbose when the filesystem is unmounted.

newfs_lfs, fsck_lfs, and regression:

* Extend the lfs library from fsck_lfs(8) so that it can be used with a
not-yet-existent LFS. Make newfs_lfs(8) use this library, so it can
create LFSs whose Ifile is larger than one segment. Addresses PR #11110.
* Make newfs_lfs(8) use strsuftoi64() for its arguments, a la newfs(8).
* Make fsck_lfs(8) respect the "file system is clean" flag.
* Don't let fsck_lfs(8) think it has dirty blocks when invoked with the
-n flag.
* Remove the Ifile from the filesystem namespace. The cleaner now uses
a fcntl call on the root inode to find the Ifile filehandle. (As a
side-effect, addresses PR #29144.)
 1.18.2.1  07-May-2005  tron Apply patch (requested by perseant in ticket #242):
* fsck_lfs buffer cache fixes, including PR #29151
* Change fsck_lfs phase 0 message to reflect reality
* fsck_lfs: check phase 5 (cleanerinfo accounting) even on
roll-forward
* Keep better track of the free list during roll-forward, avoiding
a core dump
* Improve hash table use for fsck_lfs buffer and vnode cache
* Document fsck_lfs flag -f, and implement -q
* Add resize_lfs, including kernel support
* Add LFS to mountd's list of exportable filesystem types
* Make the LFS lkm work again [christos@]
* Add MP locking to the LFS kernel subsystem
* Fix pager_map deadlock in lfs_putpages()
* Avoid incomplete file extension that looks like "partial
truncation" to fsck
* Use lfs_malloc for cleaner malloc, since the cleaner often runs
in low-memory conditions.
* Use splay trees, not hash table, to track page allocation for
write.
* Fix mkdir panic on full fs
* Fix page accounting leak by counting differently.
* Use rightly named structure for lfs_getattr [skrll@]
* Cosmetic changes for readability.
 1.29.8.1  06-Nov-2007  matt sync with HEAD
 1.30.10.1  18-May-2008  yamt sync with head.
 1.30.8.1  02-Jun-2008  mjf Sync with HEAD.
 1.31.26.2  23-Jun-2013  tls resync from head
 1.31.26.1  25-Feb-2013  tls resync with head
 1.31.20.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.31.20.1  23-Jan-2013  yamt sync with head
 1.42.16.1  08-Apr-2020  martin Merge changes from current as of 20200406

RSS XML Feed