Home | History | Annotate | Download | only in fsck_lfs
History log of /src/sbin/fsck_lfs/pass1.c
RevisionDateAuthorComments
 1.47  03-Apr-2020  joerg Avoid common symbols for fsck_lfs.
 1.46  23-Feb-2020  riastradh Fix userland references to LFS_ORPHAN_NEXTFREE.

Forgot to grep for these or do a full distribution build, oops!
 1.45  03-Oct-2015  dholland branches: 1.45.16; 1.45.18;
The per-inode state 'id_entryno' is used by pass1 for a block count,
so widen it to 'long long'. pass2 uses it for the number of entries in
a directory (IIUC) which does not need to be wider than int, but for
now let's not try to split into two fields. FUTURE...
 1.44  01-Sep-2015  dholland Use daddr_t, not ulfs_daddr_t, as the latter's 32 bits wide.
Don't use either for on-disk items.
Declare external data in header files.
Part 3 of 3.
 1.43  01-Sep-2015  dholland The ifile's inode number is constant. (it is always 1)

Therefore, storing the value in the superblock and reading it out
again is silly and offers the opportunity for it to become corrupted.
So, don't do that (most of the code already didn't) and use the
existing constant instead. Initialize new 32-bit superblocks with
the value for the sake of old userland programs, but don't keep the
value in the 64-bit superblock at all.

(approved by Margo Seltzer)
 1.42  12-Aug-2015  dholland Hack up dinode usage to be 64 vs. 32 as needed. Part 1.

(This part changes the native lfs code; the ufs-derived code already
has 64 vs. 32 logic, but as aspects of it are unsafe, and don't
entirely interoperate cleanly with the lfs 64/32 stuff, pass 2 will be
rehashing that.)
 1.41  12-Aug-2015  dholland Add IFILE32 and IFILE64 structures for the on-disk ifile entries.
Add and use accessors. There are also a bunch of places that cast and
I hope I've found them all...
 1.40  28-Jul-2015  dholland Add a new lfs header file: lfs_accessors.h.

This contains all the accessor functions and macros out of lfs.h.
Add an include of lfs_accessors.h after all uses of lfs.h... except
for code that wants to define its own struct lfs-alike that the
accessors are supposed to play along with. For these, set STRUCT_LFS
and include lfs_accessors.h after the necessary structure has been
defined, so that lfs_accessors.h can emit functions in terms of it.
 1.39  24-Jul-2015  dholland More lfs superblock accessors.
(This changes the rest of the code over; all the accessors were
already added.)

The difference between this commit and the previous one is arbitrary,
but the previous one passed the regression tests on its own so I'm
keeping it separate to help with any bisections that might be needed
in the future.
 1.38  24-Jul-2015  dholland Switch to accessor functions for elements of the LFS on-disk
superblock. This will allow switching between 32/64 bit forms on the
fly; it will also allow handling LFS_EI reasonably tidily. (That
currently doesn't work on the superblock.)

It also gets rid of cpp abuse in the form of fake structure member
macros.

Also, instead of doing sleep/wakeup on &lfs_avail and &lfs_nextseg
inside the on-disk superblock, add extra elements to the in-memory
struct lfs for this. (XXX: these should be changed to condvars, but
not right now)

XXX: this migrates a structure needed by the lfs code in libsa (struct
salfs) into lfs.h, where it doesn't belong, but for the time being
this is necessary in order to allow the accessors (and the various
lfs macros and other goop that relies on them) to compile.
 1.37  18-Jun-2013  christos Prefix most of the cpp macros with lfs_ and LFS_ to avoid conflicts with ffs.
This was done so that boot blocks that want to compile both FFS and LFS in
the same file work.
 1.36  08-Jun-2013  dholland Tidy up the LFS userland build hacks.
Don't use -I${NETBSDSRCDIR}/sys; don't include files other than the
exported LFS headers, which are lfs.h, lfs_inode.h, and (for now)
lfs_extern.h.
 1.35  08-Jun-2013  dholland DIRBLKSIZ -> LFS_DIRBLKSIZ
DIRECTSIZ -> LFS_DIRECTSIZ
DIRSIZ -> LFS_DIRSIZ
OLDDIRFMT -> LFS_OLDDIRFMT
NEWDIRFMT -> LFS_NEWDIRFMT
IFTODT -> LFS_IFTODT
DTTOIF -> LFS_DTTOIF
 1.34  08-Jun-2013  dholland Stick LFS_ in front of IFMT, IFIFO, IFREG, etc. so as not to conflict
with the UFS copies of these symbols. (Which themselves ought to have
UFS_ stuck on.)
 1.33  06-Jun-2013  dholland Cleanups and hacks to make lfs userland stuff build:
- lfs_cksum.c doesn't actually need ulfs_inode.h any more.
- neither does lfs_itimes.c.
- add hacks to fsck_lfs to make it compile.
- add hacks to newfs_lfs to make it compile.
- fix warning in ulfs_quota.c when quotas are fully disabled
(as I guess is happening with the rumpity version)

XXX: This commit adds -I${NETBSDSRCDIR}/sys to the Makefiles for
XXX: fsck_lfs, newfs_lfs, and lfs_cleanerd. This needs to be cleaned
XXX: up ASAP; but I consider this less problematic in the short term
XXX: than spewing ulfs_*.h into /usr/include.
 1.32  06-Jun-2013  dholland ufs -> ulfs for fsck_lfs.
 1.31  22-Jan-2013  dholland Stuff UFS_ in front of a few of ufs's symbols to reduce namespace
pollution. Specifically:
ROOTINO -> UFS_ROOTINO
WINO -> UFS_WINO
NXADDR -> UFS_NXADDR
NDADDR -> UFS_NDADDR
NIADDR -> UFS_NIADDR
MAXSYMLINKLEN -> UFS_MAXSYMLINKLEN
MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Sort out ext2fs's misuse of NDADDR and NIADDR; fortunately, these have
the same values in ext2fs and ffs.

No functional change intended.
 1.30  16-Feb-2010  mlelstv branches: 1.30.6; 1.30.12;
Three changes in a single commit.

- drop the notion of frags (LFS fragments) vs fsb (FFS fragments)
The code uses a complicated unity function that just makes the
code difficult to understand.

- support larger sector sizes. Fix disk address computations
to use DEV_BSIZE in the kernel as required by device drivers
and to use sector sizes in userland.

- Fix several locking bugs in lfs_bio.c and lfs_subr.c.
 1.29  08-Oct-2007  ad Give brelse() a second argument so that it matches the kernel.
fsck_lfs now compiles again.
 1.28  09-Nov-2006  christos branches: 1.28.8;
Fix malloc/realloc/calloc issues: always check and exit, use EEXIT instead
of 8.
 1.27  16-Oct-2006  christos comment out/delete impossible code
 1.26  01-Sep-2006  perseant Several fixes to improve the reliability of the roll-forward agent.
Also, note "properly orphaned" files as distinct from corrupted files.
 1.25  18-Jul-2006  perseant Various improvements to fsck_lfs, to wit:

* Add lfs_balloc capability to the lfs library.
* Extend the Ifile if we run out of free inodes when creating lost+found.
* Don't roll forward if we have allocated a lost+found, to avoid
conflicts when adding new files in roll-forward.
* Make some messages slightly more verbose (e.g. include inode number,
and use pwarn() instead of printf() so the messages include the device
name when preening).
* Change superblock detection/avoidance to use the offset table in the
primary superblock, rather than looking at the contents.
* Be more verbose about various operations when passed the -d flag,
especially roll-forward.
* Be more careful about dirops during roll forward, since the cleaner can
sometimes write blocks from dirop vnodes. Detect and avoid this problem.
* Always check the free list, even if given -i; if we're going to write
it we have to check it first.
* Mark inodes dirty when blocks are found during roll forward, so the
inodes are written with the new block locations.
* Update size of inodes if blocks beyond EOF are found during roll
forward.
* Fix segment accounting for blocks and inodes found during roll
forward.
* Report statistics on roll forward: how many new/deleted/moved files
and how many updated blocks (or "nothing new").
* Don't care if the device being checked is really a device, if we have
been passed the -f flag (to facilitate automated testing).
* When writing to the disk, use the current time in the segment headers
rathern than time 0.
* When passed the -i flag, locate the partial segment containing the
Ifile inode and use that to calculate lfs_offset, lfs_curseg,
lfs_nextseg. (Again for automated testing.)
 1.24  17-Mar-2006  rumble Check for allocation failures in malloc, calloc, realloc, asprintf, and
vasprintf and try to handle them.
 1.23  13-Sep-2005  christos rename lfs.h to lfs_user.h so that it does not conflict.
 1.22  19-Aug-2005  christos 64 bit inode changes
 1.21  27-Jun-2005  christos constify
 1.20  06-Feb-2005  perry remove obsolete "register" declarations.
 1.19  19-Jan-2005  xtraeme ANSIfy, WARNS=2
 1.18  18-Jul-2004  yamt zero-out dinode is not a proper way to 'clear' an lfs inode.
 1.17  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22308, verified by myself.
 1.16  12-Jul-2003  yamt fix a null dereference on stale inode.
 1.15  02-Apr-2003  fvdl Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
 1.14  28-Mar-2003  perseant Add working writing ability to fsck_lfs, including roll-forward, based on
a partial-segment writer ported from the kernel.
 1.13  24-Jan-2003  fvdl Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
 1.12  25-Sep-2001  wiz Add some \n to error messages.
 1.11  13-Jul-2001  perseant Merge the short-lived perseant-lfsv2 branch into the trunk.

Kernels and tools understand both v1 and v2 filesystems; newfs_lfs
generates v2 by default. Changes for the v2 layout include:

- Segments of non-PO2 size and arbitrary block offset, so these can be
matched to convenient physical characteristics of the partition (e.g.,
stripe or track size and offset).

- Address by fragment instead of by disk sector, paving the way for
non-512-byte-sector devices. In theory fragments can be as large
as you like, though in reality they must be smaller than MAXBSIZE in size.

- Use serial number and filesystem identifier to ensure that roll-forward
doesn't get old data and think it's new. Roll-forward is enabled for
v2 filesystems, though not for v1 filesystems by default.

- The inode free list is now a tailq, paving the way for undelete (undelete
is not yet implemented, but can be without further non-backwards-compatible
changes to disk structures).

- Inode atime information is kept in the Ifile, instead of on the inode;
that is, the inode is never written *just* because atime was changed.
Because of this the inodes remain near the file data on the disk, rather
than wandering all over as the disk is read repeatedly. This speeds up
repeated reads by a small but noticeable amount.

Other changes of note include:

- The ifile written by newfs_lfs can now be of arbitrary length, it is no
longer restricted to a single indirect block.

- Fixed an old bug where ctime was changed every time a vnode was created.
I need to look more closely to make sure that the times are only updated
during write(2) and friends, not after-the-fact during a segment write,
and certainly not by the cleaner.
 1.10  06-Jan-2001  joff branches: 1.10.2;
Fixed blockmap handling to properly use disk blocks rather than fragments.
Fixes an issue with fsck_lfs not detecting all duplicate blocks that may
exist in a corrupted filesystem.
 1.9  05-Jan-2001  lukem use %ll_ instead of the less standard %q_
 1.8  14-Jun-2000  perseant Add "-i" flag to specify the location of the index file inode, to
examine alternate checkpoints. Regularize usage of maxino. Remove olf
debugging cruft.
 1.7  30-May-2000  perseant Check for cycles in the inode free list, and for free inodes not on the free
list.
 1.6  23-May-2000  perseant branches: 1.6.2;
Convert to NetBSD source code style
 1.5  16-May-2000  perseant fsck_lfs can now write to the filesystem, allowing it to correct most
(though still not all) errors in a damaged lfs. Segment byte accounting
is corrected in pass 5. "fsck_lfs -p" will do a partial roll-forward,
verifying the checkpoint from the newer superblock. fscknames[] is
updated so that fsck knows about fsck_lfs.
 1.4  20-Jan-2000  perseant Rename lfs_ifind so that it does not conflict with new kernel prototype.
Addresses PR #9253.
 1.3  03-Jul-1999  kleink RCS Id police.
 1.2  24-Mar-1999  nathanw branches: 1.2.2;
printf format fixes for Alpha.
 1.1  18-Mar-1999  perseant Initial checkin of fsck_lfs. This version cannot do any repair (-p flag
does nothing, and one of -p or -n is required) but can be useful as a
diagnostic tool.
 1.2.2.1  21-Jan-2000  he Pull up revision 1.4 (requested by perseant):
Fix name collision error due to recent kernel prototype updates.
Fixes PR#9253.
 1.6.2.1  22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.10.2.1  02-Jul-2001  perseant Change disk addressing unit to be the fragment, instead of the disk sector.
All quantities in the superblock, inodes, indirect blocks, etc. refer now
to this abstract unit (called "fsb" as it is in FFS) instead of disk sectors;
as a consequence segment summary blocks have to be multiples of a fragment in
size. In v1 filesystems, compatibility code ensures that 1 fsb == 1 sector,
regardless of fragment size.

Fragments can now range in size between 512 and 32k; in the event that
LFS_LABELPAD (8k) is smaller than the disk address unit size, an extra
proto-superblock is kept at 8k from the beginning of the disk, to be used
*only* to locate the real superblocks. (Not all of the userland knows about
this yet.)

Almost all of this was done not by me, but by joff.
 1.28.8.1  06-Nov-2007  matt sync with HEAD
 1.30.12.2  23-Jun-2013  tls resync from head
 1.30.12.1  25-Feb-2013  tls resync with head
 1.30.6.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.30.6.1  23-Jan-2013  yamt sync with head
 1.45.18.1  17-Aug-2020  martin Pull up following revision(s) (requested by riastradh in ticket #1050):

sys/ufs/lfs/lfs_subr.c: revision 1.101
sys/ufs/lfs/lfs_subr.c: revision 1.102
sys/ufs/lfs/lfs_inode.c: revision 1.158
sys/ufs/lfs/lfs_inode.h: revision 1.25
sys/ufs/lfs/lfs_balloc.c: revision 1.95
sys/ufs/lfs/lfs_pages.c: revision 1.21
sys/ufs/lfs/lfs_vnops.c: revision 1.330
sys/ufs/lfs/lfs_alloc.c: revision 1.140 (patch)
sys/ufs/lfs/lfs_alloc.c: revision 1.141 (patch)
lib/libp2k/p2k.c: revision 1.72
sys/ufs/lfs/lfs.h: revision 1.205
sys/ufs/lfs/lfs.h: revision 1.206
sys/ufs/lfs/lfs_segment.c: revision 1.284
sys/ufs/lfs/lfs.h: revision 1.207
sys/ufs/lfs/lfs_segment.c: revision 1.285
sys/ufs/lfs/lfs_debug.c: revision 1.55
sys/ufs/lfs/lfs_rename.c: revision 1.23
usr.sbin/dumplfs/dumplfs.c: revision 1.65
sys/ufs/lfs/lfs_vfsops.c: revision 1.371
sys/arch/i386/stand/efiboot/bootx64/Makefile: revision 1.3
sys/ufs/lfs/lfs_vfsops.c: revision 1.372
sys/ufs/lfs/lfs_vfsops.c: revision 1.373
sbin/fsck_lfs/pass1.c: revision 1.46
sys/ufs/lfs/lfs_vnops.c: revision 1.326
sys/ufs/lfs/lfs_vnops.c: revision 1.327
sys/ufs/lfs/lfs_vfsops.c: revision 1.375 (patch)
sys/ufs/lfs/lfs_vnops.c: revision 1.328
sys/ufs/lfs/lfs_subr.c: revision 1.98
sys/ufs/lfs/lfs_extern.h: revision 1.116
sys/ufs/lfs/lfs_vnops.c: revision 1.329
sys/ufs/lfs/lfs_subr.c: revision 1.99
sys/ufs/lfs/lfs_extern.h: revision 1.117
sys/ufs/lfs/lfs_accessors.h: revision 1.49
sys/ufs/lfs/lfs_extern.h: revision 1.118
sys/rump/fs/lib/liblfs/Makefile: revision 1.15
sys/ufs/lfs/lfs_bio.c: revision 1.146 (patch)
sys/ufs/lfs/lfs_bio.c: revision 1.147
sys/ufs/lfs/lfs_subr.c: revision 1.100

Fix kassert in lfs by initializing vp first.

Use a marker node to iterate lfs_dchainhd / i_lfs_dchain.

I believe elements can be removed while the lock is dropped,
including the next node we're hanging on to.

Just use VOP_BWRITE for lfs_bwrite_log.
Hope this doesn't cause trouble with vfs_suspend.

Teach lfs to transition ro<->rw.

Prevent new dirops while we issue lfs_flush_dirops.

lfs_flush_dirops assumes (by KASSERT((ip->i_state & IN_ADIROP) == 0))
that vnodes on the dchain will not become involved in active dirops
even while holding no other locks (lfs_lock, v_interlock), so we must
set lfs_writer here. All other callers already set lfs_writer.

We set fs->lfs_writer++ without explicitly doing lfs_writer_enter
because
(a) we already waited for the dirops to drain, and
(b) we hold lfs_lock and cannot drop it before setting lfs_writer.

Assert lfs_writer where I think we can now prove it.

Serialize access to the splay tree with lfs_lock.

Change some cheap KDASSERT into KASSERT.

Take a reference and fix assertions in lfs_flush_dirops.
Fixes panic:
KASSERT((ip->i_state & IN_ADIROP) == 0) at lfs_vnops.c:1670
lfs_flush_dirops
lfs_check
lfs_setattr
VOP_SETATTR
change_mode
sys_fchmod
syscall

This assertion -- and the assertion that vp->v_uflag has VU_DIROP set
-- is valid only until we release lfs_lock, because we may race with
lfs_unmark_dirop which will remove the nodes and change the flags.

Further, vp itself is valid only as long as it is referenced, which it
is as long as it's on the dchain, but lfs_unmark_dirop drops the
dchain's reference.

Don't lfs_writer_enter while holding v_interlock.

There's no need to lfs_writer_enter at all here, as far as I can see.
lfs_flush_fs will do it for us.

Break deadlock in PR kern/52301.

The lock order is lfs_writer -> lfs_seglock. The problem in 52301 is
that lfs_segwrite violates this lock order by sometimes doing
lfs_seglock -> lfs_writer, either (a) when doing a checkpoint or (b),
opportunistically, when there are no dirops pending. Both cases can
deadlock, because dirops sometimes take the seglock (lfs_truncate,
lfs_valloc, lfs_vfree):
(a) There may be dirops pending, and they may be waiting for the
seglock, so we can't wait for them to complete while holding the
seglock.
(b) The test for fs->lfs_dirops == 0 happens unlocked, and the state
may change by the time lfs_writer_enter acquires lfs_lock.

To resolve this in each case:
(a) Do lfs_writer_enter before lfs_seglock, since we will need it
unconditionally anyway. The worst performance impact of this should
be that some dirops get delayed a little bit.
(b) Create a new lfs_writer_tryenter to use at this point so that the
test for fs->lfs_dirops == 0 and the acquisition of lfs_writer happen
atomically under lfs_lock.

Initialize/destroy lfs_allclean_wakeup in modcmd, not lfs_mountfs.

Fixes reloading lfs.kmod.

In lfs_update, hold lfs_writer around lfs_vflush.

Otherwise, we might do
lfs_vflush
-> lfs_seglock
-> lfs_segwait(SEGM_CKP)
-> lfs_writer_enter
which is the reverse of the lfs_writer -> lfs_seglock ordering.

Call lfs_orphan in lfs_rename while we're still in the dirop.
lfs_writer_enter can't fail; keep it simple and don't pretend it can.

Assert that mtsleep can't fail either -- it doesn't catch signals and
there's no timeout.

Teach LFS_ORPHAN_NEXTFREE about lfs64.

Dust off the orphan detection code and try to make it work.

Fix !DIAGNOSTIC compile

Fix userland references to LFS_ORPHAN_NEXTFREE.

Forgot to grep for these or do a full distribution build, oops!

Fix missing <sys/evcnt.h> by removing the evcnts instead.

Just wanted to confirm that a race might happen, and indeed it did.
These serve little diagnostic value otherwise.

OR into bp->b_cflags; don't overwrite.

CTASSERT lfs on-disk structure sizes.

Avoid misaligned access to lfs64 on-disk records in memory.
lfs64 directory entries are only 32-bit aligned in order to conserve
space in directory blocks, and we had a hack to stuff a 64-bit inode
in them. This replaces the hack by __aligned(4) __packed, and goes
further:

1. It's not clear that all the other lfs64 data structures are 64-bit
aligned on disk to begin with. We can go through these later and
upgrade them from
struct foo64 {
...
} __aligned(4) __packed;
union foo {
struct foo64 f64;
...
};
to
struct foo64 {
...
};
union foo {
struct foo64 f64 __aligned(8);
...
} __aligned(4) __packed;
if we really want to take advantage of 64-bit memory accesses.
However, the __aligned(4) __packed must remain on the union
because:
2. We access even the lfs32 data structures via a union that has
lfs64 members, and it turns out that compilers will assume access
through a union with 64-bit aligned members implies the whole
union has 64-bit alignment, even if we're only accessing a 32-bit
aligned member.

Fix clang build after packed lfs64 accessor change.

Suppress spurious address-of-packed error in rump lfs too.
 1.45.16.1  08-Apr-2020  martin Merge changes from current as of 20200406

RSS XML Feed