Home | History | Annotate | Download | only in fsck_ffs
History log of /src/sbin/fsck_ffs/setup.c
RevisionDateAuthorComments
 1.111  23-Jun-2025  christos join lines
 1.110  19-Jun-2025  mlelstv Don't truncate bitmap size to unsigned int, avoids crashes on filesystems
with more than 2^32 blocks.

Pullups
 1.109  05-Jul-2023  riastradh Revert "fsck_ffs(8): Ensure A divides S before aligned_alloc(A, S)."

C17 lifted this restriction.
 1.108  04-Jul-2023  riastradh fsck_ffs(8): Fix whitespace issues.

- Nix trailing whitespace.
- Omit excessive blank lines.
- Insert missing blank lines between $NetBSD$ and copyright.

No functional change intended.
 1.107  04-Jul-2023  riastradh fsck_ffs(8): Ensure A divides S before aligned_alloc(A, S).

Required by C11 Sec. 7.22.3.1 The aligned_alloc function, para. 2,
p. 348:

The value of alignment shall be a valid alignment supported by the
implementation and the value of size shall be an integral multiple
of alignment.

XXX pullup-10
 1.106  08-Jan-2023  chs ufs: more signed/unsigned fixes

Fix the previous signed/unsigned fixes to build on 32-bit,
including applying this commit from FreeBSD:

commit 2d34afcd04207cf3fa3d5b7f467a890eae75da41
Author: Kirk McKusick <mckusick@FreeBSD.org>
Date: Sun Oct 25 21:04:07 2020 +0000

Use proper type (ino_t) for inode numbers to avoid improper sign extention
in the Pass 5 checks. The manifestation was fsck_ffs exiting with this error:

** Phase 5 - Check Cyl groups
fsck_ffs: inoinfo: inumber 18446744071562087424 out of range

The error only manifests itself for filesystems bigger than about 100Tb.

Reported by: Nikita Grechikhin <ngrechikhin at yandex.ru>
MFC after: 2 weeks
Sponsored by: Netflix
 1.105  07-Jan-2023  chs ufs: fixed signed/unsigned bugs affecting large file systems

Apply these commits from FreeBSD:

commit e870d1e6f97cc73308c11c40684b775bcfa906a2
Author: Kirk McKusick <mckusick@FreeBSD.org>
Date: Wed Feb 10 20:10:35 2010 +0000

This fix corrects a problem in the file system that treats large
inode numbers as negative rather than unsigned. For a default
(16K block) file system, this bug began to show up at a file system
size above about 16Tb.

To fully handle this problem, newfs must be updated to ensure that
it will never create a filesystem with more than 2^32 inodes. That
patch will be forthcoming soon.

Reported by: Scott Burns, John Kilburg, Bruce Evans
Followup by: Jeff Roberson
PR: 133980
MFC after: 2 weeks

commit 81479e688b0f643ffacd3f335b4b4bba460b769d
Author: Kirk McKusick <mckusick@FreeBSD.org>
Date: Thu Feb 11 18:14:53 2010 +0000

One last pass to get all the unsigned comparisons correct.


In additional to the changes from FreeBSD, this commit includes quite a few
related changes to appease -Wsign-compare.
 1.104  17-Nov-2022  chs branches: 1.104.2;
Restore backward compatibility of UFS2 with previous NetBSD releases by
disabling support in UFS2 for extended attributes (including ACLs).
Add a new variant of UFS2 called "UFS2ea" that does support extended attributes.
Add new fsck_ffs operations "-c ea" and "-c no-ea" to convert file systems
from UFS2 to UFS2ea and vice-versa (both of which delete all existing extended
attributes in the process).
 1.103  17-Apr-2020  jdolecek align buffers used for I/O to DEV_BSIZE so it's executed more optimally
when run for xbd(4) raw (character) device
 1.102  05-Oct-2018  hannken branches: 1.102.2;
Add a test for duplicate inodes on the persistent snapshot list.
 1.101  08-Feb-2017  rin branches: 1.101.4; 1.101.10; 1.101.12;
Add smaller versions of fsck_ffs(8) and newfs(8) for install media, where
support for Endian-Independent FFS and Apple UFS is disabled unless FFS_EI=1
and APPLE_UFS=1 are added to CRUNCHENV, respectively.

This reduces the size of ramdisk image for atari by over 15KB.

Thanks tsutsui and christos for their useful comments.
 1.100  23-Jun-2013  dholland branches: 1.100.10; 1.100.14;
Stick ffs_, ext2_, chfs_, filecore_, cd9660_, or mfs_ in front of
the following symbols so as to disambiguate fully. (Christos already
did the lfs ones.)

lblkno
lblktosize
lfragtosize
numfrags
blkroundup
fragroundup
 1.99  23-Jun-2013  dholland fsbtodb() -> FFS_FSBTODB(), EXT2_FSBTODB(), or MFS_FSBTODB()
dbtofsb() -> FFS_DBTOFSB() or EXT2_DBTOFSB()

(Christos already did the lfs ones a few days back)
 1.98  19-Jun-2013  dholland Rename ambiguous macros:
MAXDIRSIZE -> UFS_MAXDIRSIZE or LFS_MAXDIRSIZE
NINDIR -> FFS_NINDIR, EXT2_NINDIR, LFS_NINDIR, or MFS_NINDIR
INOPB -> FFS_INOPB, LFS_INOPB
INOPF -> FFS_INOPF, LFS_INOPF
blksize -> ffs_blksize, ext2_blksize, or lfs_blksize
sblksize -> ffs_blksize

These are not the only ambiguously defined filesystem macros, of
course, there's a pile more. I may not have found all the ambiguous
definitions of blksize(), too, as there are a lot of other things
called 'blksize' in the system.
 1.97  09-Jun-2013  dholland Stick UFS_ in front of these symbols:
DIRBLKSIZ
DIRECTSIZ
DIRSIZ
OLDDIRFMT
NEWDIRFMT

Part of PR 47909.
 1.96  22-Jan-2013  dholland Stuff UFS_ in front of a few of ufs's symbols to reduce namespace
pollution. Specifically:
ROOTINO -> UFS_ROOTINO
WINO -> UFS_WINO
NXADDR -> UFS_NXADDR
NDADDR -> UFS_NDADDR
NIADDR -> UFS_NIADDR
MAXSYMLINKLEN -> UFS_MAXSYMLINKLEN
MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Sort out ext2fs's misuse of NDADDR and NIADDR; fortunately, these have
the same values in ext2fs and ffs.

No functional change intended.
 1.95  29-Jan-2012  nonaka branches: 1.95.6;
use FS_UFS[12]_MAGIC_SWAPPED instead of bswap32(FS_UFS[12]_MAGIC).
 1.94  14-Aug-2011  christos branches: 1.94.2;
WARNS=4
 1.93  09-Jun-2011  christos share more code.
 1.92  20-Mar-2011  bouyer branches: 1.92.2;
initialise memory allocated for uquot_user_hash & uquot_group_hash.
Pointed out by Nicolas Joly.
 1.91  06-Mar-2011  bouyer merge the bouyer-quota2 branch. This adds a new on-disk format
to store disk quota usage and limits, integrated with ffs
metadata. Usage is checked by fsck_ffs (no more quotacheck)
and is covered by the WAPBL journal. Enabled with kernel
option QUOTA2 (added where QUOTA was enabled in kernel config files),
turned on with tunefs(8) on a per-filesystem
basis. mount_mfs(8) can also turn quotas on.

See http://mail-index.netbsd.org/tech-kern/2011/02/19/msg010025.html
for details.
 1.90  31-Jan-2010  mlelstv branches: 1.90.2;
Skip handling of APPLEUFS_LABEL if it is smaller than a device block.
In particular:

- newfs will not try to erase the label
- fsck_ffs will not try to validate the label

This lets newfs and fsck work on 2048-byte-per-sector media.

Does Apple UFS support such media and how?
 1.89  27-Sep-2009  bouyer Restore changes from 1.86 and 1.87 after commit of 1.88.
 1.88  13-Sep-2009  bouyer Do some basic checks of the WAPBL journal, to abort the boot before the
kernel refuse to mount a filesystem read-write (booting a system
multiuser with critical filesystems read-only is bad):
Add a check_wapbl() which will check some WAPBL values in the superblock,
and try to read the journal via wapbl_replay_start() if there is one.
pfatal() if one of these fail (abort boot if in preen mode,
as "CONTINUE" otherwise). In non-preen mode the bogus journal will
be cleared.
check_wapbl() is always called if the superblock supports WAPBL.
Even if FS_DOWAPBL is not there, there could be flags asking the
kernel to clear or create a log with bogus values which would cause the
kernel refuse to mount the filesystem.
Discussed in
http://mail-index.netbsd.org/tech-kern/2009/08/17/msg005896.html
and followups.
 1.87  07-Apr-2009  mrg fix a logic error in the previous, as point out by frank kardel.
 1.86  25-Mar-2009  mrg don't ignore "fsck -f" when given with "-p" on a wapbl filesystem.
ie, "fsck -fp" actually forces the check in preen mode now.
 1.85  25-Feb-2009  christos don't copy the address of a pointer. Noticed by Anon Ymous
 1.84  30-Aug-2008  bouyer branches: 1.84.2; 1.84.4; 1.84.8;
Add fss(4) snapshot support to fsck_ffs(8) (via -x or -X options, like
dump(8)). This allows fsck_ffs -n to work on a snapshot of a R/W mounted
filesystem, and avoid errors related to filesystem activity.
 1.83  31-Jul-2008  simonb Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.
 1.82  23-Feb-2008  christos branches: 1.82.4; 1.82.6;
Make sure that the exit values are always sane, and use symbolic instead
of magic constants. Reviewed by go@
 1.81  22-Aug-2007  christos branches: 1.81.2; 1.81.8; 1.81.10;
Avoid zero-divides from Anon Ymous
 1.80  26-Aug-2006  christos - Deal with wedges and the new disk geometry structures, instead of using
struct disklabel.

Functionality lost:
1. struct disklabel used to be updated to contain bsize, fsize, cpg.
This information was used to locate the alternative superblock in
the filesystem if the primary superblock was corrupted. We need
to find a new place to store this information if we need this
functionality.
2. On vax SMD drives that contained bad sector lists, the newfs program
knew how to get the offset and skip to the correct location in order
to place the label.
 1.79  17-Mar-2006  rumble Check for allocation failures in malloc, calloc, realloc, asprintf, and
vasprintf and try to handle them.
 1.78  27-Jun-2005  christos sprinkle const.
 1.77  02-Jun-2005  dbj for ufs2, assume FS_44INODEFMT
this is necessary for freebsd compatbility, since they do not initialize
the old field.
 1.76  02-Jun-2005  lukem appease gcc -Wuninitialized
 1.75  19-Jan-2005  xtraeme Kill __P(), ANSIfy and WARNS=2
 1.74  29-Oct-2004  dsl Rewrite getdisklabelpart() to avoid problems with isdigit(*ch_ptr) and
an incorrect check for a (probably impossible) empty string.
Add comments to avoid confusion...
 1.73  14-Apr-2004  dbj add support for downgrading a filesystem fslevel from 4 to 3
 1.72  14-Apr-2004  dbj set fs_old_nrpos to 1 when doing -c4 upgrade.
This isn't used by kernel, but does affect cg layout slightly
 1.71  12-Apr-2004  dbj fix whitespace in debug printf
 1.70  21-Mar-2004  dsl branches: 1.70.2;
Don't use an ffsv1 superblock from 64k (SBLOCK_UFS2) when looking
for the main filesystem superblock.
64k is the offset of the first alternate if the blocksize if 64k.
Fixes part of PR kern/24809
 1.69  20-Jan-2004  dbj don't calculate fake superblock used for finding alternate superblocks
if the disklabel is missing the cpg parameter. Also print a warning
if this is skipped because of a missing fsize, frag or cpg disklabel parameter
this fixes a divide by zero error reported by martin@
 1.68  12-Jan-2004  dbj change the message "COVERTING TO FFSv2 SUPERBLOCK" to
"CONVERT TO NEW SUPERBLOCK LAYOUT" to help avoid confusion
 1.67  10-Jan-2004  mrg - some KNF (80 cols)
- fix a printf format issue
 1.66  09-Jan-2004  dbj use %#llx instead of %llx when printing incorrect qfmask or qbmask
 1.65  09-Jan-2004  dbj do not upgrade superblock or set FS_FLAGS_UPDATED unless -c 4 option
is provided.
add compatibility for filesystems before FFSv2 integration
these patches are from pr port-macppc/23925 and should also
fix problems discussed in pr kern/21404 and pr kern/21283
 1.64  02-Jan-2004  dbj add uuid field to apple ufs volume label
 1.63  20-Oct-2003  dsl Add a -q (quiet) option to print nothing for clean filesystems.
Support in fsck_ffs and stub in fsck_xxx.
Push a few more messages through pwarn() instead of printf() to ensure
disk name is shown.
 1.62  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22308, verified by myself.
 1.61  11-Apr-2003  enami Correctly detect a UFS1 file system of non-native endian.
 1.60  06-Apr-2003  fvdl Write update some old fields when writing the superblock, similar to
ffs_oldfscompat_write() in the kernel. Use the old totals when
time < old_time (i.e. an old kernel or fsck wrote the filesystem last).
When setting the date back on a new kernel, that works out ok, since
new kernels always update both fields.
 1.59  05-Apr-2003  fvdl Skip checks for old 4.2BSD filesystem; as it stands, we can't deal with
writing them. Could be fixed, but doesn't have a high priority.
 1.58  02-Apr-2003  fvdl Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
 1.57  21-Feb-2003  fvdl Only check relevant fields when comparing the superblock to an alternate
superblock. Avoids false positives should fsck_ffs be run on a filesystem
that was created after the UFS2 code has been merged.

This commit is mostly a forward compatibility patch that can be pulled
up in to the 1.6 branch.

From Kirk Mckusick in FreeBSD (setup.c rev. 1.30). Original commit message:
 1.56  24-Jan-2003  fvdl Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
 1.55  05-Nov-2002  dbj check that a disklabel is valid before trying to extract partition
information from it when checking for apple ufs filesystems
 1.54  28-Sep-2002  dbj Add support for the Apple UFS variation on ffs
This is the bulk of PR #17345

The general approach is to use a run time deteriminable value
for DIRBLKSIZ. Additional allowances are included for using
MAXSYMLINKLEN with FS_42INODEFMT and a shift in the cylinder group
cluster summary count array. Support is added for managing
the Apple UFS volume label.
 1.53  30-Jun-2002  dbj commit fix from pr bin/15449
this fixes FS_42POSTBLFMT compatibility
 1.52  19-Dec-2001  fvdl branches: 1.52.2;
Don't use the pendinginodes and pendingblocks fields in alternate
superblock comparison.
 1.51  16-Nov-2001  lukem - changes to -F semantics:
- remove the restriction that filesystem must be a regular file
- don't try and read a disklabel
- use `p' (instead of `h') as the index of the last partition
 1.50  18-Sep-2001  lukem add comments to make it clearer what cmpsblks() is doing
 1.49  06-Sep-2001  lukem Incorporate the enhanced ffs_dirpref() by Grigoriy Orlov, as found in
FreeBSD (three commits; the initial work, man page updates, and a fix
to ffs_reload()), with the following differences:
- Be consistent between newfs(8) and tunefs(8) as to the options which
set and control the tuning parameters for this work (avgfilesize & avgfpdir)
- Use u_int16_t instead of u_int8_t to keep track of the number of
contiguous directories (suggested by Chuck Silvers)
- Work within our FFS_EI framework
- Ensure that fs->fs_maxclusters and fs->fs_contigdirs don't point to
the same area of memory

The new algorithm has a marked performance increase, especially when
performing tasks such as untarring pkgsrc.tar.gz, etc.

The original FreeBSD commit messages are attached:

=====
mckusick 2001/04/10 01:39:00 PDT
Directory layout preference improvements from Grigoriy Orlov <gluk@ptci.ru>.
His description of the problem and solution follow. My own tests show
speedups on typical filesystem intensive workloads of 5% to 12% which
is very impressive considering the small amount of code change involved.

------

One day I noticed that some file operations run much faster on
small file systems then on big ones. I've looked at the ffs
algorithms, thought about them, and redesigned the dirpref algorithm.

First I want to describe the results of my tests. These results are old
and I have improved the algorithm after these tests were done. Nevertheless
they show how big the perfomance speedup may be. I have done two file/directory
intensive tests on a two OpenBSD systems with old and new dirpref algorithm.
The first test is "tar -xzf ports.tar.gz", the second is "rm -rf ports".
The ports.tar.gz file is the ports collection from the OpenBSD 2.8 release.
It contains 6596 directories and 13868 files. The test systems are:

1. Celeron-450, 128Mb, two IDE drives, the system at wd0, file system for
test is at wd1. Size of test file system is 8 Gb, number of cg=991,
size of cg is 8m, block size = 8k, fragment size = 1k OpenBSD-current
from Dec 2000 with BUFCACHEPERCENT=35

2. PIII-600, 128Mb, two IBM DTLA-307045 IDE drives at i815e, the system
at wd0, file system for test is at wd1. Size of test file system is 40 Gb,
number of cg=5324, size of cg is 8m, block size = 8k, fragment size = 1k
OpenBSD-current from Dec 2000 with BUFCACHEPERCENT=50

You can get more info about the test systems and methods at:
http://www.ptci.ru/gluk/dirpref/old/dirpref.html

Test Results

tar -xzf ports.tar.gz rm -rf ports
mode old dirpref new dirpref speedup old dirprefnew dirpref speedup
First system
normal 667 472 1.41 477 331 1.44
async 285 144 1.98 130 14 9.29
sync 768 616 1.25 477 334 1.43
softdep 413 252 1.64 241 38 6.34
Second system
normal 329 81 4.06 263.5 93.5 2.81
async 302 25.7 11.75 112 2.26 49.56
sync 281 57.0 4.93 263 90.5 2.9
softdep 341 40.6 8.4 284 4.76 59.66

"old dirpref" and "new dirpref" columns give a test time in seconds.
speedup - speed increasement in times, ie. old dirpref / new dirpref.

------

Algorithm description

The old dirpref algorithm is described in comments:

/*
* Find a cylinder to place a directory.
*
* The policy implemented by this algorithm is to select from
* among those cylinder groups with above the average number of
* free inodes, the one with the smallest number of directories.
*/

A new directory is allocated in a different cylinder groups than its
parent directory resulting in a directory tree that is spreaded across
all the cylinder groups. This spreading out results in a non-optimal
access to the directories and files. When we have a small filesystem
it is not a problem but when the filesystem is big then perfomance
degradation becomes very apparent.

What I mean by a big file system ?

1. A big filesystem is a filesystem which occupy 20-30 or more percent
of total drive space, i.e. first and last cylinder are physically
located relatively far from each other.
2. It has a relatively large number of cylinder groups, for example
more cylinder groups than 50% of the buffers in the buffer cache.

The first results in long access times, while the second results in
many buffers being used by metadata operations. Such operations use
cylinder group blocks and on-disk inode blocks. The cylinder group
block (fs->fs_cblkno) contains struct cg, inode and block bit maps.
It is 2k in size for the default filesystem parameters. If new and
parent directories are located in different cylinder groups then the
system performs more input/output operations and uses more buffers.
On filesystems with many cylinder groups, lots of cache buffers are
used for metadata operations.

My solution for this problem is very simple. I allocate many directories
in one cylinder group. I also do some things, so that the new allocation
method does not cause excessive fragmentation and all directory inodes
will not be located at a location far from its file's inodes and data.
The algorithm is:
/*
* Find a cylinder group to place a directory.
*
* The policy implemented by this algorithm is to allocate a
* directory inode in the same cylinder group as its parent
* directory, but also to reserve space for its files inodes
* and data. Restrict the number of directories which may be
* allocated one after another in the same cylinder group
* without intervening allocation of files.
*
* If we allocate a first level directory then force allocation
* in another cylinder group.
*/

My early versions of dirpref give me a good results for a wide range of
file operations and different filesystem capacities except one case:
those applications that create their entire directory structure first
and only later fill this structure with files.

My solution for such and similar cases is to limit a number of
directories which may be created one after another in the same cylinder
group without intervening file creations. For this purpose, I allocate
an array of counters at mount time. This array is linked to the superblock
fs->fs_contigdirs[cg]. Each time a directory is created the counter
increases and each time a file is created the counter decreases. A 60Gb
filesystem with 8mb/cg requires 10kb of memory for the counters array.

The maxcontigdirs is a maximum number of directories which may be created
without an intervening file creation. I found in my tests that the best
performance occurs when I restrict the number of directories in one cylinder
group such that all its files may be located in the same cylinder group.
There may be some deterioration in performance if all the file inodes
are in the same cylinder group as its containing directory, but their
data partially resides in a different cylinder group. The maxcontigdirs
value is calculated to try to prevent this condition. Since there is
no way to know how many files and directories will be allocated later
I added two optimization parameters in superblock/tunefs. They are:

int32_t fs_avgfilesize; /* expected average file size */
int32_t fs_avgfpdir; /* expected # of files per directory */

These parameters have reasonable defaults but may be tweeked for special
uses of a filesystem. They are only necessary in rare cases like better
tuning a filesystem being used to store a squid cache.

I have been using this algorithm for about 3 months. I have done
a lot of testing on filesystems with different capacities, average
filesize, average number of files per directory, and so on. I think
this algorithm has no negative impact on filesystem perfomance. It
works better than the default one in all cases. The new dirpref
will greatly improve untarring/removing/coping of big directories,
decrease load on cvs servers and much more. The new dirpref doesn't
speedup a compilation process, but also doesn't slow it down.

Obtained from: Grigoriy Orlov <gluk@ptci.ru>
=====

=====
iedowse 2001/04/23 17:37:17 PDT
Pre-dirpref versions of fsck may zero out the new superblock fields
fs_contigdirs, fs_avgfilesize and fs_avgfpdir. This could cause
panics if these fields were zeroed while a filesystem was mounted
read-only, and then remounted read-write.

Add code to ffs_reload() which copies the fs_contigdirs pointer
from the previous superblock, and reinitialises fs_avgf* if necessary.

Reviewed by: mckusick
=====

=====
nik 2001/04/10 03:36:44 PDT
Add information about the new options to newfs and tunefs which set the
expected average file size and number of files per directory. Could do
with some fleshing out.
=====
 1.48  03-Sep-2001  lukem no need to assign asb->fs_state twice in cmpsblks()
 1.47  03-Sep-2001  lukem deprecate fs_fscktime; we never used it.

in an effort to maintain compatibility with freebsd/openbsd/whatever,
i'm attempting to get the superblock format in sync, and freebsd uses
the int32_t at this position for `fs_pendinginodes'.

if we ever decide to implement fscktime functionality, we'll:
a) make sure to liaise with the other projects to reserve the same
spare field
b) actually implement the code this time ...

(this is also preparing us for other changes, like the new dirpref code)
 1.46  02-Sep-2001  lukem Incorporate fix by iedowse @ FreeBSD to allow disks with large numbers of
cylinder groups to work correctly, with minor modifications by me to work
with our FFS_EI code. From the FreeBSD commit message:

The ffs superblock includes a 128-byte region for use by temporary
in-core pointers to summary information. An array in this region
(fs_csp) could overflow on filesystems with a very large number of
cylinder groups (~16000 on i386 with 8k blocks). When this happens,
other fields in the superblock get corrupted, and fsck refuses to
check the filesystem.

Solve this problem by replacing the fs_csp array in 'struct fs'
with a single pointer, and add padding to keep the length of the
128-byte region fixed. Update the kernel and userland utilities
to use just this single pointer.

With this change, the kernel no longer makes use of the superblock
fields 'fs_csshift' and 'fs_csmask'. Add a comment to newfs/mkfs.c
to indicate that these fields must be calculated for compatibility
with older kernels.

Reviewed by: mckusick
 1.45  17-Aug-2001  lukem remove third argument (`int ns') from ffs_sb_swap(), and let ffs_sb_swap()
determine the endianness of the `struct fs *o' superblock from o->fs_magic
and set needswap as necessary, rather than trusting the caller to get
it right. invariably, almost every caller of ffs_sb_swap() was calling it
with ns set to the wrong value for ns anyway!
ansi KNF ffs_bswap.c declarations whilst here.

this fixes all sorts of problems when trying to use other-endian file systems,
notably the kernel trying to access memory *way* off, possibly corrupting or
panicing, and userland programs SEGVing and/or corrupting things (e.g,
"fsck_ffs -B" to swap a file system endianness).

whilst the previous rev of ffs_bswap.c (1.10, 2000/12/23) made this problem
worse, i suspect that the problem was always there and previous versions
just happened not to trash things at the wrong time.

FFS_EI should now be a lot more stable.
 1.44  15-Aug-2001  lukem - implement -F; treat provided filesystems as images in regular files
- replace "filesystem" with "file system" as appropriate
- grammar fixes
 1.43  04-Jul-2001  hubertf EVEN IF YOU SCREAM, THE COMMANT IS STILL CALLED fsck_ffs !
 1.42  04-Feb-2001  christos remove redundant declarations
 1.41  26-Jan-2001  thorpej In pass 5, check alternate superblocks for consistency with
the current in-core master superblock, and fix them up if
they're incorrect. Move the code that writes the alternate
superblocks if (cvtlevel || doswap) into pass 5 for efficiency.

Reviewd by Charles Hannum, and used by me to fix up a curdled
file system.
 1.40  09-Jan-2001  mycroft Remove a bogus piece of code that was never used.
 1.39  09-Jan-2001  mycroft Try to cope with cs_ndir being wacky (too large or, particularly when using -b,
too damn small) by setting a minimum (1024) and maximum (maxino + 1). This
prevents certain operations getting REALLY slow when -b is used, and also
avoids overallocating memory if the superblock is hosed.
 1.38  05-Jan-2001  lukem use %ll_ instead of the less standard %q_
 1.37  15-Nov-1999  fvdl branches: 1.37.4;
Changes for softdep code.
 1.36  01-May-1999  is branches: 1.36.2; 1.36.6;
Fix typo.
 1.35  12-Nov-1998  christos Adjust for DKTYPENAME changes.
 1.34  26-Jul-1998  mycroft const poisoning.
 1.33  18-Mar-1998  bouyer Add support for non-native byteorder FFS, and converting byteorder.
Also, be a bit more conservative with the clean flag: don't mark the FS
clean when we know there may still be errors (user anserwed 'n' to
a question, or fsck says "you must rerun fsck").
 1.32  24-Sep-1997  lukem for now, #ifdef out a couple of chunks that were added in the lite2 merge
 1.31  20-Sep-1997  lukem - don't indiscriminately include <stdlib.h> and <unistd.h> in "fsck.h"
- explicitly pull in <stdio.h>, <stdlib.h> and <unistd.h> in *.c as necessary
 1.30  16-Sep-1997  lukem resolve conflicts from lite-2 merge.
 1.29  16-Sep-1997  mrg make these compile on the alpha after WARNS=1.
 1.28  14-Sep-1997  lukem * cleanup for WARNS=1
* deprecate register
* cleanup manpage
* remove unused docheck() func
* prefix hex numbers with '0x'
* getopt returns -1 not EOF
 1.27  27-Sep-1996  christos - util.h -> fsutil.h
 1.26  23-Sep-1996  christos - fixed all printf formats [there were a lot of %l? <-> %? mistakes]
- added missing prototypes, and made local functions static
- removed parallel preening code; this is part of fsck(8)
- use printing utilities from fsck(8)
- Makefile does not make links to fsck and fsck.8
- removed -l maxparallel option. It has no meaning anymore.
 1.25  21-May-1996  mycroft Oops; use %x to print out masks, not %d.
 1.24  21-May-1996  mycroft Check fs_[bf]mask, fs_maxfilesize, fs_maxsymlinklen, and fs_q[bf]mask,
since incorrect values may cause the kernel to malfunction.
 1.23  05-Apr-1996  cgd check in changes proposed in PR 2006 (approved by J.T.), to rename fsck
to fsck_ffs, so that in the future 'fsck' can be a wrapper than invokes
appropriate filesystem-specific checker programs. For now, the only
user-visible change is that the names have changed in the manual page
and in error messages; fsck and fsck.8 are now links to fsck_ffs and
fsck_ffs.8, until the rest of the transition is complete.
 1.22  12-Jul-1995  cgd implement a 'force check' flag, '-f'. I used the SunOS name, but the Digital
semantics. now:
(1) dirty file systems will always be checked; nothing new there.
(2) if not '-f' clean file systems will _NEVER_ be checked,
i.e. they won't be checked even if -p isn't specified. This
allows one to 'fsck -p ; fsck' to preen, then clean up
anything that 'fsck -p' barfs on, without waiting for the
clean file systems to be checked again.
(3) if '-f' clean file systems will ALWAYS be checked. This
allows people to put 'fsck -fp' into /etc/rc on systems
where they're leery of the FS clean flag state, need
the extra reliability, and can afford time 'wasted'
in checks.
The assumption made here is that if a file system is marked clean, it
_IS CLEAN_, really, and shouldn't be checked unless fsck is explicitly
told to (with -f). This should be a valid assumption, but may not be in
the presence of file system bugs. Documentation updated to note '-f'.
 1.21  12-Apr-1995  mycroft Set the clean flag if necessary. If preening, don't check `clean' file
systems.
 1.20  21-Mar-1995  cgd type sizes
 1.19  18-Mar-1995  cgd convert to new RCS Id conventions; reduce my headache
 1.18  28-Dec-1994  mycroft Mostly sync with CSRG.
 1.17  27-Dec-1994  mycroft Copy fs_maxcluster when comparing superblocks.
 1.16  18-Dec-1994  cgd light clean, and make it compile against new header files.
 1.15  05-Dec-1994  cgd more cleanups from Jim Jegers, passed over by me.
 1.14  28-Oct-1994  mycroft Use the S_IS*() macros, and make this compile again after Chris's changes to ufs.
 1.13  23-Sep-1994  mycroft Remove some more uses of obsolete functions.
 1.12  23-Sep-1994  mycroft Eliminate uses of some obsolete functions.
 1.11  29-Jun-1994  ws Reads on raw disks are only guarranteed in multiples of the block size
 1.10  08-Jun-1994  mycroft Update from 4.4-Lite, with local changes.
 1.9  25-Apr-1994  cgd oops; changed comparison, but not field!
 1.8  25-Apr-1994  cgd need <sys/time.h>
 1.7  14-Apr-1994  cgd fs type names will soon be strings
 1.6  09-Apr-1994  deraadt from <dean@fsa.ca>: let "fsck /usr" work. also, if the user does
"fsck /dev/sd0a" attempt to map to the raw device name.
 1.5  01-Oct-1993  mycroft Skip check if filesystem is marked clean and isn't too dusty, only with -p.
Set clean flag after checking a filesystem.
 1.4  01-Aug-1993  mycroft Add RCS identifiers.
 1.3  23-Mar-1993  cgd changed "Id" to "Header" for rcsids
 1.2  22-Mar-1993  cgd added rcs ids to all files
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3  16-Sep-1997  lukem imported from lite-2
 1.1.1.2  13-Jun-1994  mycroft Import 4.4-Lite version.
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.36.6.1  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.36.2.2  26-Oct-1999  fvdl Fix some merge mistakes.
 1.36.2.1  19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.37.4.5  25-Nov-2001  he Pull up revision 1.49 (requested by lukem):
Pull in enhanced ffs_dirpref() algorithm, which provides a
substantial performance improvement through better locality
between parent/child directories and their files, and by easing
the pressure on the buffer cache for metadata operations.
 1.37.4.4  25-Nov-2001  he Pull up revision 1.47 (requested by lukem):
Deprecate unused fs_fscktime.
 1.37.4.3  25-Nov-2001  he Pull up revision 1.46 (requested by lukem):
Change fs_csp[] from being a fixed size to being an array sized
as required. This allows file systems with more than about 15500
cylinder groups (on 32-bit systems) to be used.
 1.37.4.2  25-Nov-2001  he Pull up revision 1.45 (requested by lukem):
Call ffs_sb_swap() with the correct arguments. Fixes problems
with using other-endian file systems.
 1.37.4.1  24-Nov-2001  he Pull up revisions 1.39-1.43 (requested by lukem):
Jumbo pullup for fsck_ffs:
o fix incorrect error message
o mark initialized globals with ``extern''
o make reconnect algorithm O(n) instead of O(n^4)
o remove dead code
o don't swap cg_clustersum(cg)[0], it's a bitmap
o ensure rotor values are positive
o some code restructuring
o fix byte swapping bug
o pass5: check alternate superblocks for consistency with in-core master
o fix usage message
 1.52.2.1  23-Feb-2003  jmc Pullup rev 1.57 (requested by fvdl in ticket #1180)
Only check relevant fields when comparing the superblock to an alternate
superblock. Avoids false positives should fsck_ffs be run on a filesystem
that was created after the UFS2 code has been merged.
 1.70.2.1  27-Apr-2004  jdc Pull up revisions 1.72-1.73 (requested by dbj in ticket #185)

Fix problems related to superblock upgrade issues which may be
experienced by -current users from 2003.
 1.81.10.2  28-Sep-2008  mjf Sync with HEAD.
 1.81.10.1  03-Apr-2008  mjf Sync with HEAD.
 1.81.8.1  24-Mar-2008  keiichi sync with head.
 1.81.2.1  23-Mar-2008  matt sync with HEAD
 1.82.6.1  10-Jun-2008  simonb Initial commit of Wasabi System's WAPBL (Write Ahead Physical Block
Logging) journaling code. Originally written by Darrin B. Jewell
while at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

Still a number of issues - look in doc/BRANCHES for "simonb-wapbl"
for more info.
 1.82.4.1  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.84.8.1  21-Apr-2010  matt sync to netbsd-5
 1.84.4.1  13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.84.2.1  03-Oct-2009  snj Pull up following revision(s) (requested by bouyer in ticket #1036):
sbin/fsck_ffs/extern.h: revision 1.25 via patch
sbin/fsck_ffs/setup.c: revision 1.88 via patch
sbin/fsck_ffs/wapbl.c: revision 1.4 via patch
sbin/tunefs/tunefs.c: revision 1.41 via patch
sys/ufs/ffs/ffs_vfsops.c: revision 1.252 via patch
sys/ufs/ffs/ffs_wapbl.c: revision 1.13 via patch
Allow tunefs to clear any type of WAPBL log, not only in-filesystem
ones. Discussed in
http://mail-index.netbsd.org/tech-kern/2009/08/17/msg005896.html
and followups.
--
Do some basic checks of the WAPBL journal, to abort the boot before the
kernel refuse to mount a filesystem read-write (booting a system
multiuser with critical filesystems read-only is bad):
Add a check_wapbl() which will check some WAPBL values in the superblock,
and try to read the journal via wapbl_replay_start() if there is one.
pfatal() if one of these fail (abort boot if in preen mode,
as "CONTINUE" otherwise). In non-preen mode the bogus journal will
be cleared.
check_wapbl() is always called if the superblock supports WAPBL.
Even if FS_DOWAPBL is not there, there could be flags asking the
kernel to clear or create a log with bogus values which would cause the
kernel refuse to mount the filesystem.
Discussed in
http://mail-index.netbsd.org/tech-kern/2009/08/17/msg005896.html
and followups.
--
If the WAPBL journal can't be read (ffs_wapbl_replay_start() fails),
mount the filesystem anyway if MNT_FORCE is present.
This allows to still boot single-user a system with a corrupted
WAPBL on /, and so get a chance to run fsck to fix it.
http://mail-index.netbsd.org/tech-kern/2009/08/17/msg005896.html
and followups.
 1.90.2.2  17-Feb-2011  bouyer Move quota2_check_doquota() call so that an unclean, wapbl filesystem
will still be ckecked if a quota inode needs to be created.
 1.90.2.1  20-Jan-2011  bouyer Snapshot of work in progress on a modernised disk quota system:
- new quotactl syscall (versionned for backward compat), which takes
as parameter a path to a mount point, and a prop_dictionary
(in plistref format) describing commands and arguments.
For each command, status and data are returned as a prop_dictionary.
quota commands features will be added to take advantage of this,
exporting quota data or getting quota commands as plists.

- new on disk-format storage (all 64bit wide), integrated to metadata for
ffs (and playing nicely with wapbl).
Quotas are enabled on a ffs filesystem via superblock flags.
tunefs(8) can enable or disable quotas.
On a quota-enabled filesystem, fsck_ffs(8) will track per-uid/gid
block and inode usages, and will check and update quotas in Pass 6.
quota usage and limits are stored in unliked files (one for users,
one for groups)l fsck_ffs(8) will create the files if needed, or
free them if needed. This means that after enabling or disabling
quotas on a filesystem; a fsck_ffs(8) run is required.
quotacheck(8) is not needed any more, on a unclean shutdown
fsck or journal replay will take care of fixing quotas.
newfs(8) can create a ready-to-mount quota-enabled filesystem
(superblock flags are set and quota inodes are created).
Other new features or semantic changes:
- default quota datas, applied to users or groups which don't already
have a quota entry
- per-user/group grace time (instead of a filesystem global one)
- 0 really means "nothing allowed at all", not "no limit".
If you want "no limit", set the limit to UQUAD_MAX (tools will
understand "unlimited" and "-")

A quota file is structured as follow:
it starts with a header, containing a few per-filesystem values,
and the default quota limits.
Quota entries are linked together as a simple list, each entry has a
pointer (as an offset withing the file) to the next.
The header has a pointer to a list of free quota entries, and
a hash table of in-use entries. The size of the hash table depends
on the filesystem block size (header+hash table should fit in the
first block). The file is not sparse and is a multiple of
filesystem block size (when the free quota entry list is empty a new
filesystem block is allocated). quota entries to not cross
filesystem block boundaries.

In memory, the kernel keeps a cache of recently used quota entries
as a reference to the block number, and offset withing the block.
The quota entry itself is keept in the buf cache.

fsck_ffs(8), tunefs(8) and newfs(8) supports are completed (with
related atf tests :)
The kernel can update disk usage and report it via quotactl(2).

Todo: enforce quotas limits (limits are not checked by kernel yet)
update repquota, edquota and rpc.rquotad to the new world
implement compat_50_quotactl ioctl.
update quotactl(2) man page

fsck_ffs required fixes so that allocating new blocks or inodes will
properly update the superblock and cg sumaries. This was not an issue up
to now because superblock and cg sumaries check happened last, but now
allocations or frees can happen in pass 6.
 1.92.2.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.94.2.3  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.94.2.2  23-Jan-2013  yamt sync with head
 1.94.2.1  17-Apr-2012  yamt sync with head
 1.95.6.3  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.95.6.2  23-Jun-2013  tls resync from head
 1.95.6.1  25-Feb-2013  tls resync with head
 1.100.14.1  21-Apr-2017  bouyer Sync with HEAD
 1.100.10.1  20-Mar-2017  pgoyette Sync with HEAD
 1.101.12.2  21-Apr-2020  martin Sync with HEAD
 1.101.12.1  10-Jun-2019  christos Sync with HEAD
 1.101.10.1  20-Oct-2018  pgoyette Sync with head
 1.101.4.1  09-Oct-2018  martin Pull up following revision(s) (requested by hannken in ticket #1051):

sbin/fsck_ffs/setup.c: revision 1.102

Add a test for duplicate inodes on the persistent snapshot list.
 1.102.2.1  12-Jul-2025  martin Pull up following revision(s) (requested by mlelstv in ticket #1964):

sbin/fsck_ffs/setup.c: revision 1.110 (patch)

Don't truncate bitmap size to unsigned int, avoids crashes on filesystems
with more than 2^32 blocks.
 1.104.2.3  12-Jul-2025  martin Pull up following revision(s) (requested by mlelstv in ticket #1135):

sbin/fsck_ffs/setup.c: revision 1.110

Don't truncate bitmap size to unsigned int, avoids crashes on filesystems
with more than 2^32 blocks.
 1.104.2.2  13-May-2023  martin Pull up following revision(s) (requested by chs in ticket #161):

sbin/fsck_ffs/setup.c: revision 1.106
sbin/fsck_ffs/pass5.c: revision 1.57

ufs: more signed/unsigned fixes

Fix the previous signed/unsigned fixes to build on 32-bit,
including applying this commit from FreeBSD:

commit 2d34afcd04207cf3fa3d5b7f467a890eae75da41
Author: Kirk McKusick <mckusick@FreeBSD.org>
Date: Sun Oct 25 21:04:07 2020 +0000
Use proper type (ino_t) for inode numbers to avoid improper sign extention
in the Pass 5 checks. The manifestation was fsck_ffs exiting with this error:
** Phase 5 - Check Cyl groups
fsck_ffs: inoinfo: inumber 18446744071562087424 out of range
The error only manifests itself for filesystems bigger than about 100Tb.
Reported by: Nikita Grechikhin <ngrechikhin at yandex.ru>
MFC after: 2 weeks
Sponsored by: Netflix
 1.104.2.1  13-May-2023  martin Pull up following revision(s) (requested by chs in ticket #160):

usr.sbin/makefs/ffs/ffs_alloc.c: revision 1.31
sbin/tunefs/tunefs.c: revision 1.58
sbin/fsck_ffs/setup.c: revision 1.105
sbin/fsck_ffs/pass5.c: revision 1.56
usr.sbin/makefs/ffs.c: revision 1.74
usr.sbin/makefs/ffs/mkfs.c: revision 1.42
usr.sbin/makefs/Makefile: revision 1.40
sys/ufs/ffs/fs.h: revision 1.71
sbin/fsdb/fsdb.c: revision 1.54
sbin/resize_ffs/resize_ffs.c: revision 1.58
sbin/fsck_ffs/pass4.c: revision 1.29
usr.sbin/makefs/ffs/ffs_extern.h: revision 1.9
sbin/newfs/mkfs.c: revision 1.133
sys/ufs/ffs/ffs_alloc.c: revision 1.172
sbin/fsck_ffs/pass1b.c: revision 1.24
usr.sbin/dumpfs/dumpfs.c: revision 1.68
sys/ufs/ffs/ffs_extern.h: revision 1.88
usr.sbin/quotacheck/quotacheck.c: revision 1.51
sys/ufs/ffs/ffs_subr.c: revision 1.54
sbin/fsck_ffs/main.c: revision 1.91
sbin/fsck_ffs/pass1.c: revision 1.63

ufs: fixed signed/unsigned bugs affecting large file systems

Apply these commits from FreeBSD:
commit e870d1e6f97cc73308c11c40684b775bcfa906a2
Author: Kirk McKusick <mckusick@FreeBSD.org>
Date: Wed Feb 10 20:10:35 2010 +0000
This fix corrects a problem in the file system that treats large
inode numbers as negative rather than unsigned. For a default
(16K block) file system, this bug began to show up at a file system
size above about 16Tb.
To fully handle this problem, newfs must be updated to ensure that
it will never create a filesystem with more than 2^32 inodes. That
patch will be forthcoming soon.
Reported by: Scott Burns, John Kilburg, Bruce Evans
Followup by: Jeff Roberson
PR: 133980
MFC after: 2 weeks

commit 81479e688b0f643ffacd3f335b4b4bba460b769d
Author: Kirk McKusick <mckusick@FreeBSD.org>
Date: Thu Feb 11 18:14:53 2010 +0000
One last pass to get all the unsigned comparisons correct.

In additional to the changes from FreeBSD, this commit includes quite a few
related changes to appease -Wsign-compare.

RSS XML Feed