Home | History | Annotate | only in /src/usr.bin/indent
History log of /src/usr.bin/indent
RevisionDateAuthorComments
 1.7 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.6 16-May-2023  rillig indent: allow comments in column 1 to be formatted
 1.5 15-May-2023  rillig indent: clean up local indentation profile

The -eei option now works, the type hints are no longer necessary.
 1.4 15-May-2023  rillig indent: let indent format its own code

With manual corrections, as indent does not properly indent multi-line
'?:' expressions nor multi-line controlling expressions.
 1.3 26-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards, to compensate for the remaining bugs
in indent.

Without the type definitions in .indent.pro, the opening braces of the
functions kw_name and lexi_alnum would not be at the beginning of the
line.
 1.2 05-Oct-2021  rillig indent: run indent on lexi.c, with manual corrections

The variables 'keywords' and 'typenames' were indented using 8 spaces,
even though -di0 was in effect, which should result in a single space,
and -ut was in effect, which should result in a single tab instead of 8
spaces.

The option -eei does not work as advertised, the controlling expressions
are only indented by the normal amount, which easily leads to confusion
as to whether the code belongs to the condition or the following
statement.
 1.1 26-Sep-2021  rillig indent: add .indent.pro that almost matches the source code

One might expect that the code of indent is properly indented according
to its own capabilities, but that's not the case, there are many
deviations.

This indentation profile comes close to the existing code. Maybe someday
indent's own source code can be formatted using this profile, but before
attempting that, its remaining bugs have to be fixed.

Development of indent has essentially stopped somewhere around 1990, as
demonstrated by the wrong formatting of '...' that has only been fixed a
few minutes ago. The '...' is an invention of C90. Indent's parser still
considers '...' as consisting of the 3 tokens period-period-period, but
that's OK since the effect is the same.

Another feature that had been missing for a long time were C99 comments
that span from '//' to the next newline. Before March 2021, these were
parsed as a binary operator, which produced lots of funny side effects.

Since indent's code makes use of several C99 features, as soon as it can
properly indent its own code, the worst of these bugs will have been
fixed.
 1.15 13-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.14 07-Oct-2021  rillig indent: raise WARNS from the default 5 up to 6
 1.13 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.12 26-Mar-2021  rillig indent: remove workaround for array initialization bug in lint

The bug has been fixed in init.c 1.133 from 2021-03-25.
 1.11 14-Mar-2021  rillig indent: fix lint warnings

No functional change.
 1.10 12-Mar-2021  rillig indent: add helper functions for doing the actual output

This allows to add debug logging to these few functions instead of all
other places that might output something.

Reducing the possible output formats to a few primitives makes dump_line
simpler, especially the fprintf calls. It also removes the non-constant
printf string.

The call to output_int may be meant for debugging, as the character 0x80
is unlikely to appear in any real-world code.

No functional change.
 1.9 08-Mar-2021  rillig indent: make it easy to compile indent in debug mode
 1.8 07-Mar-2021  rillig indent: for the token types, use enum instead of #define

This makes it easier to step through the code in a debugger.

No functional change.
 1.7 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.6 08-Oct-2006  peter branches: 1.6.82;
WFORMAT is no more...
 1.5 11-Oct-2000  is More format string cleanup by sommerfeld.
 1.4 18-Oct-1997  mrg branches: 1.4.4; 1.4.12;
merge lite-2.
 1.3 09-Jan-1997  tls RCS ID police
 1.2 31-Jul-1993  mycroft Add RCS indentifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.4.12.1 18-Oct-2000  tv Pullup usr.bin string format fixes [is].
See "cvs log" for explicit revision numbers per file, from sommerfeld.
 1.4.4.1 19-Oct-2000  he Pull up revision 1.5 (requested by he):
Format string cleanup.
 1.6.82.1 10-Jun-2019  christos Sync with HEAD
 1.4 18-Nov-2021  rillig indent: replace old license discussion with a brief history section
 1.3 26-Mar-2021  rillig indent: don't claim that indent is "the nicest C pretty printer around"

That statement may have been true in 1993, but definitely is not true
anymore, as of 2021.

The part about "needs to be completely redone" is still true though
since indent cannot even format its own source code in an acceptable
way.
 1.2 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.1 09-Apr-1993  cgd branches: 1.1.1; 1.1.120;
added, from net/2 (patch 124).
 1.1.120.1 10-Jun-2019  christos Sync with HEAD
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.3 04-Jun-2023  rillig indent: remove trailing whitespace from README
 1.2 27-Nov-2021  rillig indent: reword README
 1.1 18-Nov-2021  rillig indent: replace old license discussion with a brief history section
 1.88 12-Dec-2024  rillig indent: add error handling for I/O errors

Suggested by lint2.
 1.87 10-Dec-2023  rillig branches: 1.87.2;
indent: be strict about options from profile files

Previously, the "option" 'xdi0' was treated the same as '-xdi0'.
 1.86 03-Dec-2023  rillig indent: inline input-related macros

No binary change.
 1.85 15-Jun-2023  rillig indent: miscellaneous cleanups, more tests for edge cases
 1.84 14-Jun-2023  rillig indent: reduce number of relocations

Since all command line options modify a member of struct options, there
is no need to encode that relocation 38 times.

No functional change.
 1.83 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.82 05-Jun-2023  rillig indent: do not report broken lines, report configuration on stderr
 1.81 05-Jun-2023  rillig indent: rename variables, clean up comments

No binary change.
 1.80 18-May-2023  rillig indent: rename a few functions

No functional change.
 1.79 18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.78 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.77 14-May-2023  rillig indent: remove foreign RCS IDs
 1.76 14-May-2023  rillig indent: miscellaneous cleanups
 1.75 13-May-2023  rillig indent: improve names of option variables

No functional change.
 1.74 13-May-2023  rillig indent: don't try to read from the file '(null)/.indent.pro'
 1.73 20-Jan-2023  rillig indent: fix misleading comment
 1.72 25-Nov-2021  rillig indent: make error message for missing command line arguments clearer
 1.71 19-Nov-2021  rillig indent: reduce casts to unsigned char for character classification

No functional change.
 1.70 07-Nov-2021  rillig indent: parse special options strictly
 1.69 05-Nov-2021  rillig indent: clean up argument parsing

In struct pro, place the dependent member below its dependency.

In load_profile, consistently use -1 when outside a comment.

No functional change.
 1.68 31-Oct-2021  rillig indent: clean up

Initialize buffers in reading order, make comments more expressive,
rename add_typename to register_typename, remove unused macro.

No functional change.
 1.67 29-Oct-2021  rillig indent: parse options in a platform-independent way

Previously, on an ILP32 platform, the option '-ts30000000000000000'
resulted in the error message 'must be an integer', on LP64 platforms it
resulted in the error message 'must be between 1 and 80'. Remove this
unnecessary difference.
 1.66 28-Oct-2021  rillig indent: clean up indentation, comments, reduce

No functional change.
 1.65 28-Oct-2021  rillig indent: fix error message for buffer overflow during option parsing

At this early time, the input file has not been opened yet, so there is
no reason to output either the input file name or the line number.
 1.64 28-Oct-2021  rillig indent: make error messages for option parsing more precise
 1.63 28-Oct-2021  rillig indent: parse option '-cli' strictly
 1.62 28-Oct-2021  rillig indent: topologically sort functions

No functional change.
 1.61 28-Oct-2021  rillig indent: change product name, update version number

NetBSD's indent has deviated enough from FreeBSD's indent to warrant a
different product name. When indent was copied from FreeBSD in 2019,
that update introduced several new bugs, some of which have been fixed
in the NetBSD version.

NetBSD indent, unlike FreeBSD indent, supports C99 comments and C99
initializer designators.
 1.60 26-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards, to compensate for the remaining bugs
in indent.

Without the type definitions in .indent.pro, the opening braces of the
functions kw_name and lexi_alnum would not be at the beginning of the
line.
 1.59 24-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards. Indent still does not get
extra_expr_indent correctly, it also indents global variables after
tagged declarations too deep.

No functional change.
 1.58 24-Oct-2021  rillig indent: rename nitems to array_length
 1.57 24-Oct-2021  rillig indent: sort includes
 1.56 17-Oct-2021  rillig indent: parse int command line options strictly

On i386 and other platforms where LONG_MAX == INT_MAX, the test
t_errors/option_tabsize_very_large failed since the behavior on integer
overflow differs between ILP32 and LP64 platforms. Noticed by gson@.

Avoid this unintended difference by adding reasonable limits for each of
the integer options and by replacing atoi with strtol.
 1.55 13-Oct-2021  rillig indent: check command line options stricter

Previously, bool options were allowed to have trailing garbage. For
example, the option '-bacc' could be spelled '-bacchus' as well.

Check that the exact option name is given in the command line, to
prevent typos in the configuration files and to reduce surprises just in
case a future option is a prefix of an existing option, or vice versa.

Add a new test program for error handling. Most of these tests are so
simple that it would be overkill to create 3 files for each test.
 1.54 08-Oct-2021  rillig indent: clean up argument handling

Sort the macros, remove redundancy from comment.

Remove redundant lint comment. Lint still does not recognize
__attribute__((__noreturn__)), but it also doesn't perform advanced
control flow analysis, so there is no point in having the comment, as it
doesn't suppress any warnings.

No functional change.
 1.53 08-Oct-2021  rillig indent: unexport add_typedefs_from_file

No functional change.
 1.52 08-Oct-2021  rillig indent: run indent on indent.h

The formatting looks mostly OK.

Some struct members had excessively long names, leaving no space for
their corresponding comments. Renamed some of them using well-known
abbreviations.

The formatting for debug_vis_range is messed up, no idea why. It is
clearly a function declaration, not a function definition, so there is
no need to place the function name in column 1.

No functional change.
 1.51 07-Oct-2021  rillig indent: rename opt.btype_2 to brace_same_line

No functional change.
 1.50 07-Oct-2021  rillig indent: fix wrong or outdated comments

No functional change.
 1.49 07-Oct-2021  rillig indent: remove global variable option_source

It is only needed at startup, while parsing the options. The string "?"
was not needed at all.

No functional change.
 1.48 07-Oct-2021  rillig indent: raise WARNS from the default 5 up to 6
 1.47 07-Oct-2021  rillig tests/indent: test parsing of command line options in profile file
 1.46 07-Oct-2021  rillig indent: complain if the profile from the command line does not exist
 1.45 07-Oct-2021  rillig indent: allow long comments in profile files

When reading a comment in a profile file, don't store the characters of
the comment in the buffer, just skip them. This allows for long comments
without triggering overflow errors.
 1.44 07-Oct-2021  rillig indent: prevent buffer overflow when reading profile
 1.43 03-Oct-2021  rillig indent: clean up load_profile

No functional change.
 1.42 03-Oct-2021  rillig indent: reduce duplicate code in load_profiles

No functional change.
 1.41 03-Oct-2021  rillig indent: rename functions

There was no good reason for using the different verbs 'scan' and 'set'
for two functions that essentially do the same.

No functional change.
 1.40 03-Oct-2021  rillig indent: fix content of profile_name

Previously, profile_name included the leading "-P", which was confusing.
 1.39 26-Sep-2021  rillig indent: unexport keyword table, clean up

No functional change.
 1.38 26-Sep-2021  rillig indent: force all option variables to be in struct options

No functional change.
 1.37 26-Sep-2021  rillig indent: reduce memory usage of the options table

Almost all boolean options are negatable, so model this directly instead
of saving each option twice. This saves memory, is faster and more
directly models reality.

No functional change.
 1.36 26-Sep-2021  rillig indent: list options in the same order as in the manual page

No functional change.
 1.35 26-Sep-2021  rillig indent: reduce code for listing the options

After this change, the few options that do not follow the standard
scheme become more visible. They are '-bl', '-br' and '-ta'.

No functional change.
 1.34 26-Sep-2021  rillig indent: negate and rename option.leave_comma

The old name did not mirror the description in the manual page, and it
was the only option that is negated. Inverting it allows the options
table to be compressed.
 1.33 26-Sep-2021  rillig indent: let indent format its own code -- in supervised mode

After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.

Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.

Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.

No functional change.
 1.32 26-Sep-2021  rillig indent: handle special options separately

Handling the special options separately removes the need for several
macro definitions. It saves a bit of memory since without the option
'--version', the option names are shorter.

No functional change.
 1.31 25-Sep-2021  rillig indent: reduce abstraction layer for defining boolean options

When initializing a boolean option, the most natural values are true and
false. Replace the previous values ON and OFF with them.

No functional change.
 1.30 25-Sep-2021  rillig indent: clean up argument handling

No functional change.
 1.29 25-Sep-2021  rillig indent: clean up argument handling

No functional change.
 1.28 25-Sep-2021  rillig indent: reduce binary size

No functional change.
 1.27 25-Sep-2021  rillig indent: rename option variable to be more expressive

No functional change.
 1.26 25-Sep-2021  rillig indent: convert options from ibool to bool

No functional change intended.
 1.25 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.24 25-Sep-2021  rillig indent: clean up initialization of options

The default values in 'struct pro' were redundant but all consistent,
even with the commented defaults in main_parse_command_line.

No functional change.
 1.23 25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.22 14-Mar-2021  rillig indent: fix lint warnings

No functional change.
 1.21 13-Mar-2021  rillig indent: distinguish between 'column' and 'indentation'

column == 1 + indentation.

In addition, indentation is a relative distance while column is an
absolute position. Therefore, don't confuse these two concepts, to
prevent off-by-one errors.

No functional change.
 1.20 13-Mar-2021  rillig indent: fix confusing variable names

The word 'col' should only be used for the 1-based column number. This
name is completely inappropriate for a line length since that provokes
off-by-one errors. The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.

Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.

No functional change.
 1.19 12-Mar-2021  rillig indent: use consistent indentation for 'else'

Half of the code used -ce, the other half the opposite -nce.

No functional change.
 1.18 09-Mar-2021  rillig indent: manually indent comments

It's strange that indent's own code is not formatted by indent itself,
which would be a good demonstration of its capabilities.

In its current state, I don't trust indent to get even the tokenization
correct, therefore the only safe way is to format the code manually.
 1.17 07-Mar-2021  rillig indent: sprinkle a few const

No functional change.
 1.16 07-Mar-2021  rillig indent: remove redundant parentheses around return value

No functional change.
 1.15 07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.14 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.13 22-Feb-2016  ginsbach branches: 1.13.16;
Stray '\n' in errx(3) format.
 1.12 22-Feb-2016  ginsbach Use errx(3).
 1.11 04-Sep-2014  mrg port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.10 12-Apr-2009  lukem branches: 1.10.24;
Fix WARNS=4 issues (-Wshadow -Wcast-qual -Wsign-compare)
 1.9 07-Aug-2003  agc branches: 1.9.42;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.8 14-Jul-2003  itojun use bounded string op
 1.7 26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.6 19-Dec-1998  christos char -> unsigned char, braces for gcc-2.8.1
 1.5 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.4 18-Oct-1997  mrg merge lite-2.
 1.3 09-Jan-1997  tls RCS ID police
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.9.42.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.10.24.1 21-Sep-2014  snj Pull up following revision(s) (requested by mrg in ticket #110):
usr.bin/indent/io.c: revision 1.15
usr.bin/indent/indent_globs.h: revision 1.10
usr.bin/indent/args.c: revision 1.11
usr.bin/indent/indent.1: revision 1.23
usr.bin/indent/indent.c: revision 1.19
port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.13.16.1 10-Jun-2019  christos Sync with HEAD
 1.87.2.1 02-Aug-2025  perseant Sync with HEAD
 1.74 04-Jan-2025  rillig indent: make debug log more uniform
 1.73 04-Jan-2025  rillig indent: make debug output easier readable

The previous format had the values of the parser state on the left side
and the corresponding names on the right side. While it looked nicely
aligned, it was not suitable for focusing on the actual data. Replace
this format with the more common "key: value" format.

Use the names of the enum constants in the debug log, instead of the
previous "nice" names that needed one more level of mental translation
and in some cases contained unbalanced punctuation such as '{'.
 1.72 03-Jan-2025  rillig indent: fix line breaks in else-if sequences

The flag ps.want_newline did not adequately model the conditions under
which a line break should be inserted, thus the redesign.

A welcome side effect is that in statements like 'if (cond);', the
semicolon is now placed on a separate line, thus becoming more visible.
 1.71 12-Dec-2024  rillig indent: add error handling for I/O errors

Suggested by lint2.
 1.70 27-Jun-2023  rillig branches: 1.70.2;
indent: fix 'blank line above first statement in function body'
 1.69 26-Jun-2023  rillig indent: implement 'blank line above first statement in function body'
 1.68 23-Jun-2023  rillig indent: properly store parser state in debug mode

The stacks in the parser state are allocated now and need to be copied
individually.

The test whether two paren stacks are equal was broken since 2023-06-14
14:11:28.
 1.67 17-Jun-2023  rillig indent: miscellaneous cleanups

No binary change.
 1.66 16-Jun-2023  rillig indent: merge lexer symbols for type in/outside parentheses
 1.65 16-Jun-2023  rillig indent: add debug output for typedef declarations
 1.64 16-Jun-2023  rillig indent: don't force a blank line between '}' and preprocessing line
 1.63 16-Jun-2023  rillig indent: rename a field of the parser state

The previous name 'comment_in_first_line' was misleading, as it could
mean that there was a comment in the first line of the file.

No functional change.
 1.62 15-Jun-2023  rillig indent: rename state variable to be more accurate

No binary change.
 1.61 14-Jun-2023  rillig indent: clean up the code, add a few tests
 1.60 14-Jun-2023  rillig indent: clean up array indexing for parser symbols

With 'top' pointing to the actual top element, the array was indexed in
the closed range from 0 to top. All other arrays are indexed by the
usual half-open interval from 0 to len.

No functional change.
 1.59 14-Jun-2023  rillig indent: allow more than 20 nested parentheses or brackets
 1.58 14-Jun-2023  rillig indent: clean up debugging code
 1.57 14-Jun-2023  rillig indent: clean up handling of comments

One less moving part in the parser state.

No functional change.
 1.56 14-Jun-2023  rillig indent: remove another flag from parser state

When processing a comment, the flag ps.next_col_1 was not used for the
next token, but for a line within a comment. As its scope was limited
to a single comment, there is no need to store it any longer than that

No functional change.
 1.55 14-Jun-2023  rillig indent: remove a redundant flag from the parser state

No functional change.
 1.54 14-Jun-2023  rillig indent: merge parser symbols for stmt and stmt_list

They were handled in exactly the same way.
 1.53 10-Jun-2023  rillig indent: rename misleading variable

The name started with 'line_start', but the value is not always the
value from the beginning of the line.

No functional change.
 1.52 10-Jun-2023  rillig indent: fix debug output

When the parser state was first printed, there were unintended diff
markers. Treat the previous lexer symbol like the other parts of the
parser state, as omitting it from the diff output is confusing.
 1.51 10-Jun-2023  rillig indent: fix line break between semicolon and brace
 1.50 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.49 10-Jun-2023  rillig indent: clean up function names, fix blank lines in debug output
 1.48 10-Jun-2023  rillig indent: distinguish blank lines from newline characters
 1.47 10-Jun-2023  rillig indent: clean up debug output

In diff mode, don't print a diff of the very first parser state, instead
print its full state.

Don't print headings for empty sections of the parser state.
 1.46 10-Jun-2023  rillig indent: clean up function and variable names
 1.45 10-Jun-2023  rillig indent: explain right-aligned code
 1.44 10-Jun-2023  rillig indent: rename and sort variables in parser state

No functional change.
 1.43 09-Jun-2023  rillig indent: sync debug information for lexer symbols
 1.42 09-Jun-2023  rillig indent: don't treat function call expressions as cast expressions
 1.41 09-Jun-2023  rillig indent: when an indentation is ambiguous, indent one level further

The '-eei' mode now applies whenever the indentation from a multi-line
expression could be confused with a following statement.
 1.40 08-Jun-2023  rillig indent: remove fragile heuristic for detecting cast expressions

The assumption that in an expression of the form '(a * anything)', the
'*' marks a pointer type was too simple-minded.

For now, fix the obvious cases and leave the others for later. If
needed, they can be worked around using the '-T' option.
 1.39 07-Jun-2023  rillig indent: extract the stack of parser symbols to a separate struct

No functional change.
 1.38 07-Jun-2023  rillig indent: send all debug output to the same stream
 1.37 06-Jun-2023  rillig indent: sort functions in call order

No functional change.
 1.36 05-Jun-2023  rillig indent: improve layout of debug output
 1.35 05-Jun-2023  rillig indent: sync debug output with parser state
 1.34 04-Jun-2023  rillig indent: remove read pointer from buffers that don't need it

The only buffer that needs a read pointer is the current input line in
'inp'.

No functional change.
 1.33 04-Jun-2023  rillig indent: track the kind of '{' on the parser stack
 1.32 04-Jun-2023  rillig indent: fix debug output of the parser symbol stack

Even though the stack always contains a stmt_list as first element,
print it nevertheless to avoid confusion about starting at index 1, and
to provide the full picture.
 1.31 04-Jun-2023  rillig indent: rename struct field, for better symmetry

No binary change outside debug mode.
 1.30 04-Jun-2023  rillig lint: use separate lexer symbols for 'case' and 'default'

It's not strictly necessary since these tokens behave in the same way,
still, the code is more straight-forward when there are separate tokens.
 1.29 04-Jun-2023  rillig indent: classify 'inline' as a modifier rather than a word
 1.28 04-Jun-2023  rillig indent: use separate lexer symbols for the different kinds of ':'
 1.27 04-Jun-2023  rillig indent: handle the indentation of 'case' in a simpler way
 1.26 04-Jun-2023  rillig indent: separate code for handling parentheses and brackets

Handling parentheses is more complicated than for brackets.
 1.25 02-Jun-2023  rillig indent: clean up

Only print the 'token' buffer in debug mode if it is interesting, group
the blocks in handling of '(' tokens by topic, remove obsolete comment
from test.
 1.24 02-Jun-2023  rillig indent: fix formatting of declarations with preprocessing lines
 1.23 23-May-2023  rillig indent: fix indentation of struct declarations
 1.22 23-May-2023  rillig indent: split debug output into paragraphs

The paragraphs separate the different processing steps: getting a token
from the lexer, processing the token, updating the parser state, sending
a finished line to the output.
 1.21 23-May-2023  rillig indent: fix spacing in declarations in for loops
 1.20 22-May-2023  rillig indent: implement suppressing optional blank lines
 1.19 20-May-2023  rillig indent: extract the output state from the parser state

The parser state depends on the preprocessing lines, the output state
shouldn't.
 1.18 20-May-2023  rillig indent: implement blank line above block comment
 1.17 20-May-2023  rillig indent: implement blank line after function body
 1.16 20-May-2023  rillig indent: implement blank lines around conditional compilation
 1.15 20-May-2023  rillig indent: add debug logging for brace indentation

No functional change outside debug mode, as the initialization of
di_stack[0] was redundant.
 1.14 18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.13 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.12 17-May-2023  rillig indent: fix indentation in preprocessor line

No binary change.
 1.11 16-May-2023  rillig indent: allow comments in column 1 to be formatted
 1.10 16-May-2023  rillig indent: remove support for form feed characters inside a line

Form feeds are occasionally used to split code into pages, and this use
is still supported. Having a form feed in the middle of a line is
exotic.
 1.9 15-May-2023  rillig indent: clean up detection of whether parentheses form a cast

No functional change.
 1.8 15-May-2023  rillig indent: indent multi-line conditions

No functional change.
 1.7 15-May-2023  rillig indent: document feature toggle for debugging output
 1.6 15-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.5 15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.4 13-May-2023  rillig indent: implement 'blank after declarations'
 1.3 13-May-2023  rillig indent: use enum instead of magic numbers for tracking declarations

No functional change.
 1.2 13-May-2023  rillig indent: add debug logging for enum token classification
 1.1 13-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.70.2.1 02-Aug-2025  perseant Sync with HEAD
 1.33 09-Jun-2023  rillig indent: indent multi-line expressions according to parentheses

This reverts the FreeBSD change from 2004-02-12 that had been imported
on 2019-04-04.
 1.32 05-Jun-2023  rillig indent: do not report broken lines, report configuration on stderr
 1.31 12-May-2023  rillig indent: sync manual page with recent changes
 1.30 26-Sep-2021  rillig indent: fix definition of -cli in manual page

See io.c, compute_label_indent.
 1.29 06-Mar-2021  rillig indent.1: sort options alphabetically
 1.28 04-Apr-2019  wiz New sentence, new line. Whitespace fixes.
 1.27 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.26 25-Feb-2016  wiz branches: 1.26.16;
Remove trailing whitespace.
 1.25 24-Feb-2016  ginsbach Remove double space before [.,:] in macro arguments.
 1.24 24-Feb-2016  ginsbach Add the [n]ei and [n]eei options to the synopsis; already documented in
description.
 1.23 04-Sep-2014  mrg port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.22 13-Oct-2012  njoly branches: 1.22.8;
Remove a few unneeded Pp macros.
 1.21 08-Apr-2012  wiz branches: 1.21.2;
Remove unnecessary Bk/Ek pairs from SYNOPSIS.
No effective change except where I used the opportunity to sort options
and/or option descriptions.
 1.20 12-Jan-2011  wiz branches: 1.20.6;
Spell out parenthesis. From Ryo HAYASAKA in PR 44372.
 1.19 24-Mar-2009  joerg Remove physical markup.
 1.18 11-Sep-2005  wiz branches: 1.18.30;
When marking up "C", use .Tn consisntently. From YOMURA Masanori in private mail.
 1.17 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.16 25-Feb-2003  wiz .Nm does not need a dummy argument ("") before punctuation or
for correct formatting of the SYNOPSIS any longer.
 1.15 01-Dec-2001  wiz .Pp not necessary before or after .Ss/.Sh.
 1.14 20-Jul-2001  kristerw Correct a minor nroff nit.
This closes PR bin/9220.
 1.13 23-Mar-2001  fair Correct one typo in the patch from PR 9220.
 1.12 16-Mar-2001  fair Commit patch from PR 9220 to document all options, and consistently
document defaults. Also, clean up nroff nits.
 1.11 24-Mar-1999  mycroft Remove a spurious .ne.
 1.10 22-Mar-1999  garbled More and more .Os cleanups. .Os is defined in the tmac.doc-common file,
so we shouldn't override it with versions in the manpages. Many more to
come.
 1.9 07-Mar-1999  mycroft Clean up SYNOPSIS formatting.
 1.8 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.7 18-Oct-1997  mrg merge lite-2.
 1.6 09-Jan-1997  tls RCS ID police
 1.5 27-Sep-1995  jtc fix formatting of example; PR #1535
 1.4 11-Jan-1994  jtc Fix spelling errors.
 1.3 07-Aug-1993  cgd do block commenting, if comment begins with slash-star-newline.
 1.2 01-Aug-1993  mycroft Add RCS indentifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 30-Dec-1993  mrg 4.4BSD-Lite2
 1.18.30.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.20.6.2 30-Oct-2012  yamt sync with head
 1.20.6.1 17-Apr-2012  yamt sync with head
 1.21.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.22.8.1 21-Sep-2014  snj Pull up following revision(s) (requested by mrg in ticket #110):
usr.bin/indent/io.c: revision 1.15
usr.bin/indent/indent_globs.h: revision 1.10
usr.bin/indent/args.c: revision 1.11
usr.bin/indent/indent.1: revision 1.23
usr.bin/indent/indent.c: revision 1.19
port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.26.16.1 10-Jun-2019  christos Sync with HEAD
 1.396 07-Jan-2025  rillig indent: condense and simplify parsing code
 1.395 04-Jan-2025  rillig indent: fix indentation of adjacent multi-line initializers

The main topic of this change is parse.c:66, which makes the indentation
of statements uniform with the indentation of other parser symbols.

That change had the side effect of messing up the indentation of files
whose first line does not start in column 1, such as in ps_ind_level.c.
To fix this side effect, the initial indentation must be determined
before pushing the placeholder token psym_stmt during initialization.
 1.394 04-Jan-2025  rillig indent: make debug log more uniform
 1.393 04-Jan-2025  rillig indent: make debug output easier readable

The previous format had the values of the parser state on the left side
and the corresponding names on the right side. While it looked nicely
aligned, it was not suitable for focusing on the actual data. Replace
this format with the more common "key: value" format.

Use the names of the enum constants in the debug log, instead of the
previous "nice" names that needed one more level of mental translation
and in some cases contained unbalanced punctuation such as '{'.
 1.392 03-Jan-2025  rillig indent: fix line breaks in else-if sequences

The flag ps.want_newline did not adequately model the conditions under
which a line break should be inserted, thus the redesign.

A welcome side effect is that in statements like 'if (cond);', the
semicolon is now placed on a separate line, thus becoming more visible.
 1.391 12-Dec-2024  rillig indent: add error handling for I/O errors

Suggested by lint2.
 1.390 03-Dec-2023  rillig branches: 1.390.2;
indent: inline input-related macros

No binary change.
 1.389 03-Dec-2023  rillig indent: group input-related variables into a struct

No functional change.
 1.388 03-Dec-2023  rillig indent: use line number of the token start in diagnostics

Previously, the line number of the end of the token was used, which was
confusing in debug mode.
 1.387 27-Jun-2023  rillig indent: fix 'blank line above first statement in function body'
 1.386 26-Jun-2023  rillig indent: implement 'blank line above first statement in function body'
 1.385 26-Jun-2023  rillig indent: in -bad mode, don't add a blank line above a comment or '}'
 1.384 25-Jun-2023  rillig indent: move cast detection from the lexer to the main processor

It is not the job of the lexer to modify the parser state.
 1.383 25-Jun-2023  rillig indent: fix formatting of parenthesized name in function definition
 1.382 23-Jun-2023  rillig indent: properly store parser state in debug mode

The stacks in the parser state are allocated now and need to be copied
individually.

The test whether two paren stacks are equal was broken since 2023-06-14
14:11:28.
 1.381 18-Jun-2023  rillig indent: remove support for backspace in code and comments

The C code in the whole tree does not contain a single literal
backspace.
 1.380 17-Jun-2023  rillig indent: miscellaneous cleanups

No binary change.
 1.379 16-Jun-2023  rillig indent: merge lexer symbols for type in/outside parentheses
 1.378 16-Jun-2023  rillig indent: fix spacing between postfix operator and left parenthesis
 1.377 16-Jun-2023  rillig indent: improve heuristics for cast expressions
 1.376 16-Jun-2023  rillig indent: improve heuristics for cast expressions
 1.375 16-Jun-2023  rillig indent: improve heuristics for casts
 1.374 16-Jun-2023  rillig indent: fix indentation and linebreaks in typedef declarations
 1.373 16-Jun-2023  rillig indent: don't force a blank line between '}' and preprocessing line
 1.372 15-Jun-2023  rillig indent: consolidate handling of statement continuations
 1.371 15-Jun-2023  rillig indent: rename state variable to be more accurate

No binary change.
 1.370 15-Jun-2023  rillig indent: fix indentation of multi-line enum constant initializers
 1.369 15-Jun-2023  rillig indent: miscellaneous cleanups, more tests for edge cases
 1.368 15-Jun-2023  rillig indent: fix alignment of multi-line declarations
 1.367 14-Jun-2023  rillig indent: clean up the code, add a few tests
 1.366 14-Jun-2023  rillig indent: allow more than 128 brace levels
 1.365 14-Jun-2023  rillig indent: clean up array indexing for parser symbols

With 'top' pointing to the actual top element, the array was indexed in
the closed range from 0 to top. All other arrays are indexed by the
usual half-open interval from 0 to len.

No functional change.
 1.364 14-Jun-2023  rillig indent: allow more than 20 nested parentheses or brackets
 1.363 14-Jun-2023  rillig indent: merge duplicate code
 1.362 14-Jun-2023  rillig indent: fix formatting of comment after 'switch (expr)'
 1.361 14-Jun-2023  rillig indent: use correct preprocessing directive in error message
 1.360 14-Jun-2023  rillig indent: allow more than 5 levels of #if/#endif
 1.359 14-Jun-2023  rillig indent: remove another flag from parser state

When processing a comment, the flag ps.next_col_1 was not used for the
next token, but for a line within a comment. As its scope was limited
to a single comment, there is no need to store it any longer than that

No functional change.
 1.358 14-Jun-2023  rillig indent: merge parser symbols for stmt and stmt_list

They were handled in exactly the same way.
 1.357 10-Jun-2023  rillig indent: rename misleading variable

The name started with 'line_start', but the value is not always the
value from the beginning of the line.

No functional change.
 1.356 10-Jun-2023  rillig indent: fix debug output

When the parser state was first printed, there were unintended diff
markers. Treat the previous lexer symbol like the other parts of the
parser state, as omitting it from the diff output is confusing.
 1.355 10-Jun-2023  rillig indent: fix line break between semicolon and brace
 1.354 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.353 10-Jun-2023  rillig indent: in debug mode, null-terminate buffers
 1.352 10-Jun-2023  rillig indent: fix indentation of continuation lines in initializers
 1.351 10-Jun-2023  rillig indent: clean up function and variable names
 1.350 10-Jun-2023  rillig indent: fix token classification in declarations

As a side effect, indent handles _Generic from C11 properly now, at
least in -nlp mode.
 1.349 10-Jun-2023  rillig indent: rename and sort variables in parser state

No functional change.
 1.348 09-Jun-2023  rillig indent: trim trailing blank lines
 1.347 09-Jun-2023  rillig indent: group lexer symbols by topic, sort processing functions

No functional change.
 1.346 09-Jun-2023  rillig indent: support C99 compound literals
 1.345 09-Jun-2023  rillig indent: don't treat function call expressions as cast expressions
 1.344 09-Jun-2023  rillig indent: eliminate unused variable

No functional change.
 1.343 09-Jun-2023  rillig indent: when an indentation is ambiguous, indent one level further

The '-eei' mode now applies whenever the indentation from a multi-line
expression could be confused with a following statement.
 1.342 09-Jun-2023  rillig indent: format its own code
 1.341 08-Jun-2023  rillig indent: remove fragile heuristic for detecting cast expressions

The assumption that in an expression of the form '(a * anything)', the
'*' marks a pointer type was too simple-minded.

For now, fix the obvious cases and leave the others for later. If
needed, they can be worked around using the '-T' option.
 1.340 08-Jun-2023  rillig indent: fix indentation of initializer lists with designators
 1.339 08-Jun-2023  rillig indent: clean up and condense code

No functional change.
 1.338 07-Jun-2023  rillig indent: extract the stack of parser symbols to a separate struct

No functional change.
 1.337 06-Jun-2023  rillig indent: compute indentation of 'case' labels on-demand

One less moving part to keep track of.

No functional change.
 1.336 05-Jun-2023  rillig indent: in 'if (expr)', the parentheses do not form a cast expression

No functional change. When stepping through the code in debug mode, it
was just too confusing that indent would log an 'unknown cast' in this
situation.
 1.335 05-Jun-2023  rillig indent: format own source code
 1.334 05-Jun-2023  rillig indent: don't remove blank line after 'if (expr) {'
 1.333 05-Jun-2023  rillig indent: do not report broken lines, report configuration on stderr
 1.332 05-Jun-2023  rillig indent: fix formatting of 'do' statements
 1.331 05-Jun-2023  rillig indent: make heuristics for '*' pointer types simpler

Previously, a '}' token did not reset the state machine, but it should.
 1.330 05-Jun-2023  rillig indent: fix trailing whitespace after comment
 1.329 05-Jun-2023  rillig indent: rename variables, clean up comments

No binary change.
 1.328 05-Jun-2023  rillig indent: clean up handling of whitespace

No functional change.
 1.327 04-Jun-2023  rillig indent: remove read pointer from buffers that don't need it

The only buffer that needs a read pointer is the current input line in
'inp'.

No functional change.
 1.326 04-Jun-2023  rillig indent: track the kind of '{' on the parser stack
 1.325 04-Jun-2023  rillig indent: ensure that the 'block init level' never goes negative

No functional change.
 1.324 04-Jun-2023  rillig indent: rename struct field, for better symmetry

No binary change outside debug mode.
 1.323 04-Jun-2023  rillig indent: fix formatting of compound expressions, at least partially
 1.322 04-Jun-2023  rillig lint: use separate lexer symbols for 'case' and 'default'

It's not strictly necessary since these tokens behave in the same way,
still, the code is more straight-forward when there are separate tokens.
 1.321 04-Jun-2023  rillig indent: classify 'inline' as a modifier rather than a word
 1.320 04-Jun-2023  rillig indent: use separate lexer symbols for the different kinds of ':'
 1.319 04-Jun-2023  rillig indent: handle the indentation of 'case' in a simpler way
 1.318 04-Jun-2023  rillig indent: separate code for handling parentheses and brackets

Handling parentheses is more complicated than for brackets.
 1.317 03-Jun-2023  rillig indent: fix indentation of adjacent '{'
 1.316 03-Jun-2023  rillig indent: clean up handling of brace indentation

No functional change.
 1.315 02-Jun-2023  rillig indent: force each statement on a new line

Previously, '{} while (cond)' was kept on a single line, even though the
'while' was independent of the '{}'.
 1.314 02-Jun-2023  rillig indent: remove newline between 'switch' and '{'
 1.313 02-Jun-2023  rillig indent: improve heuristics of classifying '*' as pointer or operator
 1.312 02-Jun-2023  rillig indent: clean up

Only print the 'token' buffer in debug mode if it is interesting, group
the blocks in handling of '(' tokens by topic, remove obsolete comment
from test.
 1.311 02-Jun-2023  rillig indent: fix formatting of declarations with preprocessing lines
 1.310 23-May-2023  rillig indent: separate code for handling enums from the lexer

The lexer's responsibility is to generate tokens, it's not supposed to
update the parser state. Centralize the state transitions that control
indentation of enum constants to keep the lexer code clean.

Skip comments, newlines and preprocessing lines when updating the parser
state for enum constants and for '*' in declarations.
 1.309 23-May-2023  rillig indent: fix indentation of struct declarations
 1.308 23-May-2023  rillig indent: split debug output into paragraphs

The paragraphs separate the different processing steps: getting a token
from the lexer, processing the token, updating the parser state, sending
a finished line to the output.
 1.307 23-May-2023  rillig indent: extract processing of a single token to separate function

No functional change.
 1.306 23-May-2023  rillig indent: fix spacing around '*' in declarations
 1.305 23-May-2023  rillig indent: fix spacing in declarations in for loops
 1.304 22-May-2023  rillig indent: fix spacing between block braces
 1.303 22-May-2023  rillig indent: implement suppressing optional blank lines
 1.302 21-May-2023  rillig indent: don't read out-of-bounds memory in preprocessing lines

(Since a few minutes.)

If a line '#if 0' was followed by an unlikely line '#', the second line
was interpreted as '#if' as well.

To detect this bug automatically, a dynamic analysis tool would need to
know that only the memory between lab.mem and lab.mem + lab.len has
defined content. This constraint, in turn, would throw up at the bottom
of copy_comment_wrap, which for a brief moment intentionally violates
this constraint.
 1.301 21-May-2023  rillig indent: don't error out on unrecognized preprocessor directives

This allows indent to be used on the GCC preprocessor output.
 1.300 20-May-2023  rillig indent: remove redundant checks in processing of '}'

No functional change.
 1.299 20-May-2023  rillig indent: extract the output state from the parser state

The parser state depends on the preprocessing lines, the output state
shouldn't.
 1.298 20-May-2023  rillig indent: implement blank line after function body
 1.297 20-May-2023  rillig indent: implement blank lines around conditional compilation
 1.296 20-May-2023  rillig indent: add debug logging for brace indentation

No functional change outside debug mode, as the initialization of
di_stack[0] was redundant.
 1.295 18-May-2023  rillig indent: remove detailed rules for blank before comment
 1.294 18-May-2023  rillig indent: rename a few functions

No functional change.
 1.293 18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.292 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.291 18-May-2023  rillig indent: remove unnecessary variable size optimization

Due to the enum that follows in the struct, the short variable was
padded to 4 bytes anyway.

No functional change.
 1.290 16-May-2023  rillig indent: directly access the input buffer

No functional change.
 1.289 16-May-2023  rillig indent: remove support for form feed characters inside a line

Form feeds are occasionally used to split code into pages, and this use
is still supported. Having a form feed in the middle of a line is
exotic.
 1.288 16-May-2023  rillig indent: remove blank between comment and parentheses or brackets

Finally, indent formats its own source code without messing up the
layout.
 1.287 16-May-2023  rillig indent: fix handling of INDENT OFF/ON comments

Previously, the 'INDENT OFF' comments were interpreted when the newline
token from the line above the comment was processed, which was earlier
than could be reasonably expected.

The 'INDENT ON' comments were interpreted equally early, which led to
the situation that the 'INDENT OFF' comments were preserved literally
but the 'INDENT ON' comments weren't.
 1.286 15-May-2023  rillig indent: clean up detection of whether parentheses form a cast

No functional change.
 1.285 15-May-2023  rillig indent: fix cast detection

In process_lparen_or_lbracket, ps.paren[...].maybe_cast was not
initialized, which may have been the cause for seemingly random spacing
around binary operators.

While here, clean up the code by reducing the number of accesses to the
parser state.
 1.284 15-May-2023  rillig indent: fix detection of casts

A word followed by a '(' does not start a cast expression.
 1.283 15-May-2023  rillig indent: fix type cast in function definition
 1.282 15-May-2023  rillig indent: fix duplicate space between comment and binary operator
 1.281 15-May-2023  rillig indent: format its own code, extend some comments

With manual corrections, as there are still some bugs left.

No functional change.
 1.280 15-May-2023  rillig indent: improve type guessing, fix formatting of declarations
 1.279 15-May-2023  rillig indent: fix spacing between function prototype and attributes
 1.278 15-May-2023  rillig indent: fix indentation of struct member names
 1.277 15-May-2023  rillig indent: indent multi-line conditions

No functional change.
 1.276 15-May-2023  rillig indent: fix indentation of statements after controlling expression
 1.275 15-May-2023  rillig indent: fix indentation of expressions in -nlp -eei mode
 1.274 15-May-2023  rillig indent: fix indentation of multi-line '?:' expressions in functions
 1.273 15-May-2023  rillig indent: let indent format its own code

With manual corrections, as indent does not properly indent multi-line
'?:' expressions nor multi-line controlling expressions.
 1.272 15-May-2023  rillig indent: fix spacing in for loop with declaration (since 2022-02-13)
 1.271 15-May-2023  rillig indent: remove redundant include lines
 1.270 15-May-2023  rillig indent: clean up memory allocation

No functional change.
 1.269 15-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.268 15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.267 14-May-2023  rillig indent: only null-terminate the buffers if necessary

The only case where a buffer is used as a C-style string is when looking
up a keyword.

No functional change.
 1.266 14-May-2023  rillig indent: remove foreign RCS IDs
 1.265 14-May-2023  rillig indent: miscellaneous cleanups
 1.264 13-May-2023  rillig indent: prevent undefined behavior on unbalanced parentheses
 1.263 13-May-2023  rillig indent: do not add a blank at the beginning of a line

Most calls to output_line did already reset the variable. There may be
some untested edge cases in or after comments, but these should be fine
as well.
 1.262 13-May-2023  rillig indent: do not add a space before a comment that starts a line
 1.261 13-May-2023  rillig indent: replace __dead functions with return statements

No functional change.
 1.260 13-May-2023  rillig indent: use enum instead of magic numbers for tracking declarations

No functional change.
 1.259 13-May-2023  rillig indent: rename struct fields for buffers

No binary change except for assertion line numbers.
 1.258 13-May-2023  rillig indent: clean up a condition, add comments

No functional change.
 1.257 13-May-2023  rillig indent: preserve indentation of preprocessor directives
 1.256 12-May-2023  rillig indent: rename placeholder symbol for parser stack

No functional change outside debug mode.
 1.255 12-May-2023  rillig indent: remove code for parsing declarations without semicolon

The statement from the comment that declarations do not need semicolons
is wrong. A possible input that matched this rule is 'void f(void) { int
a }'.
 1.254 12-May-2023  rillig indent: remove statistics

The numbers from the statistics were wrong.
 1.253 12-May-2023  rillig indent: condense code for handling spaced expressions

No functional change outside debug mode.
 1.252 11-May-2023  rillig indent: don't touch comments in preprocessing lines

The indentation of multi-line comments was wrong, and the code for
handling them was too complicated.
 1.251 11-May-2023  rillig indent: remove broken code for handling blank lines

This fixes several bugs where blank lines were erroneously added or
removed, treating these old bugs for new bugs in different places.
These new bugs are expected to be easier to fix, as the old bugs will
not interfere anymore.
 1.250 11-May-2023  rillig indent: move parser state variables to the parser_state struct

Include the variables in the debug output.
 1.249 11-May-2023  rillig indent: eliminate a local variable for else-if handling

No functional change intended.
 1.248 11-May-2023  rillig indent: move force_nl into the parser state

This way, it is included in the debug output.

No functional change.
 1.247 11-May-2023  rillig indent: remove unnecessary assignments to last_else

No functional change intended.
 1.246 11-May-2023  rillig indent: remove buggy code for swapping tokens

It is not the job of an indenter to swap tokens, even if it's only about
placing comments elsewhere. The code that swapped the tokens was
complicated, buggy and impossible to understand.

In -br (brace right) mode, indent no longer moves a '{' from the
beginning of a line to the end of the previous line, as that was handled
by the token swapping code as well. This change is unintended, but it
will be easier to re-add that now that the code is simpler.
 1.245 09-May-2022  rillig indent: clean up control flow, remove Capsicum

No functional change.
 1.244 23-Apr-2022  rillig indent: group global variables related to output control

No functional change.
 1.243 23-Apr-2022  rillig indent: remove Capsicum support

NetBSD doesn't have Capsicum.
 1.242 13-Feb-2022  rillig indent: rename parser_state.p_l_follow and paren_level

The previous variable names were misleading.

Paren_level is not the current level of parentheses but the one from the
beginning of the current output line. For better accuracy, rename it to
line_start_paren_level.

P_l_follow is not the level of parentheses that will be active at some
point in the future, as the previous name suggested. Instead, it is the
level of parentheses right now. For better accuracy, rename it to
nparen. This nicely matches its main usage, which is as index to the
parser_state.paren array.

No binary change.
 1.241 13-Feb-2022  rillig indent: replace bitmasking code with struct

The struct directly represents the properties of a pair of parentheses,
without forcing the human reader to decode any bitset. This makes it
easier to find the remaining bugs in the heuristic for determining the
kind of parentheses.

No functional change outside debug mode.
 1.240 13-Feb-2022  rillig indent: change parser_state.cast_mask to 0-based indexing

Having 1-based indexing was completely unexpected, and it didn't match
the 0-based indexing of parser_state.paren_indents.

No functional change.
 1.239 28-Nov-2021  rillig indent: treat L"string" as a single token

There is never whitespace between the 'L' and the string literal or the
character constant. There might be a backslash-newline between them, but
that case was not handled before either.

No functional change.
 1.238 28-Nov-2021  rillig indent: clean up and document input handling

The transformation of moving comments from after an 'if (expr)' after
the following brace has a large implementation cost (about 300 lines of
code) and makes input handling quite complicated. Document the overall
idea to save future readers some time.

No functional change.
 1.237 27-Nov-2021  rillig indent: accept a few formatting suggestions from indent

The remaining issues are still that the conditions look ambiguous even
with -eei, and that __attribute__ is broken into a separate line.

No functional change.
 1.236 27-Nov-2021  rillig indent: rename dump functions to output

No functional change.
 1.235 27-Nov-2021  rillig indent: inline switch_buffer

The function name was not accurate all the time. Now that
inp_from_comment is a separate function, it doesn't make sense anymore
to offload the 3 simple statements to a separate function.

No functional change.
 1.234 26-Nov-2021  rillig indent: add buf_add_range for adding characters to a buffer

No functional change.
 1.233 26-Nov-2021  rillig indent: move ind_add from io.c to indent.c

It's a general-purpose function that is not directly related to input or
output.
 1.232 25-Nov-2021  rillig indent: rename ps.in_function_parameters to match reality

This flag is only set while parsing the parameters of a function
definition, but not for a function declaration. See buffer_add in the
test fmt_decl.

No functional change.
 1.231 25-Nov-2021  rillig indent: rename ps.in_stmt to in_stmt_or_decl

The previous name didn't match reality.

No functional change.
 1.230 25-Nov-2021  rillig indent: rename ps.ind_stmt to in_stmt_cont

This makes a comment redundant.

No functional change.
 1.229 25-Nov-2021  rillig indent: clean up style

No functional change.
 1.228 19-Nov-2021  rillig indent: reduce casts to unsigned char for character classification

No functional change.
 1.227 19-Nov-2021  rillig indent: fix included headers
 1.226 19-Nov-2021  rillig indent: replace ps.procname with ps.is_function_definition

Omly the first character of ps.procname was ever read, and it was only
compared to '\0'. Using a bool for this means simpler code, less
memory and fewer wasted CPU cycles due to the removed strncpy.

No functional change.
 1.225 19-Nov-2021  rillig indent: remove all references to inbuf from indent.c

No functional change.
 1.224 19-Nov-2021  rillig indent: move character input handling from indent.c to io.c

No functional change.
 1.223 19-Nov-2021  rillig indent: move character input from indent.c to io.c

No functional change.
 1.222 19-Nov-2021  rillig indent: replace direct access to the input buffer

This is a preparation for abstracting away all the low-level details of
handling the input. The goal is to fix the current bugs regarding line
number counting, out of bounds memory access, and generally unreadable
code.

No functional change.
 1.221 19-Nov-2021  rillig indent: add debug logging for input buffer handling
 1.220 19-Nov-2021  rillig indent: rename input buffer variables

From reading the names 'save_com' and 'sc_end', it was not obvious
enough that these two variables are the limits of the same buffer, the
names were just too unrelated.

No functional change.
 1.219 19-Nov-2021  rillig indent: group variables for input handling

No functional change.
 1.218 07-Nov-2021  rillig indent: fix handling of C99 comments after 'if (expr)'
 1.217 07-Nov-2021  rillig indent: demonstrate disappearing form feed
 1.216 07-Nov-2021  rillig indent: various cleanups

Make several comments more precise.

Rename process_end_of_file to process_eof to match the token name.

Change the order of assignments in analyze_comment to keep the com_ind
computations closer together.

In copy_comment_wrap, use pointer difference instead of pointer addition
to stay away from undefined behavior.

No functional change.
 1.215 07-Nov-2021  rillig indent: rename ps.decl_nest to decl_level

This better matches the comment.

No functional change.
 1.214 07-Nov-2021  rillig indent: reduce negations in process_else, clean up comments

No functional change.
 1.213 07-Nov-2021  rillig indent: only access buffer data in the range [buf.s, buf.e)

No functional change.
 1.212 07-Nov-2021  rillig indent: rename type_at_paren_level_0 to type_outside_parentheses

For symmetry with type_in_parentheses.

No functional change.
 1.211 07-Nov-2021  rillig indent: distinguish between typename in parentheses and other words

This gets rid of two members of parser_state. No functional change for
well-formed programs. The sequence of '++int' or '--size_t' may be
formatted differently than before, but no program is expected to contain
that sequence.

Rename lsym_ident to lsym_word since 'ident' was too specific. This
token type is used for constants and string literals as well. Strictly
speaking, a string literal is not a word, but at least it's better than
before.
 1.210 07-Nov-2021  rillig indent: rename 'inbuf' functions to 'inp'

The variable 'inp' used to be named 'inbuf'. Make the function names
correspond to the variable name again.

No functional change.
 1.209 05-Nov-2021  rillig indent: rename process_keyword_do to process_do, same for 'else'

Before the symbols from the tokenizer had the prefix 'lsym', the symbols
could not be simply called 'else' and 'do'. The functions for processing
the tokens followed that naming scheme.

When the prefix 'lsym' was introduced, the word 'keyword' was no longer
needed, neither in the constants nor in the function names.

No functional change.
 1.208 05-Nov-2021  rillig indent: rename ps.curr_newline to next_col_1

For symmetry with ps.curr_col_1.

No functional change.
 1.207 04-Nov-2021  rillig indent: split process_comment_in_code into separate functions

No functional change.
 1.206 04-Nov-2021  rillig indent: fix joining of adjacent unary '+' operators
 1.205 03-Nov-2021  rillig indent: inline indentation_after, shorten function name to ind_add

There were only few calls to indentation_after, so inlining it spares
the need to look at yet another function definition. Another effect is
that code.s and code.e appear in the code as a pair now, instead of a
single code.s, making the scope of the function call obvious.

In ind_add, there is no need to check for '\0' anymore since none of the
buffers can ever contain a null character, these are filtered out by
inbuf_read_line.

No functional change.
 1.204 01-Nov-2021  rillig indent: fix missing blank after 'return' (since 2021-10-31)

In indent.c 1.200 from 2021-10-31, the subtypes of identifier tokens
were removed since they were redundant. An unintended side effect was
that a parenthesized expression after 'return' was no longer separated
by a blank.

Before that change, 'return' was tokenized as an lsym_ident with subtype
kw_other, and want_space_before_lparen handled this case in the last
line. After the change, 'return' was treated as an ordinary identifier,
and unless the option '-pcs' (blank after function call) was given, the
blank was removed.

The other keywords that had kw_other are not affected since they do not
expect a '(' afterwards. These keywords are 'break', 'continue', 'goto',
'inline' and 'restrict'.

Curiously, there was not a single test case that covered 'return(expr)'.

While here, remove the trailing ',' from the enum lexer_symbol, which is
not allowed in standard C, it is a GNU extension. Lint doesn't complain
about this since the default LINTFLAGS include '-g' for GCC mode.
 1.203 31-Oct-2021  rillig indent: clean up

Initialize buffers in reading order, make comments more expressive,
rename add_typename to register_typename, remove unused macro.

No functional change.
 1.202 31-Oct-2021  rillig indent: for '-pcs', add blank between function and '('

Before indent-2021.09.30.21.48.12, the blank had always been added, even
in '-npcs' mode. Since then, the blank had never been added.

Now, add the blank in '-pcs' mode and omit it in '-npcs' mode.
 1.201 31-Oct-2021  rillig indent: replace kw_tag with lsym_tag

This leaves only one special type of token, which is lsym_ident, which
in some cases represents a type name and in other cases an identifier,
constant or string literal.

No functional change.
 1.200 31-Oct-2021  rillig indent: replace simple cases of keyword_kind with lexer_symbol

The remaining keyword kinds 'tag' and 'type' require a bit more thought,
so do them in a separate step.

No functional change.
 1.199 31-Oct-2021  rillig indent: rename lsym_type to better reflect reality

Type names that occur in parentheses are parsed as lsym_ident having the
subtype kw_type instead.

No functional change.
 1.198 31-Oct-2021  rillig indent: add separate lexer symbol for offsetof

No functional change.
 1.197 31-Oct-2021  rillig indent: add separate lexer symbol for sizeof

The plan is to get rid of the type keyword_kind, which largely overlaps
with lexer_symbol.

No functional change.
 1.196 30-Oct-2021  rillig indent: push down variable comment_buffered

No functional change.
 1.195 30-Oct-2021  rillig indent: rename prev_newline and prev_col_1 to curr

These two flags describe the token that is currently processed.

In process_binary_op, curr_newline can never be true since newline is
not a binary operator, so remove that condition.

No functional change.
 1.194 30-Oct-2021  rillig indent: reorder assignments in switch_buffer

No functional change.
 1.193 30-Oct-2021  rillig indent: move buffer functions further up

No functional change.
 1.192 30-Oct-2021  rillig indent: group variables by topic

No functional change.
 1.191 30-Oct-2021  rillig indent: prevent buffer overflow in search_stmt_comment

printf '{ if (%010000d) /*comment*/ ; }' '0' | indent
 1.190 30-Oct-2021  rillig indent: add debug logging for save_com

This will help in finding the proper fix for the assertion failure in
search_stmt_comment.

Add an assertion in search_stmt_lbrace to prevent the previous,
incomplete fix from being applied again.
 1.189 30-Oct-2021  rillig indent: prevent buffer overflows in 'if (expr) ... stmt'
 1.188 30-Oct-2021  rillig indent: revert previous fix of assertion failure

The strange code with the out of bounds memory access is needed to
transform 'if (expr) /* comment */ {' to 'if (expr) { /* comment */',
that is, to move the comment to the right.

Add a test that prevents "repairing" this code again.
 1.187 30-Oct-2021  rillig indent: fix assertion failure in search_stmt_comment

I have no idea why the code was written in such a convoluted way before.
By removing all the code that didn't make sense, everything just works
as expected, and the existing tests all pass, especially those in
token_comment.c that mention search_stmt_comment.
 1.186 30-Oct-2021  rillig indent: replace tabsize with hardcoded 8 in process_comma

On 2018-07-25, FreeBSD added the option '-ts' to make the tabulator size
configurable, replacing several constants 7, 8, 9 with tabsize. The 8 in
the expression 'max_col - 8' was not related to the tabulator size but
instead represents the typical width of a variable name. Subtracting a
tab from the right margin doesn't make sense since the right margin need
not be aligned on a tabstop.

See the test fmt_decl.c, where the declaration 'struct s0 a,b;' is split
into several lines because the estimate for the variable name following
the comma is too high. There would have been plenty of space to the
right to keep the whole declaration in a single line.

No functional change.
 1.185 30-Oct-2021  rillig indent: don't risk a buffer overflow in code_add_decl_indent

The buffers have a safety margin of 5 characters, so the bounds check is
not strictly necessary. It makes the code more uniform though.

No functional change.
 1.184 30-Oct-2021  rillig indent: clean up code_add_decl_indent

In layout computations, it is helpful for human readers to list the
summands in logical order. In this case, the expression 'code_len +
base_ind' was rather confusing, so replace it with 'base_ind +
code_len'. This makes the code straight-forward enough that it doesn't
need any comments anymore.

No functional change.
 1.183 30-Oct-2021  rillig indent: remove confusing modulo from code_add_decl_indent

The only effects of the modulo operation was to make indent slower and
to confuse human readers.

During the computation of the indentation, the main focus is on the
difference between the current indentation, as computed from the base
indentation and the current code, and the target indentation. All these
computations take opt.tabsize into account. When looking only at the
difference, whether or not a multiple of opt.tabsize is added does not
matter.

No functional change.
 1.182 30-Oct-2021  rillig indent: inline bloated call to 'parse' during initialization

No functional change.
 1.181 30-Oct-2021  rillig indent: condense code for parsing command line arguments

Previously, the cascade of 'if' statements suggested that there were 6
different cases to be handled when in reality there are only 3: no
arguments, 1 argument, 2 arguments. Let the code express this directly.

No functional change.
 1.180 30-Oct-2021  rillig indent: extract main_load_profiles from main_parse_command_line

No functional change.
 1.179 29-Oct-2021  rillig indent: remove redundant comments, remove punctuation from debug log

The comment about 'null stmt' between braces probably meant 'no
statements between braces'.

The comments at psym_switch_expr only repeated what the code says or had
been outdated 29 years ago already since opt.case_indent does not have
to be 'one level down'.

In the debug log, the quotes around the symbol names are not necessary
after a ':'. The parse stack also does not need this much punctuation.

Reducing a do-while loop to nothing instead of a statement saves a few
CPU cycles. It works because after each lbrace, a stmt is pushed to the
parser stack. This stmt can only ever be reduced to a stmt_list but
never be removed.
 1.178 29-Oct-2021  rillig indent: fix missing blank before binary operator
 1.177 29-Oct-2021  rillig indent: merge isblank and is_hspace into ch_isblank

No functional change.
 1.176 29-Oct-2021  rillig indent: replace segmentation fault with assertion
 1.175 29-Oct-2021  rillig indent: initialize 'ps' via code

This saves 3 kB of binary size since the parser state is rather large
and only very few members are initialized to non-zero values.

No functional change.
 1.174 29-Oct-2021  rillig indent: clean up main_init_globals

No functional change.
 1.173 29-Oct-2021  rillig indent: fix undefined behavior in buffer handling

Adding an arbitrary integer to a pointer may result in an out of bounds
pointer, so replace the addition with a pointer subtraction.

In the buffer handling functions, handle 'buf' and 'l' before 's' and
'e', since they are pairs.

In inbuf_read_line, use 's' instead of 'buf' to make the code easier to
understand for human readers.

No functional change.
 1.172 29-Oct-2021  rillig indent: mark obviously broken code
 1.171 29-Oct-2021  rillig indent: use prev/curr/next to refer to the current token

The word 'last' just didn't match with 'next'.

No functional change.
 1.170 29-Oct-2021  rillig indent: rename ps.dumped_decl_indent and indent_declaration

The word 'dump' in 'ps.dumped_decl_indent' was too close to dump_line,
which led to confusion since the variable controls whether the
indentation has been added to the code buffer, which happens way before
actually dumping the current line to the output file.

The function name 'indent_declaration' was too unspecific, it did not
reveal where the indentation of the declaration actually happened.

No functional change.
 1.169 29-Oct-2021  rillig indent: keep p_l_follow nonnegative, use consistent comparison

No functional change.
 1.168 29-Oct-2021  rillig indent: spell 'parentheses' properly in messages and comments
 1.167 28-Oct-2021  rillig indent: clean up indentation, comments, reduce

No functional change.
 1.166 28-Oct-2021  rillig indent: remove unused local variable in lexi

Since the previous commit, lexi is always called with the same argument,
so remove that parameter.

The previous commit broke the debug logging by not printing "transient
state" anymore. Replace this with "rolled back parser state" at the
caller's site.

No functional change.
 1.165 28-Oct-2021  rillig indent: reduce negations in search_stmt_lookahead

No functional change.
 1.164 28-Oct-2021  rillig indent: clean up comments and function names

Having accurate names for the lexer symbols and the parser symbols makes
most of the comments redundant. Remove these.

Rename process_decl to process_type, to match the name of the
corresponding lexer symbol. In this phase, it's just a single type
token, not a whole declaration.

No functional change.
 1.163 28-Oct-2021  rillig indent: make error messages for option parsing more precise
 1.162 26-Oct-2021  rillig indent: clean up process_comment

There is no undefined behavior since the compared characters are always
from the basic execution character set. All other cases are covered by
the condition above for now_len.

Fix debug logging for non-ASCII characters, previously a character was
output as \xffffffc3.
 1.161 26-Oct-2021  rillig indent: make ps.keyword easier to understand

Previously, ps.keyword did not have any documentation and was not
straight-forward. In some cases it was reset to kw_0, in others it was
set to an interesting value. The idea behind it was to remember the kind
of word of the previous token, to decide whether to have a space between
sizeof or offsetof and a following '('.

No functional change.
 1.160 26-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards, to compensate for the remaining bugs
in indent.

Without the type definitions in .indent.pro, the opening braces of the
functions kw_name and lexi_alnum would not be at the beginning of the
line.
 1.159 25-Oct-2021  rillig indent: improve debug logging

Output the various details in chronological order.
 1.158 25-Oct-2021  rillig indent: rename search_brace to search_stmt

No functional change.
 1.157 25-Oct-2021  rillig indent: rename local variable sp_sw to spaced_expr

The 'sp' probably meant 'space-enclosed'; no idea what 'sw' was meant to
mean. Maybe 'switch', but that would have been rather ambiguous when
talking about control flow statements.

No functional change.
 1.156 25-Oct-2021  rillig indent: split type token_type into 3 separate types

Previously, token_type was used for 3 different purposes:

1. symbol types from the lexer
2. symbol types on the parser stack
3. kind of control statement for 'if (expr)' and similar statements

Splitting the 41 constants into separate types makes it immediately
clear that the parser stack never handles comments, preprocessing lines,
newlines, form feeds, the inner structure of expressions.

Previously, the constant switch_expr was especially confusing since it
was used for 3 different purposes: when returned from lexi, it
represented the keyword 'switch', in the parser stack it represented
'switch (expr)', and it was used for a statement head as well.

The only overlap between the lexer symbols and the parser symbols are
'{' and '}', and the keywords 'do' and 'else'. To increase confusion,
the constants of the previous token_type were in apparently random
order and before 2021, they had cryptic, highly abbreviated names.

No functional change.
 1.155 24-Oct-2021  rillig indent: rename form_feed to tt_lex_form_feed

No functional change.
 1.154 24-Oct-2021  rillig indent: split kw_for_or_if_or_while into separate constants

No functional change.
 1.153 24-Oct-2021  rillig indent: split kw_do_or_else into separate constants

It was unnecessarily confusing to have the token types keyword_do_else,
keyword_do and keyword_else at the same time, without any hint in what
they differed.

Some of the token types seem to be used by the lexer while others are
used in the parse stack. Maybe all token types can be partitioned into
these groups, which would suggest to use two different types for them.
And if not, it's still clearer to have this distinction in the names of
the constants.

No functional change.
 1.152 24-Oct-2021  rillig indent: rename seen_quest to quest_level

The new name aligns with other similar variables like ind_level,
case_ind_level and ifdef_level. The old name 'seen' is mainly used for
bool variables.

No functional change.
 1.151 24-Oct-2021  rillig indent: fix indentation of ad-hoc tagged variables

Seen among others in usr.bin/indent/lexi.c, variable 'keywords'.
 1.150 24-Oct-2021  rillig indent: initialize variables in main_loop in declaration

No functional change.
 1.149 24-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards. Indent still does not get
extra_expr_indent correctly, it also indents global variables after
tagged declarations too deep.

No functional change.
 1.148 24-Oct-2021  rillig indent: clean up format of warnings and errors

Previously, warnings and errors had the form of C block comments. Before
NetBSD io.c 1.20 from 2019-10-19, this format made sense because the
diagnostics could end up in the same output stream as the formatted
output.

Since NetBSD io.c 1.20 from 2019-10-19, all diagnostics are redirected
to stderr. This change was not mentioned in the commit message back
then, it makes sense nevertheless. Since stdout and stderr now are
properly separated, there is no need anymore to keep the weird format
for warnings and errors. Switch to the standard 'error: file:line'
format.

Move the function 'diag' to indent.c to have access to the name of the
current input file.
 1.147 24-Oct-2021  rillig indent: fix line number counting at beginning of function body
 1.146 24-Oct-2021  rillig indent: rename nitems to array_length
 1.145 24-Oct-2021  rillig indent: replace global variable use_ff with function parameter
 1.144 20-Oct-2021  rillig indent: rename ps.last_u_d to match its comment

No functional change.
 1.143 20-Oct-2021  rillig indent: rename parser stack variables

No functional change.
 1.142 20-Oct-2021  rillig indent: rename blankline_requested variables

The words 'prefix' and 'postfix' sounded too much like horizontal
concepts, like in operators. The actual purpose of these variables is to
add blank lines before and after the current line, so use the same
wording as in the command line options.

No functional change.
 1.141 20-Oct-2021  rillig indent: invert condition in process_newline

It's hard to follow a condition that combines many negated terms with
'||'. Group the conditions by their origin.

The condition '!opt.break_after_comma && break_comma' still sounds like
a contradition, more investigations to follow.

No functional change.
 1.140 20-Oct-2021  rillig indent: rename next_blank_lines to blank_lines_to_output

The previous name was already an improvement over the name before that
(n_real_blanklines), but didn't express the intended purpose clearly
enough, so try another name.

No functional change.
 1.139 19-Oct-2021  rillig indent: if a file ends with indent off, don't add space-newline
 1.138 17-Oct-2021  rillig indent: parse int command line options strictly

On i386 and other platforms where LONG_MAX == INT_MAX, the test
t_errors/option_tabsize_very_large failed since the behavior on integer
overflow differs between ILP32 and LP64 platforms. Noticed by gson@.

Avoid this unintended difference by adding reasonable limits for each of
the integer options and by replacing atoi with strtol.
 1.137 09-Oct-2021  rillig indent: condense code for calculating indentations

No functional change.
 1.136 09-Oct-2021  rillig indent: extract common code for advancing a single tab

No functional change.
 1.135 08-Oct-2021  rillig indent: clean up comments, parentheses, debug messages, boolean operator

No functional change.
 1.134 08-Oct-2021  rillig indent: rename in_or_st to init_or_struct

This makes a few comments redundant.

No functional change.
 1.133 08-Oct-2021  rillig indent: rename fill_buffer to inbuf_read_line

No functional change.
 1.132 08-Oct-2021  rillig indent: clean up process_decl, replace unnecessary strlen

No functional change.
 1.131 08-Oct-2021  rillig indent: remove unnecessary forward declarations

No functional change.
 1.130 08-Oct-2021  rillig indent: reduce negations in main_loop

No functional change.
 1.129 08-Oct-2021  rillig indent: fix parsing of preprocessor lines with comments and strings
 1.128 08-Oct-2021  rillig indent: run indent on indent.h

The formatting looks mostly OK.

Some struct members had excessively long names, leaving no space for
their corresponding comments. Renamed some of them using well-known
abbreviations.

The formatting for debug_vis_range is messed up, no idea why. It is
clearly a function declaration, not a function definition, so there is
no need to place the function name in column 1.

No functional change.
 1.127 08-Oct-2021  rillig indent: split process_keyword_do_else into separate functions

No functional change.
 1.126 08-Oct-2021  rillig indent: rename tokens lparen and rparen to be more precise

No functional change.
 1.125 07-Oct-2021  rillig indent: rename bp_save to saved_inp_s, be_save to saved_inp_e

Using the same naming convention makes it easier to relate the
variables.

No functional change.
 1.124 07-Oct-2021  rillig indent: group variables for the input buffer

The input buffer follows the same concept as the intermediate buffers
for label, code, comment and token, so use the same type for it.

No functional change.
 1.123 07-Oct-2021  rillig indent: move definition of bufsize from header to implementation

No functional change.
 1.122 07-Oct-2021  rillig indent: rename opt.btype_2 to brace_same_line

No functional change.
 1.121 07-Oct-2021  rillig indent: clean up code, remove outdated wrong comments

No functional change.
 1.120 07-Oct-2021  rillig indent: use braces around multi-line statements

No functional change.
 1.119 07-Oct-2021  rillig indent: let the code breathe a bit by inserting empty lines

No functional change.
 1.118 07-Oct-2021  rillig indent: clean up comments

No functional change.
 1.117 07-Oct-2021  rillig indent: fix wrong or outdated comments

No functional change.
 1.116 07-Oct-2021  rillig indent: remove redundant comments

No functional change.
 1.115 07-Oct-2021  rillig indent: reduce indentation

No functional change.
 1.114 07-Oct-2021  rillig indent: remove global variable option_source

It is only needed at startup, while parsing the options. The string "?"
was not needed at all.

No functional change.
 1.113 07-Oct-2021  rillig indent: clean up colon handling

No functional change.
 1.112 07-Oct-2021  rillig indent: add high-level API for working with buffers

This makes the code more boring to read, which is actually good. Less
fiddling with memcpy and pointer arithmetics.

Since indent is not a high-performance tool used for bulk operations on
terabytes of source code, there is no need to squeeze out every possible
CPU cycle.

No functional change.
 1.111 07-Oct-2021  rillig indent: rename copy_id to copy_token

No functional change.
 1.110 07-Oct-2021  rillig indent: raise WARNS from the default 5 up to 6
 1.109 07-Oct-2021  rillig indent: prevent division by zero
 1.108 05-Oct-2021  rillig indent: rename n_real_blanklines

The word 'n' was not as helpful as possible, the word 'real' did not
give any clue at all about the variable's purpose.

No functional change.
 1.107 05-Oct-2021  rillig indent: fix off-by-one error for indented first line
 1.106 05-Oct-2021  rillig indent: make off-by-one error in main_prepare_parsing more visible

No functional change.
 1.105 05-Oct-2021  rillig indent: make variable names more expressive

The abbreviation 'dec' looked too much like 'decimal' instead of the
intended 'declaration'.

No functional change.
 1.104 05-Oct-2021  rillig indent: remove variable name prefix 'inout_'

This makes the variable names more readable. The prefix is not actually
needed to understand the code, it is rather distracting.

The compiler and lint will guard against any accidental mismatch between
pointer, integer and bool.

No functional change.
 1.103 05-Oct-2021  rillig indent: fix Clang-Tidy warnings, clean up bakcopy

The comment above and inside bakcopy had been outdated for at least the
last 28 years, the backup file is named "%s.BAK", not ".B%s".

Prevent buffer overflow for very long filenames (sprintf -> snprintf).
 1.102 05-Oct-2021  rillig indent: fix spelling in comments
 1.101 05-Oct-2021  rillig indent: merge duplicate code into is_hspace

No functional change.
 1.100 05-Oct-2021  rillig indent: clean up code for appending to buffers

Use *e++ for appending and e[-1] for testing the previously appended
character, like in other places in the code.

No functional change.
 1.99 05-Oct-2021  rillig indent: merge duplicate code for reading from input buffer

No functional change.
 1.98 03-Oct-2021  rillig indent: rename functions

There was no good reason for using the different verbs 'scan' and 'set'
for two functions that essentially do the same.

No functional change.
 1.97 03-Oct-2021  rillig indent: fix content of profile_name

Previously, profile_name included the leading "-P", which was confusing.
 1.96 30-Sep-2021  rillig indent: remove space between ')' and '(' in declarations
 1.95 30-Sep-2021  rillig indent: untangle want_blank_before_lparen

No functional change.
 1.94 30-Sep-2021  rillig indent: extract want_blank_before_lparen

No functional change.
 1.93 30-Sep-2021  rillig indent: add space between ',' and '[' in C99 initializations
 1.92 27-Sep-2021  rillig indent: let indent format the comments after previous refactoring

Before this refactoring, I had skipped this section of the code from
formatting since the 'default:' branch was enclosed in a block of its
own, and that block would have been indented one more level to the
right. Extracting that code into a separate function got rid of the
extra braces.

No functional change.
 1.91 27-Sep-2021  rillig indent: split search_brace into smaller functions

No functional change.
 1.90 27-Sep-2021  rillig indent: use binary instead of linear search when adding types

No functional change.
 1.89 27-Sep-2021  rillig indent: rename rwcode to keyword_kind, various cleanup

No idea what the 'rw' in 'rwcode' meant, it had been imported that way
28 years ago. Since rwcode specifies the kind of a keyword, the prefix
'kw_' makes sense.

No functional change.
 1.88 26-Sep-2021  rillig indent: unexport global variables

The variable match_state was write-only and was thus removed.

No functional change.
 1.87 26-Sep-2021  rillig indent: negate and rename option.leave_comma

The old name did not mirror the description in the manual page, and it
was the only option that is negated. Inverting it allows the options
table to be compressed.
 1.86 26-Sep-2021  rillig indent: let indent format its own code -- in supervised mode

After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.

Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.

Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.

No functional change.
 1.85 26-Sep-2021  rillig indent: fix missing space between comma and ellipsis

According to lint's C grammar, in standard C an ellipsis only occurs
after a comma. There are GCC extensions that allow an ellipsis as the
only function parameter, as well as in 'case a ... b', but these are
rare.
 1.84 25-Sep-2021  rillig indent: misc cleanup

No functional change.
 1.83 25-Sep-2021  rillig indent: convert found_err to bool

That variable had slipped through the migration since it consequently
used int for the declaration, the definition and all assignments.

No functional change.
 1.82 25-Sep-2021  rillig indent: use strlen instead of own implementation

The two loops looks similar but differ in a crucial detail that makes up
for a '+ 1'.

No functional change.
 1.81 25-Sep-2021  rillig indent: merge duplicate code for token buffers

No functional change.
 1.80 25-Sep-2021  rillig indent: clean up argument handling

No functional change.
 1.79 25-Sep-2021  rillig indent: un-abbreviate a few parser_state members, clean up comments

No functional change.
 1.78 25-Sep-2021  rillig indent: remove dead code for printing comments after empty lines

This code has been commented out for at least 29 years.

No functional change.
 1.77 25-Sep-2021  rillig indent: reduce code and data size for lexing of numbers

Instead of having a table of strings (121 pointers + 121 data
relocations), reduce that table to the actual character data and use a
secondary table for looking up the correct row in the main table.

No functional change.
 1.76 25-Sep-2021  rillig indent: rename option variable to be more expressive

No functional change.
 1.75 25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.74 25-Sep-2021  rillig indent: convert parser_state from ibool to bool

indent.c:400:5: error: suggest parentheses around assignment used as
truth value
io.c:271:32: error: ‘~’ on a boolean expression

No functional change intended.
 1.73 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.72 25-Sep-2021  rillig indent: merge duplicate code for initializing buffers

No functional change.
 1.71 25-Sep-2021  rillig indent: clean up initialization of options

The default values in 'struct pro' were redundant but all consistent,
even with the commented defaults in main_parse_command_line.

No functional change.
 1.70 25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.69 25-Sep-2021  rillig indent: move statistical values into a separate struct

No functional change.
 1.68 25-Sep-2021  rillig indent: add nonnull memory allocation functions

The only functional change is a single error message.
 1.67 25-Sep-2021  rillig indent: group global variables for token buffer

No functional change.
 1.66 25-Sep-2021  rillig indent: inline macro 'token'

No functional change.
 1.65 25-Sep-2021  rillig indent: group global variables for code buffer

No functional change.
 1.64 25-Sep-2021  rillig indent: rename variables of type token_type

The previous variable name 'code' conflicts with the buffer of the same
name.

No functional change.
 1.63 24-Sep-2021  rillig indent: group global variables for label buffer into struct

No functional change.
 1.62 24-Sep-2021  rillig indent: group global variables for the comment buffer

No functional change.
 1.61 25-Aug-2021  rillig indent: fix lint warnings about type conversions on ilp32

No functional change.
 1.60 26-Mar-2021  rillig indent: fix Clang build everywhere but on amd64

No idea why Clang didn't complain about this on amd64, only on all other
platforms.
 1.59 14-Mar-2021  rillig indent: fix lint warnings

No functional change.
 1.58 13-Mar-2021  rillig indent: add debug logging for switching the input buffer

No functional change outside debug mode.
 1.57 13-Mar-2021  rillig indent: distinguish between 'column' and 'indentation'

column == 1 + indentation.

In addition, indentation is a relative distance while column is an
absolute position. Therefore, don't confuse these two concepts, to
prevent off-by-one errors.

No functional change.
 1.56 13-Mar-2021  rillig indent: rename pr_comment to process_comment, clean up documentation

No functional change.
 1.55 13-Mar-2021  rillig indent: fix handling of '/*' in string literal in preprocessing line

Previously, the '/*' in the string literal had been interpreted as the
beginning of a comment, which was wrong. Because of that, the variable
declaration in the following line was still interpreted as part of the
comment. The comment even continued until the end of the file.

Due to indent's forgiving nature, it neither complained nor even
mentioned that anything had gone wrong. The decision of rather
producing wrong output than failing early is a dangerous one.

At least, there should have been an error message that at the end of the
file, the parser was still in a a comment, expecting the closing '*/'.
 1.54 13-Mar-2021  rillig indent: split 'main_loop' into several functions

No functional change.
 1.53 13-Mar-2021  rillig indent: split 'main' into manageable parts

Since several years (maybe even decades) compilers know how to inline
static functions that are only used once. Therefore there is no need to
have overly long functions anymore, especially not 'main', which is only
called a single time and thus does not add any noticeable performance
degradation.

No functional change.
 1.52 13-Mar-2021  rillig indent: remove redundant parentheses

No functional change.
 1.51 13-Mar-2021  rillig indent: fix confusing variable names

The word 'col' should only be used for the 1-based column number. This
name is completely inappropriate for a line length since that provokes
off-by-one errors. The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.

Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.

No functional change.
 1.50 13-Mar-2021  rillig indent: inline calls to count_spaces and count_spaces_until

These two functions operated on column numbers instead of indentation,
which required adjustments of '+ 1' and '- 1'. Their names were
completely wrong since these functions did not count anything, instead
they computed the column.

No functional change.
 1.49 13-Mar-2021  rillig indent: replace compute_code_column with compute_code_indent

The goal is to only ever be concerned about the _indentation_ of a
token, never the _column_ it appears in. Having only one of these
avoids off-by-one errors.

No functional change.
 1.48 13-Mar-2021  rillig indent: add debug logging for actually writing to the output file

Together with the results of the tokenizer and the 4 buffers for token,
label, code and comment, the debug log now provides a good high-level
view on how the indentation happens and where to look for the many
remaining bugs.
 1.47 13-Mar-2021  rillig indent: replace pad_output with output_indent

Calculating the indentation is simpler than calculating the column,
since that saves the constant addition and subtraction of the 1.

No functional change.
 1.46 12-Mar-2021  rillig indent: replace 'target' with 'indent' in function names

The word 'target' was not as specific as possible.

No functional change.
 1.45 12-Mar-2021  rillig indent: use consistent indentation for 'else'

Half of the code used -ce, the other half the opposite -nce.

No functional change.
 1.44 12-Mar-2021  rillig indent: manually fix indentation

No functional change.
 1.43 11-Mar-2021  rillig indent: reduce indentation of check_size functions

No functional change.
 1.42 11-Mar-2021  rillig indent: remove redundant cast after allocation functions

No functional change.
 1.41 09-Mar-2021  rillig indent: extract search_brace from main

No functional change.
 1.40 09-Mar-2021  rillig indent: extract capsicum code out of the main function

No functional change.
 1.39 09-Mar-2021  rillig indent: rename a few more token types

The previous names were either too short or ambiguous.

No functional change.
 1.38 09-Mar-2021  rillig indent: make token names more precise

The previous 'casestmt' was wrong since a case label is not a statement
at all.

The previous 'swstmt' was overly short, and wrong as well, since it
represents only the 'switch (expr)' part, which is not a complete switch
statement. Same for 'ifstmt', 'whilestmt', 'forstmt'.

The previous word 'head' was not precise enough since it didn't specify
exactly where the head ends and the body starts. Especially for
handling the dangling else, this distinction is important.

No functional change.
 1.37 09-Mar-2021  rillig indent: rename a few tokens to be more obvious

For casual readers it is not obvious whether the 'sp' meant 'special' or
'space' or something entirely different.
 1.36 09-Mar-2021  rillig indent: manually indent comments

It's strange that indent's own code is not formatted by indent itself,
which would be a good demonstration of its capabilities.

In its current state, I don't trust indent to get even the tokenization
correct, therefore the only safe way is to format the code manually.
 1.35 08-Mar-2021  rillig indent: inline macro for backslash

No functional change.
 1.34 08-Mar-2021  rillig indent: convert big macros to functions

Each of these buffers is only modified in a single file. This makes it
unnecessary to declare the macros in the global header.
 1.33 08-Mar-2021  rillig indent: fix printing of uninitialized 'token' in debug output
 1.32 07-Mar-2021  rillig indent: sprinkle a few const

No functional change.
 1.31 07-Mar-2021  rillig indent: use named constants for the different types of keywords

This reduces the magic numbers in the code. Most of these had their
designated constant name written in a nearby comment anyway.

The one instance where arithmetic was performed on this new enum type
(in indent.c) was a bit tricky to understand.

The combination rw_continue_or_inline_or_restrict looks strange, the
'continue' should intuitively belong to the other control flow keywords
in rw_break_or_goto_or_return.

No functional change.
 1.30 07-Mar-2021  rillig indent: for the token types, use enum instead of #define

This makes it easier to step through the code in a debugger.

No functional change.
 1.29 07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.28 06-Mar-2021  rillig indent: fix space-tab alignment in indent's own code

These parts are not fixed automatically by indent since they are in box
comments.

No functional change.
 1.27 23-Apr-2020  joerg Avoid common symbol declarations
 1.26 19-Oct-2019  christos use stdarg, annotate function as __printflike and fix broken formats.
 1.25 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.24 03-Feb-2019  mrg - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
this case, and thus can't be marked __dead easily
 1.23 05-Sep-2016  sevan branches: 1.23.14;
Drop main() prototype.
 1.22 25-Feb-2016  ginsbach Fix obvious contraction spelling mistakes by adding missing apostrophes.
 1.21 22-Feb-2016  ginsbach Use warnx(3).
 1.20 22-Feb-2016  ginsbach Use errx(3).
 1.19 04-Sep-2014  mrg port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.18 12-Apr-2009  lukem branches: 1.18.24;
Fix WARNS=4 issues (-Wshadow -Wcast-qual -Wsign-compare)
 1.17 21-Jul-2008  lukem branches: 1.17.6;
Remove the \n and tabs from the __COPYRIGHT() strings.
Tweak to use a consistent format.
 1.16 30-Oct-2004  dsl branches: 1.16.28;
Add (unsigned char) cast to ctype functions
 1.15 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.14 19-Jun-2003  christos PR/21645: Mishka: Localized comments don't work with indent.
 1.13 26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.12 20-Aug-2001  wiz precede, not preceed.
 1.11 16-Jun-2001  kleink Handle a labeled statement at the beginning of a function correctly;
from Nagae Hidetake <nagae@tk.airnet.ne.jp> in PR bin/12781.
 1.10 19-Dec-1998  christos char -> unsigned char, braces for gcc-2.8.1
 1.9 08-Oct-1998  wsanchez Get rid of multiply defined common symbols
 1.8 06-Sep-1998  mellon Support indenting standard input. When indenting standard input, write output to standard output.
 1.7 25-Aug-1998  ross Add { and } to shut up egcs. Reformat the more questionable code.
 1.6 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.5 18-Oct-1997  mrg merge lite-2.
 1.4 09-Jan-1997  tls RCS ID police
 1.3 07-May-1996  jtc Include appropriate header files to bring prototypes into scope.
Removed explicit errno declarations.
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 08-Jun-1993  mrg 4.4BSD-Lite2
 1.16.28.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.17.6.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.18.24.1 21-Sep-2014  snj Pull up following revision(s) (requested by mrg in ticket #110):
usr.bin/indent/io.c: revision 1.15
usr.bin/indent/indent_globs.h: revision 1.10
usr.bin/indent/args.c: revision 1.11
usr.bin/indent/indent.1: revision 1.23
usr.bin/indent/indent.c: revision 1.19
port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.23.14.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.23.14.1 10-Jun-2019  christos Sync with HEAD
 1.390.2.1 02-Aug-2025  perseant Sync with HEAD
 1.211 07-Jan-2025  rillig indent: condense and simplify parsing code
 1.210 04-Jan-2025  rillig indent: make debug log more uniform
 1.209 04-Jan-2025  rillig indent: make debug output easier readable

The previous format had the values of the parser state on the left side
and the corresponding names on the right side. While it looked nicely
aligned, it was not suitable for focusing on the actual data. Replace
this format with the more common "key: value" format.

Use the names of the enum constants in the debug log, instead of the
previous "nice" names that needed one more level of mental translation
and in some cases contained unbalanced punctuation such as '{'.
 1.208 03-Jan-2025  rillig indent: fix line breaks in else-if sequences

The flag ps.want_newline did not adequately model the conditions under
which a line break should be inserted, thus the redesign.

A welcome side effect is that in statements like 'if (cond);', the
semicolon is now placed on a separate line, thus becoming more visible.
 1.207 03-Dec-2023  rillig branches: 1.207.2;
indent: inline input-related macros

No binary change.
 1.206 03-Dec-2023  rillig indent: group input-related variables into a struct

No functional change.
 1.205 03-Dec-2023  rillig indent: use line number of the token start in diagnostics

Previously, the line number of the end of the token was used, which was
confusing in debug mode.
 1.204 26-Jun-2023  rillig indent: implement 'blank line above first statement in function body'
 1.203 23-Jun-2023  rillig indent: properly store parser state in debug mode

The stacks in the parser state are allocated now and need to be copied
individually.

The test whether two paren stacks are equal was broken since 2023-06-14
14:11:28.
 1.202 16-Jun-2023  rillig indent: merge lexer symbols for type in/outside parentheses
 1.201 16-Jun-2023  rillig indent: fix indentation and linebreaks in typedef declarations
 1.200 16-Jun-2023  rillig indent: don't force a blank line between '}' and preprocessing line
 1.199 16-Jun-2023  rillig indent: rename a field of the parser state

The previous name 'comment_in_first_line' was misleading, as it could
mean that there was a comment in the first line of the file.

No functional change.
 1.198 15-Jun-2023  rillig indent: rename state variable to be more accurate

No binary change.
 1.197 14-Jun-2023  rillig indent: clean up the code, add a few tests
 1.196 14-Jun-2023  rillig indent: allow more than 128 brace levels
 1.195 14-Jun-2023  rillig indent: clean up array indexing for parser symbols

With 'top' pointing to the actual top element, the array was indexed in
the closed range from 0 to top. All other arrays are indexed by the
usual half-open interval from 0 to len.

No functional change.
 1.194 14-Jun-2023  rillig indent: allow more than 20 nested parentheses or brackets
 1.193 14-Jun-2023  rillig indent: clean up debugging code
 1.192 14-Jun-2023  rillig indent: clean up handling of comments

One less moving part in the parser state.

No functional change.
 1.191 14-Jun-2023  rillig indent: remove another flag from parser state

When processing a comment, the flag ps.next_col_1 was not used for the
next token, but for a line within a comment. As its scope was limited
to a single comment, there is no need to store it any longer than that

No functional change.
 1.190 14-Jun-2023  rillig indent: remove a redundant flag from the parser state

No functional change.
 1.189 14-Jun-2023  rillig indent: merge parser symbols for stmt and stmt_list

They were handled in exactly the same way.
 1.188 10-Jun-2023  rillig indent: rename misleading variable

The name started with 'line_start', but the value is not always the
value from the beginning of the line.

No functional change.
 1.187 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.186 10-Jun-2023  rillig indent: in debug mode, null-terminate buffers
 1.185 10-Jun-2023  rillig indent: clean up function and variable names
 1.184 10-Jun-2023  rillig indent: rename and sort variables in parser state

No functional change.
 1.183 09-Jun-2023  rillig indent: trim trailing blank lines
 1.182 09-Jun-2023  rillig indent: group lexer symbols by topic, sort processing functions

No functional change.
 1.181 09-Jun-2023  rillig indent: don't treat function call expressions as cast expressions
 1.180 09-Jun-2023  rillig indent: when an indentation is ambiguous, indent one level further

The '-eei' mode now applies whenever the indentation from a multi-line
expression could be confused with a following statement.
 1.179 08-Jun-2023  rillig indent: remove fragile heuristic for detecting cast expressions

The assumption that in an expression of the form '(a * anything)', the
'*' marks a pointer type was too simple-minded.

For now, fix the obvious cases and leave the others for later. If
needed, they can be worked around using the '-T' option.
 1.178 08-Jun-2023  rillig indent: clean up and condense code

No functional change.
 1.177 07-Jun-2023  rillig indent: extract the stack of parser symbols to a separate struct

No functional change.
 1.176 06-Jun-2023  rillig indent: compute indentation of 'case' labels on-demand

One less moving part to keep track of.

No functional change.
 1.175 05-Jun-2023  rillig indent: sync debug output with parser state
 1.174 05-Jun-2023  rillig indent: format own source code
 1.173 05-Jun-2023  rillig indent: do not report broken lines, report configuration on stderr
 1.172 05-Jun-2023  rillig indent: rename variables, clean up comments

No binary change.
 1.171 04-Jun-2023  rillig indent: remove read pointer from buffers that don't need it

The only buffer that needs a read pointer is the current input line in
'inp'.

No functional change.
 1.170 04-Jun-2023  rillig indent: track the kind of '{' on the parser stack
 1.169 04-Jun-2023  rillig indent: fix debug output of the parser symbol stack

Even though the stack always contains a stmt_list as first element,
print it nevertheless to avoid confusion about starting at index 1, and
to provide the full picture.
 1.168 04-Jun-2023  rillig indent: rename struct field, for better symmetry

No binary change outside debug mode.
 1.167 04-Jun-2023  rillig lint: use separate lexer symbols for 'case' and 'default'

It's not strictly necessary since these tokens behave in the same way,
still, the code is more straight-forward when there are separate tokens.
 1.166 04-Jun-2023  rillig indent: classify 'inline' as a modifier rather than a word
 1.165 04-Jun-2023  rillig indent: use separate lexer symbols for the different kinds of ':'
 1.164 04-Jun-2023  rillig indent: handle the indentation of 'case' in a simpler way
 1.163 04-Jun-2023  rillig indent: separate code for handling parentheses and brackets

Handling parentheses is more complicated than for brackets.
 1.162 02-Jun-2023  rillig indent: improve heuristics of classifying '*' as pointer or operator
 1.161 02-Jun-2023  rillig indent: clean up

Only print the 'token' buffer in debug mode if it is interesting, group
the blocks in handling of '(' tokens by topic, remove obsolete comment
from test.
 1.160 02-Jun-2023  rillig indent: fix formatting of declarations with preprocessing lines
 1.159 23-May-2023  rillig indent: split debug output into paragraphs

The paragraphs separate the different processing steps: getting a token
from the lexer, processing the token, updating the parser state, sending
a finished line to the output.
 1.158 23-May-2023  rillig indent: fix spacing in declarations in for loops
 1.157 22-May-2023  rillig indent: implement suppressing optional blank lines
 1.156 20-May-2023  rillig indent: extract the output state from the parser state

The parser state depends on the preprocessing lines, the output state
shouldn't.
 1.155 20-May-2023  rillig indent: implement blank line above block comment
 1.154 20-May-2023  rillig indent: implement blank line after function body
 1.153 20-May-2023  rillig indent: implement blank lines around conditional compilation
 1.152 20-May-2023  rillig indent: add debug logging for brace indentation

No functional change outside debug mode, as the initialization of
di_stack[0] was redundant.
 1.151 20-May-2023  rillig indent: separate detection of function definitions from lexing '*'

No functional change.
 1.150 18-May-2023  rillig indent: document the funcname token
 1.149 18-May-2023  rillig indent: rename a few functions

No functional change.
 1.148 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.147 18-May-2023  rillig indent: remove unnecessary variable size optimization

Due to the enum that follows in the struct, the short variable was
padded to 4 bytes anyway.

No functional change.
 1.146 16-May-2023  rillig indent: directly access the input buffer

No functional change.
 1.145 16-May-2023  rillig indent: remove support for form feed characters inside a line

Form feeds are occasionally used to split code into pages, and this use
is still supported. Having a form feed in the middle of a line is
exotic.
 1.144 16-May-2023  rillig indent: fix handling of INDENT OFF/ON comments

Previously, the 'INDENT OFF' comments were interpreted when the newline
token from the line above the comment was processed, which was earlier
than could be reasonably expected.

The 'INDENT ON' comments were interpreted equally early, which led to
the situation that the 'INDENT OFF' comments were preserved literally
but the 'INDENT ON' comments weren't.
 1.143 16-May-2023  rillig indent: move parsing of 'INDENT OFF/ON' comments to the lexer

No functional change.
 1.142 15-May-2023  rillig indent: clean up detection of whether parentheses form a cast

No functional change.
 1.141 15-May-2023  rillig indent: fix duplicate space between comment and binary operator
 1.140 15-May-2023  rillig indent: format its own code, extend some comments

With manual corrections, as there are still some bugs left.

No functional change.
 1.139 15-May-2023  rillig indent: fix indentation of statements after controlling expression
 1.138 15-May-2023  rillig indent: fix indentation of expressions in -nlp -eei mode
 1.137 15-May-2023  rillig indent: fix indentation of multi-line '?:' expressions in functions
 1.136 15-May-2023  rillig indent: group parser state by topic

No functional change.
 1.135 15-May-2023  rillig indent: clean up memory allocation

No functional change.
 1.134 15-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.133 15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.132 14-May-2023  rillig indent: only null-terminate the buffers if necessary

The only case where a buffer is used as a C-style string is when looking
up a keyword.

No functional change.
 1.131 14-May-2023  rillig indent: reduce code for scanning tokens

The input line is guaranteed to end with '\n', so there's no need to
carry another pointer around.

No functional change.
 1.130 14-May-2023  rillig indent: remove foreign RCS IDs
 1.129 14-May-2023  rillig indent: miscellaneous cleanups
 1.128 13-May-2023  rillig indent: implement 'blank after declarations'
 1.127 13-May-2023  rillig indent: use enum instead of magic numbers for tracking declarations

No functional change.
 1.126 13-May-2023  rillig indent: improve names of option variables

No functional change.
 1.125 13-May-2023  rillig indent: rename struct fields for buffers

No binary change except for assertion line numbers.
 1.124 13-May-2023  rillig indent: clean up a condition, add comments

No functional change.
 1.123 13-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.122 12-May-2023  rillig indent: rename placeholder symbol for parser stack

No functional change outside debug mode.
 1.121 12-May-2023  rillig indent: remove statistics

The numbers from the statistics were wrong.
 1.120 12-May-2023  rillig indent: condense code for handling spaced expressions

No functional change outside debug mode.
 1.119 11-May-2023  rillig indent: don't touch comments in preprocessing lines

The indentation of multi-line comments was wrong, and the code for
handling them was too complicated.
 1.118 11-May-2023  rillig indent: remove unused code
 1.117 11-May-2023  rillig indent: remove broken code for handling blank lines

This fixes several bugs where blank lines were erroneously added or
removed, treating these old bugs for new bugs in different places.
These new bugs are expected to be easier to fix, as the old bugs will
not interfere anymore.
 1.116 11-May-2023  rillig indent: add debug output for tracking comments and braces
 1.115 11-May-2023  rillig indent: move parser state variables to the parser_state struct

Include the variables in the debug output.
 1.114 11-May-2023  rillig indent: move force_nl into the parser state

This way, it is included in the debug output.

No functional change.
 1.113 11-May-2023  rillig indent: remove buggy code for swapping tokens

It is not the job of an indenter to swap tokens, even if it's only about
placing comments elsewhere. The code that swapped the tokens was
complicated, buggy and impossible to understand.

In -br (brace right) mode, indent no longer moves a '{' from the
beginning of a line to the end of the previous line, as that was handled
by the token swapping code as well. This change is unintended, but it
will be easier to re-add that now that the code is simpler.
 1.112 23-Apr-2022  rillig indent: group global variables related to output control

No functional change.
 1.111 13-Feb-2022  rillig indent: rename parser_state.p_l_follow and paren_level

The previous variable names were misleading.

Paren_level is not the current level of parentheses but the one from the
beginning of the current output line. For better accuracy, rename it to
line_start_paren_level.

P_l_follow is not the level of parentheses that will be active at some
point in the future, as the previous name suggested. Instead, it is the
level of parentheses right now. For better accuracy, rename it to
nparen. This nicely matches its main usage, which is as index to the
parser_state.paren array.

No binary change.
 1.110 13-Feb-2022  rillig indent: replace bitmasking code with struct

The struct directly represents the properties of a pair of parentheses,
without forcing the human reader to decode any bitset. This makes it
easier to find the remaining bugs in the heuristic for determining the
kind of parentheses.

No functional change outside debug mode.
 1.109 13-Feb-2022  rillig indent: change parser_state.cast_mask to 0-based indexing

Having 1-based indexing was completely unexpected, and it didn't match
the 0-based indexing of parser_state.paren_indents.

No functional change.
 1.108 12-Feb-2022  rillig indent: fix indentation of enum constants in typedef (since 2019-04-04)

The solution is not elegant since it adds a small state machine inside
the parser state, but at least these states only depend on the sequence
of token types and not on any other part of the parser state.

Reported in PR#55453.
 1.107 28-Nov-2021  rillig indent: treat L"string" as a single token

There is never whitespace between the 'L' and the string literal or the
character constant. There might be a backslash-newline between them, but
that case was not handled before either.

No functional change.
 1.106 28-Nov-2021  rillig indent: clean up and document input handling

The transformation of moving comments from after an 'if (expr)' after
the following brace has a large implementation cost (about 300 lines of
code) and makes input handling quite complicated. Document the overall
idea to save future readers some time.

No functional change.
 1.105 27-Nov-2021  rillig indent: accept a few formatting suggestions from indent

The remaining issues are still that the conditions look ambiguous even
with -eei, and that __attribute__ is broken into a separate line.

No functional change.
 1.104 27-Nov-2021  rillig indent: rename dump functions to output

No functional change.
 1.103 26-Nov-2021  rillig indent: add buf_add_range for adding characters to a buffer

No functional change.
 1.102 25-Nov-2021  rillig indent: rename ps.in_function_parameters to match reality

This flag is only set while parsing the parameters of a function
definition, but not for a function declaration. See buffer_add in the
test fmt_decl.

No functional change.
 1.101 25-Nov-2021  rillig indent: fix comment for ps.in_decl

In C, there are no declaration statements. There are declarations and
statements, but no combination thereof.
 1.100 25-Nov-2021  rillig indent: rename ps.in_stmt to in_stmt_or_decl

The previous name didn't match reality.

No functional change.
 1.99 25-Nov-2021  rillig indent: rename ps.ind_stmt to in_stmt_cont

This makes a comment redundant.

No functional change.
 1.98 19-Nov-2021  rillig indent: reduce casts to unsigned char for character classification

No functional change.
 1.97 19-Nov-2021  rillig indent: replace ps.procname with ps.is_function_definition

Omly the first character of ps.procname was ever read, and it was only
compared to '\0'. Using a bool for this means simpler code, less
memory and fewer wasted CPU cycles due to the removed strncpy.

No functional change.
 1.96 19-Nov-2021  rillig indent: unexport inbuf

No functional change.
 1.95 19-Nov-2021  rillig indent: use character input API from pr_comment.c

No functional change.
 1.94 19-Nov-2021  rillig indent: remove all references to inbuf from indent.c

No functional change.
 1.93 19-Nov-2021  rillig indent: move character input handling from indent.c to io.c

No functional change.
 1.92 19-Nov-2021  rillig indent: move character input from indent.c to io.c

No functional change.
 1.91 19-Nov-2021  rillig indent: use character input API from the tokenizer

No functional change.
 1.90 19-Nov-2021  rillig indent: move character input handling from lexi.c to io.c

No functional change.
 1.89 19-Nov-2021  rillig indent: replace direct access to the input buffer

This is a preparation for abstracting away all the low-level details of
handling the input. The goal is to fix the current bugs regarding line
number counting, out of bounds memory access, and generally unreadable
code.

No functional change.
 1.88 19-Nov-2021  rillig indent: rename input buffer variables

From reading the names 'save_com' and 'sc_end', it was not obvious
enough that these two variables are the limits of the same buffer, the
names were just too unrelated.

No functional change.
 1.87 19-Nov-2021  rillig indent: group variables for input handling

No functional change.
 1.86 07-Nov-2021  rillig indent: various cleanups

Make several comments more precise.

Rename process_end_of_file to process_eof to match the token name.

Change the order of assignments in analyze_comment to keep the com_ind
computations closer together.

In copy_comment_wrap, use pointer difference instead of pointer addition
to stay away from undefined behavior.

No functional change.
 1.85 07-Nov-2021  rillig indent: rename ps.decl_nest to decl_level

This better matches the comment.

No functional change.
 1.84 07-Nov-2021  rillig indent: reduce negations in process_else, clean up comments

No functional change.
 1.83 07-Nov-2021  rillig indent: document the comment buffer more accurately
 1.82 07-Nov-2021  rillig indent: split copy_comment into wrapping and non-wrapping

These two cases are processed in an almost entirely different way. In
particular, copy_comment_nowrap should copy the comment verbatim, which
is not obvious from the current code, due to the many conditions and the
complex control flow.

No functional change.
 1.81 07-Nov-2021  rillig indent: rename type_at_paren_level_0 to type_outside_parentheses

For symmetry with type_in_parentheses.

No functional change.
 1.80 07-Nov-2021  rillig indent: distinguish between typename in parentheses and other words

This gets rid of two members of parser_state. No functional change for
well-formed programs. The sequence of '++int' or '--size_t' may be
formatted differently than before, but no program is expected to contain
that sequence.

Rename lsym_ident to lsym_word since 'ident' was too specific. This
token type is used for constants and string literals as well. Strictly
speaking, a string literal is not a word, but at least it's better than
before.
 1.79 07-Nov-2021  rillig indent: rename 'inbuf' functions to 'inp'

The variable 'inp' used to be named 'inbuf'. Make the function names
correspond to the variable name again.

No functional change.
 1.78 05-Nov-2021  rillig indent: the '+ 1' in dump_line_code is not an off-by-one error
 1.77 05-Nov-2021  rillig indent: rename ps.curr_newline to next_col_1

For symmetry with ps.curr_col_1.

No functional change.
 1.76 03-Nov-2021  rillig indent: inline indentation_after, shorten function name to ind_add

There were only few calls to indentation_after, so inlining it spares
the need to look at yet another function definition. Another effect is
that code.s and code.e appear in the code as a pair now, instead of a
single code.s, making the scope of the function call obvious.

In ind_add, there is no need to check for '\0' anymore since none of the
buffers can ever contain a null character, these are filtered out by
inbuf_read_line.

No functional change.
 1.75 01-Nov-2021  rillig indent: fix missing blank after 'return' (since 2021-10-31)

In indent.c 1.200 from 2021-10-31, the subtypes of identifier tokens
were removed since they were redundant. An unintended side effect was
that a parenthesized expression after 'return' was no longer separated
by a blank.

Before that change, 'return' was tokenized as an lsym_ident with subtype
kw_other, and want_space_before_lparen handled this case in the last
line. After the change, 'return' was treated as an ordinary identifier,
and unless the option '-pcs' (blank after function call) was given, the
blank was removed.

The other keywords that had kw_other are not affected since they do not
expect a '(' afterwards. These keywords are 'break', 'continue', 'goto',
'inline' and 'restrict'.

Curiously, there was not a single test case that covered 'return(expr)'.

While here, remove the trailing ',' from the enum lexer_symbol, which is
not allowed in standard C, it is a GNU extension. Lint doesn't complain
about this since the default LINTFLAGS include '-g' for GCC mode.
 1.74 31-Oct-2021  rillig indent: clean up

Initialize buffers in reading order, make comments more expressive,
rename add_typename to register_typename, remove unused macro.

No functional change.
 1.73 31-Oct-2021  rillig indent: replace kw_tag with lsym_tag

This leaves only one special type of token, which is lsym_ident, which
in some cases represents a type name and in other cases an identifier,
constant or string literal.

No functional change.
 1.72 31-Oct-2021  rillig indent: replace simple cases of keyword_kind with lexer_symbol

The remaining keyword kinds 'tag' and 'type' require a bit more thought,
so do them in a separate step.

No functional change.
 1.71 31-Oct-2021  rillig indent: rename lsym_type to better reflect reality

Type names that occur in parentheses are parsed as lsym_ident having the
subtype kw_type instead.

No functional change.
 1.70 31-Oct-2021  rillig indent: add separate lexer symbol for offsetof

No functional change.
 1.69 31-Oct-2021  rillig indent: add separate lexer symbol for sizeof

The plan is to get rid of the type keyword_kind, which largely overlaps
with lexer_symbol.

No functional change.
 1.68 31-Oct-2021  rillig indent: clean up definition of keywords

Rename kw_struct_or_union_or_enum to the shorter kw_tag.

Merge kw_jump with kw_inline_or_restrict since they are handled in the
same way.

No functional change.
 1.67 30-Oct-2021  rillig indent: move debugging functions to a separate section
 1.66 30-Oct-2021  rillig indent: rename prev_newline and prev_col_1 to curr

These two flags describe the token that is currently processed.

In process_binary_op, curr_newline can never be true since newline is
not a binary operator, so remove that condition.

No functional change.
 1.65 30-Oct-2021  rillig indent: clean up lexical analyzer

Use traditional type for small unsigned numbers instead of uint8_t; the
required header was not included.

Remove assertion for debug mode; lint takes care of ensuring that the
enum constants match the length of the names array.

Constify a name array.

Move the comparison function for bsearch closer to its caller.

No functional change.
 1.64 30-Oct-2021  rillig indent: inline macro label_offset

No functional change.
 1.63 29-Oct-2021  rillig indent: fix missing blank before binary operator
 1.62 29-Oct-2021  rillig indent: merge isblank and is_hspace into ch_isblank

No functional change.
 1.61 29-Oct-2021  rillig indent: reorder global variables to be more intuitive

The buffer 'inp' comes first. From there, a single token is read into
the buffer 'token'. From there, it usually ends up in 'code'. The buffer
'token' does not belong to the group of the other 3 buffers, which
together make up a line of formatted output.

No functional change.
 1.60 29-Oct-2021  rillig indent: use prev/curr/next to refer to the current token

The word 'last' just didn't match with 'next'.

No functional change.
 1.59 29-Oct-2021  rillig indent: group members of parser_state by topic

No functional change.
 1.58 29-Oct-2021  rillig indent: rename ps.dumped_decl_indent and indent_declaration

The word 'dump' in 'ps.dumped_decl_indent' was too close to dump_line,
which led to confusion since the variable controls whether the
indentation has been added to the code buffer, which happens way before
actually dumping the current line to the output file.

The function name 'indent_declaration' was too unspecific, it did not
reveal where the indentation of the declaration actually happened.

No functional change.
 1.57 29-Oct-2021  rillig indent: keep p_l_follow nonnegative, use consistent comparison

No functional change.
 1.56 29-Oct-2021  rillig indent: spell 'parentheses' properly in messages and comments
 1.55 28-Oct-2021  rillig indent: clean up indentation, comments, reduce

No functional change.
 1.54 28-Oct-2021  rillig indent: reduce negations in search_stmt_lookahead

No functional change.
 1.53 28-Oct-2021  rillig indent: clean up comments and function names

Having accurate names for the lexer symbols and the parser symbols makes
most of the comments redundant. Remove these.

Rename process_decl to process_type, to match the name of the
corresponding lexer symbol. In this phase, it's just a single type
token, not a whole declaration.

No functional change.
 1.52 26-Oct-2021  rillig indent: make ps.keyword easier to understand

Previously, ps.keyword did not have any documentation and was not
straight-forward. In some cases it was reset to kw_0, in others it was
set to an interesting value. The idea behind it was to remember the kind
of word of the previous token, to decide whether to have a space between
sizeof or offsetof and a following '('.

No functional change.
 1.51 26-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards, to compensate for the remaining bugs
in indent.

Without the type definitions in .indent.pro, the opening braces of the
functions kw_name and lexi_alnum would not be at the beginning of the
line.
 1.50 25-Oct-2021  rillig indent: rename search_brace to search_stmt

No functional change.
 1.49 25-Oct-2021  rillig indent: split type token_type into 3 separate types

Previously, token_type was used for 3 different purposes:

1. symbol types from the lexer
2. symbol types on the parser stack
3. kind of control statement for 'if (expr)' and similar statements

Splitting the 41 constants into separate types makes it immediately
clear that the parser stack never handles comments, preprocessing lines,
newlines, form feeds, the inner structure of expressions.

Previously, the constant switch_expr was especially confusing since it
was used for 3 different purposes: when returned from lexi, it
represented the keyword 'switch', in the parser stack it represented
'switch (expr)', and it was used for a statement head as well.

The only overlap between the lexer symbols and the parser symbols are
'{' and '}', and the keywords 'do' and 'else'. To increase confusion,
the constants of the previous token_type were in apparently random
order and before 2021, they had cryptic, highly abbreviated names.

No functional change.
 1.48 24-Oct-2021  rillig indent: rename form_feed to tt_lex_form_feed

No functional change.
 1.47 24-Oct-2021  rillig indent: split kw_for_or_if_or_while into separate constants

No functional change.
 1.46 24-Oct-2021  rillig indent: split kw_do_or_else into separate constants

It was unnecessarily confusing to have the token types keyword_do_else,
keyword_do and keyword_else at the same time, without any hint in what
they differed.

Some of the token types seem to be used by the lexer while others are
used in the parse stack. Maybe all token types can be partitioned into
these groups, which would suggest to use two different types for them.
And if not, it's still clearer to have this distinction in the names of
the constants.

No functional change.
 1.45 24-Oct-2021  rillig indent: rename nitems to array_length
 1.44 24-Oct-2021  rillig indent: replace global variable use_ff with function parameter
 1.43 20-Oct-2021  rillig indent: add reminder to make the code understandable for humans
 1.42 20-Oct-2021  rillig indent: rename ps.last_u_d to match its comment

No functional change.
 1.41 20-Oct-2021  rillig indent: rename parser stack variables

No functional change.
 1.40 20-Oct-2021  rillig indent: rename blankline_requested variables

The words 'prefix' and 'postfix' sounded too much like horizontal
concepts, like in operators. The actual purpose of these variables is to
add blank lines before and after the current line, so use the same
wording as in the command line options.

No functional change.
 1.39 20-Oct-2021  rillig indent: rename next_blank_lines to blank_lines_to_output

The previous name was already an improvement over the name before that
(n_real_blanklines), but didn't express the intended purpose clearly
enough, so try another name.

No functional change.
 1.38 09-Oct-2021  rillig indent: extract common code for advancing a single tab

No functional change.
 1.37 08-Oct-2021  rillig indent: rename in_or_st to init_or_struct

This makes a few comments redundant.

No functional change.
 1.36 08-Oct-2021  rillig indent: convert ps.box_com to local variable

This variable is only used in a single function, and that function does
not call any other function that could replace the parser state or
install a temporary parser state.

No functional change.
 1.35 08-Oct-2021  rillig indent: rename fill_buffer to inbuf_read_line

No functional change.
 1.34 08-Oct-2021  rillig indent: run indent on indent.h

The formatting looks mostly OK.

Some struct members had excessively long names, leaving no space for
their corresponding comments. Renamed some of them using well-known
abbreviations.

The formatting for debug_vis_range is messed up, no idea why. It is
clearly a function declaration, not a function definition, so there is
no need to place the function name in column 1.

No functional change.
 1.33 08-Oct-2021  rillig indent: replace column calculations with indent, part 4/4
 1.32 08-Oct-2021  rillig indent: rename tokens lparen and rparen to be more precise

No functional change.
 1.31 08-Oct-2021  rillig indent: merge headers into a single file

No functional change.
 1.30 07-Oct-2021  rillig indent: remove global variable option_source

It is only needed at startup, while parsing the options. The string "?"
was not needed at all.

No functional change.
 1.29 05-Oct-2021  rillig indent: merge duplicate code into is_hspace

No functional change.
 1.28 05-Oct-2021  rillig indent: merge duplicate code for reading from input buffer

No functional change.
 1.27 03-Oct-2021  rillig indent: rename functions

There was no good reason for using the different verbs 'scan' and 'set'
for two functions that essentially do the same.

No functional change.
 1.26 27-Sep-2021  rillig indent: use binary instead of linear search when adding types

No functional change.
 1.25 25-Sep-2021  rillig indent: merge duplicate code for token buffers

No functional change.
 1.24 25-Sep-2021  rillig indent: clean up argument handling

No functional change.
 1.23 25-Sep-2021  rillig indent: reduce code and data size for lexing of numbers

Instead of having a table of strings (121 pointers + 121 data
relocations), reduce that table to the actual character data and use a
secondary table for looking up the correct row in the main table.

No functional change.
 1.22 25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.21 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.20 25-Sep-2021  rillig indent: use standard definition for bool, true, false
 1.19 25-Sep-2021  rillig indent: clean up initialization of options

The default values in 'struct pro' were redundant but all consistent,
even with the commented defaults in main_parse_command_line.

No functional change.
 1.18 25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.17 25-Sep-2021  rillig indent: add nonnull memory allocation functions

The only functional change is a single error message.
 1.16 14-Mar-2021  rillig indent: give indent a try at formatting its own code

Formatting indent.h required the following manual corrections
afterwards:

The first tab in the comment in line 1 was replaced with a space but
shouldn't be.

The spacing around the '...' in function prototypes was completely
wrong. It looked like 'const char *,...)__printflike', without any
spaces.

The '*' of the return type 'const char *' was tied to the function name,
even though this declaration was only for a single function. In such a
case, it's more appropriate to line up the function names.

The function-like macros were not indented to -di. This is something
that I would not expect from indent, so it's ok to do that manually.
 1.15 13-Mar-2021  rillig indent: remove disabled duplicate RCS ID from header

By convention, headers don't record their RCS ID.
 1.14 13-Mar-2021  rillig indent: rename pr_comment to process_comment, clean up documentation

No functional change.
 1.13 13-Mar-2021  rillig indent: inline calls to count_spaces and count_spaces_until

These two functions operated on column numbers instead of indentation,
which required adjustments of '+ 1' and '- 1'. Their names were
completely wrong since these functions did not count anything, instead
they computed the column.

No functional change.
 1.12 13-Mar-2021  rillig indent: replace column computation with indentation computation

No functional change.
 1.11 13-Mar-2021  rillig indent: replace compute_code_column with compute_code_indent

The goal is to only ever be concerned about the _indentation_ of a
token, never the _column_ it appears in. Having only one of these
avoids off-by-one errors.

No functional change.
 1.10 13-Mar-2021  rillig indent: replace compute_label_column with compute_label_indent

Using the invariant 'column == 1 + indent'. This removes several overly
complicated '+ 1' from the code that are not needed conceptually.

No functional change.
 1.9 13-Mar-2021  rillig indent: add debug logging for actually writing to the output file

Together with the results of the tokenizer and the 4 buffers for token,
label, code and comment, the debug log now provides a good high-level
view on how the indentation happens and where to look for the many
remaining bugs.
 1.8 13-Mar-2021  rillig indent: replace pad_output with output_indent

Calculating the indentation is simpler than calculating the column,
since that saves the constant addition and subtraction of the 1.

No functional change.
 1.7 12-Mar-2021  rillig indent: add 'const', rename variables, reorder formula for tab width

Column counting starts at 1. This 1 should rather be at the beginning
of the formula since it is thought of being added at the very beginning
of the line, not at the end.

When adding a tab, the newly added tab is added at the end of the
string, therefore that '+ 1' should be at the end of the formula as
well.

No functional change.
 1.6 12-Mar-2021  rillig indent: replace 'target' with 'indent' in function names

The word 'target' was not as specific as possible.

No functional change.
 1.5 07-Mar-2021  rillig indent: in debug mode, output detailed token information

The main ingredient for understanding how indent works is the tokenizer
and the 4 buffers in which the text is collected.

Inspecting this debug log for the test comment-line-end makes it obvious
why indent messes up code that contains '//' comments. The cause is
that indent interprets '//' as an operator, just like '&&' or '||'. The
sequence '/////' is interpreted as a single operator as well, by the
way.

Since '//' is interpreted as an ordinary operator, any words following
it are plain identifiers, usually several of them in a row, which is a
syntax error. Depending on the context, the operator '//' is either a
unary operator (no space around) or a binary operator (space around).
This explains why the word 'line-end' is expanded to 'line - end'.

No functional change outside of debug mode.
 1.4 07-Mar-2021  rillig indent: for the token types, use enum instead of #define

This makes it easier to step through the code in a debugger.

No functional change.
 1.3 07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.2 19-Oct-2019  christos use stdarg, annotate function as __printflike and fix broken formats.
 1.1 04-Apr-2019  kamil branches: 1.1.1; 1.1.2;
Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.

Part II, checkin new files.
 1.1.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.2.2 10-Jun-2019  christos Sync with HEAD
 1.1.2.1 04-Apr-2019  christos file indent.h was added on branch phil-wifi on 2019-06-10 22:10:20 +0000
 1.1.1.1 04-Apr-2019  kamil FreeBSD indent r340138
 1.207.2.1 02-Aug-2025  perseant Sync with HEAD
 1.13 08-Oct-2021  rillig indent: merge headers into a single file

No functional change.
 1.12 27-Sep-2021  rillig indent: rename rwcode to keyword_kind, various cleanup

No idea what the 'rw' in 'rwcode' meant, it had been imported that way
28 years ago. Since rwcode specifies the kind of a keyword, the prefix
'kw_' makes sense.

No functional change.
 1.11 09-Mar-2021  rillig indent: rename a few more token types

The previous names were either too short or ambiguous.

No functional change.
 1.10 09-Mar-2021  rillig indent: make token names more precise

The previous 'casestmt' was wrong since a case label is not a statement
at all.

The previous 'swstmt' was overly short, and wrong as well, since it
represents only the 'switch (expr)' part, which is not a complete switch
statement. Same for 'ifstmt', 'whilestmt', 'forstmt'.

The previous word 'head' was not precise enough since it didn't specify
exactly where the head ends and the body starts. Especially for
handling the dangling else, this distinction is important.

No functional change.
 1.9 09-Mar-2021  rillig indent: rename a few tokens to be more obvious

For casual readers it is not obvious whether the 'sp' meant 'special' or
'space' or something entirely different.
 1.8 07-Mar-2021  rillig indent: use named constants for the different types of keywords

This reduces the magic numbers in the code. Most of these had their
designated constant name written in a nearby comment anyway.

The one instance where arithmetic was performed on this new enum type
(in indent.c) was a bit tricky to understand.

The combination rw_continue_or_inline_or_restrict looks strange, the
'continue' should intuitively belong to the other control flow keywords
in rw_break_or_goto_or_return.

No functional change.
 1.7 07-Mar-2021  rillig indent: for the token types, use enum instead of #define

This makes it easier to step through the code in a debugger.

No functional change.
 1.6 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.5 07-Aug-2003  agc branches: 1.5.98;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.4 18-Oct-1997  mrg merge lite-2.
 1.3 09-Jan-1997  tls RCS ID police
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.5.98.1 10-Jun-2019  christos Sync with HEAD
 1.49 08-Oct-2021  rillig indent: merge headers into a single file

No functional change.
 1.48 07-Oct-2021  rillig indent: rename bp_save to saved_inp_s, be_save to saved_inp_e

Using the same naming convention makes it easier to relate the
variables.

No functional change.
 1.47 07-Oct-2021  rillig indent: group variables for the input buffer

The input buffer follows the same concept as the intermediate buffers
for label, code, comment and token, so use the same type for it.

No functional change.
 1.46 07-Oct-2021  rillig indent: move definition of bufsize from header to implementation

No functional change.
 1.45 07-Oct-2021  rillig indent: rename opt.btype_2 to brace_same_line

No functional change.
 1.44 07-Oct-2021  rillig indent: fix wrong or outdated comments

No functional change.
 1.43 07-Oct-2021  rillig indent: clean up colon handling

No functional change.
 1.42 05-Oct-2021  rillig indent: rename n_real_blanklines

The word 'n' was not as helpful as possible, the word 'real' did not
give any clue at all about the variable's purpose.

No functional change.
 1.41 27-Sep-2021  rillig indent: rename rwcode to keyword_kind, various cleanup

No idea what the 'rw' in 'rwcode' meant, it had been imported that way
28 years ago. Since rwcode specifies the kind of a keyword, the prefix
'kw_' makes sense.

No functional change.
 1.40 26-Sep-2021  rillig indent: fix documentation of opt.case_indent

See io.c, compute_label_indent.
 1.39 26-Sep-2021  rillig indent: unexport global variables

The variable match_state was write-only and was thus removed.

No functional change.
 1.38 26-Sep-2021  rillig indent: negate and rename option.leave_comma

The old name did not mirror the description in the manual page, and it
was the only option that is negated. Inverting it allows the options
table to be compressed.
 1.37 25-Sep-2021  rillig indent: misc cleanup

No functional change.
 1.36 25-Sep-2021  rillig indent: convert found_err to bool

That variable had slipped through the migration since it consequently
used int for the declaration, the definition and all assignments.

No functional change.
 1.35 25-Sep-2021  rillig indent: un-abbreviate a few parser_state members, clean up comments

No functional change.
 1.34 25-Sep-2021  rillig indent: remove dead code for printing comments after empty lines

This code has been commented out for at least 29 years.

No functional change.
 1.33 25-Sep-2021  rillig indent: rename option variable to be more expressive

No functional change.
 1.32 25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.31 25-Sep-2021  rillig indent: convert parser_state from ibool to bool

indent.c:400:5: error: suggest parentheses around assignment used as
truth value
io.c:271:32: error: ‘~’ on a boolean expression

No functional change intended.
 1.30 25-Sep-2021  rillig indent: convert options from ibool to bool

No functional change intended.
 1.29 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.28 25-Sep-2021  rillig indent: use standard definition for bool, true, false
 1.27 25-Sep-2021  rillig indent: move statistical values into a separate struct

No functional change.
 1.26 25-Sep-2021  rillig indent: group global variables for token buffer

No functional change.
 1.25 25-Sep-2021  rillig indent: inline macro 'token'

No functional change.
 1.24 25-Sep-2021  rillig indent: group global variables for code buffer

No functional change.
 1.23 24-Sep-2021  rillig indent: group global variables for label buffer into struct

No functional change.
 1.22 24-Sep-2021  rillig indent: group global variables for the comment buffer

No functional change.
 1.21 13-Mar-2021  rillig indent: fix documentation of parser_state.paren_indents

The column position is not the same as the indentation (off-by-one).
 1.20 13-Mar-2021  rillig indent: distinguish between 'column' and 'indentation'

column == 1 + indentation.

In addition, indentation is a relative distance while column is an
absolute position. Therefore, don't confuse these two concepts, to
prevent off-by-one errors.

No functional change.
 1.19 13-Mar-2021  rillig indent: rename pr_comment to process_comment, clean up documentation

No functional change.
 1.18 13-Mar-2021  rillig indent: fix confusing variable names

The word 'col' should only be used for the 1-based column number. This
name is completely inappropriate for a line length since that provokes
off-by-one errors. The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.

Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.

No functional change.
 1.17 08-Mar-2021  rillig indent: inline macro for backslash

No functional change.
 1.16 08-Mar-2021  rillig indent: convert big macros to functions

Each of these buffers is only modified in a single file. This makes it
unnecessary to declare the macros in the global header.
 1.15 07-Mar-2021  rillig lint: move keyword 'continue' over to the other control flow keywords

No functional change since neither rw_jump nor rw_inline_or_restrict is
mentioned in any switch statement, and lint didn't find any other
suspicious enum operations.
 1.14 07-Mar-2021  rillig indent: use named constants for the different types of keywords

This reduces the magic numbers in the code. Most of these had their
designated constant name written in a nearby comment anyway.

The one instance where arithmetic was performed on this new enum type
(in indent.c) was a bit tricky to understand.

The combination rw_continue_or_inline_or_restrict looks strange, the
'continue' should intuitively belong to the other control flow keywords
in rw_break_or_goto_or_return.

No functional change.
 1.13 07-Mar-2021  rillig indent: for the token types, use enum instead of #define

This makes it easier to step through the code in a debugger.

No functional change.
 1.12 23-Apr-2020  joerg Avoid common symbol declarations
 1.11 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.10 04-Sep-2014  mrg branches: 1.10.16;
port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.9 12-Apr-2009  lukem branches: 1.9.24;
Fix WARNS=4 issues (-Wshadow -Wcast-qual -Wsign-compare)
 1.8 07-Aug-2003  agc branches: 1.8.42;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.7 26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.6 08-Oct-1998  wsanchez Get rid of multiply defined common symbols
 1.5 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.4 18-Oct-1997  mrg merge lite-2.
 1.3 09-Jan-1997  tls RCS ID police
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.8.42.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.9.24.1 21-Sep-2014  snj Pull up following revision(s) (requested by mrg in ticket #110):
usr.bin/indent/io.c: revision 1.15
usr.bin/indent/indent_globs.h: revision 1.10
usr.bin/indent/args.c: revision 1.11
usr.bin/indent/indent.1: revision 1.23
usr.bin/indent/indent.c: revision 1.19
port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.10.16.1 10-Jun-2019  christos Sync with HEAD
 1.237 04-Jan-2025  rillig indent: make debug output easier readable

The previous format had the values of the parser state on the left side
and the corresponding names on the right side. While it looked nicely
aligned, it was not suitable for focusing on the actual data. Replace
this format with the more common "key: value" format.

Use the names of the enum constants in the debug log, instead of the
previous "nice" names that needed one more level of mental translation
and in some cases contained unbalanced punctuation such as '{'.
 1.236 12-Dec-2024  rillig indent: add error handling for I/O errors

Suggested by lint2.
 1.235 03-Dec-2023  rillig branches: 1.235.2;
indent: inline input-related macros

No binary change.
 1.234 03-Dec-2023  rillig indent: group input-related variables into a struct

No functional change.
 1.233 03-Dec-2023  rillig indent: use line number of the token start in diagnostics

Previously, the line number of the end of the token was used, which was
confusing in debug mode.
 1.232 27-Jun-2023  rillig indent: fix 'blank line above first statement in function body'
 1.231 26-Jun-2023  rillig indent: implement 'blank line above first statement in function body'
 1.230 26-Jun-2023  rillig indent: in -bad mode, don't add a blank line above a comment or '}'
 1.229 17-Jun-2023  rillig indent: clean up

Extract duplicate code for handling line continuations.

Prevent theoretic undefined behavior in strspn, as inp.s is not
null-terminated.

Remove adding extra space characters when processing comments, as these
are not necessary to force a line of output.

No functional change.
 1.228 17-Jun-2023  rillig indent: miscellaneous cleanups

No binary change.
 1.227 16-Jun-2023  rillig indent: don't force a blank line between '}' and preprocessing line
 1.226 16-Jun-2023  rillig indent: rename a field of the parser state

The previous name 'comment_in_first_line' was misleading, as it could
mean that there was a comment in the first line of the file.

No functional change.
 1.225 15-Jun-2023  rillig indent: consolidate handling of statement continuations
 1.224 15-Jun-2023  rillig indent: rename state variable to be more accurate

No binary change.
 1.223 15-Jun-2023  rillig indent: fix indentation of multi-line enum constant initializers
 1.222 15-Jun-2023  rillig indent: miscellaneous cleanups, more tests for edge cases
 1.221 14-Jun-2023  rillig indent: clean up array indexing for parser symbols

With 'top' pointing to the actual top element, the array was indexed in
the closed range from 0 to top. All other arrays are indexed by the
usual half-open interval from 0 to len.

No functional change.
 1.220 14-Jun-2023  rillig indent: allow more than 20 nested parentheses or brackets
 1.219 14-Jun-2023  rillig indent: clean up debugging code
 1.218 14-Jun-2023  rillig indent: clean up handling of comments

One less moving part in the parser state.

No functional change.
 1.217 10-Jun-2023  rillig indent: rename misleading variable

The name started with 'line_start', but the value is not always the
value from the beginning of the line.

No functional change.
 1.216 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.215 10-Jun-2023  rillig indent: in debug mode, null-terminate buffers
 1.214 10-Jun-2023  rillig indent: distinguish blank lines from newline characters
 1.213 10-Jun-2023  rillig indent: fix indentation of continuation lines in initializers
 1.212 10-Jun-2023  rillig indent: extract output of an indented line to separate function
 1.211 10-Jun-2023  rillig indent: clean up function names and order in output
 1.210 10-Jun-2023  rillig indent: clean up function and variable names
 1.209 10-Jun-2023  rillig indent: rename and sort variables in parser state

No functional change.
 1.208 09-Jun-2023  rillig indent: trim trailing blank lines
 1.207 09-Jun-2023  rillig indent: when an indentation is ambiguous, indent one level further

The '-eei' mode now applies whenever the indentation from a multi-line
expression could be confused with a following statement.
 1.206 09-Jun-2023  rillig indent: format its own code
 1.205 09-Jun-2023  rillig indent: indent multi-line expressions according to parentheses

This reverts the FreeBSD change from 2004-02-12 that had been imported
on 2019-04-04.
 1.204 08-Jun-2023  rillig indent: fix indentation in multi-line else-if conditions
 1.203 08-Jun-2023  rillig indent: clean up and condense code

No functional change.
 1.202 07-Jun-2023  rillig indent: extract the stack of parser symbols to a separate struct

No functional change.
 1.201 06-Jun-2023  rillig indent: condense code for writing tabs

No functional change.
 1.200 06-Jun-2023  rillig indent: sort functions in call order

No functional change.
 1.199 06-Jun-2023  rillig indent: compute indentation of 'case' labels on-demand

One less moving part to keep track of.

No functional change.
 1.198 05-Jun-2023  rillig indent: clean up comments
 1.197 05-Jun-2023  rillig indent: don't remove blank line after 'if (expr) {'
 1.196 05-Jun-2023  rillig indent: fix formatting of 'do' statements
 1.195 05-Jun-2023  rillig indent: clean up handling of whitespace

No functional change.
 1.194 05-Jun-2023  rillig indent: let the output routines keep track of the indentation

No functional change.
 1.193 04-Jun-2023  rillig indent: remove read pointer from buffers that don't need it

The only buffer that needs a read pointer is the current input line in
'inp'.

No functional change.
 1.192 04-Jun-2023  rillig indent: force at least one space after the colon of a label
 1.191 04-Jun-2023  rillig indent: track the kind of '{' on the parser stack
 1.190 04-Jun-2023  rillig indent: ensure that the 'block init level' never goes negative

No functional change.
 1.189 04-Jun-2023  rillig indent: fix indentation of initializers in compound expressions
 1.188 04-Jun-2023  rillig indent: handle the indentation of 'case' in a simpler way
 1.187 23-May-2023  rillig indent: separate code for handling enums from the lexer

The lexer's responsibility is to generate tokens, it's not supposed to
update the parser state. Centralize the state transitions that control
indentation of enum constants to keep the lexer code clean.

Skip comments, newlines and preprocessing lines when updating the parser
state for enum constants and for '*' in declarations.
 1.186 23-May-2023  rillig indent: split debug output into paragraphs

The paragraphs separate the different processing steps: getting a token
from the lexer, processing the token, updating the parser state, sending
a finished line to the output.
 1.185 22-May-2023  rillig indent: implement suppressing optional blank lines
 1.184 20-May-2023  rillig indent: don't insert blank line between two closing lines
 1.183 20-May-2023  rillig indent: extract the output state from the parser state

The parser state depends on the preprocessing lines, the output state
shouldn't.
 1.182 20-May-2023  rillig indent: implement blank line above block comment
 1.181 20-May-2023  rillig indent: implement blank line after function body
 1.180 20-May-2023  rillig indent: ensure that no blank lines are inserted in INDENT OFF mode

No blank lines were inserted previously, but the code looked
suspicious as if that were possible.
 1.179 20-May-2023  rillig indent: implement blank lines around conditional compilation
 1.178 18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.177 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.176 18-May-2023  rillig indent: remove unnecessary variable size optimization

Due to the enum that follows in the struct, the short variable was
padded to 4 bytes anyway.

No functional change.
 1.175 16-May-2023  rillig indent: directly access the input buffer

No functional change.
 1.174 16-May-2023  rillig indent: remove support for form feed characters inside a line

Form feeds are occasionally used to split code into pages, and this use
is still supported. Having a form feed in the middle of a line is
exotic.
 1.173 16-May-2023  rillig indent: fix handling of INDENT OFF/ON comments

Previously, the 'INDENT OFF' comments were interpreted when the newline
token from the line above the comment was processed, which was earlier
than could be reasonably expected.

The 'INDENT ON' comments were interpreted equally early, which led to
the situation that the 'INDENT OFF' comments were preserved literally
but the 'INDENT ON' comments weren't.
 1.172 16-May-2023  rillig indent: move parsing of 'INDENT OFF/ON' comments to the lexer

No functional change.
 1.171 15-May-2023  rillig indent: fix cast detection

In process_lparen_or_lbracket, ps.paren[...].maybe_cast was not
initialized, which may have been the cause for seemingly random spacing
around binary operators.

While here, clean up the code by reducing the number of accesses to the
parser state.
 1.170 15-May-2023  rillig indent: indent multi-line conditions

No functional change.
 1.169 15-May-2023  rillig indent: fix indentation of statements after controlling expression
 1.168 15-May-2023  rillig indent: fix indentation of expressions in -nlp -eei mode
 1.167 15-May-2023  rillig indent: remove redundant include lines
 1.166 15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.165 14-May-2023  rillig indent: only null-terminate the buffers if necessary

The only case where a buffer is used as a C-style string is when looking
up a keyword.

No functional change.
 1.164 14-May-2023  rillig indent: reduce code for scanning tokens

The input line is guaranteed to end with '\n', so there's no need to
carry another pointer around.

No functional change.
 1.163 14-May-2023  rillig indent: remove foreign RCS IDs
 1.162 14-May-2023  rillig indent: miscellaneous cleanups
 1.161 13-May-2023  rillig indent: do not add a blank at the beginning of a line

Most calls to output_line did already reset the variable. There may be
some untested edge cases in or after comments, but these should be fine
as well.
 1.160 13-May-2023  rillig indent: implement 'blank after declarations'
 1.159 13-May-2023  rillig indent: use enum instead of magic numbers for tracking declarations

No functional change.
 1.158 13-May-2023  rillig indent: improve names of option variables

No functional change.
 1.157 13-May-2023  rillig indent: rename struct fields for buffers

No binary change except for assertion line numbers.
 1.156 13-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.155 12-May-2023  rillig indent: remove statistics

The numbers from the statistics were wrong.
 1.154 11-May-2023  rillig indent: clean up input buffer handling

No functional change.
 1.153 11-May-2023  rillig indent: don't touch comments in preprocessing lines

The indentation of multi-line comments was wrong, and the code for
handling them was too complicated.
 1.152 11-May-2023  rillig tests/indent: add more tests for preprocessing directives
 1.151 11-May-2023  rillig indent: remove unused code
 1.150 11-May-2023  rillig indent: remove broken code for handling blank lines

This fixes several bugs where blank lines were erroneously added or
removed, treating these old bugs for new bugs in different places.
These new bugs are expected to be easier to fix, as the old bugs will
not interfere anymore.
 1.149 11-May-2023  rillig indent: add debug output for tracking comments and braces
 1.148 23-Apr-2022  rillig indent: group global variables related to output control

No functional change.
 1.147 13-Feb-2022  rillig indent: consistently use nparen for indexing parser_state.paren

No binary change.
 1.146 13-Feb-2022  rillig indent: rename parser_state.p_l_follow and paren_level

The previous variable names were misleading.

Paren_level is not the current level of parentheses but the one from the
beginning of the current output line. For better accuracy, rename it to
line_start_paren_level.

P_l_follow is not the level of parentheses that will be active at some
point in the future, as the previous name suggested. Instead, it is the
level of parentheses right now. For better accuracy, rename it to
nparen. This nicely matches its main usage, which is as index to the
parser_state.paren array.

No binary change.
 1.145 13-Feb-2022  rillig indent: replace bitmasking code with struct

The struct directly represents the properties of a pair of parentheses,
without forcing the human reader to decode any bitset. This makes it
easier to find the remaining bugs in the heuristic for determining the
kind of parentheses.

No functional change outside debug mode.
 1.144 12-Feb-2022  rillig indent: fix indentation of enum constants in typedef (since 2019-04-04)

The solution is not elegant since it adds a small state machine inside
the parser state, but at least these states only depend on the sequence
of token types and not on any other part of the parser state.

Reported in PR#55453.
 1.143 28-Nov-2021  rillig indent: clean up and document input handling

The transformation of moving comments from after an 'if (expr)' after
the following brace has a large implementation cost (about 300 lines of
code) and makes input handling quite complicated. Document the overall
idea to save future readers some time.

No functional change.
 1.142 27-Nov-2021  rillig indent: accept a few formatting suggestions from indent

The remaining issues are still that the conditions look ambiguous even
with -eei, and that __attribute__ is broken into a separate line.

No functional change.
 1.141 27-Nov-2021  rillig indent: rename dump functions to output

No functional change.
 1.140 27-Nov-2021  rillig indent: add assertions for input handling

Just to document the invariants; the code is already OK.
 1.139 26-Nov-2021  rillig indent: enhance debug logging for input handling
 1.138 26-Nov-2021  rillig indent: replace inp_enlarge with inp_add

Previously, inbuf.inp.s was only updated at the very end of reading a
line from the input file, which meant that during debugging, it pointed
to invalid memory. Updating all fields in inbuf.inp after every
reallocation makes the code less tricky to understand.

No functional change.
 1.137 26-Nov-2021  rillig indent: split inp_read_line into smaller functions

No functional change.
 1.136 26-Nov-2021  rillig indent: extract inp_from_file from inp_read_line

No functional change.
 1.135 26-Nov-2021  rillig indent: remove code that fixes malformed preprocessor directives
 1.134 26-Nov-2021  rillig indent: move ind_add from io.c to indent.c

It's a general-purpose function that is not directly related to input or
output.
 1.133 25-Nov-2021  rillig indent: prevent undefined behavior in inp_line_start

No functional change.
 1.132 25-Nov-2021  rillig indent: update cross-reference comments for bug in comment handling

The function was renamed in io.c 1.122 from 2021-11-19.
 1.131 25-Nov-2021  rillig indent: rename ps.in_stmt to in_stmt_or_decl

The previous name didn't match reality.

No functional change.
 1.130 25-Nov-2021  rillig indent: rename ps.ind_stmt to in_stmt_cont

This makes a comment redundant.

No functional change.
 1.129 19-Nov-2021  rillig indent: reduce casts to unsigned char for character classification

No functional change.
 1.128 19-Nov-2021  rillig indent: keep inbuf.save_com_s and inbuf.save_com_e in sync

No functional change.
 1.127 19-Nov-2021  rillig indent: fix included headers
 1.126 19-Nov-2021  rillig indent: clean up io.c

No functional change.
 1.125 19-Nov-2021  rillig indent: replace ps.procname with ps.is_function_definition

Omly the first character of ps.procname was ever read, and it was only
compared to '\0'. Using a bool for this means simpler code, less
memory and fewer wasted CPU cycles due to the removed strncpy.

No functional change.
 1.124 19-Nov-2021  rillig indent: unexport inbuf

No functional change.
 1.123 19-Nov-2021  rillig indent: use character input API from pr_comment.c

No functional change.
 1.122 19-Nov-2021  rillig indent: remove all references to inbuf from indent.c

No functional change.
 1.121 19-Nov-2021  rillig indent: move character input handling from indent.c to io.c

No functional change.
 1.120 19-Nov-2021  rillig indent: move character input from indent.c to io.c

No functional change.
 1.119 19-Nov-2021  rillig indent: use character input API from the tokenizer

No functional change.
 1.118 19-Nov-2021  rillig indent: move character input handling from lexi.c to io.c

No functional change.
 1.117 19-Nov-2021  rillig indent: group variables for input handling

No functional change.
 1.116 07-Nov-2021  rillig indent: rename 'inbuf' functions to 'inp'

The variable 'inp' used to be named 'inbuf'. Make the function names
correspond to the variable name again.

No functional change.
 1.115 05-Nov-2021  rillig indent: the '+ 1' in dump_line_code is not an off-by-one error
 1.114 04-Nov-2021  rillig indent: do not discard former error comments anymore

Since io.c 1.20 from 2019-10-19, indent has not placed error comments in
the code anymore. Since these comments are supposed to be cleaned up
immediately, there is no point in having code for handling them.
 1.113 04-Nov-2021  rillig indent: extract compute_code_indent_lineup into separate function

Having 9 different paths in a single function made it more complicated
to understand than necessary.

No functional change.
 1.112 04-Nov-2021  rillig indent: fix off-by-one confusion in paren_indent

The variable was called 'indent' but actually contained a 'column',
which was off by one.

No functional change.
 1.111 04-Nov-2021  rillig indent: replace column computation with indentation computation

No functional change.
 1.110 04-Nov-2021  rillig indent: group conditions in compute_code_indent by topic

No functional change.
 1.109 03-Nov-2021  rillig indent: inline indentation_after, shorten function name to ind_add

There were only few calls to indentation_after, so inlining it spares
the need to look at yet another function definition. Another effect is
that code.s and code.e appear in the code as a pair now, instead of a
single code.s, making the scope of the function call obvious.

In ind_add, there is no need to check for '\0' anymore since none of the
buffers can ever contain a null character, these are filtered out by
inbuf_read_line.

No functional change.
 1.108 30-Oct-2021  rillig indent: inline macro label_offset

No functional change.
 1.107 29-Oct-2021  rillig indent: merge isblank and is_hspace into ch_isblank

No functional change.
 1.106 29-Oct-2021  rillig indent: fix undefined behavior in buffer handling

Adding an arbitrary integer to a pointer may result in an out of bounds
pointer, so replace the addition with a pointer subtraction.

In the buffer handling functions, handle 'buf' and 'l' before 's' and
'e', since they are pairs.

In inbuf_read_line, use 's' instead of 'buf' to make the code easier to
understand for human readers.

No functional change.
 1.105 29-Oct-2021  rillig indent: reorder global variables to be more intuitive

The buffer 'inp' comes first. From there, a single token is read into
the buffer 'token'. From there, it usually ends up in 'code'. The buffer
'token' does not belong to the group of the other 3 buffers, which
together make up a line of formatted output.

No functional change.
 1.104 29-Oct-2021  rillig indent: rename ps.dumped_decl_indent and indent_declaration

The word 'dump' in 'ps.dumped_decl_indent' was too close to dump_line,
which led to confusion since the variable controls whether the
indentation has been added to the code buffer, which happens way before
actually dumping the current line to the output file.

The function name 'indent_declaration' was too unspecific, it did not
reveal where the indentation of the declaration actually happened.

No functional change.
 1.103 27-Oct-2021  rillig indent: fix indentation of local variable declarations

This had been broken since the import of FreeBSD indent in 2019.
 1.102 24-Oct-2021  rillig indent: clean up format of warnings and errors

Previously, warnings and errors had the form of C block comments. Before
NetBSD io.c 1.20 from 2019-10-19, this format made sense because the
diagnostics could end up in the same output stream as the formatted
output.

Since NetBSD io.c 1.20 from 2019-10-19, all diagnostics are redirected
to stderr. This change was not mentioned in the commit message back
then, it makes sense nevertheless. Since stdout and stderr now are
properly separated, there is no need anymore to keep the weird format
for warnings and errors. Switch to the standard 'error: file:line'
format.

Move the function 'diag' to indent.c to have access to the name of the
current input file.
 1.101 24-Oct-2021  rillig indent: replace global variable use_ff with function parameter
 1.100 24-Oct-2021  rillig indent: sort includes
 1.99 20-Oct-2021  rillig indent: rename blankline_requested variables

The words 'prefix' and 'postfix' sounded too much like horizontal
concepts, like in operators. The actual purpose of these variables is to
add blank lines before and after the current line, so use the same
wording as in the command line options.

No functional change.
 1.98 20-Oct-2021  rillig indent: rename next_blank_lines to blank_lines_to_output

The previous name was already an improvement over the name before that
(n_real_blanklines), but didn't express the intended purpose clearly
enough, so try another name.

No functional change.
 1.97 19-Oct-2021  rillig indent: always keep next_blank_lines >= 0

No functional change.
 1.96 19-Oct-2021  rillig indent: use simpler code for copying the input buffer

In debug mode, this reduces the amount of debug output lines.

No functional change in default mode.
 1.95 19-Oct-2021  rillig indent: if a file ends with indent off, don't add space-newline
 1.94 11-Oct-2021  rillig indent: use bool for suppress_blanklines

It only ever got assigned the values 0 and 1.

No functional change.
 1.93 11-Oct-2021  rillig indent: remove dead code
 1.92 09-Oct-2021  rillig indent: condense code for calculating indentations

No functional change.
 1.91 09-Oct-2021  rillig indent: extract common code for advancing a single tab

No functional change.
 1.90 08-Oct-2021  rillig indent: improve local variable names

No functional change.
 1.89 08-Oct-2021  rillig indent: rename fill_buffer to inbuf_read_line

No functional change.
 1.88 08-Oct-2021  rillig indent: run indent on indent.h

The formatting looks mostly OK.

Some struct members had excessively long names, leaving no space for
their corresponding comments. Renamed some of them using well-known
abbreviations.

The formatting for debug_vis_range is messed up, no idea why. It is
clearly a function declaration, not a function definition, so there is
no need to place the function name in column 1.

No functional change.
 1.87 08-Oct-2021  rillig indent: fix formatting of C99 comments

The first attempt at formatting C99 comments was conceptually wrong. It
accessed the next token in dump_line, even though that function should
only ever look at the buffers for the label, the code and the current
comment. (Understanding that part of the code was difficult at that time
due to the sheer number of global variables.) The complicated and
ever-growing condition for whether to output the token was a hack and in
retrospect doesn't make sense at all, that's why it only came close to
the intended effect.

Some unintended side effects were that the C99 comments had an
additional space in front of them, and that in some cases an empty line
followed the comment, and that the comments were not aligned.

Previously, the newline that terminates the C99 comment was included in
the comment. Separating the newline from the comment fixed all these
unintended side effects. The only downside is that the multi-line
statement is not indented, but that should be easy to fix.
 1.86 08-Oct-2021  rillig indent: split dump_line into smaller functions

No functional change.
 1.85 08-Oct-2021  rillig indent: replace column calculations with indent, part 4/4
 1.84 08-Oct-2021  rillig indent: replace column calculations with indent, part 3

No functional change.
 1.83 08-Oct-2021  rillig indent: replace column calculations with indent, part 2

No functional change.
 1.82 08-Oct-2021  rillig indent: calculate indentation instead of column

This avoids constantly adding and subtracting 1.

No functional change.
 1.81 08-Oct-2021  rillig indent: reduce indentation in dump_line
 1.80 07-Oct-2021  rillig indent: rename bp_save to saved_inp_s, be_save to saved_inp_e

Using the same naming convention makes it easier to relate the
variables.

No functional change.
 1.79 07-Oct-2021  rillig indent: group variables for the input buffer

The input buffer follows the same concept as the intermediate buffers
for label, code, comment and token, so use the same type for it.

No functional change.
 1.78 07-Oct-2021  rillig indent: use braces around multi-line statements

No functional change.
 1.77 07-Oct-2021  rillig indent: let the code breathe a bit by inserting empty lines

No functional change.
 1.76 07-Oct-2021  rillig indent: remove redundant comments

No functional change.
 1.75 07-Oct-2021  rillig indent: clean up colon handling

No functional change.
 1.74 07-Oct-2021  rillig indent: raise WARNS from the default 5 up to 6
 1.73 05-Oct-2021  rillig indent: rewrite parse_indent_comment for human readers

The generated code is still very similar, GCC does a good job at
inlining strncmp.

No functional change.
 1.72 05-Oct-2021  rillig indent: rename n_real_blanklines

The word 'n' was not as helpful as possible, the word 'real' did not
give any clue at all about the variable's purpose.

No functional change.
 1.71 05-Oct-2021  rillig indent: rename local char variable, reduce scope of counters

No functional change.
 1.70 05-Oct-2021  rillig indent: use proper escape sequence for form feed

This escape sequence has been available since at least 1978.
 1.69 05-Oct-2021  rillig indent: merge duplicate code into is_hspace

No functional change.
 1.68 26-Sep-2021  rillig indent: unexport global variables

The variable match_state was write-only and was thus removed.

No functional change.
 1.67 26-Sep-2021  rillig indent: let indent format its own code -- in supervised mode

After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.

Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.

Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.

No functional change.
 1.66 25-Sep-2021  rillig indent: misc cleanup

No functional change.
 1.65 25-Sep-2021  rillig indent: convert found_err to bool

That variable had slipped through the migration since it consequently
used int for the declaration, the definition and all assignments.

No functional change.
 1.64 25-Sep-2021  rillig indent: un-abbreviate a few parser_state members, clean up comments

No functional change.
 1.63 25-Sep-2021  rillig indent: remove dead code for printing comments after empty lines

This code has been commented out for at least 29 years.

No functional change.
 1.62 25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.61 25-Sep-2021  rillig indent: convert parser_state from ibool to bool

indent.c:400:5: error: suggest parentheses around assignment used as
truth value
io.c:271:32: error: ‘~’ on a boolean expression

No functional change intended.
 1.60 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.59 25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.58 25-Sep-2021  rillig indent: move statistical values into a separate struct

No functional change.
 1.57 25-Sep-2021  rillig indent: add nonnull memory allocation functions

The only functional change is a single error message.
 1.56 25-Sep-2021  rillig indent: group global variables for token buffer

No functional change.
 1.55 25-Sep-2021  rillig indent: group global variables for code buffer

No functional change.
 1.54 24-Sep-2021  rillig indent: group global variables for label buffer into struct

No functional change.
 1.53 24-Sep-2021  rillig indent: extract parse_indent_comment from fill_buffer

No functional change.
 1.52 24-Sep-2021  rillig indent: group global variables for the comment buffer

No functional change.
 1.51 24-Sep-2021  rillig indent: rename local variable in fill_buffer

The local variable name 'com' prevented grouping the global variables
combuf, s_com, e_com and l_com into a struct named 'com'.

No functional change.
 1.50 24-Sep-2021  rillig indent: fix token duplication after C99 comment

The code that keeps blank lines after C99 comments still looks wrong,
but at least it's better than before.
 1.49 14-Mar-2021  rillig indent: make compute_code_indent more readable

The '?:' operator computing the factor was too hard to read. When
quickly scanning the code, the 1 in the expression looked too much like
it would be added to the indentation, which would turn the indentation
length into a column number, and that again would smell like an
off-by-one error.

No functional change.
 1.48 14-Mar-2021  rillig indent: fix lint warnings

No functional change.
 1.47 13-Mar-2021  rillig indent: add debug logging for switching the input buffer

No functional change outside debug mode.
 1.46 13-Mar-2021  rillig indent: align comments in indent's own code

No functional change.
 1.45 13-Mar-2021  rillig indent: rename local variable in dump_line

This clarifies that the variable names a column, not an indentation.
 1.44 13-Mar-2021  rillig indent: in dump_line, reduce scope of local variable

This allows the variable 'target' in the lower half of the function to
get a more specific name.

No functional change.
 1.43 13-Mar-2021  rillig indent: distinguish between 'column' and 'indentation'

column == 1 + indentation.

In addition, indentation is a relative distance while column is an
absolute position. Therefore, don't confuse these two concepts, to
prevent off-by-one errors.

No functional change.
 1.42 13-Mar-2021  rillig indent: fix confusing variable names

The word 'col' should only be used for the 1-based column number. This
name is completely inappropriate for a line length since that provokes
off-by-one errors. The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.

Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.

No functional change.
 1.41 13-Mar-2021  rillig indent: inline calls to count_spaces and count_spaces_until

These two functions operated on column numbers instead of indentation,
which required adjustments of '+ 1' and '- 1'. Their names were
completely wrong since these functions did not count anything, instead
they computed the column.

No functional change.
 1.40 13-Mar-2021  rillig indent: replace column computation with indentation computation

No functional change.
 1.39 13-Mar-2021  rillig indent: replace compute_code_column with compute_code_indent

The goal is to only ever be concerned about the _indentation_ of a
token, never the _column_ it appears in. Having only one of these
avoids off-by-one errors.

No functional change.
 1.38 13-Mar-2021  rillig indent: replace compute_label_column with compute_label_indent

Using the invariant 'column == 1 + indent'. This removes several overly
complicated '+ 1' from the code that are not needed conceptually.

No functional change.
 1.37 13-Mar-2021  rillig indent: manually fix indentation in indent's own source code
 1.36 13-Mar-2021  rillig indent: add debug logging for actually writing to the output file

Together with the results of the tokenizer and the 4 buffers for token,
label, code and comment, the debug log now provides a good high-level
view on how the indentation happens and where to look for the many
remaining bugs.
 1.35 13-Mar-2021  rillig indent: remove strange debugging code that went in the output file

Whenever the code to be output contained the magic byte 0x80, instead of
writing this byte, indent wrote the column number at the beginning of
the code snippet, times 7. Especially the 'times 7' does not make any
sense at all.

In ISO-8859-1, this character position is not assigned. In Microsoft
Codepage 1252 it is the Euro sign. In UTF-8 (which was probably not on
the author's list when the code was originally written) it occurs as the
middle byte for code points like U+2026 (horizontal ellipsis) from the
block General Punctuation.

Remove this strange code, thereby fixing indent for UTF-8 code. The
code had been there since at least 1993-04-09, when it was first
imported to NetBSD.
 1.34 13-Mar-2021  rillig indent: replace pad_output with output_indent

Calculating the indentation is simpler than calculating the column,
since that saves the constant addition and subtraction of the 1.

No functional change.
 1.33 13-Mar-2021  rillig indent: clean up verbose documentation comments from the 1970s

Since C90, there is no need to repeat the type of the function
parameters.

In the whole code of indent, there is a lot of confusion between the
concepts of a 'column' (which is a position on the screen, counting
starts at 1) and 'indentation' (which is a length, not a position). To
avoid this confusion, the code will be rewritten anyway very soon.

Repeatedly adding and subtracting 1 from the 'current column' is not
elegant, this should rather be done by consistently measuring only the
indentation from the left border (at offset 0), as a distance, not as an
absolute position.
 1.32 12-Mar-2021  rillig indent: add 'const', rename variables, reorder formula for tab width

Column counting starts at 1. This 1 should rather be at the beginning
of the formula since it is thought of being added at the very beginning
of the line, not at the end.

When adding a tab, the newly added tab is added at the end of the
string, therefore that '+ 1' should be at the end of the formula as
well.

No functional change.
 1.31 12-Mar-2021  rillig indent: replace 'target' with 'indent' in function names

The word 'target' was not as specific as possible.

No functional change.
 1.30 12-Mar-2021  rillig indent: use consistent indentation for 'else'

Half of the code used -ce, the other half the opposite -nce.

No functional change.
 1.29 12-Mar-2021  rillig indent: make output_string inline

GCC 9.3.0 didn't notice that the argument to this function is always a
string literal, which makes it worthwhile to inline the call.
 1.28 12-Mar-2021  rillig indent: add helper functions for doing the actual output

This allows to add debug logging to these few functions instead of all
other places that might output something.

Reducing the possible output formats to a few primitives makes dump_line
simpler, especially the fprintf calls. It also removes the non-constant
printf string.

The call to output_int may be meant for debugging, as the character 0x80
is unlikely to appear in any real-world code.

No functional change.
 1.27 08-Mar-2021  rillig indent: remove redundant initializer in dump_line

No functional change.
 1.26 08-Mar-2021  rillig indent: move comment about dump_line to column 1

It looked misplaced on the right side since that area is usually
reserved for small remarks, not long explanations.

No functional change.
 1.25 08-Mar-2021  rillig indent: always use braces in do-while loops

Having a 'while' at the beginning of a line looks as if it would start a
loop. It's confusing when it _ends_ a loop instead.
 1.24 07-Mar-2021  rillig indent: fix handling of '//' end-of-line comments
 1.23 07-Mar-2021  rillig indent: remove redundant parentheses around return value

No functional change.
 1.22 07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.21 06-Mar-2021  rillig indent: fix space-tab alignment in indent's own code

These parts are not fixed automatically by indent since they are in box
comments.

No functional change.
 1.20 19-Oct-2019  christos use stdarg, annotate function as __printflike and fix broken formats.
 1.19 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.18 03-Feb-2019  mrg - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
this case, and thus can't be marked __dead easily
 1.17 25-Feb-2016  ginsbach branches: 1.17.16;
Fix obvious contraction spelling mistakes by adding missing apostrophes.
 1.16 22-Feb-2016  ginsbach Use errx(3).
 1.15 04-Sep-2014  mrg port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.14 12-Apr-2009  lukem branches: 1.14.24;
Fix WARNS=4 issues (-Wshadow -Wcast-qual -Wsign-compare)
 1.13 16-Oct-2003  itojun branches: 1.13.42;
safer use of realloc
 1.12 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.11 26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.10 14-Oct-2000  is Due to infinite wisdom by the language designers, the difference of pointers
has a type of (int) on i386 and (long) on sparc, and I don't even want to
know what else on other cpu types.
 1.9 19-Dec-1998  christos branches: 1.9.2; 1.9.10;
char -> unsigned char, braces for gcc-2.8.1
 1.8 25-Aug-1998  ross Add { and } to shut up egcs. Reformat the more questionable code.
 1.7 30-Mar-1998  mrg use static int instead of static
 1.6 19-Oct-1997  mrg fix compile warnings on the alpha.
 1.5 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.4 18-Oct-1997  mrg merge lite-2.
 1.3 09-Jan-1997  tls RCS ID police
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.9.10.1 17-Oct-2000  tv Pullup 1.10 [is]:
Due to infinite wisdom by the language designers, the difference of pointers
has a type of (int) on i386 and (long) on sparc, and I don't even want to
know what else on other cpu types.
 1.9.2.1 19-Oct-2000  he Pull up revision 1.10 (requested by is):
The type of the difference between pointers is implementation-
defined (long on sparc and int on most others). Compensate when
printing.
 1.13.42.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.14.24.1 21-Sep-2014  snj Pull up following revision(s) (requested by mrg in ticket #110):
usr.bin/indent/io.c: revision 1.15
usr.bin/indent/indent_globs.h: revision 1.10
usr.bin/indent/args.c: revision 1.11
usr.bin/indent/indent.1: revision 1.23
usr.bin/indent/indent.c: revision 1.19
port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.17.16.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.17.16.1 10-Jun-2019  christos Sync with HEAD
 1.235.2.1 02-Aug-2025  perseant Sync with HEAD
 1.242 03-Dec-2023  rillig indent: inline input-related macros

No binary change.
 1.241 03-Dec-2023  rillig indent: use line number of the token start in diagnostics

Previously, the line number of the end of the token was used, which was
confusing in debug mode.
 1.240 03-Dec-2023  rillig indent: fix line number counting in function definition

In a function definition that is split on two lines, if the first line
ends with a '*', the following line break didn't include the line
number.
 1.239 26-Jun-2023  rillig indent: improve heuristics for '*' as pointer in for loops
 1.238 26-Jun-2023  rillig indent: improve heuristics for '*' as a pointer type
 1.237 26-Jun-2023  rillig indent: clean up indentation
 1.236 25-Jun-2023  rillig indent: move cast detection from the lexer to the main processor

It is not the job of the lexer to modify the parser state.
 1.235 25-Jun-2023  rillig indent: treat 'complex' and 'imaginary' as type modifiers, not as types
 1.234 25-Jun-2023  rillig indent: fix formatting of parenthesized name in function definition
 1.233 25-Jun-2023  rillig indent: don't use strspn on inp_p, as it is not null-terminated

No functional change.
 1.232 17-Jun-2023  rillig indent: clean up

Extract duplicate code for handling line continuations.

Prevent theoretic undefined behavior in strspn, as inp.s is not
null-terminated.

Remove adding extra space characters when processing comments, as these
are not necessary to force a line of output.

No functional change.
 1.231 17-Jun-2023  rillig indent: miscellaneous cleanups

No binary change.
 1.230 16-Jun-2023  rillig indent: merge lexer symbols for type in/outside parentheses
 1.229 14-Jun-2023  rillig indent: clean up array indexing for parser symbols

With 'top' pointing to the actual top element, the array was indexed in
the closed range from 0 to top. All other arrays are indexed by the
usual half-open interval from 0 to len.

No functional change.
 1.228 14-Jun-2023  rillig indent: allow more than 20 nested parentheses or brackets
 1.227 14-Jun-2023  rillig indent: remove another flag from parser state

When processing a comment, the flag ps.next_col_1 was not used for the
next token, but for a line within a comment. As its scope was limited
to a single comment, there is no need to store it any longer than that

No functional change.
 1.226 14-Jun-2023  rillig indent: remove a redundant flag from the parser state

No functional change.
 1.225 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.224 10-Jun-2023  rillig indent: clean up function names, fix blank lines in debug output
 1.223 10-Jun-2023  rillig indent: in debug mode, null-terminate buffers
 1.222 10-Jun-2023  rillig indent: clean up function and variable names
 1.221 10-Jun-2023  rillig indent: rename and sort variables in parser state

No functional change.
 1.220 09-Jun-2023  rillig indent: clean up lexer

No functional change.
 1.219 09-Jun-2023  rillig indent: improve heuristics for function declaration vs. definition
 1.218 09-Jun-2023  rillig indent: format its own code
 1.217 08-Jun-2023  rillig indent: remove fragile heuristic for detecting cast expressions

The assumption that in an expression of the form '(a * anything)', the
'*' marks a pointer type was too simple-minded.

For now, fix the obvious cases and leave the others for later. If
needed, they can be worked around using the '-T' option.
 1.216 07-Jun-2023  rillig indent: extract the stack of parser symbols to a separate struct

No functional change.
 1.215 06-Jun-2023  rillig indent: sort functions in call order

No functional change.
 1.214 04-Jun-2023  rillig indent: do not parse '&&&&&&&' as a single binary operator
 1.213 04-Jun-2023  rillig indent: fix '*=' to be a binary operator, not a unary one
 1.212 04-Jun-2023  rillig indent: remove read pointer from buffers that don't need it

The only buffer that needs a read pointer is the current input line in
'inp'.

No functional change.
 1.211 04-Jun-2023  rillig indent: rename struct field, for better symmetry

No binary change outside debug mode.
 1.210 04-Jun-2023  rillig lint: use separate lexer symbols for 'case' and 'default'

It's not strictly necessary since these tokens behave in the same way,
still, the code is more straight-forward when there are separate tokens.
 1.209 04-Jun-2023  rillig indent: classify 'inline' as a modifier rather than a word
 1.208 04-Jun-2023  rillig indent: use separate lexer symbols for the different kinds of ':'
 1.207 04-Jun-2023  rillig indent: separate code for handling parentheses and brackets

Handling parentheses is more complicated than for brackets.
 1.206 23-May-2023  rillig indent: separate code for handling enums from the lexer

The lexer's responsibility is to generate tokens, it's not supposed to
update the parser state. Centralize the state transitions that control
indentation of enum constants to keep the lexer code clean.

Skip comments, newlines and preprocessing lines when updating the parser
state for enum constants and for '*' in declarations.
 1.205 23-May-2023  rillig indent: split debug output into paragraphs

The paragraphs separate the different processing steps: getting a token
from the lexer, processing the token, updating the parser state, sending
a finished line to the output.
 1.204 23-May-2023  rillig indent: fix spacing in declarations in for loops
 1.203 22-May-2023  rillig indent: adjust indentation in lexer

No binary change.
 1.202 20-May-2023  rillig indent: extract the output state from the parser state

The parser state depends on the preprocessing lines, the output state
shouldn't.
 1.201 20-May-2023  rillig indent: clean up lexing of word tokens

No functional change.
 1.200 20-May-2023  rillig indent: separate detection of function definitions from lexing '*'

No functional change.
 1.199 18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.198 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.197 16-May-2023  rillig indent: directly access the input buffer

No functional change.
 1.196 16-May-2023  rillig indent: allow comments in column 1 to be formatted
 1.195 16-May-2023  rillig indent: remove support for form feed characters inside a line

Form feeds are occasionally used to split code into pages, and this use
is still supported. Having a form feed in the middle of a line is
exotic.
 1.194 16-May-2023  rillig indent: fix handling of INDENT OFF/ON comments

Previously, the 'INDENT OFF' comments were interpreted when the newline
token from the line above the comment was processed, which was earlier
than could be reasonably expected.

The 'INDENT ON' comments were interpreted equally early, which led to
the situation that the 'INDENT OFF' comments were preserved literally
but the 'INDENT ON' comments weren't.
 1.193 16-May-2023  rillig indent: move parsing of 'INDENT OFF/ON' comments to the lexer

No functional change.
 1.192 15-May-2023  rillig indent: clean up detection of whether parentheses form a cast

No functional change.
 1.191 15-May-2023  rillig indent: improve type guessing, fix formatting of declarations
 1.190 15-May-2023  rillig indent: remove backslash line continuation outside preprocessing

The indenter did not handle these backslashes well, interpreting them as
unary operators, and they are an edge case anyway. Line continuations
in string literals and character constants are kept.
 1.189 15-May-2023  rillig indent: indent multi-line conditions

No functional change.
 1.188 15-May-2023  rillig indent: let indent format its own code

With manual corrections, as indent does not properly indent multi-line
'?:' expressions nor multi-line controlling expressions.
 1.187 15-May-2023  rillig indent: clean up memory allocation

No functional change.
 1.186 15-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.185 15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.184 14-May-2023  rillig indent: only null-terminate the buffers if necessary

The only case where a buffer is used as a C-style string is when looking
up a keyword.

No functional change.
 1.183 14-May-2023  rillig indent: reduce code for scanning tokens

The input line is guaranteed to end with '\n', so there's no need to
carry another pointer around.

No functional change.
 1.182 14-May-2023  rillig indent: remove foreign RCS IDs
 1.181 14-May-2023  rillig indent: miscellaneous cleanups
 1.180 14-May-2023  rillig indent: reduce binary size

No functional change.
 1.179 13-May-2023  rillig indent: fix lexing of numbers that are spread over multiple lines
 1.178 13-May-2023  rillig indent: rename struct fields for buffers

No binary change except for assertion line numbers.
 1.177 13-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.176 12-May-2023  rillig indent: condense code for handling spaced expressions

No functional change outside debug mode.
 1.175 11-May-2023  rillig indent: move parser state variables to the parser_state struct

Include the variables in the debug output.
 1.174 11-May-2023  rillig indent: move force_nl into the parser state

This way, it is included in the debug output.

No functional change.
 1.173 11-May-2023  rillig indent: remove buggy code for swapping tokens

It is not the job of an indenter to swap tokens, even if it's only about
placing comments elsewhere. The code that swapped the tokens was
complicated, buggy and impossible to understand.

In -br (brace right) mode, indent no longer moves a '{' from the
beginning of a line to the end of the previous line, as that was handled
by the token swapping code as well. This change is unintended, but it
will be easier to re-add that now that the code is simpler.
 1.172 13-Feb-2022  rillig indent: rename parser_state.p_l_follow and paren_level

The previous variable names were misleading.

Paren_level is not the current level of parentheses but the one from the
beginning of the current output line. For better accuracy, rename it to
line_start_paren_level.

P_l_follow is not the level of parentheses that will be active at some
point in the future, as the previous name suggested. Instead, it is the
level of parentheses right now. For better accuracy, rename it to
nparen. This nicely matches its main usage, which is as index to the
parser_state.paren array.

No binary change.
 1.171 13-Feb-2022  rillig indent: replace bitmasking code with struct

The struct directly represents the properties of a pair of parentheses,
without forcing the human reader to decode any bitset. This makes it
easier to find the remaining bugs in the heuristic for determining the
kind of parentheses.

No functional change outside debug mode.
 1.170 13-Feb-2022  rillig indent: change parser_state.cast_mask to 0-based indexing

Having 1-based indexing was completely unexpected, and it didn't match
the 0-based indexing of parser_state.paren_indents.

No functional change.
 1.169 12-Feb-2022  rillig indent: fix indentation of enum constants in typedef (since 2019-04-04)

The solution is not elegant since it adds a small state machine inside
the parser state, but at least these states only depend on the sequence
of token types and not on any other part of the parser state.

Reported in PR#55453.
 1.168 12-Feb-2022  rillig indent: extend debug logging for the parser state

The member names in struct parser_state are not trustworthy, for example
in_decl does not correspond to the intuitive definition of "inside a
declaration". To cope with this uncertainty, output the full state of
the parser state to the debug log, not only the changes. This helps to
track the inner state for small differences in the input, such as
between 'typedef enum { TA, TB } TT' and 'enum { EA, EB } ET'.

This hopefully helps in fixing PR#55453.

No functional change outside debug mode.
 1.167 28-Nov-2021  rillig indent: treat L"string" as a single token

There is never whitespace between the 'L' and the string literal or the
character constant. There might be a backslash-newline between them, but
that case was not handled before either.

No functional change.
 1.166 27-Nov-2021  rillig indent: illustrate probably_looking_at_definition with examples

No functional change.
 1.165 27-Nov-2021  rillig indent: fix out of bounds memory access (since 2021-11-25)
 1.164 25-Nov-2021  rillig indent: rename ps.in_function_parameters to match reality

This flag is only set while parsing the parameters of a function
definition, but not for a function declaration. See buffer_add in the
test fmt_decl.

No functional change.
 1.163 25-Nov-2021  rillig indent: improve heuristic for spaces around '*' in declarations
 1.162 25-Nov-2021  rillig indent: eliminate 3 negations in tokenizer

No functional change.
 1.161 25-Nov-2021  rillig indent: extract lex_asterisk_unary into separate function

No functional change.
 1.160 25-Nov-2021  rillig indent: condense code for building tokens from characters

No functional change.
 1.159 25-Nov-2021  rillig indent: in lexi, assign lsym and next_unary in consistent order

No functional change.
 1.158 25-Nov-2021  rillig indent: fix heuristic for declaration/definition to post-1990 reality
 1.157 25-Nov-2021  rillig indent: fix space after function name for option '-pcs'
 1.156 25-Nov-2021  rillig indent: fix spacing for unknown type names in declarations
 1.155 25-Nov-2021  rillig indent: extract probably_looking_at_definition to separate function

This heuristic guesses wrong in many cases and will some cleanups.

No functional change.
 1.154 25-Nov-2021  rillig indent: merge duplicate code for parsing 'struct s *'

No functional change.
 1.153 25-Nov-2021  rillig indent: fix formatting of a few declarations involving unknown types
 1.152 25-Nov-2021  rillig indent: rename ps.in_stmt to in_stmt_or_decl

The previous name didn't match reality.

No functional change.
 1.151 25-Nov-2021  rillig indent: rename ps.ind_stmt to in_stmt_cont

This makes a comment redundant.

No functional change.
 1.150 20-Nov-2021  rillig indent: clean up lint annotation and tests
 1.149 20-Nov-2021  rillig indent: fix tokenizing of word-like tokens (since 2019-04-04)

After a backslash-newline, the first character of the next line is only
part of the identifier if it is an identifier character.

indent-2000.10.11.14.46.04
| int var \
| +name = 4;
indent-2012.11.20.03.02.57

indent-2014.09.04.04.06.07
| int var \
| +name = 4;
indent-2019.02.03.03.19.29

indent-2019.04.04.15.27.35
| int var+name = 4;
indent-2021.11.19.20.23.17

indent
| int var + name = 4;
 1.148 19-Nov-2021  rillig indent: reduce casts to unsigned char for character classification

No functional change.
 1.147 19-Nov-2021  rillig indent: replace ps.procname with ps.is_function_definition

Omly the first character of ps.procname was ever read, and it was only
compared to '\0'. Using a bool for this means simpler code, less
memory and fewer wasted CPU cycles due to the removed strncpy.

No functional change.
 1.146 19-Nov-2021  rillig indent: fix formatting of function definitions (since 2019-04-04)

In the definition of a function with a pointer return type, the
formatting depended on the name of the function. Function names
matching [A-Za-z+] were formatted correctly, those containing [$0-9_]
weren't.
 1.145 19-Nov-2021  rillig indent: merge duplicate code into is_identifier_part

No functional change.
 1.144 19-Nov-2021  rillig indent: fix lost function name (since 2019-04-04)

When indent searched for an identifier followed by a '(', to see whether
the identifier is a function name, it didn't care that the input buffer
could be resized due to a long line, which had made the pointer 'tp'
invalid. Fix this by stopping the search at the end of the line. A
better approach would be to have an unlimited lookahead buffer for
situations like these. The code that deals with character input has
already been extracted to io.c, so it's possible to implement that now.

While here, fix another access to undefined memory, after the loop.

There is still the issue of overwriting procname[0] with a blank, which
results in inconsistent formatting depending on the function name,
probably another case of accessing undefined memory, although the
results have been reproducible, but that may have been pure luck.

The formatted code looks clearly broken, but that's still better than
losing a token and destroying the whole file.
 1.143 19-Nov-2021  rillig indent: use character input API from the tokenizer

No functional change.
 1.142 19-Nov-2021  rillig indent: move character input handling from lexi.c to io.c

No functional change.
 1.141 19-Nov-2021  rillig indent: replace direct access to the input buffer

This is a preparation for abstracting away all the low-level details of
handling the input. The goal is to fix the current bugs regarding line
number counting, out of bounds memory access, and generally unreadable
code.

No functional change.
 1.140 19-Nov-2021  rillig indent: group variables for input handling

No functional change.
 1.139 18-Nov-2021  rillig indent: prevent use-after-free bug

Triggered by the following artificial program:

---- snip ----
int *
f
( void)
{
}
---- snap ----
 1.138 07-Nov-2021  rillig indent: various cleanups

Make several comments more precise.

Rename process_end_of_file to process_eof to match the token name.

Change the order of assignments in analyze_comment to keep the com_ind
computations closer together.

In copy_comment_wrap, use pointer difference instead of pointer addition
to stay away from undefined behavior.

No functional change.
 1.137 07-Nov-2021  rillig indent: rename ps.decl_nest to decl_level

This better matches the comment.

No functional change.
 1.136 07-Nov-2021  rillig indent: move ps.p_l_follow closer to lsym_type_outside_parentheses

This makes it easier to see the relation between these two.

No functional change.
 1.135 07-Nov-2021  rillig indent: rename type_at_paren_level_0 to type_outside_parentheses

For symmetry with type_in_parentheses.

No functional change.
 1.134 07-Nov-2021  rillig indent: distinguish between typename in parentheses and other words

This gets rid of two members of parser_state. No functional change for
well-formed programs. The sequence of '++int' or '--size_t' may be
formatted differently than before, but no program is expected to contain
that sequence.

Rename lsym_ident to lsym_word since 'ident' was too specific. This
token type is used for constants and string literals as well. Strictly
speaking, a string literal is not a word, but at least it's better than
before.
 1.133 07-Nov-2021  rillig indent: rename 'inbuf' functions to 'inp'

The variable 'inp' used to be named 'inbuf'. Make the function names
correspond to the variable name again.

No functional change.
 1.132 05-Nov-2021  rillig indent: consistently use token.e[-1] for the last added character

No functional change.
 1.131 05-Nov-2021  rillig indent: add debug output for remaining members of parser_status
 1.130 05-Nov-2021  rillig indent: rename ps.curr_newline to next_col_1

For symmetry with ps.curr_col_1.

No functional change.
 1.129 01-Nov-2021  rillig indent: fix missing blank after 'return' (since 2021-10-31)

In indent.c 1.200 from 2021-10-31, the subtypes of identifier tokens
were removed since they were redundant. An unintended side effect was
that a parenthesized expression after 'return' was no longer separated
by a blank.

Before that change, 'return' was tokenized as an lsym_ident with subtype
kw_other, and want_space_before_lparen handled this case in the last
line. After the change, 'return' was treated as an ordinary identifier,
and unless the option '-pcs' (blank after function call) was given, the
blank was removed.

The other keywords that had kw_other are not affected since they do not
expect a '(' afterwards. These keywords are 'break', 'continue', 'goto',
'inline' and 'restrict'.

Curiously, there was not a single test case that covered 'return(expr)'.

While here, remove the trailing ',' from the enum lexer_symbol, which is
not allowed in standard C, it is a GNU extension. Lint doesn't complain
about this since the default LINTFLAGS include '-g' for GCC mode.
 1.128 31-Oct-2021  rillig indent: clean up

Initialize buffers in reading order, make comments more expressive,
rename add_typename to register_typename, remove unused macro.

No functional change.
 1.127 31-Oct-2021  rillig indent: remove redundant keyword.is_type

It is still confusing that not all type keywords end up as lsym_type.
Those that occur inside parentheses end up as identifiers instead. To
see whether an identifier is a typename, query ps.curr_is_type and
ps.prev_is_type.

No functional change.
 1.126 31-Oct-2021  rillig indent: replace kw_tag with lsym_tag

This leaves only one special type of token, which is lsym_ident, which
in some cases represents a type name and in other cases an identifier,
constant or string literal.

No functional change.
 1.125 31-Oct-2021  rillig indent: replace simple cases of keyword_kind with lexer_symbol

The remaining keyword kinds 'tag' and 'type' require a bit more thought,
so do them in a separate step.

No functional change.
 1.124 31-Oct-2021  rillig indent: rename lsym_type to better reflect reality

Type names that occur in parentheses are parsed as lsym_ident having the
subtype kw_type instead.

No functional change.
 1.123 31-Oct-2021  rillig indent: remove support for pre-1978 variable initialization
 1.122 31-Oct-2021  rillig indent: in debug log, print token subtype in same line

The keyword 'void' is parsed as lsym_type in some cases and lsym_ident
in others. Its corresponding keyword is always kw_type though. Put the
subtype into the same line as the other token information.
 1.121 31-Oct-2021  rillig indent: add separate lexer symbol for offsetof

No functional change.
 1.120 31-Oct-2021  rillig indent: add separate lexer symbol for sizeof

The plan is to get rid of the type keyword_kind, which largely overlaps
with lexer_symbol.

No functional change.
 1.119 31-Oct-2021  rillig indent: clean up definition of keywords

Rename kw_struct_or_union_or_enum to the shorter kw_tag.

Merge kw_jump with kw_inline_or_restrict since they are handled in the
same way.

No functional change.
 1.118 31-Oct-2021  rillig indent: condense lexi_alnum

No functional change.
 1.117 30-Oct-2021  rillig indent: rename prev_newline and prev_col_1 to curr

These two flags describe the token that is currently processed.

In process_binary_op, curr_newline can never be true since newline is
not a binary operator, so remove that condition.

No functional change.
 1.116 30-Oct-2021  rillig indent: in debug output, list the new token first
 1.115 30-Oct-2021  rillig indent: clean up lexical analyzer

Use traditional type for small unsigned numbers instead of uint8_t; the
required header was not included.

Remove assertion for debug mode; lint takes care of ensuring that the
enum constants match the length of the names array.

Constify a name array.

Move the comparison function for bsearch closer to its caller.

No functional change.
 1.114 29-Oct-2021  rillig indent: remove redundant comments, remove punctuation from debug log

The comment about 'null stmt' between braces probably meant 'no
statements between braces'.

The comments at psym_switch_expr only repeated what the code says or had
been outdated 29 years ago already since opt.case_indent does not have
to be 'one level down'.

In the debug log, the quotes around the symbol names are not necessary
after a ':'. The parse stack also does not need this much punctuation.

Reducing a do-while loop to nothing instead of a statement saves a few
CPU cycles. It works because after each lbrace, a stmt is pushed to the
parser stack. This stmt can only ever be reduced to a stmt_list but
never be removed.
 1.113 29-Oct-2021  rillig indent: in debug mode, log only differences for most ps members
 1.112 29-Oct-2021  rillig indent: add detailed debug logging for the parser state
 1.111 29-Oct-2021  rillig indent: merge isblank and is_hspace into ch_isblank

No functional change.
 1.110 29-Oct-2021  rillig indent: use prev/curr/next to refer to the current token

The word 'last' just didn't match with 'next'.

No functional change.
 1.109 29-Oct-2021  rillig indent: keep p_l_follow nonnegative, use consistent comparison

No functional change.
 1.108 29-Oct-2021  rillig indent: spell 'parentheses' properly in messages and comments
 1.107 28-Oct-2021  rillig indent: remove unused local variable in lexi

Since the previous commit, lexi is always called with the same argument,
so remove that parameter.

The previous commit broke the debug logging by not printing "transient
state" anymore. Replace this with "rolled back parser state" at the
caller's site.

No functional change.
 1.106 28-Oct-2021  rillig indent: reduce negations in search_stmt_lookahead

No functional change.
 1.105 26-Oct-2021  rillig indent: make ps.keyword easier to understand

Previously, ps.keyword did not have any documentation and was not
straight-forward. In some cases it was reset to kw_0, in others it was
set to an interesting value. The idea behind it was to remember the kind
of word of the previous token, to decide whether to have a space between
sizeof or offsetof and a following '('.

No functional change.
 1.104 26-Oct-2021  rillig indent: fix debug logging

The parser state is not always 'ps', so the debug logging must use the
correct state as well.
 1.103 26-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards, to compensate for the remaining bugs
in indent.

Without the type definitions in .indent.pro, the opening braces of the
functions kw_name and lexi_alnum would not be at the beginning of the
line.
 1.102 26-Oct-2021  rillig indent: merge duplicate code in lexi_alnum
 1.101 25-Oct-2021  rillig indent: improve debug logging

Output the various details in chronological order.
 1.100 25-Oct-2021  rillig indent: split type token_type into 3 separate types

Previously, token_type was used for 3 different purposes:

1. symbol types from the lexer
2. symbol types on the parser stack
3. kind of control statement for 'if (expr)' and similar statements

Splitting the 41 constants into separate types makes it immediately
clear that the parser stack never handles comments, preprocessing lines,
newlines, form feeds, the inner structure of expressions.

Previously, the constant switch_expr was especially confusing since it
was used for 3 different purposes: when returned from lexi, it
represented the keyword 'switch', in the parser stack it represented
'switch (expr)', and it was used for a statement head as well.

The only overlap between the lexer symbols and the parser symbols are
'{' and '}', and the keywords 'do' and 'else'. To increase confusion,
the constants of the previous token_type were in apparently random
order and before 2021, they had cryptic, highly abbreviated names.

No functional change.
 1.99 24-Oct-2021  rillig indent: rename form_feed to tt_lex_form_feed

No functional change.
 1.98 24-Oct-2021  rillig indent: split kw_for_or_if_or_while into separate constants

No functional change.
 1.97 24-Oct-2021  rillig indent: split kw_do_or_else into separate constants

It was unnecessarily confusing to have the token types keyword_do_else,
keyword_do and keyword_else at the same time, without any hint in what
they differed.

Some of the token types seem to be used by the lexer while others are
used in the parse stack. Maybe all token types can be partitioned into
these groups, which would suggest to use two different types for them.
And if not, it's still clearer to have this distinction in the names of
the constants.

No functional change.
 1.96 24-Oct-2021  rillig indent: define lexi_end as function instead of macro
 1.95 24-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards. Indent still does not get
extra_expr_indent correctly, it also indents global variables after
tagged declarations too deep.

No functional change.
 1.94 24-Oct-2021  rillig indent: rename nitems to array_length
 1.93 24-Oct-2021  rillig indent: sort includes
 1.92 20-Oct-2021  rillig indent: rename ps.last_u_d to match its comment

No functional change.
 1.91 11-Oct-2021  rillig indent: use separate variables for lexi_alnum and lexi

These two uses of the variable are independent of each other.

No functional change.
 1.90 11-Oct-2021  rillig indent: clean up comments in lexi and lexi_alnum

No functional change.
 1.89 11-Oct-2021  rillig indent: extract lexi_alnum from lexi

No functional change.
 1.88 09-Oct-2021  rillig indent: fix lint warning about bsearch discarding 'const'

lexi.c(433): warning: call to 'bsearch' effectively discards 'const'
from argument [346]
 1.87 08-Oct-2021  rillig indent: merge duplicate code in lexer

No functional change.
 1.86 08-Oct-2021  rillig indent: rename in_or_st to init_or_struct

This makes a few comments redundant.

No functional change.
 1.85 08-Oct-2021  rillig indent: remove 'global' from the list of keywords

Since 1978, 'global' has not been a keyword in C. Moreover, it was
declared as a type while its name would rather suggest a storage class.

Removing the keyword fixes the formatting of variables named 'global'.
 1.84 08-Oct-2021  rillig indent: clean up typename handling

Unexport typenames list.

Replace standard binary search with custom binary search that returns
the inserting position.

In is_typename, take advantage of the buffer type instead of using
the standard C recipe for str_ends_with.

No functional change.
 1.83 08-Oct-2021  rillig indent: enhance comments for lex_number state machine

No functional change.
 1.82 08-Oct-2021  rillig indent: improve local variable names

No functional change.
 1.81 08-Oct-2021  rillig indent: rename fill_buffer to inbuf_read_line

No functional change.
 1.80 08-Oct-2021  rillig indent: constify detection of function names

No functional change.
 1.79 08-Oct-2021  rillig indent: rename tokens lparen and rparen to be more precise

No functional change.
 1.78 07-Oct-2021  rillig indent: group variables for the input buffer

The input buffer follows the same concept as the intermediate buffers
for label, code, comment and token, so use the same type for it.

No functional change.
 1.77 07-Oct-2021  rillig indent: clean up code, remove outdated wrong comments

No functional change.
 1.76 07-Oct-2021  rillig indent: use braces around multi-line statements

No functional change.
 1.75 07-Oct-2021  rillig indent: let the code breathe a bit by inserting empty lines

No functional change.
 1.74 07-Oct-2021  rillig indent: fix wrong or outdated comments

No functional change.
 1.73 07-Oct-2021  rillig indent: raise WARNS from the default 5 up to 6
 1.72 05-Oct-2021  rillig indent: use buffer type in debug_print_buf

That function had been created before 'struct buffer' was invented,
therefore it used two pointers as parameters. Remove this redundancy.

No functional change.
 1.71 05-Oct-2021  rillig indent: run indent on lexi.c, with manual corrections

The variables 'keywords' and 'typenames' were indented using 8 spaces,
even though -di0 was in effect, which should result in a single space,
and -ut was in effect, which should result in a single tab instead of 8
spaces.

The option -eei does not work as advertised, the controlling expressions
are only indented by the normal amount, which easily leads to confusion
as to whether the code belongs to the condition or the following
statement.
 1.70 05-Oct-2021  rillig indent: untangle complicated condition in probably_typedef

No functional change.
 1.69 05-Oct-2021  rillig indent: use proper escape sequence for form feed

This escape sequence has been available since at least 1978.
 1.68 05-Oct-2021  rillig indent: merge duplicate code into is_hspace

No functional change.
 1.67 05-Oct-2021  rillig indent: clean up code for appending to buffers

Use *e++ for appending and e[-1] for testing the previously appended
character, like in other places in the code.

No functional change.
 1.66 05-Oct-2021  rillig indent: merge duplicate code for reading from input buffer

No functional change.
 1.65 03-Oct-2021  rillig indent: fix lint warning about signed '>>'

Lint couldn't infer that indent's list of type names will practically
never contain more that 2 billion entries and that the result of '>>'
would be the same in all cases.
 1.64 27-Sep-2021  rillig indent: use binary instead of linear search when adding types

No functional change.
 1.63 27-Sep-2021  rillig indent: extract is_typename from lexi

No functional change.
 1.62 27-Sep-2021  rillig indent: rename rwcode to keyword_kind, various cleanup

No idea what the 'rw' in 'rwcode' meant, it had been imported that way
28 years ago. Since rwcode specifies the kind of a keyword, the prefix
'kw_' makes sense.

No functional change.
 1.61 26-Sep-2021  rillig indent: unexport global variables

The variable match_state was write-only and was thus removed.

No functional change.
 1.60 26-Sep-2021  rillig indent: unexport keyword table, clean up

No functional change.
 1.59 26-Sep-2021  rillig indent: let indent format its own code -- in supervised mode

After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.

Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.

Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.

No functional change.
 1.58 25-Sep-2021  rillig indent: merge duplicate code for token buffers

No functional change.
 1.57 25-Sep-2021  rillig indent: extract probably_typedef into separate function

This condition is complicated enough that it warrants being split into
several clauses, maybe even with an explanation.

No functional change.
 1.56 25-Sep-2021  rillig indent: reduce code and data size for lexing of numbers

Instead of having a table of strings (121 pointers + 121 data
relocations), reduce that table to the actual character data and use a
secondary table for looking up the correct row in the main table.

No functional change.
 1.55 25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.54 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.53 25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.52 25-Sep-2021  rillig indent: make lex_char_or_string simpler

The previous code was so tricky that every second line needed a comment
that explains what's going on. Replace the complicated code with the
usual straight-forward string-copying patterns.

No functional change.
 1.51 25-Sep-2021  rillig indent: add nonnull memory allocation functions

The only functional change is a single error message.
 1.50 25-Sep-2021  rillig indent: group global variables for token buffer

No functional change.
 1.49 25-Sep-2021  rillig indent: inline macro 'token'

No functional change.
 1.48 25-Sep-2021  rillig indent: group global variables for code buffer

No functional change.
 1.47 25-Sep-2021  rillig indent: rename variables of type token_type

The previous variable name 'code' conflicts with the buffer of the same
name.

No functional change.
 1.46 24-Sep-2021  rillig indent: group global variables for label buffer into struct

No functional change.
 1.45 24-Sep-2021  rillig indent: group global variables for the comment buffer

No functional change.
 1.44 24-Sep-2021  rillig indent: fix space-tab in indentation
 1.43 26-Aug-2021  rillig indent: extract lex_number, lex_word, lex_char_or_string

No functional change.
 1.42 25-Aug-2021  rillig indent: fix lint warnings about type conversions on ilp32

No functional change.
 1.41 14-Mar-2021  rillig indent: fix lint warnings

No functional change.
 1.40 13-Mar-2021  rillig indent: remove redundant parentheses

No functional change.
 1.39 13-Mar-2021  rillig indent: add debug logging for actually writing to the output file

Together with the results of the tokenizer and the 4 buffers for token,
label, code and comment, the debug log now provides a good high-level
view on how the indentation happens and where to look for the many
remaining bugs.
 1.38 12-Mar-2021  rillig indent: use consistent indentation for 'else'

Half of the code used -ce, the other half the opposite -nce.

No functional change.
 1.37 12-Mar-2021  rillig indent: fix misleading indentation in indent's own code

No functional change.
 1.36 12-Mar-2021  rillig indent: move code for tokenizing numbers further up

Having it directly below the table makes it easier understandable.

I also tried to omit this function entirely by moving the code into the
initializer itself, but that made the code redundant and furthermore
increased the size of the resulting binary, probably because of the new
relocation records.

No functional change.
 1.35 11-Mar-2021  rillig indent: reduce indentation of check_size functions

No functional change.
 1.34 11-Mar-2021  rillig indent: remove redundant cast after allocation functions

No functional change.
 1.33 11-Mar-2021  rillig indent: use consistent array indexing

No functional change.
 1.32 11-Mar-2021  rillig indent: merge duplicate code for reading from the input buffer

No functional change.
 1.31 09-Mar-2021  rillig indent: rename a few more token types

The previous names were either too short or ambiguous.

No functional change.
 1.30 09-Mar-2021  rillig indent: make token names more precise

The previous 'casestmt' was wrong since a case label is not a statement
at all.

The previous 'swstmt' was overly short, and wrong as well, since it
represents only the 'switch (expr)' part, which is not a complete switch
statement. Same for 'ifstmt', 'whilestmt', 'forstmt'.

The previous word 'head' was not precise enough since it didn't specify
exactly where the head ends and the body starts. Especially for
handling the dangling else, this distinction is important.

No functional change.
 1.29 09-Mar-2021  rillig indent: rename a few tokens to be more obvious

For casual readers it is not obvious whether the 'sp' meant 'special' or
'space' or something entirely different.
 1.28 09-Mar-2021  rillig indent: manually indent comments

It's strange that indent's own code is not formatted by indent itself,
which would be a good demonstration of its capabilities.

In its current state, I don't trust indent to get even the tokenization
correct, therefore the only safe way is to format the code manually.
 1.27 08-Mar-2021  rillig indent: split bsearch comparison function

It may have been a clever trick to use the same memory layout for struct
templ and a string pointer, but it's not worth the extra comment and
difficulty in understanding the code.

No functional change.
 1.26 08-Mar-2021  rillig indent: inline macro for backslash

No functional change.
 1.25 08-Mar-2021  rillig indent: convert big macros to functions

Each of these buffers is only modified in a single file. This makes it
unnecessary to declare the macros in the global header.
 1.24 07-Mar-2021  rillig indent: fix handling of '//' end-of-line comments
 1.23 07-Mar-2021  rillig indent: remove redundant parentheses around return value

No functional change.
 1.22 07-Mar-2021  rillig lint: move keyword 'continue' over to the other control flow keywords

No functional change since neither rw_jump nor rw_inline_or_restrict is
mentioned in any switch statement, and lint didn't find any other
suspicious enum operations.
 1.21 07-Mar-2021  rillig indent: use named constants for the different types of keywords

This reduces the magic numbers in the code. Most of these had their
designated constant name written in a nearby comment anyway.

The one instance where arithmetic was performed on this new enum type
(in indent.c) was a bit tricky to understand.

The combination rw_continue_or_inline_or_restrict looks strange, the
'continue' should intuitively belong to the other control flow keywords
in rw_break_or_goto_or_return.

No functional change.
 1.20 07-Mar-2021  rillig indent: in debug mode, output detailed token information

The main ingredient for understanding how indent works is the tokenizer
and the 4 buffers in which the text is collected.

Inspecting this debug log for the test comment-line-end makes it obvious
why indent messes up code that contains '//' comments. The cause is
that indent interprets '//' as an operator, just like '&&' or '||'. The
sequence '/////' is interpreted as a single operator as well, by the
way.

Since '//' is interpreted as an ordinary operator, any words following
it are plain identifiers, usually several of them in a row, which is a
syntax error. Depending on the context, the operator '//' is either a
unary operator (no space around) or a binary operator (space around).
This explains why the word 'line-end' is expanded to 'line - end'.

No functional change outside of debug mode.
 1.19 07-Mar-2021  rillig indent: for the token types, use enum instead of #define

This makes it easier to step through the code in a debugger.

No functional change.
 1.18 07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.17 19-Oct-2019  christos use stdarg, annotate function as __printflike and fix broken formats.
 1.16 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.15 03-Feb-2019  mrg - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
this case, and thus can't be marked __dead easily
 1.14 05-Jun-2016  dholland branches: 1.14.16;
Fix CSRG-era typo: typedef, not typdef. Spotted by Piotr Stefaniak.
 1.13 12-Apr-2009  lukem Fix WARNS=4 issues (-Wshadow -Wcast-qual -Wsign-compare)
 1.12 07-Aug-2003  agc branches: 1.12.42;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.11 26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.10 22-Mar-2002  kristerw Recognize all C9x integer constants (ISO/IEC 9899:1999 section 6.4.4.1)
Patch taken from FreeBSD.

Fixes PR bin/9219.
 1.9 15-Mar-1999  kristerw Made indent recognize the [fF], [uU], [lL], [uU][lL], [lL][lL], and
[uU][lL][lL] constant suffixes. (PR bin/6516 by Brian Ginsbach)
 1.8 19-Dec-1998  christos char -> unsigned char, braces for gcc-2.8.1
 1.7 25-Aug-1998  ross Add { and } to shut up egcs. Reformat the more questionable code.
 1.6 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.5 18-Oct-1997  mrg merge lite-2.
 1.4 09-Sep-1997  agc Bump number of elements in specials array from 100 to 1000.
Typedefs are added to this array, and it silently ignores
any attempts to enter more elements when the array is full.
 1.3 09-Jan-1997  tls RCS ID police
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.12.42.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.14.16.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.14.16.1 10-Jun-2019  christos Sync with HEAD
 1.85 07-Jan-2025  rillig indent: condense and simplify parsing code
 1.84 07-Jan-2025  rillig indent: fix indentation of statement after deeply nested 'if'
 1.83 07-Jan-2025  rillig indent: fix indentation of comment above 'else'

Previously, indent assumed that no 'else' would follow.
 1.82 04-Jan-2025  rillig indent: fix indentation of adjacent multi-line initializers

The main topic of this change is parse.c:66, which makes the indentation
of statements uniform with the indentation of other parser symbols.

That change had the side effect of messing up the indentation of files
whose first line does not start in column 1, such as in ps_ind_level.c.
To fix this side effect, the initial indentation must be determined
before pushing the placeholder token psym_stmt during initialization.
 1.81 04-Jan-2025  rillig indent: make debug log more uniform
 1.80 04-Jan-2025  rillig indent: make debug output easier readable

The previous format had the values of the parser state on the left side
and the corresponding names on the right side. While it looked nicely
aligned, it was not suitable for focusing on the actual data. Replace
this format with the more common "key: value" format.

Use the names of the enum constants in the debug log, instead of the
previous "nice" names that needed one more level of mental translation
and in some cases contained unbalanced punctuation such as '{'.
 1.79 18-Jun-2023  rillig branches: 1.79.2;
indent: untangle code for handling the statement indentation

The expression 'psyms.level-- - 2' did too much in a single line, so
extract the '--' to a separate statement, to highlight the symmetry
between the 'sym' and 'ind_level' code.

No functional change.
 1.78 17-Jun-2023  rillig indent: miscellaneous cleanups

No binary change.
 1.77 14-Jun-2023  rillig indent: clean up the code, add a few tests
 1.76 14-Jun-2023  rillig indent: allow more than 128 brace levels
 1.75 14-Jun-2023  rillig indent: fix out-of-bounds read when reducing a statement

Since parse.c 1.73 from today. The parser symbol psym_stmt_list that was
removed in that commit acted as a stop symbol, so that psyms_reduce_stmt
would save a memory access.
 1.74 14-Jun-2023  rillig indent: clean up array indexing for parser symbols

With 'top' pointing to the actual top element, the array was indexed in
the closed range from 0 to top. All other arrays are indexed by the
usual half-open interval from 0 to len.

No functional change.
 1.73 14-Jun-2023  rillig indent: merge parser symbols for stmt and stmt_list

They were handled in exactly the same way.
 1.72 10-Jun-2023  rillig indent: fix stack overflow, add more tests

For several parser symbols, 2 symbols are pushed in a row, which led to
an out-of-bounds write.
 1.71 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.70 09-Jun-2023  rillig indent: format its own code
 1.69 07-Jun-2023  rillig indent: extract the stack of parser symbols to a separate struct

No functional change.
 1.68 06-Jun-2023  rillig indent: sort functions in call order

No functional change.
 1.67 06-Jun-2023  rillig indent: compute indentation of 'case' labels on-demand

One less moving part to keep track of.

No functional change.
 1.66 05-Jun-2023  rillig indent: rename variables, clean up comments

No binary change.
 1.65 04-Jun-2023  rillig indent: track the kind of '{' on the parser stack
 1.64 03-Jun-2023  rillig indent: clean up handling of brace indentation

No functional change.
 1.63 02-Jun-2023  rillig indent: fix formatting of declarations with preprocessing lines
 1.62 23-May-2023  rillig indent: split debug output into paragraphs

The paragraphs separate the different processing steps: getting a token
from the lexer, processing the token, updating the parser state, sending
a finished line to the output.
 1.61 18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.60 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.59 16-May-2023  rillig indent: allow comments in column 1 to be formatted
 1.58 15-May-2023  rillig indent: format its own code, extend some comments

With manual corrections, as there are still some bugs left.

No functional change.
 1.57 15-May-2023  rillig indent: remove redundant include lines
 1.56 15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.55 14-May-2023  rillig indent: remove foreign RCS IDs
 1.54 13-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.53 12-May-2023  rillig indent: rename placeholder symbol for parser stack

No functional change outside debug mode.
 1.52 12-May-2023  rillig tests/indent: test pushing the placeholder symbol to the parser stack
 1.51 12-May-2023  rillig indent: condense code for handling spaced expressions

No functional change outside debug mode.
 1.50 11-May-2023  rillig indent: remove buggy code for swapping tokens

It is not the job of an indenter to swap tokens, even if it's only about
placing comments elsewhere. The code that swapped the tokens was
complicated, buggy and impossible to understand.

In -br (brace right) mode, indent no longer moves a '{' from the
beginning of a line to the end of the previous line, as that was handled
by the token swapping code as well. This change is unintended, but it
will be easier to re-add that now that the code is simpler.
 1.49 22-Apr-2022  rillig indent: remove FreeBSD IDs

Most of the IDs were empty anyway.
 1.48 07-Nov-2021  rillig indent: various cleanups

Make several comments more precise.

Rename process_end_of_file to process_eof to match the token name.

Change the order of assignments in analyze_comment to keep the com_ind
computations closer together.

In copy_comment_wrap, use pointer difference instead of pointer addition
to stay away from undefined behavior.

No functional change.
 1.47 29-Oct-2021  rillig indent: remove redundant comments, remove punctuation from debug log

The comment about 'null stmt' between braces probably meant 'no
statements between braces'.

The comments at psym_switch_expr only repeated what the code says or had
been outdated 29 years ago already since opt.case_indent does not have
to be 'one level down'.

In the debug log, the quotes around the symbol names are not necessary
after a ':'. The parse stack also does not need this much punctuation.

Reducing a do-while loop to nothing instead of a statement saves a few
CPU cycles. It works because after each lbrace, a stmt is pushed to the
parser stack. This stmt can only ever be reduced to a stmt_list but
never be removed.
 1.46 29-Oct-2021  rillig indent: remove redundant comments

The comments only repeated what the constants for the parser symbols
already express in their names. In the past, the names of these
constants were inconsistent and misleading; back then, it made sense to
make the comments express the actual meaning of the constants.
 1.45 29-Oct-2021  rillig indent: reduce indentation in parse, extract decl_level

No functional change.
 1.44 28-Oct-2021  rillig indent: clean up indentation, comments, reduce

No functional change.
 1.43 28-Oct-2021  rillig indent: clean up comments and function names

Having accurate names for the lexer symbols and the parser symbols makes
most of the comments redundant. Remove these.

Rename process_decl to process_type, to match the name of the
corresponding lexer symbol. In this phase, it's just a single type
token, not a whole declaration.

No functional change.
 1.42 26-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards, to compensate for the remaining bugs
in indent.

Without the type definitions in .indent.pro, the opening braces of the
functions kw_name and lexi_alnum would not be at the beginning of the
line.
 1.41 25-Oct-2021  rillig indent: do not output token in debug mode

When the parse stack is manipulated, the text of the token is not
relevant anymore and may even be confusing, for example when parsing
if_expr, the token may contain "}".
 1.40 25-Oct-2021  rillig indent: rename search_brace to search_stmt

No functional change.
 1.39 25-Oct-2021  rillig indent: split type token_type into 3 separate types

Previously, token_type was used for 3 different purposes:

1. symbol types from the lexer
2. symbol types on the parser stack
3. kind of control statement for 'if (expr)' and similar statements

Splitting the 41 constants into separate types makes it immediately
clear that the parser stack never handles comments, preprocessing lines,
newlines, form feeds, the inner structure of expressions.

Previously, the constant switch_expr was especially confusing since it
was used for 3 different purposes: when returned from lexi, it
represented the keyword 'switch', in the parser stack it represented
'switch (expr)', and it was used for a statement head as well.

The only overlap between the lexer symbols and the parser symbols are
'{' and '}', and the keywords 'do' and 'else'. To increase confusion,
the constants of the previous token_type were in apparently random
order and before 2021, they had cryptic, highly abbreviated names.

No functional change.
 1.38 24-Oct-2021  rillig indent: split kw_do_or_else into separate constants

It was unnecessarily confusing to have the token types keyword_do_else,
keyword_do and keyword_else at the same time, without any hint in what
they differed.

Some of the token types seem to be used by the lexer while others are
used in the parse stack. Maybe all token types can be partitioned into
these groups, which would suggest to use two different types for them.
And if not, it's still clearer to have this distinction in the names of
the constants.

No functional change.
 1.37 24-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards. Indent still does not get
extra_expr_indent correctly, it also indents global variables after
tagged declarations too deep.

No functional change.
 1.36 20-Oct-2021  rillig indent: rename parser stack variables

No functional change.
 1.35 08-Oct-2021  rillig indent: clean up comments, parentheses, debug messages, boolean operator

No functional change.
 1.34 08-Oct-2021  rillig indent: clean up 'parse', add test for dangling else

No functional change.
 1.33 07-Oct-2021  rillig indent: rename opt.btype_2 to brace_same_line

No functional change.
 1.32 07-Oct-2021  rillig indent: let the code breathe a bit by inserting empty lines

No functional change.
 1.31 07-Oct-2021  rillig indent: clean up comments

No functional change.
 1.30 07-Oct-2021  rillig indent: remove redundant comments

No functional change.
 1.29 05-Oct-2021  rillig indent: fix Clang-Tidy warnings, clean up bakcopy

The comment above and inside bakcopy had been outdated for at least the
last 28 years, the backup file is named "%s.BAK", not ".B%s".

Prevent buffer overflow for very long filenames (sprintf -> snprintf).
 1.28 05-Oct-2021  rillig indent: rename local char variable, reduce scope of counters

No functional change.
 1.27 26-Sep-2021  rillig indent: let indent format its own code -- in supervised mode

After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.

Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.

Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.

No functional change.
 1.26 25-Sep-2021  rillig indent: un-abbreviate a few parser_state members, clean up comments

No functional change.
 1.25 25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.24 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.23 25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.22 25-Sep-2021  rillig indent: group global variables for token buffer

No functional change.
 1.21 25-Sep-2021  rillig indent: inline macro 'token'

No functional change.
 1.20 25-Sep-2021  rillig indent: group global variables for code buffer

No functional change.
 1.19 25-Sep-2021  rillig indent: rename variables of type token_type

The previous variable name 'code' conflicts with the buffer of the same
name.

No functional change.
 1.18 12-Mar-2021  rillig indent: use consistent indentation for 'else'

Half of the code used -ce, the other half the opposite -nce.

No functional change.
 1.17 09-Mar-2021  rillig indent: make token names more precise

The previous 'casestmt' was wrong since a case label is not a statement
at all.

The previous 'swstmt' was overly short, and wrong as well, since it
represents only the 'switch (expr)' part, which is not a complete switch
statement. Same for 'ifstmt', 'whilestmt', 'forstmt'.

The previous word 'head' was not precise enough since it didn't specify
exactly where the head ends and the body starts. Especially for
handling the dangling else, this distinction is important.

No functional change.
 1.16 09-Mar-2021  rillig indent: extract reduce_stmt from reduce

This refactoring reduces the indentation of the code, as well as
removing any ambiguity as to which 'switch' statement a 'break' belongs,
as there are no more nested 'switch' statements.

No functional change.
 1.15 09-Mar-2021  rillig indent: manually indent comments

It's strange that indent's own code is not formatted by indent itself,
which would be a good demonstration of its capabilities.

In its current state, I don't trust indent to get even the tokenization
correct, therefore the only safe way is to format the code manually.
 1.14 07-Mar-2021  rillig indent: in debug mode, output detailed token information

The main ingredient for understanding how indent works is the tokenizer
and the 4 buffers in which the text is collected.

Inspecting this debug log for the test comment-line-end makes it obvious
why indent messes up code that contains '//' comments. The cause is
that indent interprets '//' as an operator, just like '&&' or '||'. The
sequence '/////' is interpreted as a single operator as well, by the
way.

Since '//' is interpreted as an ordinary operator, any words following
it are plain identifiers, usually several of them in a row, which is a
syntax error. Depending on the context, the operator '//' is either a
unary operator (no space around) or a binary operator (space around).
This explains why the word 'line-end' is expanded to 'line - end'.

No functional change outside of debug mode.
 1.13 07-Mar-2021  rillig indent: for the token types, use enum instead of #define

This makes it easier to step through the code in a debugger.

No functional change.
 1.12 07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.11 06-Mar-2021  rillig indent: fix space-tab alignment in indent's own code

These parts are not fixed automatically by indent since they are in box
comments.

No functional change.
 1.10 19-Oct-2019  christos use stdarg, annotate function as __printflike and fix broken formats.
 1.9 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.8 03-Feb-2019  mrg - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
this case, and thus can't be marked __dead easily
 1.7 07-Aug-2003  agc branches: 1.7.98;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.6 26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.5 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.4 18-Oct-1997  mrg merge lite-2.
 1.3 09-Jan-1997  tls RCS ID police
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.7.98.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.7.98.1 10-Jun-2019  christos Sync with HEAD
 1.79.2.1 02-Aug-2025  perseant Sync with HEAD
 1.174 04-Jan-2025  rillig indent: make debug output easier readable

The previous format had the values of the parser state on the left side
and the corresponding names on the right side. While it looked nicely
aligned, it was not suitable for focusing on the actual data. Replace
this format with the more common "key: value" format.

Use the names of the enum constants in the debug log, instead of the
previous "nice" names that needed one more level of mental translation
and in some cases contained unbalanced punctuation such as '{'.
 1.173 03-Dec-2023  rillig branches: 1.173.2;
indent: inline input-related macros

No binary change.
 1.172 03-Dec-2023  rillig indent: use line number of the token start in diagnostics

Previously, the line number of the end of the token was used, which was
confusing in debug mode.
 1.171 23-Jun-2023  rillig indent: fix scanning of no-wrap comments (since 2021.11.07.10.34.03)

The "refactoring" back then tried to be too clever.
 1.170 18-Jun-2023  rillig indent: only add blank lines before actual block comments
 1.169 18-Jun-2023  rillig indent: remove support for backspace in code and comments

The C code in the whole tree does not contain a single literal
backspace.
 1.168 17-Jun-2023  rillig indent: clean up

Extract duplicate code for handling line continuations.

Prevent theoretic undefined behavior in strspn, as inp.s is not
null-terminated.

Remove adding extra space characters when processing comments, as these
are not necessary to force a line of output.

No functional change.
 1.167 17-Jun-2023  rillig indent: miscellaneous cleanups

No binary change.
 1.166 16-Jun-2023  rillig indent: rename a field of the parser state

The previous name 'comment_in_first_line' was misleading, as it could
mean that there was a comment in the first line of the file.

No functional change.
 1.165 14-Jun-2023  rillig indent: allow more than 20 nested parentheses or brackets
 1.164 14-Jun-2023  rillig indent: clean up handling of comments

One less moving part in the parser state.

No functional change.
 1.163 14-Jun-2023  rillig indent: remove another flag from parser state

When processing a comment, the flag ps.next_col_1 was not used for the
next token, but for a line within a comment. As its scope was limited
to a single comment, there is no need to store it any longer than that

No functional change.
 1.162 14-Jun-2023  rillig indent: remove a redundant flag from the parser state

No functional change.
 1.161 10-Jun-2023  rillig indent: miscellaneous cleanups
 1.160 10-Jun-2023  rillig indent: in debug mode, null-terminate buffers
 1.159 10-Jun-2023  rillig indent: rename and sort variables in parser state

No functional change.
 1.158 09-Jun-2023  rillig indent: format its own code
 1.157 09-Jun-2023  rillig indent: preserve block comments with delimiters
 1.156 09-Jun-2023  rillig indent: express more clearly how delimited and no-wrap comments relate

No functional change.
 1.155 06-Jun-2023  rillig indent: right-trim single-line comments
 1.154 06-Jun-2023  rillig indent: clean up formatting of comments

No functional change.
 1.153 06-Jun-2023  rillig indent: simplify handling of comments

No functional change.
 1.152 06-Jun-2023  rillig indent: split printing of comments into smaller functions

No functional change.
 1.151 04-Jun-2023  rillig indent: remove read pointer from buffers that don't need it

The only buffer that needs a read pointer is the current input line in
'inp'.

No functional change.
 1.150 04-Jun-2023  rillig indent: fix out-of-bounds read when reading a comment
 1.149 21-May-2023  rillig tests/indent: fix outdated or wrong comments
 1.148 20-May-2023  rillig indent: extract the output state from the parser state

The parser state depends on the preprocessing lines, the output state
shouldn't.
 1.147 20-May-2023  rillig indent: implement blank line above block comment
 1.146 18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.145 18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.144 16-May-2023  rillig indent: directly access the input buffer

No functional change.
 1.143 16-May-2023  rillig indent: remove support for form feed characters inside a line

Form feeds are occasionally used to split code into pages, and this use
is still supported. Having a form feed in the middle of a line is
exotic.
 1.142 15-May-2023  rillig indent: fix line wrapping of comments to the right of code
 1.141 15-May-2023  rillig indent: let indent format its own code

With manual corrections, as indent does not properly indent multi-line
'?:' expressions nor multi-line controlling expressions.
 1.140 15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.139 14-May-2023  rillig indent: only null-terminate the buffers if necessary

The only case where a buffer is used as a C-style string is when looking
up a keyword.

No functional change.
 1.138 14-May-2023  rillig indent: shorten variable names for indenting comments

No functional change.
 1.137 14-May-2023  rillig indent: fix handling of multiple block comments in a line
 1.136 14-May-2023  rillig indent: in comments, keep a leading tab

This kind of comments is used for the CVS IDs at the top of files.
 1.135 14-May-2023  rillig indent: fix vertical spacing after declarations

A comment is not supposed to change the state of the 'blank line after
declaration', but it did. The initialization of saved_just_saw_decl was
wrong though since it tried to capture the previous value after it had
already been overwritten.
 1.134 14-May-2023  rillig indent: reduce code for scanning tokens

The input line is guaranteed to end with '\n', so there's no need to
carry another pointer around.

No functional change.
 1.133 14-May-2023  rillig indent: remove foreign RCS IDs
 1.132 13-May-2023  rillig indent: use enum instead of magic numbers for tracking declarations

No functional change.
 1.131 13-May-2023  rillig indent: rename struct fields for buffers

No binary change except for assertion line numbers.
 1.130 12-May-2023  rillig indent: remove statistics

The numbers from the statistics were wrong.
 1.129 11-May-2023  rillig indent: remove broken code for handling blank lines

This fixes several bugs where blank lines were erroneously added or
removed, treating these old bugs for new bugs in different places.
These new bugs are expected to be easier to fix, as the old bugs will
not interfere anymore.
 1.128 09-May-2022  rillig indent: clean up control flow, remove Capsicum

No functional change.
 1.127 23-Apr-2022  rillig indent: group global variables related to output control

No functional change.
 1.126 27-Nov-2021  rillig indent: rename dump functions to output

No functional change.
 1.125 26-Nov-2021  rillig indent: add buf_add_range for adding characters to a buffer

No functional change.
 1.124 25-Nov-2021  rillig indent: make fits_in_one_line independent of ps

This makes it easier to understand the function. Having the dependency
to the input line handling is already complicated enough.

No functional change.
 1.123 25-Nov-2021  rillig indent: fix accidentally joined and broken comments (since 2019-04-04)

The fixed version is not perfect as it gets the indentation of the last
line of the first comment wrong, but at least indent doesn't generate
malformed output anymore.
 1.122 25-Nov-2021  rillig tests/indent: demonstrate bugs in comment processing
 1.121 19-Nov-2021  rillig indent: reduce casts to unsigned char for character classification

No functional change.
 1.120 19-Nov-2021  rillig indent: use character input API from pr_comment.c

No functional change.
 1.119 19-Nov-2021  rillig indent: replace direct access to the input buffer

This is a preparation for abstracting away all the low-level details of
handling the input. The goal is to fix the current bugs regarding line
number counting, out of bounds memory access, and generally unreadable
code.

No functional change.
 1.118 19-Nov-2021  rillig indent: rename input buffer variables

From reading the names 'save_com' and 'sc_end', it was not obvious
enough that these two variables are the limits of the same buffer, the
names were just too unrelated.

No functional change.
 1.117 19-Nov-2021  rillig indent: group variables for input handling

No functional change.
 1.116 07-Nov-2021  rillig indent: various cleanups

Make several comments more precise.

Rename process_end_of_file to process_eof to match the token name.

Change the order of assignments in analyze_comment to keep the com_ind
computations closer together.

In copy_comment_wrap, use pointer difference instead of pointer addition
to stay away from undefined behavior.

No functional change.
 1.115 07-Nov-2021  rillig indent: remove code that accessed out-of-bounds data from buffer

Saving and restoring the exact buffer position had been necessary before
NetBSD pr_comment.c 1.114. The code accessed the buffer data out of the
bounds of [com.s, com.e), which was rather surprising. More
specifically, it accessed com.e[-1] in a case where com.e == com.s,
relying on com.e[-1] being ' ' in most cases and '*' in the case where
the comment delimiter was written to a separate output line.

Make the code easier understandable by only ever accessing the buffer
data in the bounds [buf.s, buf.e).

No functional change.
 1.114 07-Nov-2021  rillig indent: only access buffer data in the range [buf.s, buf.e)

No functional change.
 1.113 07-Nov-2021  rillig indent: do not expand comment buffer unnecessarily

This may have been a simple typo or a really tricky optimization that
isn't obvious even after studying the code for several months. Either of
these is bad, so use the standard form of resetting the buffer.

No functional change.
 1.112 07-Nov-2021  rillig indent: remove dead code in analyze_comment

The case of an otherwise empty line is already handled further above.
 1.111 07-Nov-2021  rillig indent: remove dead code from copy_comment_wrap

C99 comments are not wrapped, therefore there is no need to check for
them in copy_comment_wrap.

No functional change.
 1.110 07-Nov-2021  rillig indent: move documentation from process_comment to copy_comment_wrap
 1.109 07-Nov-2021  rillig indent: clean up process_comment

The assignments to the variables were redundant, they are already done
by analyze_comment.

No functional change.
 1.108 07-Nov-2021  rillig indent: remove redundant assignment

At that point, ps.next_col_1 is already false.

No functional change.
 1.107 07-Nov-2021  rillig indent: make end of comment detection of nowrap comments simpler

No functional change.
 1.106 07-Nov-2021  rillig indent: clean up overcomplicated conditions in copy_comment_nowrap

No functional change.
 1.105 07-Nov-2021  rillig indent: skip redundant conditions in copy_comment_nowrap

No functional change.
 1.104 07-Nov-2021  rillig indent: clean up copy_comment_nowrap

The action for '\f' was the same as the default action.

Replacing 'switch' with 'if' makes the code shorter.

No functional change.
 1.103 07-Nov-2021  rillig indent: make copy_comment_nowrap simpler

Since a nowrap comment is copied unmodified, it need not depend on any
maximum line length.

No functional change.
 1.102 07-Nov-2021  rillig indent: remove dead code from process_comment_nowrap

In comments that are preserved, no additional leading ' * ' is inserted.

No functional change.
 1.101 07-Nov-2021  rillig indent: remove dead code from copy_comment_wrap

No functional change.
 1.100 07-Nov-2021  rillig indent: remove dead code from copy_comment_nowrap

No functional change.
 1.99 07-Nov-2021  rillig indent: split copy_comment into wrapping and non-wrapping

These two cases are processed in an almost entirely different way. In
particular, copy_comment_nowrap should copy the comment verbatim, which
is not obvious from the current code, due to the many conditions and the
complex control flow.

No functional change.
 1.98 07-Nov-2021  rillig indent: rename 'inbuf' functions to 'inp'

The variable 'inp' used to be named 'inbuf'. Make the function names
correspond to the variable name again.

No functional change.
 1.97 05-Nov-2021  rillig indent: rename ps.curr_newline to next_col_1

For symmetry with ps.curr_col_1.

No functional change.
 1.96 04-Nov-2021  rillig indent: fix parsing of C99 comments containing '*/'
 1.95 04-Nov-2021  rillig indent: split process_comments into separate functions

No functional change.
 1.94 03-Nov-2021  rillig indent: inline indentation_after, shorten function name to ind_add

There were only few calls to indentation_after, so inlining it spares
the need to look at yet another function definition. Another effect is
that code.s and code.e appear in the code as a pair now, instead of a
single code.s, making the scope of the function call obvious.

In ind_add, there is no need to check for '\0' anymore since none of the
buffers can ever contain a null character, these are filtered out by
inbuf_read_line.

No functional change.
 1.93 30-Oct-2021  rillig indent: rename prev_newline and prev_col_1 to curr

These two flags describe the token that is currently processed.

In process_binary_op, curr_newline can never be true since newline is
not a binary operator, so remove that condition.

No functional change.
 1.92 30-Oct-2021  rillig indent: fix bounds check for sc_buf

Some years ago, save_com was an array of characters, used as temporary
storage. When sc_buf was added, this code was forgotten. The bounds
check must be on the array itself, not on an iterator that points
somewhere in that array.
 1.91 30-Oct-2021  rillig indent: fix assertion in fits_in_one_line
 1.90 29-Oct-2021  rillig indent: merge isblank and is_hspace into ch_isblank

No functional change.
 1.89 29-Oct-2021  rillig indent: fix undefined behavior in buffer handling

Adding an arbitrary integer to a pointer may result in an out of bounds
pointer, so replace the addition with a pointer subtraction.

In the buffer handling functions, handle 'buf' and 'l' before 's' and
'e', since they are pairs.

In inbuf_read_line, use 's' instead of 'buf' to make the code easier to
understand for human readers.

No functional change.
 1.88 29-Oct-2021  rillig indent: use prev/curr/next to refer to the current token

The word 'last' just didn't match with 'next'.

No functional change.
 1.87 26-Oct-2021  rillig indent: clean up process_comment

There is no undefined behavior since the compared characters are always
from the basic execution character set. All other cases are covered by
the condition above for now_len.

Fix debug logging for non-ASCII characters, previously a character was
output as \xffffffc3.
 1.86 26-Oct-2021  rillig indent: reduce indentation in process_comment

No functional change.
 1.85 26-Oct-2021  rillig indent: make reformatting of comments simpler

No functional change.
 1.84 25-Oct-2021  rillig indent: split type token_type into 3 separate types

Previously, token_type was used for 3 different purposes:

1. symbol types from the lexer
2. symbol types on the parser stack
3. kind of control statement for 'if (expr)' and similar statements

Splitting the 41 constants into separate types makes it immediately
clear that the parser stack never handles comments, preprocessing lines,
newlines, form feeds, the inner structure of expressions.

Previously, the constant switch_expr was especially confusing since it
was used for 3 different purposes: when returned from lexi, it
represented the keyword 'switch', in the parser stack it represented
'switch (expr)', and it was used for a statement head as well.

The only overlap between the lexer symbols and the parser symbols are
'{' and '}', and the keywords 'do' and 'else'. To increase confusion,
the constants of the previous token_type were in apparently random
order and before 2021, they had cryptic, highly abbreviated names.

No functional change.
 1.83 24-Oct-2021  rillig indent: run indent on its own source code

With manual corrections afterwards. Indent still does not get
extra_expr_indent correctly, it also indents global variables after
tagged declarations too deep.

No functional change.
 1.82 24-Oct-2021  rillig indent: replace global variable use_ff with function parameter
 1.81 24-Oct-2021  rillig indent: clean up comments
 1.80 20-Oct-2021  rillig indent: rename blankline_requested variables

The words 'prefix' and 'postfix' sounded too much like horizontal
concepts, like in operators. The actual purpose of these variables is to
add blank lines before and after the current line, so use the same
wording as in the command line options.

No functional change.
 1.79 14-Oct-2021  rillig indent: clarify that 25 is a magic number

The extra line width for comments to the very right is just that, an
arbitrarily chosen number. It neither has to be a multiple of 8, nor of
the tabsize nor of the indentation. Since 25 is neither of these, this
makes it a perfect choice, allowing these extreme comments to have 22
characters per line with -sc (leading asterisks in comment
continuations, the default) or 25 without.

No functional change.
 1.78 14-Oct-2021  rillig indent: turn ps.com_ind into local variable

This makes it immediately clear that ps.com_ind is computed in the first
part of process_comment and then only used from there.

No functional change.
 1.77 13-Oct-2021  rillig indent: extract fits_in_one_line from process_comment

No functional change.
 1.76 12-Oct-2021  rillig indent: in process_comment, migrate int variable to bool

No functional change.
 1.75 12-Oct-2021  rillig indent: in process_comment, negate box_com to may_wrap

In the new line 213, may_wrap could only be true and was therefore a
redundant condition.

No functional change.
 1.74 12-Oct-2021  rillig indent: negate a few condition in process_comment

No functional change.
 1.73 12-Oct-2021  rillig indent: fix formatting of single-line comments (since today)

The change in pr_comment.c 1.70 from 3 hours ago did not cover all edge
cases correctly. Now it works for comments that are aligned with tabs.
 1.72 12-Oct-2021  rillig indent: replace unreachable code with assertion
 1.71 12-Oct-2021  rillig indent: use high-level buffer API for processing comments

Document the trickery of copying the last word from the previous line
since that it not obvious at all.

No functional change.
 1.70 12-Oct-2021  rillig indent: fix wrapping for comments in otherwise empty lines

The comment above the code was wrong. The leading 3 characters were
indeed ignored, but the first of them was '/', not ' '. Of the trailing
3 characters, 2 were not ignored. The start and end of the comment would
not cancel out, they would rather sum up.
 1.69 09-Oct-2021  rillig indent: extract common code for advancing a single tab

No functional change.
 1.68 08-Oct-2021  rillig indent: replace unreachable code with assertions

The input buffer is always supposed to be terminated with a newline. The
function inbuf_read_line silently skips null characters. Since the input
buffer is redirected to a temporary buffer in some cases, do not simply
remove this supposed dead code, but replace it with assertions.

In any case, if the code for calling inbuf_read_line had been reachable
and actually allocated a different buffer, continuing to use the pointer
p would have invoked undefined behavior anyway.
 1.67 08-Oct-2021  rillig indent: use standard error handling for unterminated comment

Just writing it to stdout is bad, especially when indent is used in
filter mode. Silently continuing after such an error is bad as well.

echo '/*' | indent
 1.66 08-Oct-2021  rillig indent: remove redundant conditions

No functional change.
 1.65 08-Oct-2021  rillig indent: convert ps.box_com to local variable

This variable is only used in a single function, and that function does
not call any other function that could replace the parser state or
install a temporary parser state.

No functional change.
 1.64 08-Oct-2021  rillig indent: rename fill_buffer to inbuf_read_line

No functional change.
 1.63 08-Oct-2021  rillig indent: run indent on indent.h

The formatting looks mostly OK.

Some struct members had excessively long names, leaving no space for
their corresponding comments. Renamed some of them using well-known
abbreviations.

The formatting for debug_vis_range is messed up, no idea why. It is
clearly a function declaration, not a function definition, so there is
no need to place the function name in column 1.

No functional change.
 1.62 08-Oct-2021  rillig indent: fix formatting of C99 comments

The first attempt at formatting C99 comments was conceptually wrong. It
accessed the next token in dump_line, even though that function should
only ever look at the buffers for the label, the code and the current
comment. (Understanding that part of the code was difficult at that time
due to the sheer number of global variables.) The complicated and
ever-growing condition for whether to output the token was a hack and in
retrospect doesn't make sense at all, that's why it only came close to
the intended effect.

Some unintended side effects were that the C99 comments had an
additional space in front of them, and that in some cases an empty line
followed the comment, and that the comments were not aligned.

Previously, the newline that terminates the C99 comment was included in
the comment. Separating the newline from the comment fixed all these
unintended side effects. The only downside is that the multi-line
statement is not indented, but that should be easy to fix.
 1.61 08-Oct-2021  rillig indent: reduce scope of t_ptr in process_comment

No functional change.
 1.60 08-Oct-2021  rillig indent: replace column calculations with indent, part 4/4
 1.59 08-Oct-2021  rillig indent: replace column calculations with indent, part 3

No functional change.
 1.58 07-Oct-2021  rillig indent: group variables for the input buffer

The input buffer follows the same concept as the intermediate buffers
for label, code, comment and token, so use the same type for it.

No functional change.
 1.57 07-Oct-2021  rillig indent: use braces around multi-line statements

No functional change.
 1.56 07-Oct-2021  rillig indent: let the code breathe a bit by inserting empty lines

No functional change.
 1.55 07-Oct-2021  rillig indent: clean up comments

No functional change.
 1.54 07-Oct-2021  rillig indent: remove redundant comments

No functional change.
 1.53 05-Oct-2021  rillig indent: fix outdated comment, add 'const'
 1.52 05-Oct-2021  rillig indent: fix spelling in comments
 1.51 05-Oct-2021  rillig indent: use proper escape sequence for form feed

This escape sequence has been available since at least 1978.
 1.50 05-Oct-2021  rillig indent: merge duplicate code into is_hspace

No functional change.
 1.49 05-Oct-2021  rillig indent: clean up code for appending to buffers

Use *e++ for appending and e[-1] for testing the previously appended
character, like in other places in the code.

No functional change.
 1.48 05-Oct-2021  rillig indent: merge duplicate code for reading from input buffer

No functional change.
 1.47 26-Sep-2021  rillig indent: let indent format its own code -- in supervised mode

After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.

Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.

Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.

No functional change.
 1.46 25-Sep-2021  rillig indent: merge duplicate code for token buffers

No functional change.
 1.45 25-Sep-2021  rillig indent: remove dead code for printing comments after empty lines

This code has been commented out for at least 29 years.

No functional change.
 1.44 25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.43 25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.42 25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.41 25-Sep-2021  rillig indent: move statistical values into a separate struct

No functional change.
 1.40 25-Sep-2021  rillig indent: add nonnull memory allocation functions

The only functional change is a single error message.
 1.39 25-Sep-2021  rillig indent: group global variables for token buffer

No functional change.
 1.38 25-Sep-2021  rillig indent: group global variables for code buffer

No functional change.
 1.37 24-Sep-2021  rillig indent: group global variables for label buffer into struct

No functional change.
 1.36 24-Sep-2021  rillig indent: group global variables for the comment buffer

No functional change.
 1.35 14-Mar-2021  rillig indent: clean up check_size_comment

The additional parameter last_bl_ptr was only necessary because the last
blank was stored as a pointer into the buffer. By storing the index in
the buffer instead, it doesn't need to be updated all the time.

No functional change.
 1.34 14-Mar-2021  rillig indent: remove trailing whitespace
 1.33 14-Mar-2021  rillig indent: clean up target column computation in process_comment

No functional change.
 1.32 14-Mar-2021  rillig indent: fix off-by-one error in comment wrapping

The manual page says that the default maximum length of a comment line
is 78. The test 'comments.0' wrongly assumed that this 78 would refer
to the maximum _column_ allowed, which is off by one.

Fix the wording in the test 'comments.0' and remove the (now satisfied)
expectation comments in the test 'token-comment.0'.

Several other tests just happened to hit that limit, fix these as well.
 1.31 14-Mar-2021  rillig indent: fix lint warnings

No functional change.
 1.30 13-Mar-2021  rillig indent: remove the '+ 1' from right margin calculation in comment

No functional change.
 1.29 13-Mar-2021  rillig indent: distinguish between 'column' and 'indentation'

column == 1 + indentation.

In addition, indentation is a relative distance while column is an
absolute position. Therefore, don't confuse these two concepts, to
prevent off-by-one errors.

No functional change.
 1.28 13-Mar-2021  rillig indent: rename pr_comment to process_comment, clean up documentation

No functional change.
 1.27 13-Mar-2021  rillig indent: remove redundant parentheses

No functional change.
 1.26 13-Mar-2021  rillig indent: fix confusing variable names

The word 'col' should only be used for the 1-based column number. This
name is completely inappropriate for a line length since that provokes
off-by-one errors. The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.

Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.

No functional change.
 1.25 13-Mar-2021  rillig indent: document undefined behavior in processing of comments

No functional change.
 1.24 13-Mar-2021  rillig indent: inline calls to count_spaces and count_spaces_until

These two functions operated on column numbers instead of indentation,
which required adjustments of '+ 1' and '- 1'. Their names were
completely wrong since these functions did not count anything, instead
they computed the column.

No functional change.
 1.23 13-Mar-2021  rillig indent: replace compute_code_column with compute_code_indent

The goal is to only ever be concerned about the _indentation_ of a
token, never the _column_ it appears in. Having only one of these
avoids off-by-one errors.

No functional change.
 1.22 13-Mar-2021  rillig indent: replace compute_label_column with compute_label_indent

Using the invariant 'column == 1 + indent'. This removes several overly
complicated '+ 1' from the code that are not needed conceptually.

No functional change.
 1.21 13-Mar-2021  rillig indent: replace pad_output with output_indent

Calculating the indentation is simpler than calculating the column,
since that saves the constant addition and subtraction of the 1.

No functional change.
 1.20 12-Mar-2021  rillig indent: replace 'target' with 'indent' in function names

The word 'target' was not as specific as possible.

No functional change.
 1.19 12-Mar-2021  rillig indent: use consistent indentation for 'else'

Half of the code used -ce, the other half the opposite -nce.

No functional change.
 1.18 11-Mar-2021  rillig indent: reduce indentation of check_size functions

No functional change.
 1.17 11-Mar-2021  rillig indent: remove redundant cast after allocation functions

No functional change.
 1.16 11-Mar-2021  rillig indent: use consistent array indexing

No functional change.
 1.15 09-Mar-2021  rillig indent: manually indent comments

It's strange that indent's own code is not formatted by indent itself,
which would be a good demonstration of its capabilities.

In its current state, I don't trust indent to get even the tokenization
correct, therefore the only safe way is to format the code manually.
 1.14 08-Mar-2021  rillig indent: convert big macros to functions

Each of these buffers is only modified in a single file. This makes it
unnecessary to declare the macros in the global header.
 1.13 07-Mar-2021  rillig indent: fix handling of '//' end-of-line comments
 1.12 07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.11 04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.10 25-Feb-2016  ginsbach branches: 1.10.16;
Fix obvious contraction spelling mistakes by adding missing apostrophes.
 1.9 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.8 19-Jun-2003  christos PR/21645: Mishka: Localized comments don't work with indent.
 1.7 26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.6 19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.5 18-Oct-1997  mrg merge lite-2.
 1.4 09-Jan-1997  tls RCS ID police
 1.3 07-Aug-1993  cgd do block commenting, if comment begins with slash-star-newline.
 1.2 01-Aug-1993  mycroft Add RCS identifiers.
 1.1 09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2 04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1 06-Jun-1993  mrg 4.4BSD-Lite2
 1.10.16.1 10-Jun-2019  christos Sync with HEAD
 1.173.2.1 02-Aug-2025  perseant Sync with HEAD

RSS XML Feed