Home | History | Annotate | Download | only in indent
History log of /src/usr.bin/indent/io.c
RevisionDateAuthorComments
 1.237  04-Jan-2025  rillig indent: make debug output easier readable

The previous format had the values of the parser state on the left side
and the corresponding names on the right side. While it looked nicely
aligned, it was not suitable for focusing on the actual data. Replace
this format with the more common "key: value" format.

Use the names of the enum constants in the debug log, instead of the
previous "nice" names that needed one more level of mental translation
and in some cases contained unbalanced punctuation such as '{'.
 1.236  12-Dec-2024  rillig indent: add error handling for I/O errors

Suggested by lint2.
 1.235  03-Dec-2023  rillig branches: 1.235.2;
indent: inline input-related macros

No binary change.
 1.234  03-Dec-2023  rillig indent: group input-related variables into a struct

No functional change.
 1.233  03-Dec-2023  rillig indent: use line number of the token start in diagnostics

Previously, the line number of the end of the token was used, which was
confusing in debug mode.
 1.232  27-Jun-2023  rillig indent: fix 'blank line above first statement in function body'
 1.231  26-Jun-2023  rillig indent: implement 'blank line above first statement in function body'
 1.230  26-Jun-2023  rillig indent: in -bad mode, don't add a blank line above a comment or '}'
 1.229  17-Jun-2023  rillig indent: clean up

Extract duplicate code for handling line continuations.

Prevent theoretic undefined behavior in strspn, as inp.s is not
null-terminated.

Remove adding extra space characters when processing comments, as these
are not necessary to force a line of output.

No functional change.
 1.228  17-Jun-2023  rillig indent: miscellaneous cleanups

No binary change.
 1.227  16-Jun-2023  rillig indent: don't force a blank line between '}' and preprocessing line
 1.226  16-Jun-2023  rillig indent: rename a field of the parser state

The previous name 'comment_in_first_line' was misleading, as it could
mean that there was a comment in the first line of the file.

No functional change.
 1.225  15-Jun-2023  rillig indent: consolidate handling of statement continuations
 1.224  15-Jun-2023  rillig indent: rename state variable to be more accurate

No binary change.
 1.223  15-Jun-2023  rillig indent: fix indentation of multi-line enum constant initializers
 1.222  15-Jun-2023  rillig indent: miscellaneous cleanups, more tests for edge cases
 1.221  14-Jun-2023  rillig indent: clean up array indexing for parser symbols

With 'top' pointing to the actual top element, the array was indexed in
the closed range from 0 to top. All other arrays are indexed by the
usual half-open interval from 0 to len.

No functional change.
 1.220  14-Jun-2023  rillig indent: allow more than 20 nested parentheses or brackets
 1.219  14-Jun-2023  rillig indent: clean up debugging code
 1.218  14-Jun-2023  rillig indent: clean up handling of comments

One less moving part in the parser state.

No functional change.
 1.217  10-Jun-2023  rillig indent: rename misleading variable

The name started with 'line_start', but the value is not always the
value from the beginning of the line.

No functional change.
 1.216  10-Jun-2023  rillig indent: miscellaneous cleanups
 1.215  10-Jun-2023  rillig indent: in debug mode, null-terminate buffers
 1.214  10-Jun-2023  rillig indent: distinguish blank lines from newline characters
 1.213  10-Jun-2023  rillig indent: fix indentation of continuation lines in initializers
 1.212  10-Jun-2023  rillig indent: extract output of an indented line to separate function
 1.211  10-Jun-2023  rillig indent: clean up function names and order in output
 1.210  10-Jun-2023  rillig indent: clean up function and variable names
 1.209  10-Jun-2023  rillig indent: rename and sort variables in parser state

No functional change.
 1.208  09-Jun-2023  rillig indent: trim trailing blank lines
 1.207  09-Jun-2023  rillig indent: when an indentation is ambiguous, indent one level further

The '-eei' mode now applies whenever the indentation from a multi-line
expression could be confused with a following statement.
 1.206  09-Jun-2023  rillig indent: format its own code
 1.205  09-Jun-2023  rillig indent: indent multi-line expressions according to parentheses

This reverts the FreeBSD change from 2004-02-12 that had been imported
on 2019-04-04.
 1.204  08-Jun-2023  rillig indent: fix indentation in multi-line else-if conditions
 1.203  08-Jun-2023  rillig indent: clean up and condense code

No functional change.
 1.202  07-Jun-2023  rillig indent: extract the stack of parser symbols to a separate struct

No functional change.
 1.201  06-Jun-2023  rillig indent: condense code for writing tabs

No functional change.
 1.200  06-Jun-2023  rillig indent: sort functions in call order

No functional change.
 1.199  06-Jun-2023  rillig indent: compute indentation of 'case' labels on-demand

One less moving part to keep track of.

No functional change.
 1.198  05-Jun-2023  rillig indent: clean up comments
 1.197  05-Jun-2023  rillig indent: don't remove blank line after 'if (expr) {'
 1.196  05-Jun-2023  rillig indent: fix formatting of 'do' statements
 1.195  05-Jun-2023  rillig indent: clean up handling of whitespace

No functional change.
 1.194  05-Jun-2023  rillig indent: let the output routines keep track of the indentation

No functional change.
 1.193  04-Jun-2023  rillig indent: remove read pointer from buffers that don't need it

The only buffer that needs a read pointer is the current input line in
'inp'.

No functional change.
 1.192  04-Jun-2023  rillig indent: force at least one space after the colon of a label
 1.191  04-Jun-2023  rillig indent: track the kind of '{' on the parser stack
 1.190  04-Jun-2023  rillig indent: ensure that the 'block init level' never goes negative

No functional change.
 1.189  04-Jun-2023  rillig indent: fix indentation of initializers in compound expressions
 1.188  04-Jun-2023  rillig indent: handle the indentation of 'case' in a simpler way
 1.187  23-May-2023  rillig indent: separate code for handling enums from the lexer

The lexer's responsibility is to generate tokens, it's not supposed to
update the parser state. Centralize the state transitions that control
indentation of enum constants to keep the lexer code clean.

Skip comments, newlines and preprocessing lines when updating the parser
state for enum constants and for '*' in declarations.
 1.186  23-May-2023  rillig indent: split debug output into paragraphs

The paragraphs separate the different processing steps: getting a token
from the lexer, processing the token, updating the parser state, sending
a finished line to the output.
 1.185  22-May-2023  rillig indent: implement suppressing optional blank lines
 1.184  20-May-2023  rillig indent: don't insert blank line between two closing lines
 1.183  20-May-2023  rillig indent: extract the output state from the parser state

The parser state depends on the preprocessing lines, the output state
shouldn't.
 1.182  20-May-2023  rillig indent: implement blank line above block comment
 1.181  20-May-2023  rillig indent: implement blank line after function body
 1.180  20-May-2023  rillig indent: ensure that no blank lines are inserted in INDENT OFF mode

No blank lines were inserted previously, but the code looked
suspicious as if that were possible.
 1.179  20-May-2023  rillig indent: implement blank lines around conditional compilation
 1.178  18-May-2023  rillig indent: manually wrap overly long lines

No functional change.
 1.177  18-May-2023  rillig indent: switch to standard code style

Taken from share/misc/indent.pro.

Indent does not wrap code to fit into the line width, it only does so
for comments. The 'INDENT OFF' sections and too long lines will be
addressed in a follow-up commit.

No functional change.
 1.176  18-May-2023  rillig indent: remove unnecessary variable size optimization

Due to the enum that follows in the struct, the short variable was
padded to 4 bytes anyway.

No functional change.
 1.175  16-May-2023  rillig indent: directly access the input buffer

No functional change.
 1.174  16-May-2023  rillig indent: remove support for form feed characters inside a line

Form feeds are occasionally used to split code into pages, and this use
is still supported. Having a form feed in the middle of a line is
exotic.
 1.173  16-May-2023  rillig indent: fix handling of INDENT OFF/ON comments

Previously, the 'INDENT OFF' comments were interpreted when the newline
token from the line above the comment was processed, which was earlier
than could be reasonably expected.

The 'INDENT ON' comments were interpreted equally early, which led to
the situation that the 'INDENT OFF' comments were preserved literally
but the 'INDENT ON' comments weren't.
 1.172  16-May-2023  rillig indent: move parsing of 'INDENT OFF/ON' comments to the lexer

No functional change.
 1.171  15-May-2023  rillig indent: fix cast detection

In process_lparen_or_lbracket, ps.paren[...].maybe_cast was not
initialized, which may have been the cause for seemingly random spacing
around binary operators.

While here, clean up the code by reducing the number of accesses to the
parser state.
 1.170  15-May-2023  rillig indent: indent multi-line conditions

No functional change.
 1.169  15-May-2023  rillig indent: fix indentation of statements after controlling expression
 1.168  15-May-2023  rillig indent: fix indentation of expressions in -nlp -eei mode
 1.167  15-May-2023  rillig indent: remove redundant include lines
 1.166  15-May-2023  rillig indent: clean up memory and buffer management

Remove the need to explicitly initialize the buffers. To avoid
subtracting null pointers or comparing them using '<', migrate the
buffers from the (start, end) form to the (start, len) form. This form
also avoids inconsistencies in whether 'buf.e == buf.s' or 'buf.s ==
buf.e' is used.

Make buffer.st const, to avoid accidental modification of the buffer's
content.

Replace '*buf.e++ = ch' with buf_add_char, to avoid having to keep track
how much unwritten space is left in the buffer. Remove all safety
margins, that is, no more unchecked access to buf.st[-1] or appending
using '*buf.e++'.

Fix line number counting in lex_word for words that contain line breaks.

No functional change.
 1.165  14-May-2023  rillig indent: only null-terminate the buffers if necessary

The only case where a buffer is used as a C-style string is when looking
up a keyword.

No functional change.
 1.164  14-May-2023  rillig indent: reduce code for scanning tokens

The input line is guaranteed to end with '\n', so there's no need to
carry another pointer around.

No functional change.
 1.163  14-May-2023  rillig indent: remove foreign RCS IDs
 1.162  14-May-2023  rillig indent: miscellaneous cleanups
 1.161  13-May-2023  rillig indent: do not add a blank at the beginning of a line

Most calls to output_line did already reset the variable. There may be
some untested edge cases in or after comments, but these should be fine
as well.
 1.160  13-May-2023  rillig indent: implement 'blank after declarations'
 1.159  13-May-2023  rillig indent: use enum instead of magic numbers for tracking declarations

No functional change.
 1.158  13-May-2023  rillig indent: improve names of option variables

No functional change.
 1.157  13-May-2023  rillig indent: rename struct fields for buffers

No binary change except for assertion line numbers.
 1.156  13-May-2023  rillig indent: move debugging code to separate file

No functional change.
 1.155  12-May-2023  rillig indent: remove statistics

The numbers from the statistics were wrong.
 1.154  11-May-2023  rillig indent: clean up input buffer handling

No functional change.
 1.153  11-May-2023  rillig indent: don't touch comments in preprocessing lines

The indentation of multi-line comments was wrong, and the code for
handling them was too complicated.
 1.152  11-May-2023  rillig tests/indent: add more tests for preprocessing directives
 1.151  11-May-2023  rillig indent: remove unused code
 1.150  11-May-2023  rillig indent: remove broken code for handling blank lines

This fixes several bugs where blank lines were erroneously added or
removed, treating these old bugs for new bugs in different places.
These new bugs are expected to be easier to fix, as the old bugs will
not interfere anymore.
 1.149  11-May-2023  rillig indent: add debug output for tracking comments and braces
 1.148  23-Apr-2022  rillig indent: group global variables related to output control

No functional change.
 1.147  13-Feb-2022  rillig indent: consistently use nparen for indexing parser_state.paren

No binary change.
 1.146  13-Feb-2022  rillig indent: rename parser_state.p_l_follow and paren_level

The previous variable names were misleading.

Paren_level is not the current level of parentheses but the one from the
beginning of the current output line. For better accuracy, rename it to
line_start_paren_level.

P_l_follow is not the level of parentheses that will be active at some
point in the future, as the previous name suggested. Instead, it is the
level of parentheses right now. For better accuracy, rename it to
nparen. This nicely matches its main usage, which is as index to the
parser_state.paren array.

No binary change.
 1.145  13-Feb-2022  rillig indent: replace bitmasking code with struct

The struct directly represents the properties of a pair of parentheses,
without forcing the human reader to decode any bitset. This makes it
easier to find the remaining bugs in the heuristic for determining the
kind of parentheses.

No functional change outside debug mode.
 1.144  12-Feb-2022  rillig indent: fix indentation of enum constants in typedef (since 2019-04-04)

The solution is not elegant since it adds a small state machine inside
the parser state, but at least these states only depend on the sequence
of token types and not on any other part of the parser state.

Reported in PR#55453.
 1.143  28-Nov-2021  rillig indent: clean up and document input handling

The transformation of moving comments from after an 'if (expr)' after
the following brace has a large implementation cost (about 300 lines of
code) and makes input handling quite complicated. Document the overall
idea to save future readers some time.

No functional change.
 1.142  27-Nov-2021  rillig indent: accept a few formatting suggestions from indent

The remaining issues are still that the conditions look ambiguous even
with -eei, and that __attribute__ is broken into a separate line.

No functional change.
 1.141  27-Nov-2021  rillig indent: rename dump functions to output

No functional change.
 1.140  27-Nov-2021  rillig indent: add assertions for input handling

Just to document the invariants; the code is already OK.
 1.139  26-Nov-2021  rillig indent: enhance debug logging for input handling
 1.138  26-Nov-2021  rillig indent: replace inp_enlarge with inp_add

Previously, inbuf.inp.s was only updated at the very end of reading a
line from the input file, which meant that during debugging, it pointed
to invalid memory. Updating all fields in inbuf.inp after every
reallocation makes the code less tricky to understand.

No functional change.
 1.137  26-Nov-2021  rillig indent: split inp_read_line into smaller functions

No functional change.
 1.136  26-Nov-2021  rillig indent: extract inp_from_file from inp_read_line

No functional change.
 1.135  26-Nov-2021  rillig indent: remove code that fixes malformed preprocessor directives
 1.134  26-Nov-2021  rillig indent: move ind_add from io.c to indent.c

It's a general-purpose function that is not directly related to input or
output.
 1.133  25-Nov-2021  rillig indent: prevent undefined behavior in inp_line_start

No functional change.
 1.132  25-Nov-2021  rillig indent: update cross-reference comments for bug in comment handling

The function was renamed in io.c 1.122 from 2021-11-19.
 1.131  25-Nov-2021  rillig indent: rename ps.in_stmt to in_stmt_or_decl

The previous name didn't match reality.

No functional change.
 1.130  25-Nov-2021  rillig indent: rename ps.ind_stmt to in_stmt_cont

This makes a comment redundant.

No functional change.
 1.129  19-Nov-2021  rillig indent: reduce casts to unsigned char for character classification

No functional change.
 1.128  19-Nov-2021  rillig indent: keep inbuf.save_com_s and inbuf.save_com_e in sync

No functional change.
 1.127  19-Nov-2021  rillig indent: fix included headers
 1.126  19-Nov-2021  rillig indent: clean up io.c

No functional change.
 1.125  19-Nov-2021  rillig indent: replace ps.procname with ps.is_function_definition

Omly the first character of ps.procname was ever read, and it was only
compared to '\0'. Using a bool for this means simpler code, less
memory and fewer wasted CPU cycles due to the removed strncpy.

No functional change.
 1.124  19-Nov-2021  rillig indent: unexport inbuf

No functional change.
 1.123  19-Nov-2021  rillig indent: use character input API from pr_comment.c

No functional change.
 1.122  19-Nov-2021  rillig indent: remove all references to inbuf from indent.c

No functional change.
 1.121  19-Nov-2021  rillig indent: move character input handling from indent.c to io.c

No functional change.
 1.120  19-Nov-2021  rillig indent: move character input from indent.c to io.c

No functional change.
 1.119  19-Nov-2021  rillig indent: use character input API from the tokenizer

No functional change.
 1.118  19-Nov-2021  rillig indent: move character input handling from lexi.c to io.c

No functional change.
 1.117  19-Nov-2021  rillig indent: group variables for input handling

No functional change.
 1.116  07-Nov-2021  rillig indent: rename 'inbuf' functions to 'inp'

The variable 'inp' used to be named 'inbuf'. Make the function names
correspond to the variable name again.

No functional change.
 1.115  05-Nov-2021  rillig indent: the '+ 1' in dump_line_code is not an off-by-one error
 1.114  04-Nov-2021  rillig indent: do not discard former error comments anymore

Since io.c 1.20 from 2019-10-19, indent has not placed error comments in
the code anymore. Since these comments are supposed to be cleaned up
immediately, there is no point in having code for handling them.
 1.113  04-Nov-2021  rillig indent: extract compute_code_indent_lineup into separate function

Having 9 different paths in a single function made it more complicated
to understand than necessary.

No functional change.
 1.112  04-Nov-2021  rillig indent: fix off-by-one confusion in paren_indent

The variable was called 'indent' but actually contained a 'column',
which was off by one.

No functional change.
 1.111  04-Nov-2021  rillig indent: replace column computation with indentation computation

No functional change.
 1.110  04-Nov-2021  rillig indent: group conditions in compute_code_indent by topic

No functional change.
 1.109  03-Nov-2021  rillig indent: inline indentation_after, shorten function name to ind_add

There were only few calls to indentation_after, so inlining it spares
the need to look at yet another function definition. Another effect is
that code.s and code.e appear in the code as a pair now, instead of a
single code.s, making the scope of the function call obvious.

In ind_add, there is no need to check for '\0' anymore since none of the
buffers can ever contain a null character, these are filtered out by
inbuf_read_line.

No functional change.
 1.108  30-Oct-2021  rillig indent: inline macro label_offset

No functional change.
 1.107  29-Oct-2021  rillig indent: merge isblank and is_hspace into ch_isblank

No functional change.
 1.106  29-Oct-2021  rillig indent: fix undefined behavior in buffer handling

Adding an arbitrary integer to a pointer may result in an out of bounds
pointer, so replace the addition with a pointer subtraction.

In the buffer handling functions, handle 'buf' and 'l' before 's' and
'e', since they are pairs.

In inbuf_read_line, use 's' instead of 'buf' to make the code easier to
understand for human readers.

No functional change.
 1.105  29-Oct-2021  rillig indent: reorder global variables to be more intuitive

The buffer 'inp' comes first. From there, a single token is read into
the buffer 'token'. From there, it usually ends up in 'code'. The buffer
'token' does not belong to the group of the other 3 buffers, which
together make up a line of formatted output.

No functional change.
 1.104  29-Oct-2021  rillig indent: rename ps.dumped_decl_indent and indent_declaration

The word 'dump' in 'ps.dumped_decl_indent' was too close to dump_line,
which led to confusion since the variable controls whether the
indentation has been added to the code buffer, which happens way before
actually dumping the current line to the output file.

The function name 'indent_declaration' was too unspecific, it did not
reveal where the indentation of the declaration actually happened.

No functional change.
 1.103  27-Oct-2021  rillig indent: fix indentation of local variable declarations

This had been broken since the import of FreeBSD indent in 2019.
 1.102  24-Oct-2021  rillig indent: clean up format of warnings and errors

Previously, warnings and errors had the form of C block comments. Before
NetBSD io.c 1.20 from 2019-10-19, this format made sense because the
diagnostics could end up in the same output stream as the formatted
output.

Since NetBSD io.c 1.20 from 2019-10-19, all diagnostics are redirected
to stderr. This change was not mentioned in the commit message back
then, it makes sense nevertheless. Since stdout and stderr now are
properly separated, there is no need anymore to keep the weird format
for warnings and errors. Switch to the standard 'error: file:line'
format.

Move the function 'diag' to indent.c to have access to the name of the
current input file.
 1.101  24-Oct-2021  rillig indent: replace global variable use_ff with function parameter
 1.100  24-Oct-2021  rillig indent: sort includes
 1.99  20-Oct-2021  rillig indent: rename blankline_requested variables

The words 'prefix' and 'postfix' sounded too much like horizontal
concepts, like in operators. The actual purpose of these variables is to
add blank lines before and after the current line, so use the same
wording as in the command line options.

No functional change.
 1.98  20-Oct-2021  rillig indent: rename next_blank_lines to blank_lines_to_output

The previous name was already an improvement over the name before that
(n_real_blanklines), but didn't express the intended purpose clearly
enough, so try another name.

No functional change.
 1.97  19-Oct-2021  rillig indent: always keep next_blank_lines >= 0

No functional change.
 1.96  19-Oct-2021  rillig indent: use simpler code for copying the input buffer

In debug mode, this reduces the amount of debug output lines.

No functional change in default mode.
 1.95  19-Oct-2021  rillig indent: if a file ends with indent off, don't add space-newline
 1.94  11-Oct-2021  rillig indent: use bool for suppress_blanklines

It only ever got assigned the values 0 and 1.

No functional change.
 1.93  11-Oct-2021  rillig indent: remove dead code
 1.92  09-Oct-2021  rillig indent: condense code for calculating indentations

No functional change.
 1.91  09-Oct-2021  rillig indent: extract common code for advancing a single tab

No functional change.
 1.90  08-Oct-2021  rillig indent: improve local variable names

No functional change.
 1.89  08-Oct-2021  rillig indent: rename fill_buffer to inbuf_read_line

No functional change.
 1.88  08-Oct-2021  rillig indent: run indent on indent.h

The formatting looks mostly OK.

Some struct members had excessively long names, leaving no space for
their corresponding comments. Renamed some of them using well-known
abbreviations.

The formatting for debug_vis_range is messed up, no idea why. It is
clearly a function declaration, not a function definition, so there is
no need to place the function name in column 1.

No functional change.
 1.87  08-Oct-2021  rillig indent: fix formatting of C99 comments

The first attempt at formatting C99 comments was conceptually wrong. It
accessed the next token in dump_line, even though that function should
only ever look at the buffers for the label, the code and the current
comment. (Understanding that part of the code was difficult at that time
due to the sheer number of global variables.) The complicated and
ever-growing condition for whether to output the token was a hack and in
retrospect doesn't make sense at all, that's why it only came close to
the intended effect.

Some unintended side effects were that the C99 comments had an
additional space in front of them, and that in some cases an empty line
followed the comment, and that the comments were not aligned.

Previously, the newline that terminates the C99 comment was included in
the comment. Separating the newline from the comment fixed all these
unintended side effects. The only downside is that the multi-line
statement is not indented, but that should be easy to fix.
 1.86  08-Oct-2021  rillig indent: split dump_line into smaller functions

No functional change.
 1.85  08-Oct-2021  rillig indent: replace column calculations with indent, part 4/4
 1.84  08-Oct-2021  rillig indent: replace column calculations with indent, part 3

No functional change.
 1.83  08-Oct-2021  rillig indent: replace column calculations with indent, part 2

No functional change.
 1.82  08-Oct-2021  rillig indent: calculate indentation instead of column

This avoids constantly adding and subtracting 1.

No functional change.
 1.81  08-Oct-2021  rillig indent: reduce indentation in dump_line
 1.80  07-Oct-2021  rillig indent: rename bp_save to saved_inp_s, be_save to saved_inp_e

Using the same naming convention makes it easier to relate the
variables.

No functional change.
 1.79  07-Oct-2021  rillig indent: group variables for the input buffer

The input buffer follows the same concept as the intermediate buffers
for label, code, comment and token, so use the same type for it.

No functional change.
 1.78  07-Oct-2021  rillig indent: use braces around multi-line statements

No functional change.
 1.77  07-Oct-2021  rillig indent: let the code breathe a bit by inserting empty lines

No functional change.
 1.76  07-Oct-2021  rillig indent: remove redundant comments

No functional change.
 1.75  07-Oct-2021  rillig indent: clean up colon handling

No functional change.
 1.74  07-Oct-2021  rillig indent: raise WARNS from the default 5 up to 6
 1.73  05-Oct-2021  rillig indent: rewrite parse_indent_comment for human readers

The generated code is still very similar, GCC does a good job at
inlining strncmp.

No functional change.
 1.72  05-Oct-2021  rillig indent: rename n_real_blanklines

The word 'n' was not as helpful as possible, the word 'real' did not
give any clue at all about the variable's purpose.

No functional change.
 1.71  05-Oct-2021  rillig indent: rename local char variable, reduce scope of counters

No functional change.
 1.70  05-Oct-2021  rillig indent: use proper escape sequence for form feed

This escape sequence has been available since at least 1978.
 1.69  05-Oct-2021  rillig indent: merge duplicate code into is_hspace

No functional change.
 1.68  26-Sep-2021  rillig indent: unexport global variables

The variable match_state was write-only and was thus removed.

No functional change.
 1.67  26-Sep-2021  rillig indent: let indent format its own code -- in supervised mode

After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.

Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.

Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.

No functional change.
 1.66  25-Sep-2021  rillig indent: misc cleanup

No functional change.
 1.65  25-Sep-2021  rillig indent: convert found_err to bool

That variable had slipped through the migration since it consequently
used int for the declaration, the definition and all assignments.

No functional change.
 1.64  25-Sep-2021  rillig indent: un-abbreviate a few parser_state members, clean up comments

No functional change.
 1.63  25-Sep-2021  rillig indent: remove dead code for printing comments after empty lines

This code has been commented out for at least 29 years.

No functional change.
 1.62  25-Sep-2021  rillig indent: convert remaining ibool to bool

No functional change intended.
 1.61  25-Sep-2021  rillig indent: convert parser_state from ibool to bool

indent.c:400:5: error: suggest parentheses around assignment used as
truth value
io.c:271:32: error: ‘~’ on a boolean expression

No functional change intended.
 1.60  25-Sep-2021  rillig indent: prepare for lint's strict bool mode

Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.

Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.

To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.

The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:

In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)

In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.

In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.

In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.

The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.

No change to the resulting binary.
 1.59  25-Sep-2021  rillig indent: remove ifdef for lint

NetBSD lint does not need them anymore, FreeBSD does not have lint.
 1.58  25-Sep-2021  rillig indent: move statistical values into a separate struct

No functional change.
 1.57  25-Sep-2021  rillig indent: add nonnull memory allocation functions

The only functional change is a single error message.
 1.56  25-Sep-2021  rillig indent: group global variables for token buffer

No functional change.
 1.55  25-Sep-2021  rillig indent: group global variables for code buffer

No functional change.
 1.54  24-Sep-2021  rillig indent: group global variables for label buffer into struct

No functional change.
 1.53  24-Sep-2021  rillig indent: extract parse_indent_comment from fill_buffer

No functional change.
 1.52  24-Sep-2021  rillig indent: group global variables for the comment buffer

No functional change.
 1.51  24-Sep-2021  rillig indent: rename local variable in fill_buffer

The local variable name 'com' prevented grouping the global variables
combuf, s_com, e_com and l_com into a struct named 'com'.

No functional change.
 1.50  24-Sep-2021  rillig indent: fix token duplication after C99 comment

The code that keeps blank lines after C99 comments still looks wrong,
but at least it's better than before.
 1.49  14-Mar-2021  rillig indent: make compute_code_indent more readable

The '?:' operator computing the factor was too hard to read. When
quickly scanning the code, the 1 in the expression looked too much like
it would be added to the indentation, which would turn the indentation
length into a column number, and that again would smell like an
off-by-one error.

No functional change.
 1.48  14-Mar-2021  rillig indent: fix lint warnings

No functional change.
 1.47  13-Mar-2021  rillig indent: add debug logging for switching the input buffer

No functional change outside debug mode.
 1.46  13-Mar-2021  rillig indent: align comments in indent's own code

No functional change.
 1.45  13-Mar-2021  rillig indent: rename local variable in dump_line

This clarifies that the variable names a column, not an indentation.
 1.44  13-Mar-2021  rillig indent: in dump_line, reduce scope of local variable

This allows the variable 'target' in the lower half of the function to
get a more specific name.

No functional change.
 1.43  13-Mar-2021  rillig indent: distinguish between 'column' and 'indentation'

column == 1 + indentation.

In addition, indentation is a relative distance while column is an
absolute position. Therefore, don't confuse these two concepts, to
prevent off-by-one errors.

No functional change.
 1.42  13-Mar-2021  rillig indent: fix confusing variable names

The word 'col' should only be used for the 1-based column number. This
name is completely inappropriate for a line length since that provokes
off-by-one errors. The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.

Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.

No functional change.
 1.41  13-Mar-2021  rillig indent: inline calls to count_spaces and count_spaces_until

These two functions operated on column numbers instead of indentation,
which required adjustments of '+ 1' and '- 1'. Their names were
completely wrong since these functions did not count anything, instead
they computed the column.

No functional change.
 1.40  13-Mar-2021  rillig indent: replace column computation with indentation computation

No functional change.
 1.39  13-Mar-2021  rillig indent: replace compute_code_column with compute_code_indent

The goal is to only ever be concerned about the _indentation_ of a
token, never the _column_ it appears in. Having only one of these
avoids off-by-one errors.

No functional change.
 1.38  13-Mar-2021  rillig indent: replace compute_label_column with compute_label_indent

Using the invariant 'column == 1 + indent'. This removes several overly
complicated '+ 1' from the code that are not needed conceptually.

No functional change.
 1.37  13-Mar-2021  rillig indent: manually fix indentation in indent's own source code
 1.36  13-Mar-2021  rillig indent: add debug logging for actually writing to the output file

Together with the results of the tokenizer and the 4 buffers for token,
label, code and comment, the debug log now provides a good high-level
view on how the indentation happens and where to look for the many
remaining bugs.
 1.35  13-Mar-2021  rillig indent: remove strange debugging code that went in the output file

Whenever the code to be output contained the magic byte 0x80, instead of
writing this byte, indent wrote the column number at the beginning of
the code snippet, times 7. Especially the 'times 7' does not make any
sense at all.

In ISO-8859-1, this character position is not assigned. In Microsoft
Codepage 1252 it is the Euro sign. In UTF-8 (which was probably not on
the author's list when the code was originally written) it occurs as the
middle byte for code points like U+2026 (horizontal ellipsis) from the
block General Punctuation.

Remove this strange code, thereby fixing indent for UTF-8 code. The
code had been there since at least 1993-04-09, when it was first
imported to NetBSD.
 1.34  13-Mar-2021  rillig indent: replace pad_output with output_indent

Calculating the indentation is simpler than calculating the column,
since that saves the constant addition and subtraction of the 1.

No functional change.
 1.33  13-Mar-2021  rillig indent: clean up verbose documentation comments from the 1970s

Since C90, there is no need to repeat the type of the function
parameters.

In the whole code of indent, there is a lot of confusion between the
concepts of a 'column' (which is a position on the screen, counting
starts at 1) and 'indentation' (which is a length, not a position). To
avoid this confusion, the code will be rewritten anyway very soon.

Repeatedly adding and subtracting 1 from the 'current column' is not
elegant, this should rather be done by consistently measuring only the
indentation from the left border (at offset 0), as a distance, not as an
absolute position.
 1.32  12-Mar-2021  rillig indent: add 'const', rename variables, reorder formula for tab width

Column counting starts at 1. This 1 should rather be at the beginning
of the formula since it is thought of being added at the very beginning
of the line, not at the end.

When adding a tab, the newly added tab is added at the end of the
string, therefore that '+ 1' should be at the end of the formula as
well.

No functional change.
 1.31  12-Mar-2021  rillig indent: replace 'target' with 'indent' in function names

The word 'target' was not as specific as possible.

No functional change.
 1.30  12-Mar-2021  rillig indent: use consistent indentation for 'else'

Half of the code used -ce, the other half the opposite -nce.

No functional change.
 1.29  12-Mar-2021  rillig indent: make output_string inline

GCC 9.3.0 didn't notice that the argument to this function is always a
string literal, which makes it worthwhile to inline the call.
 1.28  12-Mar-2021  rillig indent: add helper functions for doing the actual output

This allows to add debug logging to these few functions instead of all
other places that might output something.

Reducing the possible output formats to a few primitives makes dump_line
simpler, especially the fprintf calls. It also removes the non-constant
printf string.

The call to output_int may be meant for debugging, as the character 0x80
is unlikely to appear in any real-world code.

No functional change.
 1.27  08-Mar-2021  rillig indent: remove redundant initializer in dump_line

No functional change.
 1.26  08-Mar-2021  rillig indent: move comment about dump_line to column 1

It looked misplaced on the right side since that area is usually
reserved for small remarks, not long explanations.

No functional change.
 1.25  08-Mar-2021  rillig indent: always use braces in do-while loops

Having a 'while' at the beginning of a line looks as if it would start a
loop. It's confusing when it _ends_ a loop instead.
 1.24  07-Mar-2021  rillig indent: fix handling of '//' end-of-line comments
 1.23  07-Mar-2021  rillig indent: remove redundant parentheses around return value

No functional change.
 1.22  07-Mar-2021  rillig indent: use all headers in all files

This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.

There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.

No change to the resulting binary.
 1.21  06-Mar-2021  rillig indent: fix space-tab alignment in indent's own code

These parts are not fixed automatically by indent since they are in box
comments.

No functional change.
 1.20  19-Oct-2019  christos use stdarg, annotate function as __printflike and fix broken formats.
 1.19  04-Apr-2019  kamil Upgrade indent(1)

Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.

FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.

Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.

Major chages in this import:

- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.

Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.

Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.
 1.18  03-Feb-2019  mrg - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
this case, and thus can't be marked __dead easily
 1.17  25-Feb-2016  ginsbach branches: 1.17.16;
Fix obvious contraction spelling mistakes by adding missing apostrophes.
 1.16  22-Feb-2016  ginsbach Use errx(3).
 1.15  04-Sep-2014  mrg port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.14  12-Apr-2009  lukem branches: 1.14.24;
Fix WARNS=4 issues (-Wshadow -Wcast-qual -Wsign-compare)
 1.13  16-Oct-2003  itojun branches: 1.13.42;
safer use of realloc
 1.12  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.11  26-May-2002  wiz Remove #ifndef'd __STDC__ code. ANSIfy.
 1.10  14-Oct-2000  is Due to infinite wisdom by the language designers, the difference of pointers
has a type of (int) on i386 and (long) on sparc, and I don't even want to
know what else on other cpu types.
 1.9  19-Dec-1998  christos branches: 1.9.2; 1.9.10;
char -> unsigned char, braces for gcc-2.8.1
 1.8  25-Aug-1998  ross Add { and } to shut up egcs. Reformat the more questionable code.
 1.7  30-Mar-1998  mrg use static int instead of static
 1.6  19-Oct-1997  mrg fix compile warnings on the alpha.
 1.5  19-Oct-1997  lukem WARNSify, fix .Nm usage, deprecate register, use <err.h>, KNFify (with indent!;)
 1.4  18-Oct-1997  mrg merge lite-2.
 1.3  09-Jan-1997  tls RCS ID police
 1.2  01-Aug-1993  mycroft Add RCS identifiers.
 1.1  09-Apr-1993  cgd branches: 1.1.1;
added, from net/2 (patch 124).
 1.1.1.2  04-Apr-2019  kamil FreeBSD indent r340138
 1.1.1.1  06-Jun-1993  mrg 4.4BSD-Lite2
 1.9.10.1  17-Oct-2000  tv Pullup 1.10 [is]:
Due to infinite wisdom by the language designers, the difference of pointers
has a type of (int) on i386 and (long) on sparc, and I don't even want to
know what else on other cpu types.
 1.9.2.1  19-Oct-2000  he Pull up revision 1.10 (requested by is):
The type of the difference between pointers is implementation-
defined (long on sparc and int on most others). Compensate when
printing.
 1.13.42.1  13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.14.24.1  21-Sep-2014  snj Pull up following revision(s) (requested by mrg in ticket #110):
usr.bin/indent/io.c: revision 1.15
usr.bin/indent/indent_globs.h: revision 1.10
usr.bin/indent/args.c: revision 1.11
usr.bin/indent/indent.1: revision 1.23
usr.bin/indent/indent.c: revision 1.19
port the -ut / -nut options from freebsd. -ut (default) enables tabs
in output, the -nut uses spaces.
 1.17.16.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.17.16.1  10-Jun-2019  christos Sync with HEAD
 1.235.2.1  02-Aug-2025  perseant Sync with HEAD

RSS XML Feed