Home | History | Annotate | Line # | Download | only in lint1
README.md revision 1.2
      1 [//]: # ($NetBSD: README.md,v 1.2 2022/04/13 22:58:18 rillig Exp $)
      2 
      3 # Introduction
      4 
      5 To learn how a specific message is triggered, read the corresponding unit
      6 test in `tests/usr.bin/xlint/lint1/msg_???.c`.
      7 
      8 # Features
      9 
     10 ## Type checking
     11 
     12 Lint has stricter type checking than most C compilers.
     13 It warns about type conversions that may result in alignment problems,
     14 see the test `msg_135.c` for examples.
     15 
     16 ## Control flow analysis
     17 
     18 Lint roughly tracks the control flow inside a single function.
     19 It doesn't follow `goto` statements though.
     20 See the test `msg_193.c` for examples.
     21 
     22 ## Error handling
     23 
     24 Lint tries to continue parsing and checking even after seeing errors.
     25 This part of lint is not robust though, so expect some crashes here,
     26 as variables may not be properly initialized or be null pointers.
     27 
     28 # Fundamental types
     29 
     30 Lint mainly analyzes expressions (`tnode_t`), which are formed from operators
     31 (`op_t`) and their operands (`tnode_t`).
     32 Each node has a type (`type_t`) and a few other properties.
     33 
     34 ## type_t
     35 
     36 The basic types are `int`, `_Bool`, `unsigned long`, and so on.
     37 A basic type is created by `gettyp(INT)`.
     38 Derived types are created by `block_derive_pointer`,
     39 `block_derive_array` and `block_derive_function`.
     40 (See [below](#memory-management) for the meaning of the prefix `block_`.)
     41 
     42 After a type has been created, it should not be modified anymore.
     43 Ideally all references to types would be `const`, but that's a lot of work.
     44 Until that is implemented, before modifying a type,
     45 it needs to be copied using `block_dup_type` or `expr_dup_type`.
     46 
     47 ## tnode_t
     48 
     49 When lint parses an expressions,
     50 it builds a tree of nodes representing the AST.
     51 Each node has an operator, which defines which other members may be accessed.
     52 The operators and their properties are defined in `ops.def`.
     53 Some examples for operators:
     54 
     55 | Operator | Meaning                                                 |
     56 |----------|---------------------------------------------------------|
     57 | CON      | compile-time constant in `tn_val`                       |
     58 | NAME     | references the identifier in `tn_sym`                   |
     59 | UPLUS    | the unary operator `+tn_left`                           |
     60 | PLUS     | the binary operator `tn_left + tn_right`                |
     61 | CALL     | a function call, typically CALL(LOAD(NAME("function"))) |
     62 | CVT      | an implicit conversion or an explicit cast              |
     63 
     64 ## sym_t
     65 
     66 There is a single symbol table (`symtab`) for the whole translation unit.
     67 This means that the same identifier may appear multiple times.
     68 To distinguish the identifiers, each symbol has a block level.
     69 Symbols from inner scopes are added to the beginning of the table,
     70 so they are found first when looking for the identifier.
     71 
     72 # Memory management
     73 
     74 ## Block scope
     75 
     76 The memory that is allocated by the `block_*_alloc` functions is freed at the
     77 end of analyzing the block, that is, after the closing `}`.
     78 See `compound_statement_rbrace:` in `cgram.y`.
     79 
     80 ## Expression scope
     81 
     82 The memory that is allocated by the `expr_*_alloc` functions is freed at the
     83 end of analyzing the expression.
     84 See `expr_free_all`.
     85 
     86 # Null pointers
     87 
     88 * Expressions can be null.
     89     * This typically happens in case of syntax errors or other errors.
     90 * The subtype of a pointer, array or function is never null.
     91 
     92 # Common variable names
     93 
     94 | Name | Type      | Meaning                                              |
     95 |------|-----------|------------------------------------------------------|
     96 | t    | `tspec_t` | a simple type such as `INT`, `FUNC`, `PTR`           |
     97 | tp   | `type_t`  | a complete type such as `pointer to array[3] of int` |
     98 | stp  | `type_t`  | the subtype of a pointer, array or function          |
     99 | tn   | `tnode_t` | a tree node, mostly used for expressions             |
    100 | op   | `op_t`    | an operator used in an expression                    |
    101 | ln   | `tnode_t` | the left-hand side operand of a binary operator      |
    102 | rn   | `tnode_t` | the right-hand side operand of a binary operator     |
    103 | sym  | `sym_t`   | a symbol from the symbol table                       |
    104 
    105 # Abbreviations
    106 
    107 | Abbr | Expanded |
    108 |------|----------|
    109 | l    | left     |
    110 | r    | right    |
    111 | st   | subtype  |
    112 | op   | operator |
    113 
    114 # Debugging
    115 
    116 Useful breakpoints are:
    117 
    118 | Location                      | Remarks                                              |
    119 |-------------------------------|------------------------------------------------------|
    120 | build_binary in tree.c        | Creates an expression for a unary or binary operator |
    121 | initialization_expr in init.c | Checks a single initializer                          |
    122 | expr in tree.c                | Checks a full expression                             |
    123 | typeok in tree.c              | Checks two types for compatibility                   |
    124 | vwarning_at in err.c          | Prints a warning                                     |
    125 | verror_at in err.c            | Prints an error                                      |
    126 | assert_failed in err.c        | Prints the location of a failed assertion            |
    127 
    128 # Tests
    129 
    130 The tests are in `tests/usr.bin/xlint`.
    131 By default, each test is run with the lint flags `-g` for GNU mode,
    132 `-S` for C99 mode and `-w` to report warnings as errors.
    133 
    134 Each test can override the lint flags using comments of the following forms:
    135 
    136 * `/* lint1-flags: -tw */` replaces the default flags.
    137 * `/* lint1-extra-flags: -p */` adds to the default flags.
    138 
    139 Most tests check the diagnostics that lint generates.
    140 They do this by placing `expect` comments near the location of the diagnostic.
    141 The comment `/* expect+1: ... */` expects a diagnostic to be generated for the
    142 code 1 line below, `/* expect-5: ... */` expects a diagnostic to be generated
    143 for the code 5 lines above.
    144 Each `expect` comment must be in a single line.
    145 There may be other code or comments in the same line.
    146 
    147 Each diagnostic has its own test `msg_???.c` that triggers the corresponding
    148 diagnostic.
    149 Most other tests focus on a single feature.
    150 
    151 ## Adding a new test
    152 
    153 1. Run `make -C tests/usr.bin/xlint/lint1 add-test NAME=test_name`.
    154 2. Run `cvs commit distrib/sets/lists/tests/mi tests/usr.bin/xlint`.
    155