Home | History | Annotate | Download | only in make
History log of /src/usr.bin/make/hash.c
RevisionDateAuthorComments
 1.80  22-Apr-2025  rillig make: clean up

Replace 'unsigned int' with simply 'unsigned'.

In compat.c, skipping whitespace is not needed, as the loop above
already skips it.

In job.c, remove the unused header <sys/file.h>.

Inline the TMPPAT macro, as it is only needed in a single place.
 1.79  07-Jul-2024  rillig make: don't track hash table chain lengths during lookup

The chain lengths are only used for debugging purposes, so avoid the
extra cost at each lookup. Instead, calculate the maximum chain length
only when it is actually requested in -dh mode.

The reported number changes slightly: Before, it was the length of the
chain that was actually traversed to find an entry, up to that entry,
now it is the length of the largest chain in the table, no matter if it
was actually accessed or not.
 1.78  05-Jun-2024  rillig branches: 1.78.2;
make: sync comments with reality
 1.77  31-May-2024  rillig make: simplify expression in iteration over hash tables
 1.76  31-May-2024  rillig make: clean up API for iterating over hash tables
 1.75  24-May-2024  rillig make: remove dead code from HashTable_DeleteEntry
 1.74  19-Dec-2023  rillig make: clean up comments

No binary change, except for line numbers in assertions.
 1.73  17-Dec-2023  rillig make: clean up names of local variables

No binary change.
 1.72  09-Feb-2022  rillig make: fix mistakes, spelling and typos in comments and manual page

No binary change for -DNDEBUG.
 1.71  27-Jan-2022  rillig make: merge duplicate code for finding an entry in a hash table

No functional change.
 1.70  27-Jan-2022  rillig make: replace HashEntry_KeyEquals with strncmp

No functional change.
 1.69  27-Dec-2021  rillig make: replace __func__ with actual strings

Make is supposed to be C90-compatible, and __func__ is from C99.

No functional change.
 1.68  15-Dec-2021  rillig make: use consistent indentation for statements and continuations

No binary change, except for line numbers in assertions in suff.c.
 1.67  15-Dec-2021  rillig make: change return type of HashTable_Set to void

None of the callers needs the HashEntry for further manipulation.

No functional change.
 1.66  07-Dec-2021  rillig make: inline HashIter_Init

It is only used in non-critical code paths, but the generated code gets
smaller by inlining.

No functional change.
 1.65  12-Sep-2021  rillig make: fix build for DEBUG_HASH_LOOKUP
 1.64  11-Apr-2021  rillig make: avoid allocating memory for simple variable names

The main change is in ParseVarname, where a Buffer is replaced with the
newly introduced LazyBuf. LazyBuf is inspired by
https://golang.org/src/path/path.go.

In CanonicalVarname, the pre-comparison of the first letter of the
variable name is no longer necessary. GCC 9 optimizes a fixed-length
memcmp so well that the code can finally be written to target human
readers, leaving the optimization to the compiler.
 1.63  03-Apr-2021  rillig make: backport to C90

In the past few months I had accidentally used C99 features in the make
code. According to tools/README, tools that are used in the build
system should restrict themselves to C90.

This allows make to build with GCC's options "-pedantic
-Wno-system-headers -Dinline= -Wno-error=cast-qual".

I didn't notice anyone actively complaining though, I just wanted to see
how much work this backporting would be. The identifier __func__ is
still used, as in other tools.

No functional change.
 1.62  03-Apr-2021  rillig make: use C99 bool type instead of defining its own

No functional change.
 1.61  01-Feb-2021  rillig make: clean up comments in hash.c
 1.60  30-Dec-2020  rillig make(1): format multi-line comments
 1.59  15-Dec-2020  rillig make(1): clean up hash function for HashTable

Expressing a multiplication as a bit shifts and subtraction is the job
of the compiler. For humans, a multiplication is easier to read.
 1.58  23-Nov-2020  rillig make(1): use tabs for indentation in hash.h and hash.c
 1.57  14-Nov-2020  rillig make(1): replace a few HashTable_CreateEntry with HashTable_Set

Instead of HashTable_CreateEntry and HashEntry_Set, several places just
need the HashEntry for storing a value in it. This makes the calling
code simpler to understand.

These parts of the code are already hard enough to understand since they
are about memory management and aliasing. Having a too detailed API for
the HashTable only distracts from these topics.
 1.56  05-Nov-2020  rillig make(1): remove redundant parentheses from sizeof operator

The parentheses are only needed if the argument is a type, not an
expression.
 1.55  25-Oct-2020  rillig make(1): print hash in debug log with fixed width

This way all the keys are nicely aligned in the debug log.
 1.54  25-Oct-2020  rillig make(1): rename hash functions to identify the type name

This makes it easier to spot mismatches between the function name and
its first parameter, although the compiler should already catch most of
them. Except for void pointers.
 1.53  25-Oct-2020  rillig make(1): clean up comments in hash.c
 1.52  25-Oct-2020  rillig make(1): clean up hash table functions
 1.51  25-Oct-2020  rillig make(1): refactor Hash_DeleteTable
 1.50  25-Oct-2020  rillig make(1): refactor Hash_InitTable

First prepare all the data, then initialize the fields in declaration
order.
 1.49  25-Oct-2020  rillig make(1): clean up RebuildTable for hash tables

The previous code used ++ and -- a lot, it also reused variables a lot
for different purposes.
 1.48  25-Oct-2020  rillig make(1): reduce amount of string hashing

In pkgsrc, running "bmake show-all" in pkgtools/pkglint called the hash
function 249130 times before, and only 115502 times after.

Still, a single call to Var_Set hashes the same string 3 times.
 1.47  18-Oct-2020  rillig make(1): rename HashEntry.name to key
 1.46  18-Oct-2020  rillig make(1): remove underscore from Hash_Table and Hash_Entry

For consistency with the other type names, such as GNodeListNode.
 1.45  18-Oct-2020  rillig make(1): make API for iterating over hash tables simpler
 1.44  05-Oct-2020  rillig make(1): make dir.c, for.c and hash.c ready for WARNS=6

Some types have changed from int to unsigned int, size_t or time_t.

The variable i in hash.c has been kept as int since it counts down to
-1, which generates efficient machine code, at least on x86_64.
 1.43  05-Oct-2020  rillig make(1): revert previous commit

It had accidentally reverted all the work from the past few days.
 1.42  05-Oct-2020  rillig make(1): fix double-free bug in -DCLEANUP mode (since 2020-10-02)

The bug had been introduced with dir.c 1.155 on 2020-10-02 22:20:25. In
that commit, openDirectories was replaced with a combination of a list
with a hash table, for more efficient lookup by name.

Upon cleanup, OpenDirs_Done is called, which in turn called
Dir_ClearPath. Dir_ClearPath takes full ownership of the given list and
empties it. This was no problem before since afterwards the list was
empty and calling Lst_Free just frees the remaining list pointer.

With OpenDirs, this list was combined with a hash table, and the hash
table contains the list nodes, assuming that the OpenDirs functions have
full ownership of both the list and the hash table. This assumption was
generally correct, except for the one moment during cleanup where full
ownership of the list was passed to Dir_ClearPath, while the hash table
still contained pointers to the (now freed) list nodes. This by itself
was not a problem since the hash table would be freed afterwards. But
as part of Dir_ClearPath, OpenDirs_Remove was called, which looked up
the freed directory by name and now found the freed list node, trying to
free it again. Boom.

Fixed by replacing the call to Dir_ClearPath with code that only frees
the directories, without giving up control over the list.
 1.41  04-Oct-2020  rillig make(1): merge duplicate code in Hash_FindEntry and Hash_CreateEntry
 1.40  04-Oct-2020  rillig make(1): avoid forward declaration for RebuildTable
 1.39  04-Oct-2020  rillig make(1): remove dead code from Hash_FindEntry

All callers pass a properly initialized table.
 1.38  03-Oct-2020  rillig make(1): replace strcpy with memcpy in Hash_CreateEntry

The string length is already known, no need to recalculate it.
 1.37  03-Oct-2020  rillig make(1): remove unnecessary code from Hash_DeleteEntry

This function is only called 2 times, and both callers pass valid
arguments.
 1.36  03-Oct-2020  rillig make(1): clean up hash table implementation

Having the hash function potentially redefined is a bit too much
flexibility. If actually needed, this should be done using a patch, not
using the C preprocessor.

Converting the macro to a function made the control flow easier to
understand. It also revealed that the variable p was unnecessary in
both Hash_FindEntry and Hash_CreateEntry.
 1.35  28-Sep-2020  rillig make(1): make debugging code shorter
 1.34  27-Sep-2020  rillig make(1): normalize whitespace in source code

There is no more space tab. Either only tabs or only spaces or tabs
followed by spaces, but not spaces followed by tabs.
 1.33  26-Sep-2020  rillig make(1): add Hash_FindValue, for direct access to hash table data
 1.32  13-Sep-2020  rillig make(1): clean up RCSID blocks

These blocks mostly consisted of redundant structure, following the same
#ifndef pattern over and over, with only minimal variation.

It's easier to maintain if the common structure is only written once and
encapsulated in a macro.

To avoid "defined but unused" warnings from GCC in the case where
MAKE_NATIVE is not defined, I had to add volatile. Adding
MAKE_ATTR_UNUSED alone would not preserve the rcsid variable in the
resulting binary.
 1.31  05-Sep-2020  rillig make(1): remove initial size argument from Hash_InitTable

In all but one case this argument was set to auto-detect anyway. The
one case where it was set was not worth keeping this complicated API.
 1.30  05-Sep-2020  rillig make(1): make Hash_Table independent from -funsigned-char

This only makes a difference for Hash_Table keys outside the ASCII
character set, and these are barely used in practice, if at all.

The effects of this change can only be seen in debug mode, when printing
the full contents of the variable namespaces. In this output, the order
of the entries might change.

All other use cases stay the same as before.
 1.29  01-Sep-2020  rillig make(1): rename Hash_Table fields

Back in the 1980s it made sense to have the type information encoded in
the variable names. At the time when make was imported into the NetBSD
tree (1993-03-21), the functions did indeed not have prototypes, they
only had return types. The void type was already invented at that time.
Since the compiler could not verify the types of function parameters, it
made perfect sense to have each variable tell whether it was a pointer
or not.

Since ISO C90 this is no longer necessary since the compiler checks
this. The variable names can now focus on the application level and
their high-level meaning, expressing the relationship to other
variables instead of encoding redundant type information.
 1.28  28-Aug-2020  rillig make(1): remove redundant comments from hash.c
 1.27  26-Aug-2020  rillig make(1): remove header sprite.h

Make is independent of the Sprite operating system.
 1.26  01-Aug-2020  rillig make(1): use consistent indentation in source code

Tabs for multiples of 8, then spaces.

The usage string has been kept as-is since the spaces there are
indentional and do influence the output.
 1.25  20-Jul-2020  sjg Make DEBUG_HASH less of a fire-hose.

Reporting keys on every lookup is overkill unless
playing with a new HASH, so wrap in #ifdef DEBUG_HASH_LOOKUP
Also add some stats at the end so we can see
final size and max chain length - maxchain is a better
variable name than maxlen.
 1.24  19-Jul-2020  riastradh Nix trailing whitespace.
 1.23  18-Jul-2020  sjg Add -dh for DEBUG_HASH

Allow tracking of max chain length, to see how well the hash
tables are working.
Pull the actual hash operation into a marco so it can be
easily changed - for experimenting.

The current hash, is pretty good.

Reviewed by: christos
 1.22  03-Jul-2020  rillig make(1): add Hash_ForEach to avoid duplicate code
 1.21  03-Jul-2020  rillig make(1): remove redundant parentheses around return values
 1.20  14-Nov-2013  sjg Don't SEGV when Hash_Table is uninitialized
 1.19  24-Jan-2009  dsl branches: 1.19.8; 1.19.14;
Don't cast 'time_t' to 'void *' and back it will lose precision.
 1.18  13-Dec-2008  dsl Use NULL instead of -1 cast to the relavant type (usually via NIL).
This was a suggestion from christos - so blame him if there is a deep
reason for using -1 :-)
 1.17  06-Oct-2008  joerg Don't use emalloc and friends directly, but call them consistently
bmake_malloc and friends. Implement them via macros for the native case
and provide fallback implementations otherwise. Avoid polluting the
namespace by not defining enomem globally. Don't bother to provide
strdup and strndup, they were only used for the estrdup and estrndup
comapt code.

This addresses the presence of emalloc in system libraries on A/UX and
resulted strange issues as reported by Timothy E. Larson.
 1.16  04-Aug-2005  christos remove unnecessary casts to void * functions (Max Okumoto)
 1.15  25-Jul-2005  christos Whitespace KNF cleanup from Max Okumoto
 1.14  16-Feb-2005  christos PR/29203, PR/29204: Max Okumoto: KNF changes to make [no functional changes]
 1.13  07-May-2004  ross Simplify build, no functional changes.

Instead of adding MAKE_BOOTSTRAP for hosted environments, i.e., when
you want things simple, instead add MAKE_NATIVE to get those hugely
important features like __RCSID().

It's now possible to build make on some hosts with: cc *.c */*.c
 1.12  07-Aug-2003  agc branches: 1.12.2;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.11  14-Jul-2003  christos Pass WARNS=3
 1.10  15-Jun-2002  wiz Remove !__STDC__ stuff, de-__P(), ANSIfy, and de-register.
 1.9  11-Jun-2000  mycroft Use a lower threshold for rebuilding hash tables.
 1.8  28-Sep-1997  lukem branches: 1.8.10;
wrap #include <sys/cdefs.h>, __RCSID(...) stuff in #ifndef MAKE_BOOTSTRAP
 1.7  01-Jul-1997  christos Add WARNS=1
RCSID police
 1.6  06-Nov-1996  christos - Merge in FreeBSD and Lite2 changes.
- Fix bug where a non-archive target with a .a suffix would always
be considered to be out of date, since it does not have a TOC.
 1.5  14-Jun-1995  christos branches: 1.5.6;
- $NetBSD$ rcsids
- Fixed so that .[A-Z]* targets that do not match keywords are ignored as
Posix mandates
- Added .PHONY target keyword
 1.4  05-Mar-1994  cgd fixes/improvements from Christos Zoulas <christos@deshaw.com>.
 1.3  13-Jan-1994  jtc Include appropriate header files to bring prototypes into scope.
 1.2  01-Aug-1993  mycroft Add RCS identifiers.
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.2  28-Dec-1996  tls Import 4.4BSD-Lite2 sources onto CSRG branch (already merged at head)
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.5.6.1  26-Jan-1997  rat Update make(1) from trunk, by request from Christos Zoulas. Fixes many bugs.
 1.8.10.1  23-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.12.2.1  10-May-2004  tron Pull up revision 1.13 (requested by sjg in ticket #282):
Simplify build, no functional changes.
Instead of adding MAKE_BOOTSTRAP for hosted environments, i.e., when
you want things simple, instead add MAKE_NATIVE to get those hugely
important features like __RCSID().
It's now possible to build make on some hosts with: cc *.c */*.c
 1.19.14.1  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.19.8.1  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.78.2.1  02-Aug-2025  perseant Sync with HEAD

RSS XML Feed