Home | History | Annotate | only in /src/external/gpl3/gcc.old/dist/libcody
Up to higher level directory
NameDateSize
buffer.cc07-Sep-20258.1K
client.cc07-Sep-20257.6K
cmake/Today
CMakeLists.txt07-Sep-20253.3K
CODING.md07-Sep-20252.9K
cody.hh07-Sep-202521.5K
config.h.in07-Sep-2025666
config.m407-Sep-20252.8K
configure07-Sep-2025117.4K
configure.ac07-Sep-20251.9K
CONTRIB.md07-Sep-2025164
fatal.cc07-Sep-20251,022
internal.hh07-Sep-20253.1K
LICENSE07-Sep-202511.1K
Makefile.in07-Sep-20253.2K
netclient.cc07-Sep-20252.6K
netserver.cc07-Sep-20253K
packet.cc07-Sep-2025805
README.md07-Sep-202518.1K
resolver.cc07-Sep-20254.4K
server.cc07-Sep-20256.9K

README.md

      1 # libCODY: COmpiler DYnamism<sup><a href="#1">1</a></sup>
      2 
      3 Copyright (C) 2020 Nathan Sidwell, nathan (a] acm.org
      4 
      5 libCODY is an implementation of a communication protocol between
      6 compilers and build systems.
      7 
      8 **WARNING:**  This is preliminary software.
      9 
     10 In addition to supporting C++modules, this may also support LTO
     11 requirements and could also deal with generated #include files
     12 and feed the compiler with prepruned include paths and whatnot.  (The
     13 system calls involved in include searches can be quite expensive on
     14 some build infrastructures.)
     15 
     16 * Client and Server objects
     17 * Direct connection for in-process use
     18 * Testing with Joust (that means nothing to you, doesn't it!)
     19 
     20 
     21 ## Problem Being Solved
     22 
     23 The origin is in C++20 modules:
     24 ```
     25 import foo;
     26 ```
     27 
     28 At that import, the compiler needs<sup><a href="#2">2</a></sup> to
     29 load up the compiled serialization of module `foo`.  Where is that
     30 file?  Does it even exist?  Unless the build system already knows the
     31 dependency graph, this might be a completely unknown module.  Now, the
     32 build system knows how to build things, but it might not have complete
     33 information about the dependencies.  The ultimate source of
     34 dependencies is the source code being compiled, and specifying the
     35 same thing in multiple places is a recipe for build skew.
     36 
     37 Hence, a protocol by which a compiler can query a build system.  This
     38 was originally described in <a
     39 href="https://wg21.link/p1184r1">p1184r1:A Module Mapper</a>.  Along
     40 with a proof-of-concept hack in GNUmake, described in <a
     41 href="https://wg21.link/p1602">p1602:Make Me A Module</a>. The current
     42 implementation has evolved and an update to p1184 will be forthcoming.
     43 
     44 ## Packet Encoding
     45 
     46 The protocol is turn-based.  The compiler sends a block of one or more
     47 requests to the builder, then waits for a block of responses to all of
     48 those requests.  If the builder needs to compile something to satisfy
     49 a request, there may be some time before the response.  A builder may
     50 service multiple compilers concurrently, each as a separate connection.
     51 
     52 When multiple requests are in a block, the responses are also in a
     53 block, and in corresponding order.  The responses must not be
     54 commenced eagerly -- they must wait until the incoming block has ended
     55 (as mentioned above, it is turn-based).  To do otherwise risks
     56 deadlock, as there is no requirement for a sending end of the
     57 communication to listen for incoming responses (or new requests) until
     58 it has completed sending its current block.
     59 
     60 Every request has a response.
     61 
     62 Requests and responses are user-readable text.  It is not intended as
     63 a transmission medium to send large binary objects (such as compiled
     64 modules).  It is presumed the builder and the compiler share a file
     65 system, for that kind of thing.<sup><a href="#3">3</a></sup>
     66 
     67 Messages characters are encoded in UTF8.
     68 
     69 Messages are a sequence of octets ending with a NEWLINE (0xa).  The lines
     70 consist of a sequence of words, separated by WHITESPACE (0x20 or 0x9).
     71 Words themselves do not contain WHITESPACE.  Lines consisting solely
     72 of WHITESPACE (or empty) are ignored.
     73 
     74 To encode a block of multiple messages, non-final messages end with a
     75 single word of SEMICOLON (0x3b), immediately before the NEWLINE.  Thus
     76 a serial connection can determine whether a block is complete without
     77 decoding the messages.
     78 
     79 Words containing characters in the set [-+_/%.A-Za-z0-9] need not be
     80 quoted.  Words containing characters outside that set should be
     81 quoted.  A zero-length word may be achieved with `''`
     82 
     83 Quoted words begin and end with APOSTROPHE (x27). Within the quoted
     84 word, BACKSLASH (x5c) is used as an escape mechanism, with the
     85 following meanings:
     86 
     87 * \\n - NEWLINE (0xa)
     88 * \\t - TAB (0x9)
     89 * \\' - APOSTROPHE (')
     90 * \\\\ - BACKSLASH (\\)
     91 
     92 Characters in the range [0x00, 0x20) and 0x7f are encoded with one or
     93 two lowercase hex characters.  Octets in the range [0x80,0xff) are
     94 UTF8 encodings of unicode characters outside the traditional ASCII set
     95 and passed as such.
     96 
     97 Decoding should be more relaxed.  Unquoted words containing characters
     98 in the range [0x20,0xff] other than BACKSLASH or APOSTROPHE should be
     99 accepted.  In a quoted sequence, `\` followed by one or two lower case
    100 hex characters decode to that octet.  Further, words can be
    101 constructed from a mixture of abutted quoted and unquoted sequences.
    102 For instance `FOO' 'bar` would decode to the word `FOO bar`.
    103 
    104 Notice that the block continuation marker of `;` is not a valid
    105 encoding of the word `;`, which would be `';'`.
    106 
    107 It is recommended that words are separated by single SPACE characters.
    108 
    109 ## Messages
    110 
    111 The message descriptions use `$metavariable` examples.
    112 
    113 The request messages are specific to a particular action.  The response
    114 messages are more generic, describing their value types, but not their
    115 meaning.  Message consumers need to know the response to decode them.
    116 Notice the `Packet::GetRequest()` method records in response packets
    117 what the request being responded to was.  Do not confuse this with the
    118 `Packet::GetCode ()` method.
    119 
    120 ### Responses
    121 
    122 The simplest response is a single:
    123 
    124 `OK`
    125 
    126 This indicates the request was successful.
    127 
    128 
    129 An error response is:
    130 
    131 `ERROR $message`
    132 
    133 The message is a human-readable string.  It indicates failure of the request.
    134 
    135 Pathnames are encoded with:
    136 
    137 `PATHNAME $pathname`
    138 
    139 Boolean responses use:
    140 
    141 `BOOL `(`TRUE`|`FALSE`)
    142 
    143 ### Handshake Request
    144 
    145 The first message is a handshake:
    146 
    147 `HELLO $version $compiler $ident`
    148 
    149 The `$version` is a numeric value, currently `1`.  `$compiler` identifies
    150 the compiler &mdash; builders may need to keep compiled modules from
    151 different compilers separate.  `$ident` is an identifier the builder
    152 might use to identify the compilation it is communicating with.
    153 
    154 Responses are:
    155 
    156 `HELLO $version $builder [$flags]`
    157 
    158 A successful handshake.  The communication is now connected and other
    159 messages may be exchanged.  An ERROR response indicates an unsuccessful
    160 handshake.  The communication remains unconnected.
    161 
    162 There is nothing restricting a handshake to its own message block.  Of
    163 course, if the handshake fails, subsequent non-handshake messages in
    164 the block will fail (producing error responses).
    165 
    166 The `$flags` word, if present allows a server to control what requests
    167 might be given.  See below.
    168 
    169 ### C++ Module Requests
    170 
    171 A set of requests are specific to C++ modules:
    172 
    173 #### Flags
    174 
    175 Several requests and one response have an optional `$flags` word.
    176 These are the `Cody::Flags` value pertaining to that request.  If
    177 omitted the value 0 is implied.  The following flags are available:
    178 
    179 * `0`, `None`: No flags.
    180 
    181 * `1<<0`, `NameOnly`: The request is for the name only, and not the
    182   CMI contents.
    183 
    184 The `NameOnly` flag may be provded in a handshake response, and
    185 indicates that the server is interested in requests only for their
    186 implied dependency information.  It may be provided on a request to
    187 indicate that only the CMI name is required, not its contents (for
    188 instance, when preprocessing).  Note that a compiler may still make
    189 `NameOnly` requests even if the server did not ask for such.
    190 
    191 #### Repository
    192 
    193 All relative CMI file names are relative to a repository.  (There are
    194 usually no absolute CMI files).  The repository may be determined
    195 with:
    196 
    197 `MODULE-REPO`
    198 
    199 A PATHNAME response is expected.  The `$pathname` may be an empty
    200 word, which is equivalent to `.`.  When the response is a relative
    201 pathname, it must be relative to the client's current working
    202 directory (which might be a process on a different host to the
    203 server).  You may set the repository to `/`, if you with to use paths
    204 relative to the root directory.
    205 
    206 #### Exporting
    207 
    208 A compilation of a module interface, partition or header unit can
    209 inform the builder with:
    210 
    211 `MODULE-EXPORT $module [$flags]`
    212 
    213 This will result in a PATHNAME response naming the Compiled Module
    214 Interface pathname to write.
    215 
    216 The `MODULE-EXPORT` request does not indicate the module has been
    217 successfully compiled.  At most one `MODULE-EXPORT` is to be made, and
    218 as the connection is for a single compilation, the builder may infer
    219 dependency relationships between the module being generated and import
    220 requests made.
    221 
    222 Named module names and header unit names are distinguished by making
    223 the latter unambiguously look like file names.  Firstly, they must be
    224 fully resolved according to the compiler's usual include path.  If
    225 that results in an absolute name file name (beginning with `/`, or
    226 certain other OS-specific sequences), all is well.  Otherwise a
    227 relative file name must be prefixed by `./` to be distinguished from a
    228 similarly named named module.  This prefixing must occur, even if the
    229 header-unit's name contains characters that cannot appear in a named
    230 module's name.
    231 
    232 It is expected that absolute header-unit names convert to relative CMI
    233 names, to keep all CMIs within the CMI repository.  This means that
    234 steps must be taken to distinguish the CMIs for `/here` from `./here`,
    235 and this can be achieved by replacing the leading `./` directory with
    236 `,/`, which is visually similar but does not have the self-reference
    237 semantics of dot.  Likewise, header-unit names containing `..`
    238 directories, can be remapped to `,,`.  (When symlinks are involved
    239 `bob/dob/..` might not be `bob`, of course.)  C++ header-unit
    240 semantics are such that there is no need to resolve multiple ways of
    241 spelling a particular header-unit to a unique CMI file.
    242 
    243 Successful compilation of an interface is indicated with a subsequent:
    244 
    245 `MODULE-COMPILED $module [$flags]`
    246 
    247 request.  This indicates the CMI file has been written to disk, so
    248 that any other compilations waiting on it may proceed.  Depending on
    249 compiler implementation, the CMI may be written before the compilation
    250 completes.  A single OK response is expected.
    251 
    252 Compilation failure can be inferred by lack of a `MODULE-COMPILED`
    253 request.  It is presumed the builder can determine this, as it is also
    254 responsible for launching and reaping the compiler invocations
    255 themselves.
    256 
    257 #### Importing
    258 
    259 Importation, including that of header-units, uses:
    260 
    261 `MODULE-IMPORT $module [$flags]`
    262 
    263 A PATHNAME response names the CMI file to be read.  Should the builder
    264 have to invoke a compilation to produce the CMI, the response should
    265 be delayed until that occurs.  If such a compilation fails, an error
    266 response should be provided to the requestor &mdash; which will then
    267 presumably fail in some manner.
    268 
    269 #### Include Translation
    270 
    271 Include translation can be determined with:
    272 
    273 `INCLUDE-TRANSLATE $header [$flags]`
    274 
    275 The header name, `$header`, is the fully resolved header name, in the
    276 above-mentioned unambiguous filename form.  The response will either
    277 be a BOOL response indicating textual inclusion, or a PATHNAME
    278 response naming the CMI for such translation.  The BOOL value is TRUE,
    279 if the header is known to be a textual header, and FALSE if nothing is
    280 known about it -- the latter might cause diagnostics about incomplete
    281 knowledge.
    282 
    283 ### GCC LTO Messages
    284 
    285 These set of requests are used for GCC LTO jobserver integration with GNU Make
    286 
    287 ## Building libCody
    288 
    289 Libcody is written in C++11.  (It's a intended for compilers, so
    290 there'd be a bootstrapping problem if it used the latest and greatest.)
    291 
    292 ### Using configure and make.
    293 
    294 It supports the usual `configure`, `make`, `make check` & `make install`
    295 sequence.  It does not support building in the source directory --
    296 that just didn't drop out, and it's not how I build things (because,
    297 again, for compilers).  Excitingly it uses my own `joust` test
    298 harness, so you'll need to build and install that somewhere, if you
    299 want the comfort of testing.
    300 
    301 The following configure options are available, in addition to the usual set:
    302 
    303 * `--enable-checking` Compile with assert-like checking.  Defaults to on.
    304 
    305 * `--with-tooldir=DIR` Prepend `DIR` to `PATH` when building (`DIR`
    306   need not already include the trailing `/bin`, and the right things
    307   happen).  Use this if you need to point to non-standard tools that
    308   you usually don't have in your path.  This path is also used when
    309   the configure script searches for programs.
    310 
    311 * `--with-toolinc=DIR`, `--with-toollib=DIR`, include path and library
    312   path variants of `--with-tooldir`.  If these are siblings of the
    313   tool bin directory, they'll be found automatically.
    314 
    315 * `--with-compiler=NAME` Specify a particular compiler to use.
    316   Usually what configure finds is sufficiently usable.
    317 
    318 * `--with-bugurl=URL` Override the bugreporting URL.  Do this if
    319   you're providing libcody as part of a package that /you/ are
    320   supporting.
    321 
    322 * `--enable-maintainer-mode` Specify that rules to rebuild things like
    323   `configure` (with `autoconf`) should be enabled.  When not enabled,
    324   you'll get a message if these appear out of date, but that can
    325   happen naturally after an update or clone as `git`, in common with
    326   other VCs, doesn't preserve the relative ordering of file
    327   modifications.  You can use `make MAINTAINER=touch` to shut make up,
    328   if this occurs (or manually execute the `autoconf` and related
    329   commands).
    330 
    331 When building, you can override the default optimization flags with
    332 `CXXFLAGS=$flags`.  I often build a debuggable library with `make
    333 CXXFLAGS=-g3`.
    334 
    335 The `Makefile` will also parallelize according to the number of CPUs,
    336 unless you specify explicitly with a `-j` option.  This is a little
    337 clunky, as it's not possible to figure out inside the makefile whether
    338 the user provided `-j`.  (Or at least I've not figured out how.)
    339 
    340 ### Using cmake and make
    341 
    342 #### In the clang/LLVM project
    343 
    344 The primary motivation for a cmake implementation is to allow building
    345 libcody "in tree" in clang/LLVM.  In that case, a checkout of libcody
    346 can be placed (or symbolically linked) into clang/tools.  This will
    347 configure and build the library along with other LLVM dependencies.
    348 
    349 *NOTE* This is not treated as an installable entity (it is present only
    350 for use by the project).
    351 
    352 *NOTE* The testing targets would not be appropriate in this configuration;
    353 it is expected that lit-based testing of the required functionality will be
    354 done by the code using the library.
    355 
    356 #### Stand-alone
    357 
    358 For use on platforms that don't support configure & make effectively, it
    359 is possible to use the cmake & make process in stand-alone mode (similar
    360 to the configure & make process above).
    361 
    362 An example use.
    363 ```
    364 cmake -DCMAKE_INSTALL_PREFIX=/path/to/installation -DCMAKE_CXX_COMPILER=clang++ /path/to/libcody/source
    365 make
    366 make install
    367 ```
    368 Supported flags (additions to the usual cmake ones).
    369 
    370 * `-DCODY_CHECKING=ON,OFF`: Compile with assert-like checking. (defaults ON)
    371 
    372 * `-DCODY_WITHEXCEPTIONS=ON,OFF`: Compile with C++ exceptions and RTTI enabled.
    373 (defaults OFF, to be compatible with GCC and LLVM).
    374 
    375 *TODO*: At present there is no support for `ctest` integration (this should be
    376 feasible, provided that `joust` is installed and can be discovered by `cmake`).
    377 
    378 ## API
    379 
    380 The library defines entities in the `::Cody` namespace.
    381 
    382 There are 4 user-visible classes:
    383 
    384 * `Packet`: Responses to requests are `Packets`.  These have a code,
    385   indicating the response kind, and a payload.
    386 
    387 * `Client`: The compiler-end of a connection.  Requests may be made
    388   and responses are returned.
    389 
    390 * `Server`: The builder-end of a connection.  Requests may be waited
    391   for, and responses made.  Builders that serve multiple concurrent
    392   connections and spawn compilations to resolve dependencies may need
    393   to derive from this class to provide response queuing.
    394 
    395 * `Resolver`: The processing engine of the builder side.  User code is
    396   expected to derive from this class and provide virtual function
    397   overriders to affect the semantics of the resolver.
    398 
    399 In addition there are a number of helpers to setup connections.
    400 
    401 Logically the Client and the Server communicate via a sequential
    402 channel.  The channel may be provided by:
    403 
    404 * two pipes, with different file descriptors for reading and writing
    405   at each end.
    406 
    407 * a socket, which will use the same file descriptor for reading and
    408   writing.  the socket can be created in a number of ways, including
    409   Unix domain and IPv6 TCP, for which helpers are provided.
    410 
    411 * a direct, in-process, connection, using buffer swapping.
    412 
    413 The communication channel is presumed reliable.
    414 
    415 Refer to the (currently very sparse) doxygen-generated documentation
    416 for details of the API.
    417 
    418 ## Examples
    419 
    420 To create an in-process resolver, use the following boilerplate:
    421 
    422 ```
    423 class MyResolver : Cody::Resolver { ... stuff here ... };
    424 
    425 Cody::Client *MakeClient (char const *maybe_ident)
    426 {
    427   auto *r = new MyResolver (...);
    428   auto *s = new Cody::Server (r);
    429   auto *c = new Cody::Client (s);
    430 
    431   auto t = c->ConnectRequest ("ME", maybe_ident);
    432   if (t.GetCode () == Cody::Client::TC_CONNECT)
    433     ;// Yay!
    434   else if (t.GetCode () == Cody::Client::TC_ERROR)
    435     report_error (t.GetString ());
    436 
    437   return c;
    438 }
    439 
    440 ```
    441 
    442 For a remotely connecting client:
    443 ```
    444 Cody::Client *MakeClient ()
    445 {
    446   char const *err = nullptr;
    447   int fd = OpenInet6 (char const **err, name, port);
    448   if (fd < 0)
    449     { ... error... return nullptr;}
    450 
    451   auto *c = new Cody::Client (fd);
    452 
    453   auto t = c->ConnectRequest ("ME", maybe_ident);
    454   if (t.GetCode () == Cody::Client::TC_CONNECT)
    455     ;// Yay!
    456   else if (t.GetCode () == Cody::Client::TC_ERROR)
    457     report_error (t.GetString ());
    458 
    459   return c;
    460 }
    461 ```
    462 
    463 # Future Directions
    464 
    465 * Current Directory.  There is no mechanism to check the builder and
    466   the compiler have the same working directory.  Perhaps that should
    467   be addressed.
    468 
    469 * Include path canonization and/or header file lookup.  This can be
    470   expensive, particularly with many `-I` options, due to the system
    471   calls.  Perhaps using a common resource would be cheaper?
    472 
    473 * Generated header file lookup/construction.  This is essentially the
    474   same problem as importing a module, and build systems are crap at
    475   dealing with this.
    476 
    477 * Link-time compilations.  Another place the compiler would like to
    478   ask the build system to do things.
    479 
    480 * C++20 API entrypoints &mdash; std:string_view would be nice
    481 
    482 * Exception-safety audit.  Exceptions are not used, but memory
    483   exhaustion could happen.  And perhaps user's resolver code employs
    484   exceptions?
    485 
    486 <a name="1">1</a>: Or a small town in Wyoming
    487 
    488 <a name="2">2</a>: This describes one common implementation technique.
    489 The std itself doesn't require such serializations, but the ability to
    490 create them is kind of the point.  Also, 'compiler' is used where we
    491 mean any consumer of a module, and 'build system' where we mean any
    492 producer of a module.
    493 
    494 <a name="3">3</a>: Even when the builder is managing a distributed set
    495 of compilations, the builder must have a mechanism to get source files
    496 to, and object files from, the compilations.  That scheme can also
    497 transfer the CMI files.
    498