Home | History | Annotate | Line # | Download | only in manual
      1 <section xmlns="http://docbook.org/ns/docbook" version="5.0"
      2 	 xml:id="appendix.porting.internals" xreflabel="Portin Internals">
      3 <?dbhtml filename="internals.html"?>
      4 
      5 <info><title>Porting to New Hardware or Operating Systems</title>
      6   <keywordset>
      7     <keyword>ISO C++</keyword>
      8     <keyword>internals</keyword>
      9   </keywordset>
     10 </info>
     11 
     12 
     13 
     14 <para>
     15 </para>
     16 
     17 
     18 <para>This document explains how to port libstdc++ (the GNU C++ library) to
     19 a new target.
     20 </para>
     21 
     22    <para>In order to make the GNU C++ library (libstdc++) work with a new
     23 target, you must edit some configuration files and provide some new
     24 header files.  Unless this is done, libstdc++ will use generic
     25 settings which may not be correct for your target; even if they are
     26 correct, they will likely be inefficient.
     27    </para>
     28 
     29    <para>Before you get started, make sure that you have a working C library on
     30 your target.  The C library need not precisely comply with any
     31 particular standard, but should generally conform to the requirements
     32 imposed by the ANSI/ISO standard.
     33    </para>
     34 
     35    <para>In addition, you should try to verify that the C++ compiler generally
     36 works.  It is difficult to test the C++ compiler without a working
     37 library, but you should at least try some minimal test cases.
     38    </para>
     39 
     40    <para>(Note that what we think of as a "target," the library refers to as
     41 a "host."  The comment at the top of <code>configure.ac</code> explains why.)
     42    </para>
     43 
     44 
     45 <section xml:id="internals.os"><info><title>Operating System</title></info>
     46 
     47 
     48 <para>If you are porting to a new operating system (as opposed to a new chip
     49 using an existing operating system), you will need to create a new
     50 directory in the <code>config/os</code> hierarchy.  For example, the IRIX
     51 configuration files are all in <code>config/os/irix</code>.  There is no set
     52 way to organize the OS configuration directory.  For example,
     53 <code>config/os/solaris/solaris-2.6</code> and
     54 <code>config/os/solaris/solaris-2.7</code> are used as configuration
     55 directories for these two versions of Solaris.  On the other hand, both
     56 Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code>
     57 directory.  The important information is that there needs to be a
     58 directory under <code>config/os</code> to store the files for your operating
     59 system.
     60 </para>
     61 
     62    <para>You might have to change the <code>configure.host</code> file to ensure that
     63 your new directory is activated.  Look for the switch statement that sets
     64 <code>os_include_dir</code>, and add a pattern to handle your operating system
     65 if the default will not suffice.  The switch statement switches on only
     66 the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code>
     67 in <code>sparc-sun-solaris2.8</code>.  If the new directory is named after the
     68 OS portion of the triplet (the default), then nothing needs to be changed.
     69    </para>
     70 
     71    <para>The first file to create in this directory, should be called
     72 <code>os_defines.h</code>.  This file contains basic macro definitions
     73 that are required to allow the C++ library to work with your C library.
     74    </para>
     75 
     76    <para>Several libstdc++ source files unconditionally define the macro
     77 <code>_POSIX_SOURCE</code>.  On many systems, defining this macro causes
     78 large portions of the C library header files to be eliminated
     79 at preprocessing time.  Therefore, you may have to <code>#undef</code> this
     80 macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or
     81 <code>__EXTENSIONS__</code>).  You won't know what macros to define or
     82 undefine at this point; you'll have to try compiling the library and
     83 seeing what goes wrong.  If you see errors about calling functions
     84 that have not been declared, look in your C library headers to see if
     85 the functions are declared there, and then figure out what macros you
     86 need to define.  You will need to add them to the
     87 <code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your
     88 target.  It will not work to simply define these macros in
     89 <code>os_defines.h</code>.
     90    </para>
     91 
     92    <para>At this time, there are a few libstdc++-specific macros which may be
     93 defined:
     94    </para>
     95 
     96    <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99
     97 function declarations (which are not covered by specialization below)
     98 found in system headers against versions found in the library headers
     99 derived from the standard.
    100    </para>
    101 
    102    <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that
    103 yields 0 if and only if the system headers are exposing proper support
    104 for C99 functions (which are not covered by specialization below).  If
    105 defined, it must be 0 while bootstrapping the compiler/rebuilding the
    106 library.
    107    </para>
    108 
    109    <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check
    110 the set of C99 long long function declarations found in system headers
    111 against versions found in the library headers derived from the
    112 standard.
    113 
    114    </para>
    115    <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an
    116 expression that yields 0 if and only if the system headers are
    117 exposing proper support for the set of C99 long long functions.  If
    118 defined, it must be 0 while bootstrapping the compiler/rebuilding the
    119 library.
    120    </para>
    121    <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an
    122 expression that yields 0 if and only if the system headers
    123 are exposing proper support for the related set of macros.  If defined,
    124 it must be 0 while bootstrapping the compiler/rebuilding the library.
    125    </para>
    126    <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined
    127 to 1 to check the related set of function declarations found in system
    128 headers against versions found in the library headers derived from
    129 the standard.
    130    </para>
    131    <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined
    132 to an expression that yields 0 if and only if the system headers
    133 are exposing proper support for the related set of functions.  If defined,
    134 it must be 0 while bootstrapping the compiler/rebuilding the library.
    135    </para>
    136    <para><code>_GLIBCXX_NO_OBSOLETE_ISINF_ISNAN_DYNAMIC</code> may be defined
    137 to an expression that yields 0 if and only if the system headers
    138 are exposing non-standard <code>isinf(double)</code> and
    139 <code>isnan(double)</code> functions in the global namespace. Those functions
    140 should be detected automatically by the <code>configure</code> script when
    141 libstdc++ is built but if their presence depends on compilation flags or
    142 other macros the static configuration can be overridden.
    143    </para>
    144    <para>Finally, you should bracket the entire file in an include-guard, like
    145 this:
    146    </para>
    147 
    148 <programlisting>
    149 
    150 #ifndef _GLIBCXX_OS_DEFINES
    151 #define _GLIBCXX_OS_DEFINES
    152 ...
    153 #endif
    154 </programlisting>
    155 
    156    <para>We recommend copying an existing <code>os_defines.h</code> to use as a
    157 starting point.
    158    </para>
    159 </section>
    160 
    161 
    162 <section xml:id="internals.cpu"><info><title>CPU</title></info>
    163 
    164 
    165 <para>If you are porting to a new chip (as opposed to a new operating system
    166 running on an existing chip), you will need to create a new directory in the
    167 <code>config/cpu</code> hierarchy.  Much like the <link linkend="internals.os">Operating system</link> setup,
    168 there are no strict rules on how to organize the CPU configuration
    169 directory, but careful naming choices will allow the configury to find your
    170 setup files without explicit help.
    171 </para>
    172 
    173    <para>We recommend that for a target triplet <code>&lt;CPU&gt;-&lt;vendor&gt;-&lt;OS&gt;</code>, you
    174 name your configuration directory <code>config/cpu/&lt;CPU&gt;</code>.  If you do this,
    175 the configury will find the directory by itself.  Otherwise you will need to
    176 edit the <code>configure.host</code> file and, in the switch statement that sets
    177 <code>cpu_include_dir</code>, add a pattern to handle your chip.
    178    </para>
    179 
    180    <para>Note that some chip families share a single configuration directory, for
    181 example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the
    182 <code>config/cpu/alpha</code> directory, and there is an entry in the
    183 <code>configure.host</code> switch statement to handle this.
    184    </para>
    185 
    186    <para>The <code>cpu_include_dir</code> sets default locations for the files controlling
    187 <link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not
    188 appropriate for your chip.
    189    </para>
    190 
    191 </section>
    192 
    193 
    194 <section xml:id="internals.char_types"><info><title>Character Types</title></info>
    195 
    196 
    197 <para>The library requires that you provide three header files to implement
    198 character classification, analogous to that provided by the C libraries
    199 <code>&lt;ctype.h&gt;</code> header.  You can model these on the files provided in
    200 <code>config/os/generic</code>.  However, these files will almost
    201 certainly need some modification.
    202 </para>
    203 
    204    <para>The first file to write is <code>ctype_base.h</code>.  This file provides
    205 some very basic information about character classification.  The libstdc++
    206 library assumes that your C library implements <code>&lt;ctype.h&gt;</code> by using
    207 a table (indexed by character code) containing integers, where each of
    208 these integers is a bit-mask indicating whether the character is
    209 upper-case, lower-case, alphabetic, etc.  The <code>ctype_base.h</code>
    210 file gives the type of the integer, and the values of the various bit
    211 masks.  You will have to peer at your own <code>&lt;ctype.h&gt;</code> to figure out
    212 how to define the values required by this file.
    213    </para>
    214 
    215    <para>The <code>ctype_base.h</code> header file does not need include guards.
    216 It should contain a single <code>struct</code> definition called
    217 <code>ctype_base</code>.  This <code>struct</code> should contain two type
    218 declarations, and one enumeration declaration, like this example, taken
    219 from the IRIX configuration:
    220    </para>
    221 
    222 <programlisting>
    223   struct ctype_base
    224      {
    225        typedef unsigned int 	mask;
    226        typedef int* 		__to_type;
    227 
    228        enum
    229        {
    230 	 space = _ISspace,
    231 	 print = _ISprint,
    232 	 cntrl = _IScntrl,
    233 	 upper = _ISupper,
    234 	 lower = _ISlower,
    235 	 alpha = _ISalpha,
    236 	 digit = _ISdigit,
    237 	 punct = _ISpunct,
    238 	 xdigit = _ISxdigit,
    239 	 alnum = _ISalnum,
    240 	 graph = _ISgraph
    241        };
    242      };
    243 </programlisting>
    244 
    245 <para>The <code>mask</code> type is the type of the elements in the table.  If your
    246 C library uses a table to map lower-case numbers to upper-case numbers,
    247 and vice versa, you should define <code>__to_type</code> to be the type of the
    248 elements in that table.  If you don't mind taking a minor performance
    249 penalty, or if your library doesn't implement <code>toupper</code> and
    250 <code>tolower</code> in this way, you can pick any pointer-to-integer type,
    251 but you must still define the type.
    252 </para>
    253 
    254    <para>The enumeration should give definitions for all the values in the above
    255 example, using the values from your native <code>&lt;ctype.h&gt;</code>.  They can
    256 be given symbolically (as above), or numerically, if you prefer.  You do
    257 not have to include <code>&lt;ctype.h&gt;</code> in this header; it will always be
    258 included before <code>ctype_base.h</code> is included.
    259    </para>
    260 
    261    <para>The next file to write is <code>ctype_configure_char.cc</code>.
    262 The first function that must be written is the <code>ctype&lt;char&gt;::ctype</code> constructor.  Here is the IRIX example:
    263    </para>
    264 
    265 <programlisting>
    266 ctype&lt;char&gt;::ctype(const mask* __table = 0, bool __del = false,
    267 	   size_t __refs = 0)
    268        : _Ctype_nois&lt;char&gt;(__refs), _M_del(__table != 0 &amp;&amp; __del),
    269 	 _M_toupper(NULL),
    270 	 _M_tolower(NULL),
    271 	 _M_ctable(NULL),
    272 	 _M_table(!__table
    273 		  ? (const mask*) (__libc_attr._ctype_tbl-&gt;_class + 1)
    274 		  : __table)
    275        { }
    276 </programlisting>
    277 
    278 <para>There are two parts of this that you might choose to alter. The first,
    279 and most important, is the line involving <code>__libc_attr</code>.  That is
    280 IRIX system-dependent code that gets the base of the table mapping
    281 character codes to attributes.  You need to substitute code that obtains
    282 the address of this table on your system.  If you want to use your
    283 operating system's tables to map upper-case letters to lower-case, and
    284 vice versa, you should initialize <code>_M_toupper</code> and
    285 <code>_M_tolower</code> with those tables, in similar fashion.
    286 </para>
    287 
    288    <para>Now, you have to write two functions to convert from upper-case to
    289 lower-case, and vice versa.  Here are the IRIX versions:
    290    </para>
    291 
    292 <programlisting>
    293      char
    294      ctype&lt;char&gt;::do_toupper(char __c) const
    295      { return _toupper(__c); }
    296 
    297      char
    298      ctype&lt;char&gt;::do_tolower(char __c) const
    299      { return _tolower(__c); }
    300 </programlisting>
    301 
    302 <para>Your C library provides equivalents to IRIX's <code>_toupper</code> and
    303 <code>_tolower</code>.  If you initialized <code>_M_toupper</code> and
    304 <code>_M_tolower</code> above, then you could use those tables instead.
    305 </para>
    306 
    307    <para>Finally, you have to provide two utility functions that convert strings
    308 of characters.  The versions provided here will always work - but you
    309 could use specialized routines for greater performance if you have
    310 machinery to do that on your system:
    311    </para>
    312 
    313 <programlisting>
    314      const char*
    315      ctype&lt;char&gt;::do_toupper(char* __low, const char* __high) const
    316      {
    317        while (__low &lt; __high)
    318 	 {
    319 	   *__low = do_toupper(*__low);
    320 	   ++__low;
    321 	 }
    322        return __high;
    323      }
    324 
    325      const char*
    326      ctype&lt;char&gt;::do_tolower(char* __low, const char* __high) const
    327      {
    328        while (__low &lt; __high)
    329 	 {
    330 	   *__low = do_tolower(*__low);
    331 	   ++__low;
    332 	 }
    333        return __high;
    334      }
    335 </programlisting>
    336 
    337    <para>You must also provide the <code>ctype_inline.h</code> file, which
    338 contains a few more functions.  On most systems, you can just copy
    339 <code>config/os/generic/ctype_inline.h</code> and use it on your system.
    340    </para>
    341 
    342    <para>In detail, the functions provided test characters for particular
    343 properties; they are analogous to the functions like <code>isalpha</code> and
    344 <code>islower</code> provided by the C library.
    345    </para>
    346 
    347    <para>The first function is implemented like this on IRIX:
    348    </para>
    349 
    350 <programlisting>
    351      bool
    352      ctype&lt;char&gt;::
    353      is(mask __m, char __c) const throw()
    354      { return (_M_table)[(unsigned char)(__c)] &amp; __m; }
    355 </programlisting>
    356 
    357 <para>The <code>_M_table</code> is the table passed in above, in the constructor.
    358 This is the table that contains the bitmasks for each character.  The
    359 implementation here should work on all systems.
    360 </para>
    361 
    362    <para>The next function is:
    363    </para>
    364 
    365 <programlisting>
    366      const char*
    367      ctype&lt;char&gt;::
    368      is(const char* __low, const char* __high, mask* __vec) const throw()
    369      {
    370        while (__low &lt; __high)
    371 	 *__vec++ = (_M_table)[(unsigned char)(*__low++)];
    372        return __high;
    373      }
    374 </programlisting>
    375 
    376 <para>This function is similar; it copies the masks for all the characters
    377 from <code>__low</code> up until <code>__high</code> into the vector given by
    378 <code>__vec</code>.
    379 </para>
    380 
    381    <para>The last two functions again are entirely generic:
    382    </para>
    383 
    384 <programlisting>
    385      const char*
    386      ctype&lt;char&gt;::
    387      scan_is(mask __m, const char* __low, const char* __high) const throw()
    388      {
    389        while (__low &lt; __high &amp;&amp; !this-&gt;is(__m, *__low))
    390 	 ++__low;
    391        return __low;
    392      }
    393 
    394      const char*
    395      ctype&lt;char&gt;::
    396      scan_not(mask __m, const char* __low, const char* __high) const throw()
    397      {
    398        while (__low &lt; __high &amp;&amp; this-&gt;is(__m, *__low))
    399 	 ++__low;
    400        return __low;
    401      }
    402 </programlisting>
    403 
    404 </section>
    405 
    406 
    407 <section xml:id="internals.thread_safety"><info><title>Thread Safety</title></info>
    408 
    409 
    410 <para>The C++ library string functionality requires a couple of atomic
    411 operations to provide thread-safety.  If you don't take any special
    412 action, the library will use stub versions of these functions that are
    413 not thread-safe.  They will work fine, unless your applications are
    414 multi-threaded.
    415 </para>
    416 
    417    <para>If you want to provide custom, safe, versions of these functions, there
    418 are two distinct approaches.  One is to provide a version for your CPU,
    419 using assembly language constructs.  The other is to use the
    420 thread-safety primitives in your operating system.  In either case, you
    421 make a file called <code>atomicity.h</code>, and the variable
    422 <code>ATOMICITYH</code> must point to this file.
    423    </para>
    424 
    425    <para>If you are using the assembly-language approach, put this code in
    426 <code>config/cpu/&lt;chip&gt;/atomicity.h</code>, where chip is the name of
    427 your processor (see <link linkend="internals.cpu">CPU</link>).  No additional changes are necessary to
    428 locate the file in this case; <code>ATOMICITYH</code> will be set by default.
    429    </para>
    430 
    431    <para>If you are using the operating system thread-safety primitives approach,
    432 you can also put this code in the same CPU directory, in which case no more
    433 work is needed to locate the file.  For examples of this approach,
    434 see the <code>atomicity.h</code> file for IRIX or IA64.
    435    </para>
    436 
    437    <para>Alternatively, if the primitives are more closely related to the OS
    438 than they are to the CPU, you can put the <code>atomicity.h</code> file in
    439 the <link linkend="internals.os">Operating system</link> directory instead.  In this case, you must
    440 edit <code>configure.host</code>, and in the switch statement that handles
    441 operating systems, override the <code>ATOMICITYH</code> variable to point to
    442 the appropriate <code>os_include_dir</code>.  For examples of this approach,
    443 see the <code>atomicity.h</code> file for AIX.
    444    </para>
    445 
    446    <para>With those bits out of the way, you have to actually write
    447 <code>atomicity.h</code> itself.  This file should be wrapped in an
    448 include guard named <code>_GLIBCXX_ATOMICITY_H</code>.  It should define one
    449 type, and two functions.
    450    </para>
    451 
    452    <para>The type is <code>_Atomic_word</code>.  Here is the version used on IRIX:
    453    </para>
    454 
    455 <programlisting>
    456 typedef long _Atomic_word;
    457 </programlisting>
    458 
    459 <para>This type must be a signed integral type supporting atomic operations.
    460 If you're using the OS approach, use the same type used by your system's
    461 primitives.  Otherwise, use the type for which your CPU provides atomic
    462 primitives.
    463 </para>
    464 
    465    <para>Then, you must provide two functions.  The bodies of these functions
    466 must be equivalent to those provided here, but using atomic operations:
    467    </para>
    468 
    469 <programlisting>
    470      static inline _Atomic_word
    471      __attribute__ ((__unused__))
    472      __exchange_and_add (_Atomic_word* __mem, int __val)
    473      {
    474        _Atomic_word __result = *__mem;
    475        *__mem += __val;
    476        return __result;
    477      }
    478 
    479      static inline void
    480      __attribute__ ((__unused__))
    481      __atomic_add (_Atomic_word* __mem, int __val)
    482      {
    483        *__mem += __val;
    484      }
    485 </programlisting>
    486 
    487 </section>
    488 
    489 
    490 <section xml:id="internals.numeric_limits"><info><title>Numeric Limits</title></info>
    491 
    492 
    493 <para>The C++ library requires information about the fundamental data types,
    494 such as the minimum and maximum representable values of each type.
    495 You can define each of these values individually, but it is usually
    496 easiest just to indicate how many bits are used in each of the data
    497 types and let the library do the rest.  For information about the
    498 macros to define, see the top of <code>include/bits/std_limits.h</code>.
    499 </para>
    500 
    501    <para>If you need to define any macros, you can do so in <code>os_defines.h</code>.
    502 However, if all operating systems for your CPU are likely to use the
    503 same values, you can provide a CPU-specific file instead so that you
    504 do not have to provide the same definitions for each operating system.
    505 To take that approach, create a new file called <code>cpu_limits.h</code> in
    506 your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>).
    507    </para>
    508 
    509 </section>
    510 
    511 
    512 <section xml:id="internals.libtool"><info><title>Libtool</title></info>
    513 
    514 
    515 <para>The C++ library is compiled, archived and linked with libtool.
    516 Explaining the full workings of libtool is beyond the scope of this
    517 document, but there are a few, particular bits that are necessary for
    518 porting.
    519 </para>
    520 
    521    <para>Some parts of the libstdc++ library are compiled with the libtool
    522 <code>--tags CXX</code> option (the C++ definitions for libtool).  Therefore,
    523 <code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct
    524 logic to compile and archive objects equivalent to the C version of libtool,
    525 <code>ltcf-c.sh</code>.  Some libtool targets have definitions for C but not
    526 for C++, or C++ definitions which have not been kept up to date.
    527    </para>
    528 
    529    <para>The C++ run-time library contains initialization code that needs to be
    530 run as the library is loaded.  Often, that requires linking in special
    531 object files when the C++ library is built as a shared library, or
    532 taking other system-specific actions.
    533    </para>
    534 
    535    <para>The libstdc++ library is linked with the C version of libtool, even
    536 though it is a C++ library.  Therefore, the C version of libtool needs to
    537 ensure that the run-time library initializers are run.  The usual way to
    538 do this is to build the library using <code>gcc -shared</code>.
    539    </para>
    540 
    541    <para>If you need to change how the library is linked, look at
    542 <code>ltcf-c.sh</code> in the top-level directory.  Find the switch statement
    543 that sets <code>archive_cmds</code>.  Here, adjust the setting for your
    544 operating system.
    545    </para>
    546 
    547 
    548 </section>
    549 
    550 </section>
    551