1 <section xmlns="http://docbook.org/ns/docbook" version="5.0" 2 xml:id="appendix.porting.internals" xreflabel="Portin Internals"> 3 <?dbhtml filename="internals.html"?> 4 5 <info><title>Porting to New Hardware or Operating Systems</title> 6 <keywordset> 7 <keyword>ISO C++</keyword> 8 <keyword>internals</keyword> 9 </keywordset> 10 </info> 11 12 13 14 <para> 15 </para> 16 17 18 <para>This document explains how to port libstdc++ (the GNU C++ library) to 19 a new target. 20 </para> 21 22 <para>In order to make the GNU C++ library (libstdc++) work with a new 23 target, you must edit some configuration files and provide some new 24 header files. Unless this is done, libstdc++ will use generic 25 settings which may not be correct for your target; even if they are 26 correct, they will likely be inefficient. 27 </para> 28 29 <para>Before you get started, make sure that you have a working C library on 30 your target. The C library need not precisely comply with any 31 particular standard, but should generally conform to the requirements 32 imposed by the ANSI/ISO standard. 33 </para> 34 35 <para>In addition, you should try to verify that the C++ compiler generally 36 works. It is difficult to test the C++ compiler without a working 37 library, but you should at least try some minimal test cases. 38 </para> 39 40 <para>(Note that what we think of as a "target," the library refers to as 41 a "host." The comment at the top of <code>configure.ac</code> explains why.) 42 </para> 43 44 45 <section xml:id="internals.os"><info><title>Operating System</title></info> 46 47 48 <para>If you are porting to a new operating system (as opposed to a new chip 49 using an existing operating system), you will need to create a new 50 directory in the <code>config/os</code> hierarchy. For example, the IRIX 51 configuration files are all in <code>config/os/irix</code>. There is no set 52 way to organize the OS configuration directory. For example, 53 <code>config/os/solaris/solaris-2.6</code> and 54 <code>config/os/solaris/solaris-2.7</code> are used as configuration 55 directories for these two versions of Solaris. On the other hand, both 56 Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code> 57 directory. The important information is that there needs to be a 58 directory under <code>config/os</code> to store the files for your operating 59 system. 60 </para> 61 62 <para>You might have to change the <code>configure.host</code> file to ensure that 63 your new directory is activated. Look for the switch statement that sets 64 <code>os_include_dir</code>, and add a pattern to handle your operating system 65 if the default will not suffice. The switch statement switches on only 66 the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code> 67 in <code>sparc-sun-solaris2.8</code>. If the new directory is named after the 68 OS portion of the triplet (the default), then nothing needs to be changed. 69 </para> 70 71 <para>The first file to create in this directory, should be called 72 <code>os_defines.h</code>. This file contains basic macro definitions 73 that are required to allow the C++ library to work with your C library. 74 </para> 75 76 <para>Several libstdc++ source files unconditionally define the macro 77 <code>_POSIX_SOURCE</code>. On many systems, defining this macro causes 78 large portions of the C library header files to be eliminated 79 at preprocessing time. Therefore, you may have to <code>#undef</code> this 80 macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or 81 <code>__EXTENSIONS__</code>). You won't know what macros to define or 82 undefine at this point; you'll have to try compiling the library and 83 seeing what goes wrong. If you see errors about calling functions 84 that have not been declared, look in your C library headers to see if 85 the functions are declared there, and then figure out what macros you 86 need to define. You will need to add them to the 87 <code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your 88 target. It will not work to simply define these macros in 89 <code>os_defines.h</code>. 90 </para> 91 92 <para>At this time, there are a few libstdc++-specific macros which may be 93 defined: 94 </para> 95 96 <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99 97 function declarations (which are not covered by specialization below) 98 found in system headers against versions found in the library headers 99 derived from the standard. 100 </para> 101 102 <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that 103 yields 0 if and only if the system headers are exposing proper support 104 for C99 functions (which are not covered by specialization below). If 105 defined, it must be 0 while bootstrapping the compiler/rebuilding the 106 library. 107 </para> 108 109 <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check 110 the set of C99 long long function declarations found in system headers 111 against versions found in the library headers derived from the 112 standard. 113 114 </para> 115 <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an 116 expression that yields 0 if and only if the system headers are 117 exposing proper support for the set of C99 long long functions. If 118 defined, it must be 0 while bootstrapping the compiler/rebuilding the 119 library. 120 </para> 121 <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an 122 expression that yields 0 if and only if the system headers 123 are exposing proper support for the related set of macros. If defined, 124 it must be 0 while bootstrapping the compiler/rebuilding the library. 125 </para> 126 <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined 127 to 1 to check the related set of function declarations found in system 128 headers against versions found in the library headers derived from 129 the standard. 130 </para> 131 <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined 132 to an expression that yields 0 if and only if the system headers 133 are exposing proper support for the related set of functions. If defined, 134 it must be 0 while bootstrapping the compiler/rebuilding the library. 135 </para> 136 <para><code>_GLIBCXX_NO_OBSOLETE_ISINF_ISNAN_DYNAMIC</code> may be defined 137 to an expression that yields 0 if and only if the system headers 138 are exposing non-standard <code>isinf(double)</code> and 139 <code>isnan(double)</code> functions in the global namespace. Those functions 140 should be detected automatically by the <code>configure</code> script when 141 libstdc++ is built but if their presence depends on compilation flags or 142 other macros the static configuration can be overridden. 143 </para> 144 <para>Finally, you should bracket the entire file in an include-guard, like 145 this: 146 </para> 147 148 <programlisting> 149 150 #ifndef _GLIBCXX_OS_DEFINES 151 #define _GLIBCXX_OS_DEFINES 152 ... 153 #endif 154 </programlisting> 155 156 <para>We recommend copying an existing <code>os_defines.h</code> to use as a 157 starting point. 158 </para> 159 </section> 160 161 162 <section xml:id="internals.cpu"><info><title>CPU</title></info> 163 164 165 <para>If you are porting to a new chip (as opposed to a new operating system 166 running on an existing chip), you will need to create a new directory in the 167 <code>config/cpu</code> hierarchy. Much like the <link linkend="internals.os">Operating system</link> setup, 168 there are no strict rules on how to organize the CPU configuration 169 directory, but careful naming choices will allow the configury to find your 170 setup files without explicit help. 171 </para> 172 173 <para>We recommend that for a target triplet <code><CPU>-<vendor>-<OS></code>, you 174 name your configuration directory <code>config/cpu/<CPU></code>. If you do this, 175 the configury will find the directory by itself. Otherwise you will need to 176 edit the <code>configure.host</code> file and, in the switch statement that sets 177 <code>cpu_include_dir</code>, add a pattern to handle your chip. 178 </para> 179 180 <para>Note that some chip families share a single configuration directory, for 181 example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the 182 <code>config/cpu/alpha</code> directory, and there is an entry in the 183 <code>configure.host</code> switch statement to handle this. 184 </para> 185 186 <para>The <code>cpu_include_dir</code> sets default locations for the files controlling 187 <link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not 188 appropriate for your chip. 189 </para> 190 191 </section> 192 193 194 <section xml:id="internals.char_types"><info><title>Character Types</title></info> 195 196 197 <para>The library requires that you provide three header files to implement 198 character classification, analogous to that provided by the C libraries 199 <code><ctype.h></code> header. You can model these on the files provided in 200 <code>config/os/generic</code>. However, these files will almost 201 certainly need some modification. 202 </para> 203 204 <para>The first file to write is <code>ctype_base.h</code>. This file provides 205 some very basic information about character classification. The libstdc++ 206 library assumes that your C library implements <code><ctype.h></code> by using 207 a table (indexed by character code) containing integers, where each of 208 these integers is a bit-mask indicating whether the character is 209 upper-case, lower-case, alphabetic, etc. The <code>ctype_base.h</code> 210 file gives the type of the integer, and the values of the various bit 211 masks. You will have to peer at your own <code><ctype.h></code> to figure out 212 how to define the values required by this file. 213 </para> 214 215 <para>The <code>ctype_base.h</code> header file does not need include guards. 216 It should contain a single <code>struct</code> definition called 217 <code>ctype_base</code>. This <code>struct</code> should contain two type 218 declarations, and one enumeration declaration, like this example, taken 219 from the IRIX configuration: 220 </para> 221 222 <programlisting> 223 struct ctype_base 224 { 225 typedef unsigned int mask; 226 typedef int* __to_type; 227 228 enum 229 { 230 space = _ISspace, 231 print = _ISprint, 232 cntrl = _IScntrl, 233 upper = _ISupper, 234 lower = _ISlower, 235 alpha = _ISalpha, 236 digit = _ISdigit, 237 punct = _ISpunct, 238 xdigit = _ISxdigit, 239 alnum = _ISalnum, 240 graph = _ISgraph 241 }; 242 }; 243 </programlisting> 244 245 <para>The <code>mask</code> type is the type of the elements in the table. If your 246 C library uses a table to map lower-case numbers to upper-case numbers, 247 and vice versa, you should define <code>__to_type</code> to be the type of the 248 elements in that table. If you don't mind taking a minor performance 249 penalty, or if your library doesn't implement <code>toupper</code> and 250 <code>tolower</code> in this way, you can pick any pointer-to-integer type, 251 but you must still define the type. 252 </para> 253 254 <para>The enumeration should give definitions for all the values in the above 255 example, using the values from your native <code><ctype.h></code>. They can 256 be given symbolically (as above), or numerically, if you prefer. You do 257 not have to include <code><ctype.h></code> in this header; it will always be 258 included before <code>ctype_base.h</code> is included. 259 </para> 260 261 <para>The next file to write is <code>ctype_configure_char.cc</code>. 262 The first function that must be written is the <code>ctype<char>::ctype</code> constructor. Here is the IRIX example: 263 </para> 264 265 <programlisting> 266 ctype<char>::ctype(const mask* __table = 0, bool __del = false, 267 size_t __refs = 0) 268 : _Ctype_nois<char>(__refs), _M_del(__table != 0 && __del), 269 _M_toupper(NULL), 270 _M_tolower(NULL), 271 _M_ctable(NULL), 272 _M_table(!__table 273 ? (const mask*) (__libc_attr._ctype_tbl->_class + 1) 274 : __table) 275 { } 276 </programlisting> 277 278 <para>There are two parts of this that you might choose to alter. The first, 279 and most important, is the line involving <code>__libc_attr</code>. That is 280 IRIX system-dependent code that gets the base of the table mapping 281 character codes to attributes. You need to substitute code that obtains 282 the address of this table on your system. If you want to use your 283 operating system's tables to map upper-case letters to lower-case, and 284 vice versa, you should initialize <code>_M_toupper</code> and 285 <code>_M_tolower</code> with those tables, in similar fashion. 286 </para> 287 288 <para>Now, you have to write two functions to convert from upper-case to 289 lower-case, and vice versa. Here are the IRIX versions: 290 </para> 291 292 <programlisting> 293 char 294 ctype<char>::do_toupper(char __c) const 295 { return _toupper(__c); } 296 297 char 298 ctype<char>::do_tolower(char __c) const 299 { return _tolower(__c); } 300 </programlisting> 301 302 <para>Your C library provides equivalents to IRIX's <code>_toupper</code> and 303 <code>_tolower</code>. If you initialized <code>_M_toupper</code> and 304 <code>_M_tolower</code> above, then you could use those tables instead. 305 </para> 306 307 <para>Finally, you have to provide two utility functions that convert strings 308 of characters. The versions provided here will always work - but you 309 could use specialized routines for greater performance if you have 310 machinery to do that on your system: 311 </para> 312 313 <programlisting> 314 const char* 315 ctype<char>::do_toupper(char* __low, const char* __high) const 316 { 317 while (__low < __high) 318 { 319 *__low = do_toupper(*__low); 320 ++__low; 321 } 322 return __high; 323 } 324 325 const char* 326 ctype<char>::do_tolower(char* __low, const char* __high) const 327 { 328 while (__low < __high) 329 { 330 *__low = do_tolower(*__low); 331 ++__low; 332 } 333 return __high; 334 } 335 </programlisting> 336 337 <para>You must also provide the <code>ctype_inline.h</code> file, which 338 contains a few more functions. On most systems, you can just copy 339 <code>config/os/generic/ctype_inline.h</code> and use it on your system. 340 </para> 341 342 <para>In detail, the functions provided test characters for particular 343 properties; they are analogous to the functions like <code>isalpha</code> and 344 <code>islower</code> provided by the C library. 345 </para> 346 347 <para>The first function is implemented like this on IRIX: 348 </para> 349 350 <programlisting> 351 bool 352 ctype<char>:: 353 is(mask __m, char __c) const throw() 354 { return (_M_table)[(unsigned char)(__c)] & __m; } 355 </programlisting> 356 357 <para>The <code>_M_table</code> is the table passed in above, in the constructor. 358 This is the table that contains the bitmasks for each character. The 359 implementation here should work on all systems. 360 </para> 361 362 <para>The next function is: 363 </para> 364 365 <programlisting> 366 const char* 367 ctype<char>:: 368 is(const char* __low, const char* __high, mask* __vec) const throw() 369 { 370 while (__low < __high) 371 *__vec++ = (_M_table)[(unsigned char)(*__low++)]; 372 return __high; 373 } 374 </programlisting> 375 376 <para>This function is similar; it copies the masks for all the characters 377 from <code>__low</code> up until <code>__high</code> into the vector given by 378 <code>__vec</code>. 379 </para> 380 381 <para>The last two functions again are entirely generic: 382 </para> 383 384 <programlisting> 385 const char* 386 ctype<char>:: 387 scan_is(mask __m, const char* __low, const char* __high) const throw() 388 { 389 while (__low < __high && !this->is(__m, *__low)) 390 ++__low; 391 return __low; 392 } 393 394 const char* 395 ctype<char>:: 396 scan_not(mask __m, const char* __low, const char* __high) const throw() 397 { 398 while (__low < __high && this->is(__m, *__low)) 399 ++__low; 400 return __low; 401 } 402 </programlisting> 403 404 </section> 405 406 407 <section xml:id="internals.thread_safety"><info><title>Thread Safety</title></info> 408 409 410 <para>The C++ library string functionality requires a couple of atomic 411 operations to provide thread-safety. If you don't take any special 412 action, the library will use stub versions of these functions that are 413 not thread-safe. They will work fine, unless your applications are 414 multi-threaded. 415 </para> 416 417 <para>If you want to provide custom, safe, versions of these functions, there 418 are two distinct approaches. One is to provide a version for your CPU, 419 using assembly language constructs. The other is to use the 420 thread-safety primitives in your operating system. In either case, you 421 make a file called <code>atomicity.h</code>, and the variable 422 <code>ATOMICITYH</code> must point to this file. 423 </para> 424 425 <para>If you are using the assembly-language approach, put this code in 426 <code>config/cpu/<chip>/atomicity.h</code>, where chip is the name of 427 your processor (see <link linkend="internals.cpu">CPU</link>). No additional changes are necessary to 428 locate the file in this case; <code>ATOMICITYH</code> will be set by default. 429 </para> 430 431 <para>If you are using the operating system thread-safety primitives approach, 432 you can also put this code in the same CPU directory, in which case no more 433 work is needed to locate the file. For examples of this approach, 434 see the <code>atomicity.h</code> file for IRIX or IA64. 435 </para> 436 437 <para>Alternatively, if the primitives are more closely related to the OS 438 than they are to the CPU, you can put the <code>atomicity.h</code> file in 439 the <link linkend="internals.os">Operating system</link> directory instead. In this case, you must 440 edit <code>configure.host</code>, and in the switch statement that handles 441 operating systems, override the <code>ATOMICITYH</code> variable to point to 442 the appropriate <code>os_include_dir</code>. For examples of this approach, 443 see the <code>atomicity.h</code> file for AIX. 444 </para> 445 446 <para>With those bits out of the way, you have to actually write 447 <code>atomicity.h</code> itself. This file should be wrapped in an 448 include guard named <code>_GLIBCXX_ATOMICITY_H</code>. It should define one 449 type, and two functions. 450 </para> 451 452 <para>The type is <code>_Atomic_word</code>. Here is the version used on IRIX: 453 </para> 454 455 <programlisting> 456 typedef long _Atomic_word; 457 </programlisting> 458 459 <para>This type must be a signed integral type supporting atomic operations. 460 If you're using the OS approach, use the same type used by your system's 461 primitives. Otherwise, use the type for which your CPU provides atomic 462 primitives. 463 </para> 464 465 <para>Then, you must provide two functions. The bodies of these functions 466 must be equivalent to those provided here, but using atomic operations: 467 </para> 468 469 <programlisting> 470 static inline _Atomic_word 471 __attribute__ ((__unused__)) 472 __exchange_and_add (_Atomic_word* __mem, int __val) 473 { 474 _Atomic_word __result = *__mem; 475 *__mem += __val; 476 return __result; 477 } 478 479 static inline void 480 __attribute__ ((__unused__)) 481 __atomic_add (_Atomic_word* __mem, int __val) 482 { 483 *__mem += __val; 484 } 485 </programlisting> 486 487 </section> 488 489 490 <section xml:id="internals.numeric_limits"><info><title>Numeric Limits</title></info> 491 492 493 <para>The C++ library requires information about the fundamental data types, 494 such as the minimum and maximum representable values of each type. 495 You can define each of these values individually, but it is usually 496 easiest just to indicate how many bits are used in each of the data 497 types and let the library do the rest. For information about the 498 macros to define, see the top of <code>include/bits/std_limits.h</code>. 499 </para> 500 501 <para>If you need to define any macros, you can do so in <code>os_defines.h</code>. 502 However, if all operating systems for your CPU are likely to use the 503 same values, you can provide a CPU-specific file instead so that you 504 do not have to provide the same definitions for each operating system. 505 To take that approach, create a new file called <code>cpu_limits.h</code> in 506 your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>). 507 </para> 508 509 </section> 510 511 512 <section xml:id="internals.libtool"><info><title>Libtool</title></info> 513 514 515 <para>The C++ library is compiled, archived and linked with libtool. 516 Explaining the full workings of libtool is beyond the scope of this 517 document, but there are a few, particular bits that are necessary for 518 porting. 519 </para> 520 521 <para>Some parts of the libstdc++ library are compiled with the libtool 522 <code>--tags CXX</code> option (the C++ definitions for libtool). Therefore, 523 <code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct 524 logic to compile and archive objects equivalent to the C version of libtool, 525 <code>ltcf-c.sh</code>. Some libtool targets have definitions for C but not 526 for C++, or C++ definitions which have not been kept up to date. 527 </para> 528 529 <para>The C++ run-time library contains initialization code that needs to be 530 run as the library is loaded. Often, that requires linking in special 531 object files when the C++ library is built as a shared library, or 532 taking other system-specific actions. 533 </para> 534 535 <para>The libstdc++ library is linked with the C version of libtool, even 536 though it is a C++ library. Therefore, the C version of libtool needs to 537 ensure that the run-time library initializers are run. The usual way to 538 do this is to build the library using <code>gcc -shared</code>. 539 </para> 540 541 <para>If you need to change how the library is linked, look at 542 <code>ltcf-c.sh</code> in the top-level directory. Find the switch statement 543 that sets <code>archive_cmds</code>. Here, adjust the setting for your 544 operating system. 545 </para> 546 547 548 </section> 549 550 </section> 551