Home | History | Annotate | Line # | Download | only in time
theory.html revision 1.2
      1 <!DOCTYPE html>
      2 <html lang="en">
      3 <head>
      4   <title>Theory and pragmatics of the tz code and data</title>
      5   <meta charset="UTF-8">
      6 </head>
      7 
      8 <!-- The somewhat-unusal indenting style in this file is intended to
      9      shrink the output of the shell command 'diff Theory Theory.html',
     10      where 'Theory' was the plain text file that this file is derived
     11      from.  The 'Theory' file used leading white space to indent, and
     12      when possible that indentation is preserved here.  Eventually we
     13      may stop doing this and remove this comment.  -->
     14 
     15 <body>
     16   <h1>Theory and pragmatics of the tz code and data</h1>
     17   <h3>Outline</h3>
     18   <nav>
     19     <ul>
     20       <li><a href="#scope">Scope of the tz database</a></li>
     21       <li><a href="#naming">Names of time zone rules</a></li>
     22       <li><a href="#abbreviations">Time zone abbreviations</a></li>
     23       <li><a href="#accuracy">Accuracy of the tz database</a></li>
     24       <li><a href="#functions">Time and date functions</a></li>
     25       <li><a href="#stability">Interface stability</a></li>
     26       <li><a href="#calendar">Calendrical issues</a></li>
     27       <li><a href="#planets">Time and time zones on other planets</a></li>
     28     </ul>
     29   </nav>
     30 
     31 
     32   <section>
     33     <h2 id="scope">Scope of the tz database</h2>
     34 <p>
     35 The tz database attempts to record the history and predicted future of
     36 all computer-based clocks that track civil time.  To represent this
     37 data, the world is partitioned into regions whose clocks all agree
     38 about timestamps that occur after the somewhat-arbitrary cutoff point
     39 of the POSIX Epoch (1970-01-01 00:00:00 UTC).  For each such region,
     40 the database records all known clock transitions, and labels the region
     41 with a notable location.  Although 1970 is a somewhat-arbitrary
     42 cutoff, there are significant challenges to moving the cutoff earlier
     43 even by a decade or two, due to the wide variety of local practices
     44 before computer timekeeping became prevalent.
     45 </p>
     46 
     47 <p>
     48 Clock transitions before 1970 are recorded for each such location,
     49 because most systems support timestamps before 1970 and could
     50 misbehave if data entries were omitted for pre-1970 transitions.
     51 However, the database is not designed for and does not suffice for
     52 applications requiring accurate handling of all past times everywhere,
     53 as it would take far too much effort and guesswork to record all
     54 details of pre-1970 civil timekeeping.
     55 Athough some information outside the scope of the database is
     56 collected in a file <code>backzone</code> that is distributed along
     57 with the database proper, this file is less reliable and does not
     58 necessarily follow database guidelines.
     59 </p>
     60 
     61 <p>
     62 As described below, reference source code for using the tz database is
     63 also available.  The tz code is upwards compatible with POSIX, an
     64 international standard for UNIX-like systems.  As of this writing, the
     65 current edition of POSIX is:
     66   <a href="http://pubs.opengroup.org/onlinepubs/9699919799/">
     67   The Open Group Base Specifications Issue 7</a>,
     68   IEEE Std 1003.1-2008, 2016 Edition.
     69 </p>
     70   </section>
     71 
     72 
     73 
     74   <section>
     75     <h2 id="naming">Names of time zone rules</h2>
     76 <p>
     77 Each of the database's time zone rules has a unique name.
     78 Inexperienced users are not expected to select these names unaided.
     79 Distributors should provide documentation and/or a simple selection
     80 interface that explains the names; for one example, see the 'tzselect'
     81 program in the tz code.  The
     82 <a href="http://cldr.unicode.org/">Unicode Common Locale Data
     83 Repository</a> contains data that may be useful for other
     84 selection interfaces.
     85 </p>
     86 
     87 <p>
     88 The time zone rule naming conventions attempt to strike a balance
     89 among the following goals:
     90 </p>
     91 <ul>
     92   <li>
     93    Uniquely identify every region where clocks have agreed since 1970.
     94    This is essential for the intended use: static clocks keeping local
     95    civil time.
     96   </li>
     97   <li>
     98    Indicate to experts where that region is.
     99   </li>
    100   <li>
    101    Be robust in the presence of political changes.  For example, names
    102    of countries are ordinarily not used, to avoid incompatibilities
    103    when countries change their name (e.g. Zaire&rarr;Congo) or when
    104    locations change countries (e.g. Hong Kong from UK colony to
    105    China).
    106   </li>
    107   <li>
    108    Be portable to a wide variety of implementations.
    109   </li>
    110   <li>
    111    Use a consistent naming conventions over the entire world.
    112   </li>
    113 </ul>
    114 <p>
    115 Names normally have the
    116 form <var>AREA</var><code>/</code><var>LOCATION</var>,
    117 where <var>AREA</var> is the name of a continent or ocean,
    118 and <var>LOCATION</var> is the name of a specific
    119 location within that region.  North and South America share the same
    120 area, '<code>America</code>'.  Typical names are
    121 '<code>Africa/Cairo</code>', '<code>America/New_York</code>', and
    122 '<code>Pacific/Honolulu</code>'.
    123 </p>
    124 
    125 <p>
    126 Here are the general rules used for choosing location names,
    127 in decreasing order of importance:
    128 </p>
    129 <ul>
    130   <li>
    131 	Use only valid POSIX file name components (i.e., the parts of
    132 		names other than '<code>/</code>').  Do not use the file name
    133 		components '<code>.</code>' and '<code>..</code>'.
    134 		Within a file name component,
    135 		use only ASCII letters, '<code>.</code>',
    136 		'<code>-</code>' and '<code>_</code>'.  Do not use
    137 		digits, as that might create an ambiguity with POSIX
    138 		TZ strings.  A file name component must not exceed 14
    139 		characters or start with '<code>-</code>'.  E.g.,
    140 		prefer '<code>Brunei</code>' to
    141 		'<code>Bandar_Seri_Begawan</code>'.  Exceptions: see
    142 		the discussion
    143 		of legacy names below.
    144   </li>
    145   <li>
    146 	A name must not be empty, or contain '<code>//</code>', or
    147 	start or end with '<code>/</code>'.
    148   </li>
    149   <li>
    150 	Do not use names that differ only in case.  Although the reference
    151 		implementation is case-sensitive, some other implementations
    152 		are not, and they would mishandle names differing only in case.
    153   </li>
    154   <li>
    155 	If one name <var>A</var> is an initial prefix of another
    156 		name <var>AB</var> (ignoring case), then <var>B</var>
    157 		must not start with '<code>/</code>', as a
    158 		regular file cannot have
    159 		the same name as a directory in POSIX.  For example,
    160 		'<code>America/New_York</code>' precludes
    161 		'<code>America/New_York/Bronx</code>'.
    162   </li>
    163   <li>
    164 	Uninhabited regions like the North Pole and Bouvet Island
    165 		do not need locations, since local time is not defined there.
    166   </li>
    167   <li>
    168 	There should typically be at least one name for each ISO 3166-1
    169 		officially assigned two-letter code for an inhabited country
    170 		or territory.
    171   </li>
    172   <li>
    173 	If all the clocks in a region have agreed since 1970,
    174 		don't bother to include more than one location
    175 		even if subregions' clocks disagreed before 1970.
    176 		Otherwise these tables would become annoyingly large.
    177   </li>
    178   <li>
    179 	If a name is ambiguous, use a less ambiguous alternative;
    180 		e.g. many cities are named San Jos and Georgetown, so
    181 		prefer '<code>Costa_Rica</code>' to '<code>San_Jose</code>' and '<code>Guyana</code>' to '<code>Georgetown</code>'.
    182   </li>
    183   <li>
    184 	Keep locations compact.  Use cities or small islands, not countries
    185 		or regions, so that any future time zone changes do not split
    186 		locations into different time zones.  E.g. prefer
    187 		'<code>Paris</code>' to '<code>France</code>', since
    188 		France has had multiple time zones.
    189   </li>
    190   <li>
    191 	Use mainstream English spelling, e.g. prefer
    192 		'<code>Rome</code>' to '<code>Roma</code>', and prefer
    193 		'<code>Athens</code>' to the Greek
    194 		'<code></code>' or the Romanized
    195 		'<code>Athna</code>'.
    196 		The POSIX file name restrictions encourage this rule.
    197   </li>
    198   <li>
    199 	Use the most populous among locations in a zone,
    200 		e.g. prefer '<code>Shanghai</code>' to
    201 		'<code>Beijing</code>'.  Among locations with
    202 		similar populations, pick the best-known location,
    203 		e.g. prefer '<code>Rome</code>' to '<code>Milan</code>'.
    204   </li>
    205   <li>
    206 	Use the singular form, e.g. prefer '<code>Canary</code>' to '<code>Canaries</code>'.
    207   </li>
    208   <li>
    209 	Omit common suffixes like '<code>_Islands</code>' and
    210 		'<code>_City</code>', unless that would lead to
    211 		ambiguity.  E.g. prefer '<code>Cayman</code>' to
    212 		'<code>Cayman_Islands</code>' and
    213 		'<code>Guatemala</code>' to
    214 		'<code>Guatemala_City</code>', but prefer
    215 		'<code>Mexico_City</code>' to '<code>Mexico</code>'
    216 		because the country
    217 		of Mexico has several time zones.
    218   </li>
    219   <li>
    220 	Use '<code>_</code>' to represent a space.
    221   </li>
    222   <li>
    223 	Omit '<code>.</code>' from abbreviations in names, e.g. prefer
    224 		'<code>St_Helena</code>' to '<code>St._Helena</code>'.
    225   </li>
    226   <li>
    227 	Do not change established names if they only marginally
    228 		violate the above rules.  For example, don't change
    229 		the existing name '<code>Rome</code>' to
    230 		'<code>Milan</code>' merely because
    231 		Milan's population has grown to be somewhat greater
    232 		than Rome's.
    233   </li>
    234   <li>
    235 	If a name is changed, put its old spelling in the
    236 		'<code>backward</code>' file.
    237 		This means old spellings will continue to work.
    238   </li>
    239 </ul>
    240 
    241 <p>
    242 The file '<code>zone1970.tab</code>' lists geographical locations used
    243 to name time
    244 zone rules.  It is intended to be an exhaustive list of names for
    245 geographic regions as described above; this is a subset of the names
    246 in the data.  Although a '<code>zone1970.tab</code>' location's longitude
    247 corresponds to its LMT offset with one hour for every 15&deg; east
    248 longitude, this relationship is not exact.
    249 </p>
    250 
    251 <p>
    252 Older versions of this package used a different naming scheme,
    253 and these older names are still supported.
    254 See the file '<code>backward</code>' for most of these older names
    255 (e.g., '<code>US/Eastern</code>' instead of '<code>America/New_York</code>').
    256 The other old-fashioned names still supported are
    257 '<code>WET</code>', '<code>CET</code>', '<code>MET</code>', and '<code>EET</code>' (see the file '<code>europe</code>').
    258 </p>
    259 
    260 <p>
    261 Older versions of this package defined legacy names that are
    262 incompatible with the first rule of location names, but which are
    263 still supported.  These legacy names are mostly defined in the file
    264 '<code>etcetera</code>'.  Also, the file '<code>backward</code>' defines the legacy names
    265 '<code>GMT0</code>', '<code>GMT-0</code>' and '<code>GMT+0</code>', and the file '<code>northamerica</code>' defines the
    266 legacy names '<code>EST5EDT</code>', '<code>CST6CDT</code>', '<code>MST7MDT</code>', and '<code>PST8PDT</code>'.
    267 </p>
    268 
    269 <p>
    270 Excluding '<code>backward</code>' should not affect the other data.  If
    271 '<code>backward</code>' is excluded, excluding '<code>etcetera</code>' should not affect the
    272 remaining data.
    273 </p>
    274 
    275 
    276   </section>
    277   <section>
    278     <h2 id="abbreviations">Time zone abbreviations</h2>
    279 <p>
    280 When this package is installed, it generates time zone abbreviations
    281 like '<code>EST</code>' to be compatible with human tradition and POSIX.
    282 Here are the general rules used for choosing time zone abbreviations,
    283 in decreasing order of importance:
    284 <ul>
    285   <li>
    286 	Use three to six characters that are ASCII alphanumerics or
    287 		'<code>+</code>' or '<code>-</code>'.
    288 		Previous editions of this database also used characters like
    289 		'<code> </code>' and '<code>?</code>', but these
    290 		characters have a special meaning to
    291 		the shell and cause commands like
    292 			'<code>set `date`</code>'
    293 		to have unexpected effects.
    294 		Previous editions of this rule required upper-case letters,
    295 		but the Congressman who introduced Chamorro Standard Time
    296 		preferred "ChST", so lower-case letters are now allowed.
    297 		Also, POSIX from 2001 on relaxed the rule to allow
    298 		'<code>-</code>', '<code>+</code>',
    299 		and alphanumeric characters from the portable character set
    300 		in the current locale.  In practice ASCII alphanumerics and
    301 		'<code>+</code>' and '<code>-</code>' are safe in all locales.
    302 
    303 		In other words, in the C locale the POSIX extended regular
    304 		expression <code>[-+[:alnum:]]{3,6}</code> should match
    305 		the abbreviation.
    306 		This guarantees that all abbreviations could have been
    307 		specified by a POSIX TZ string.
    308   </li>
    309   <li>
    310 	Use abbreviations that are in common use among English-speakers,
    311 		e.g. 'EST' for Eastern Standard Time in North America.
    312 		We assume that applications translate them to other languages
    313 		as part of the normal localization process; for example,
    314 		a French application might translate 'EST' to 'HNE'.
    315 
    316 <p><small>These abbreviations (for standard/daylight/etc. time) are:
    317 ACST/ACDT Australian Central,
    318 AST/ADT/APT/AWT/ADDT Atlantic,
    319 AEST/AEDT Australian Eastern,
    320 AHST/AHDT Alaska-Hawaii,
    321 AKST/AKDT Alaska,
    322 AWST/AWDT Australian Western,
    323 BST/BDT Bering,
    324 CAT/CAST Central Africa,
    325 CET/CEST/CEMT Central European,
    326 ChST Chamorro,
    327 CST/CDT/CWT/CPT/CDDT Central [North America],
    328 CST/CDT China,
    329 GMT/BST/IST/BDST Greenwich,
    330 EAT East Africa,
    331 EST/EDT/EWT/EPT/EDDT Eastern [North America],
    332 EET/EEST Eastern European,
    333 GST Guam,
    334 HST/HDT Hawaii,
    335 HKT/HKST Hong Kong,
    336 IST India,
    337 IST/GMT Irish,
    338 IST/IDT/IDDT Israel,
    339 JST/JDT Japan,
    340 KST/KDT Korea,
    341 MET/MEST Middle European (a backward-compatibility alias for Central European),
    342 MSK/MSD Moscow,
    343 MST/MDT/MWT/MPT/MDDT Mountain,
    344 NST/NDT/NWT/NPT/NDDT Newfoundland,
    345 NST/NDT/NWT/NPT Nome,
    346 NZMT/NZST New Zealand through 1945,
    347 NZST/NZDT New Zealand 1946&ndash;present,
    348 PKT/PKST Pakistan,
    349 PST/PDT/PWT/PPT/PDDT Pacific,
    350 SAST South Africa,
    351 SST Samoa,
    352 WAT/WAST West Africa,
    353 WET/WEST/WEMT Western European,
    354 WIB Waktu Indonesia Barat,
    355 WIT Waktu Indonesia Timur,
    356 WITA Waktu Indonesia Tengah,
    357 YST/YDT/YWT/YPT/YDDT Yukon</small>.</p>
    358   </li>
    359   <li>
    360 	For zones whose times are taken from a city's longitude, use the
    361 traditional <var>x</var>MT notation. The only abbreviation like this
    362 in current use is 'GMT'. The others are for timestamps before 1960,
    363 except that Monrovia Mean Time persisted until 1972. Typically,
    364 numeric abbreviations (e.g., '<code>-</code>004430' for MMT) would
    365 cause trouble here, as the numeric strings would exceed the POSIX length limit.
    366 
    367 <p><small>These abbreviations are:
    368 AMT Amsterdam, Asuncin, Athens;
    369 BMT Baghdad, Bangkok, Batavia, Bern, Bogot, Bridgetown, Brussels, Bucharest;
    370 CMT Calamarca, Caracas, Chisinau, Coln, Copenhagen, Crdoba;
    371 DMT Dublin/Dunsink;
    372 EMT Easter;
    373 FFMT Fort-de-France;
    374 FMT Funchal;
    375 GMT Greenwich;
    376 HMT Havana, Helsinki, Horta, Howrah;
    377 IMT Irkutsk, Istanbul;
    378 JMT Jerusalem;
    379 KMT Kaunas, Kiev, Kingston;
    380 LMT Lima, Lisbon, local, Luanda;
    381 MMT Macassar, Madras, Mal, Managua, Minsk, Monrovia, Montevideo, Moratuwa,
    382  Moscow;
    383 PLMT Ph Lin;
    384 PMT Paramaribo, Paris, Perm, Pontianak, Prague;
    385 PMMT Port Moresby;
    386 QMT Quito;
    387 RMT Rangoon, Riga, Rome;
    388 SDMT Santo Domingo;
    389 SJMT San Jos;
    390 SMT Santiago, Simferopol, Singapore, Stanley;
    391 TBMT Tbilisi;
    392 TMT Tallinn, Tehran;
    393 WMT Warsaw</small>.</p>
    394 
    395 <p><small>A few abbreviations also follow the pattern that
    396 GMT/BST established for time in the UK. They are:
    397 
    398 CMT/BST for Calamarca Mean Time and Bolivian Summer Time
    399 1890&ndash;1932, DMT/IST for Dublin/Dunsink Mean Time and Irish Summer Time
    400 1880&ndash;1916, MMT/MST/MDST for Moscow 1880&ndash;1919, and RMT/LST
    401 for Riga Mean Time and Latvian Summer time 1880&ndash;1926.
    402 An extra-special case is SET for Swedish Time (<em>svensk
    403 normaltid</em>) 1879&ndash;1899, 3&deg; west of the Stockholm
    404 Observatory.</small></p>
    405   </li>
    406   <li>
    407 	Use 'LMT' for local mean time of locations before the introduction
    408 		of standard time; see "<a href="#scope">Scope of the
    409 		tz database</a>".
    410   </li>
    411   <li>
    412 	If there is no common English abbreviation, use numeric offsets like
    413 		<code>-</code>05 and <code>+</code>0830 that are
    414 		generated by zic's <code>%z</code> notation.
    415   </li>
    416   <li>
    417 	Use current abbreviations for older timestamps to avoid confusion.
    418 		For example, in 1910 a common English abbreviation for UT +01
    419 		in central Europe was 'MEZ' (short for both "Middle European
    420 		Zone" and for "Mitteleuropische Zeit" in German).  Nowadays
    421 		'CET' ("Central European Time") is more common in English, and
    422 		the database uses 'CET' even for circa-1910 timestamps as this
    423 		is less confusing for modern users and avoids the need for
    424 		determining when 'CET' supplanted 'MEZ' in common usage.
    425   </li>
    426   <li>
    427 	Use a consistent style in a zone's history.  For example, if a zone's
    428 		history tends to use numeric abbreviations and a particular
    429 		entry could go either way, use a numeric abbreviation.
    430   </li>
    431   <li>
    432 	Use UT (with time zone abbreviation '<code>-</code>00') for
    433 		locations while uninhabited.  The leading
    434 		'<code>-</code>' is a flag that the time
    435 		zone is in some sense undefined; this notation is
    436 		derived from Internet RFC 3339.
    437   </li>
    438 </ul>
    439 <p>
    440 Application writers should note that these abbreviations are ambiguous
    441 in practice: e.g., 'CST' means one thing in China and something else
    442 in North America, and 'IST' can refer to time in India, Ireland or
    443 Israel. To avoid ambiguity, use numeric UT offsets like
    444 '<code>-</code>0600' instead of time zone abbreviations like 'CST'.
    445 </p>
    446   </section>
    447 
    448 
    449   <section>
    450     <h2 id="accuracy">Accuracy of the tz database</h2>
    451 <p>
    452 The tz database is not authoritative, and it surely has errors.
    453 Corrections are welcome and encouraged; see the file <code>CONTRIBUTING</code>.
    454 Users requiring authoritative data should consult national standards
    455 bodies and the references cited in the database's comments.
    456 </p>
    457 
    458 <p>
    459 Errors in the tz database arise from many sources:
    460 </p>
    461 <ul>
    462   <li>
    463    The tz database predicts future timestamps, and current predictions
    464    will be incorrect after future governments change the rules.
    465    For example, if today someone schedules a meeting for 13:00 next
    466    October 1, Casablanca time, and tomorrow Morocco changes its
    467    daylight saving rules, software can mess up after the rule change
    468    if it blithely relies on conversions made before the change.
    469   </li>
    470   <li>
    471    The pre-1970 entries in this database cover only a tiny sliver of how
    472    clocks actually behaved; the vast majority of the necessary
    473    information was lost or never recorded.  Thousands more zones would
    474    be needed if the tz database's scope were extended to cover even
    475    just the known or guessed history of standard time; for example,
    476    the current single entry for France would need to split into dozens
    477    of entries, perhaps hundreds.  And in most of the world even this
    478    approach would be misleading due to widespread disagreement or
    479    indifference about what times should be observed.  In her 2015 book
    480    <cite>The Global Transformation of Time, 1870-1950</cite>, Vanessa Ogle writes
    481    "Outside of Europe and North America there was no system of time
    482    zones at all, often not even a stable landscape of mean times,
    483    prior to the middle decades of the twentieth century".  See:
    484    Timothy Shenk, <a
    485    href="https://www.dissentmagazine.org/blog/booked-a-global-history-of-time-vanessa-ogle">Booked:
    486    A Global History of Time</a>. <cite>Dissent</cite> 2015-12-17.
    487   </li>
    488   <li>
    489    Most of the pre-1970 data entries come from unreliable sources, often
    490    astrology books that lack citations and whose compilers evidently
    491    invented entries when the true facts were unknown, without
    492    reporting which entries were known and which were invented.
    493    These books often contradict each other or give implausible entries,
    494    and on the rare occasions when they are checked they are
    495    typically found to be incorrect.
    496   </li>
    497   <li>
    498    For the UK the tz database relies on years of first-class work done by
    499    Joseph Myers and others; see
    500    "<a href="https://www.polyomino.org.uk/british-time/">History of
    501    legal time in Britain</a>".
    502    Other countries are not done nearly as well.
    503   </li>
    504   <li>
    505    Sometimes, different people in the same city would maintain clocks
    506    that differed significantly.  Railway time was used by railroad
    507    companies (which did not always agree with each other),
    508    church-clock time was used for birth certificates, etc.
    509    Often this was merely common practice, but sometimes it was set by law.
    510    For example, from 1891 to 1911 the UT offset in France was legally
    511    0:09:21 outside train stations and 0:04:21 inside.
    512   </li>
    513   <li>
    514    Although a named location in the tz database stands for the
    515    containing region, its pre-1970 data entries are often accurate for
    516    only a small subset of that region.  For example, <code>Europe/London</code>
    517    stands for the United Kingdom, but its pre-1847 times are valid
    518    only for locations that have London's exact meridian, and its 1847
    519    transition to GMT is known to be valid only for the L&amp;NW and the
    520    Caledonian railways.
    521   </li>
    522   <li>
    523    The tz database does not record the earliest time for which a zone's
    524    data entries are thereafter valid for every location in the region.
    525    For example, <code>Europe/London</code> is valid for all locations in its
    526    region after GMT was made the standard time, but the date of
    527    standardization (1880-08-02) is not in the tz database, other than
    528    in commentary.  For many zones the earliest time of validity is
    529    unknown.
    530   </li>
    531   <li>
    532    The tz database does not record a region's boundaries, and in many
    533    cases the boundaries are not known.  For example, the zone
    534    <code>America/Kentucky/Louisville</code> represents a region around
    535    the city of
    536    Louisville, the boundaries of which are unclear.
    537   </li>
    538   <li>
    539    Changes that are modeled as instantaneous transitions in the tz
    540    database were often spread out over hours, days, or even decades.
    541   </li>
    542   <li>
    543    Even if the time is specified by law, locations sometimes
    544    deliberately flout the law.
    545   </li>
    546   <li>
    547    Early timekeeping practices, even assuming perfect clocks, were
    548    often not specified to the accuracy that the tz database requires.
    549   </li>
    550   <li>
    551    Sometimes historical timekeeping was specified more precisely
    552    than what the tz database can handle.  For example, from 1909 to
    553    1937 Netherlands clocks were legally UT +00:19:32.13, but the tz
    554    database cannot represent the fractional second.
    555   </li>
    556   <li>
    557    Even when all the timestamp transitions recorded by the tz database
    558    are correct, the tz rules that generate them may not faithfully
    559    reflect the historical rules.  For example, from 1922 until World
    560    War II the UK moved clocks forward the day following the third
    561    Saturday in April unless that was Easter, in which case it moved
    562    clocks forward the previous Sunday.  Because the tz database has no
    563    way to specify Easter, these exceptional years are entered as
    564    separate tz Rule lines, even though the legal rules did not change.
    565   </li>
    566   <li>
    567    The tz database models pre-standard time using the proleptic Gregorian
    568    calendar and local mean time (LMT), but many people used other
    569    calendars and other timescales.  For example, the Roman Empire used
    570    the Julian calendar, and had 12 varying-length daytime hours with a
    571    non-hour-based system at night.
    572   </li>
    573   <li>
    574    Early clocks were less reliable, and data entries do not represent
    575    clock error.
    576   </li>
    577   <li>
    578    The tz database assumes Universal Time (UT) as an origin, even
    579    though UT is not standardized for older timestamps.  In the tz
    580    database commentary, UT denotes a family of time standards that
    581    includes Coordinated Universal Time (UTC) along with other variants
    582    such as UT1 and GMT, with days starting at midnight.  Although UT
    583    equals UTC for modern timestamps, UTC was not defined until 1960,
    584    so commentary uses the more-general abbreviation UT for timestamps
    585    that might predate 1960.  Since UT, UT1, etc. disagree slightly,
    586    and since pre-1972 UTC seconds varied in length, interpretation of
    587    older timestamps can be problematic when subsecond accuracy is
    588    needed.
    589   </li>
    590   <li>
    591    Civil time was not based on atomic time before 1972, and we don't
    592    know the history of earth's rotation accurately enough to map SI
    593    seconds to historical solar time to more than about one-hour
    594    accuracy.  See: Stephenson FR, Morrison LV, Hohenkerk CY.
    595    <a href="http://dx.doi.org/10.1098/rspa.2016.0404">Measurement
    596    of the Earth's rotation: 720 BC to AD 2015</a>.
    597    <cite>Proc Royal Soc A</cite>. 2016 Dec 7;472:20160404.
    598    Also see: Espenak F. <a
    599    href="https://eclipse.gsfc.nasa.gov/SEhelp/uncertainty2004.html">Uncertainty
    600    in Delta T (T)</a>.
    601   </li>
    602   <li>
    603    The relationship between POSIX time (that is, UTC but ignoring leap
    604    seconds) and UTC is not agreed upon after 1972.  Although the POSIX
    605    clock officially stops during an inserted leap second, at least one
    606    proposed standard has it jumping back a second instead; and in
    607    practice POSIX clocks more typically either progress glacially during
    608    a leap second, or are slightly slowed while near a leap second.
    609   </li>
    610   <li>
    611    The tz database does not represent how uncertain its information is.
    612    Ideally it would contain information about when data entries are
    613    incomplete or dicey.  Partial temporal knowledge is a field of
    614    active research, though, and it's not clear how to apply it here.
    615   </li>
    616 </ul>
    617 <p>
    618 In short, many, perhaps most, of the tz database's pre-1970 and future
    619 timestamps are either wrong or misleading.  Any attempt to pass the
    620 tz database off as the definition of time should be unacceptable to
    621 anybody who cares about the facts.  In particular, the tz database's
    622 LMT offsets should not be considered meaningful, and should not prompt
    623 creation of zones merely because two locations differ in LMT or
    624 transitioned to standard time at different dates.
    625 </p>
    626   </section>
    627 
    628 
    629   <section>
    630     <h2 id="functions">Time and date functions</h2>
    631 <p>
    632 The tz code contains time and date functions that are upwards
    633 compatible with those of POSIX.
    634 </p>
    635 
    636 <p>
    637 POSIX has the following properties and limitations.
    638 </p>
    639 <ul>
    640   <li>
    641     <p>
    642 	In POSIX, time display in a process is controlled by the
    643 	environment variable TZ.  Unfortunately, the POSIX TZ string takes
    644 	a form that is hard to describe and is error-prone in practice.
    645 	Also, POSIX TZ strings can't deal with other (for example, Israeli)
    646 	daylight saving time rules, or situations where more than two
    647 	time zone abbreviations are used in an area.
    648     </p>
    649     <p>
    650       The POSIX TZ string takes the following form:
    651     </p>
    652     <p>
    653       <var>stdoffset</var>[<var>dst</var>[<var>offset</var>][<code>,</code><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]]]
    654     </p>
    655     <p>
    656 	where:
    657     <dl>
    658       <dt><var>std</var> and <var>dst</var></dt><dd>
    659 		are 3 or more characters specifying the standard
    660 		and daylight saving time (DST) zone names.
    661 		Starting with POSIX.1-2001, <var>std</var>
    662 		and <var>dst</var> may also be
    663 		in a quoted form like '<code>&lt;+09&gt;</code>'; this allows
    664 		"<code>+</code>" and "<code>-</code>" in the names.
    665       </dd>
    666       <dt><var>offset</var></dt><dd>
    667 		is of the form
    668 		'<code>[&plusmn;]<var>hh</var>:[<var>mm</var>[:<var>ss</var>]]</code>'
    669 		and specifies the offset west of UT.  '<var>hh</var>'
    670 		may be a single digit; 0&le;<var>hh</var>&le;24.
    671 		The default DST offset is one hour ahead of standard time.
    672       </dd>
    673       <dt><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]</dt><dd>
    674 		specifies the beginning and end of DST.  If this is absent,
    675 		the system supplies its own rules for DST, and these can
    676 		differ from year to year; typically US DST rules are used.
    677       </dd>
    678       <dt><var>time</var></dt><dd>
    679 		takes the form
    680 		'<var>hh</var><code>:</code>[<var>mm</var>[<code>:</code><var>ss</var>]]'
    681 		and defaults to 02:00.
    682 		This is the same format as the offset, except that a
    683 		leading '<code>+</code>' or '<code>-</code>' is not allowed.
    684       </dd>
    685       <dt><var>date</var></dt><dd>
    686 		takes one of the following forms:
    687 	<dl>
    688 	  <dt>J<var>n</var> (1&le;<var>n</var>&le;365)</dt><dd>
    689 			origin-1 day number not counting February 29
    690           </dd>
    691 	  <dt><var>n</var> (0&le;<var>n</var>&le;365)</dt><dd>
    692 			origin-0 day number counting February 29 if present
    693           </dd>
    694 	  <dt><code>M</code><var>m</var><code>.</code><var>n</var><code>.</code><var>d</var> (0[Sunday]&le;<var>d</var>&le;6[Saturday], 1&le;<var>n</var>&le;5, 1&le;<var>m</var>&le;12)</dt><dd>
    695 			for the <var>d</var>th day of
    696 			week <var>n</var> of month <var>m</var> of the
    697 			year, where week 1 is the first week in which
    698 			day <var>d</var> appears, and '<code>5</code>'
    699 			stands for the last week in which
    700 			day <var>d</var> appears
    701 			(which may be either the 4th or 5th week).
    702 			Typically, this is the only useful form;
    703 			the <var>n</var>
    704 			and <code>J</code><var>n</var> forms are
    705 			rarely used.
    706 	  </dd>
    707 </dl>
    708 </dd>
    709 </dl>
    710 	Here is an example POSIX TZ string for New Zealand after 2007.
    711 	It says that standard time (NZST) is 12 hours ahead of UT,
    712 	and that daylight saving time (NZDT) is observed from September's
    713 	last Sunday at 02:00 until April's first Sunday at 03:00:
    714 
    715         <pre><code>TZ='NZST-12NZDT,M9.5.0,M4.1.0/3'</code></pre>
    716 
    717 	This POSIX TZ string is hard to remember, and mishandles some
    718 	timestamps before 2008.  With this package you can use this
    719 	instead:
    720 
    721 	<pre><code>TZ='Pacific/Auckland'</code></pre>
    722   </li>
    723   <li>
    724 	POSIX does not define the exact meaning of TZ values like
    725 	"<code>EST5EDT</code>".
    726 	Typically the current US DST rules are used to interpret such values,
    727 	but this means that the US DST rules are compiled into each program
    728 	that does time conversion.  This means that when US time conversion
    729 	rules change (as in the United States in 1987), all programs that
    730 	do time conversion must be recompiled to ensure proper results.
    731   </li>
    732   <li>
    733 	The TZ environment variable is process-global, which makes it hard
    734 	to write efficient, thread-safe applications that need access
    735 	to multiple time zones.
    736   </li>
    737   <li>
    738 	In POSIX, there's no tamper-proof way for a process to learn the
    739 	system's best idea of local wall clock.  (This is important for
    740 	applications that an administrator wants used only at certain
    741 	times &ndash;
    742 	without regard to whether the user has fiddled the TZ environment
    743 	variable.  While an administrator can "do everything in UT" to get
    744 	around the problem, doing so is inconvenient and precludes handling
    745 	daylight saving time shifts - as might be required to limit phone
    746 	calls to off-peak hours.)
    747   </li>
    748   <li>
    749 	POSIX provides no convenient and efficient way to determine the UT
    750 	offset and time zone abbreviation of arbitrary timestamps,
    751 	particularly for time zone settings that do not fit into the
    752 	POSIX model.
    753   </li>
    754   <li>
    755 	POSIX requires that systems ignore leap seconds.
    756   </li>
    757   <li>
    758 	The tz code attempts to support all the <code>time_t</code>
    759 	implementations allowed by POSIX.  The <code>time_t</code>
    760 	type represents a nonnegative count of
    761 	seconds since 1970-01-01 00:00:00 UTC, ignoring leap seconds.
    762 	In practice, <code>time_t</code> is usually a signed 64- or
    763 	32-bit integer; 32-bit signed <code>time_t</code> values stop
    764 	working after 2038-01-19 03:14:07 UTC, so
    765 	new implementations these days typically use a signed 64-bit integer.
    766 	Unsigned 32-bit integers are used on one or two platforms,
    767 	and 36-bit and 40-bit integers are also used occasionally.
    768 	Although earlier POSIX versions allowed <code>time_t</code> to be a
    769 	floating-point type, this was not supported by any practical
    770 	systems, and POSIX.1-2013 and the tz code both
    771 	require <code>time_t</code>
    772 	to be an integer type.
    773   </li>
    774 </ul>
    775 <p>
    776 These are the extensions that have been made to the POSIX functions:
    777 </p>
    778 <ul>
    779   <li>
    780     <p>
    781 	The TZ environment variable is used in generating the name of a file
    782 	from which time zone information is read (or is interpreted a la
    783 	POSIX); TZ is no longer constrained to be a three-letter time zone
    784 	name followed by a number of hours and an optional three-letter
    785 	daylight time zone name.  The daylight saving time rules to be used
    786 	for a particular time zone are encoded in the time zone file;
    787 	the format of the file allows U.S., Australian, and other rules to be
    788 	encoded, and allows for situations where more than two time zone
    789 	abbreviations are used.
    790     </p>
    791     <p>
    792 	It was recognized that allowing the TZ environment variable to
    793 	take on values such as '<code>America/New_York</code>' might
    794 	cause "old" programs
    795 	(that expect TZ to have a certain form) to operate incorrectly;
    796 	consideration was given to using some other environment variable
    797 	(for example, TIMEZONE) to hold the string used to generate the
    798 	time zone information file name.  In the end, however, it was decided
    799 	to continue using TZ: it is widely used for time zone purposes;
    800 	separately maintaining both TZ and TIMEZONE seemed a nuisance;
    801 	and systems where "new" forms of TZ might cause problems can simply
    802 	use TZ values such as "<code>EST5EDT</code>" which can be used both by
    803 	"new" programs (a la POSIX) and "old" programs (as zone names and
    804 	offsets).
    805     </p>
    806 </li>
    807 <li>
    808 	The code supports platforms with a UT offset member
    809 	in <code>struct tm</code>,
    810 	e.g., <code>tm_gmtoff</code>.
    811 </li>
    812 <li>
    813 	The code supports platforms with a time zone abbreviation member in
    814 	<code>struct tm</code>, e.g., <code>tm_zone</code>.
    815 </li>
    816 <li>
    817 	Since the TZ environment variable can now be used to control time
    818 	conversion, the <code>daylight</code>
    819 	and <code>timezone</code> variables are no longer needed.
    820 	(These variables are defined and set by <code>tzset</code>;
    821 	however, their values will not be used
    822 	by <code>localtime</code>.)
    823 </li>
    824 <li>
    825 	Functions <code>tzalloc</code>, <code>tzfree</code>,
    826 	<code>localtime_rz</code>, and <code>mktime_z</code> for
    827 	more-efficient thread-safe applications that need to use
    828 	multiple time zones.  The <code>tzalloc</code>
    829 	and <code>tzfree</code> functions allocate and free objects of
    830 	type <code>timezone_t</code>, and <code>localtime_rz</code>
    831 	and <code>mktime_z</code> are like <code>localtime_r</code>
    832 	and <code>mktime</code> with an extra
    833 	<code>timezone_t</code> argument.  The functions were inspired
    834 	by NetBSD.
    835 </li>
    836 <li>
    837 	A function <code>tzsetwall</code> has been added to arrange
    838 	for the system's
    839 	best approximation to local wall clock time to be delivered by
    840 	subsequent calls to <code>localtime</code>.  Source code for portable
    841 	applications that "must" run on local wall clock time should call
    842 	<code>tzsetwall</code>; if such code is moved to "old" systems that don't
    843 	provide tzsetwall, you won't be able to generate an executable program.
    844 	(These time zone functions also arrange for local wall clock time to be
    845 	used if tzset is called &ndash; directly or indirectly &ndash;
    846 	and there's no TZ
    847 	environment variable; portable applications should not, however, rely
    848 	on this behavior since it's not the way SVR2 systems behave.)
    849 </li>
    850 <li>
    851 	Negative <code>time_t</code> values are supported, on systems
    852 	where <code>time_t</code> is signed.
    853 </li>
    854 <li>
    855 	These functions can account for leap seconds, thanks to Bradley White.
    856 </li>
    857 </ul>
    858 <p>
    859 Points of interest to folks with other systems:
    860 </p>
    861 <ul>
    862   <li>
    863 	Code compatible with this package is already part of many platforms,
    864 	including GNU/Linux, Android, the BSDs, Chromium OS, Cygwin, AIX, iOS,
    865 	BlackBery 10, macOS, Microsoft Windows, OpenVMS, and Solaris.
    866 	On such hosts, the primary use of this package
    867 	is to update obsolete time zone rule tables.
    868 	To do this, you may need to compile the time zone compiler
    869 	'<code>zic</code>' supplied with this package instead of using
    870 	the system '<code>zic</code>', since the format
    871 	of <code>zic</code>'s input is occasionally extended, and a
    872 	platform may still be shipping an older <code>zic</code>.
    873   </li>
    874   <li>
    875 	The UNIX Version 7 <code>timezone</code> function is not
    876 	present in this package;
    877 	it's impossible to reliably map timezone's arguments (a "minutes west
    878 	of GMT" value and a "daylight saving time in effect" flag) to a
    879 	time zone abbreviation, and we refuse to guess.
    880 	Programs that in the past used the timezone function may now examine
    881 	<code>localtime(&amp;clock)-&gt;tm_zone</code>
    882 	(if <code>TM_ZONE</code> is defined) or
    883 	<code>tzname[localtime(&amp;clock)-&gt;tm_isdst]</code>
    884 	(if <code>HAVE_TZNAME</code> is defined)
    885 	to learn the correct time zone abbreviation to use.
    886   </li>
    887   <li>
    888 	The 4.2BSD <code>gettimeofday</code> function is not used in
    889 	this package.
    890 	This formerly let users obtain the current UTC offset and DST flag,
    891 	but this functionality was removed in later versions of BSD.
    892   </li>
    893   <li>
    894 	In SVR2, time conversion fails for near-minimum or near-maximum
    895 	<code>time_t</code> values when doing conversions for places
    896 	that don't use UT.
    897 	This package takes care to do these conversions correctly.
    898 	A comment in the source code tells how to get compatibly wrong
    899 	results.
    900   </li>
    901 </ul>
    902 <p>
    903 The functions that are conditionally compiled
    904 if <code>STD_INSPIRED</code> is defined
    905 should, at this point, be looked on primarily as food for thought.  They are
    906 not in any sense "standard compatible" &ndash; some are not, in fact,
    907 specified in <em>any</em> standard.  They do, however, represent responses of
    908 various authors to
    909 standardization proposals.
    910 </p>
    911 
    912 <p>
    913 Other time conversion proposals, in particular the one developed by folks at
    914 Hewlett Packard, offer a wider selection of functions that provide capabilities
    915 beyond those provided here.  The absence of such functions from this package
    916 is not meant to discourage the development, standardization, or use of such
    917 functions.  Rather, their absence reflects the decision to make this package
    918 contain valid extensions to POSIX, to ensure its broad acceptability.  If
    919 more powerful time conversion functions can be standardized, so much the
    920 better.
    921 </p>
    922   </section>
    923 
    924 
    925   <section>
    926     <h2 id="stability">Interface stability</h2>
    927 <p>
    928 The tz code and data supply the following interfaces:
    929 </p>
    930 <ul>
    931   <li>
    932    A set of zone names as per "<a href="#naming">Names of time zone
    933    rules</a>" above.
    934   </li>
    935   <li>
    936    Library functions described in "<a href="#functions">Time and date
    937    functions</a>" above.
    938   </li>
    939   <li>
    940    The programs <code>tzselect</code>, <code>zdump</code>,
    941    and <code>zic</code>, documented in their man pages.
    942   </li>
    943   <li>
    944    The format of <code>zic</code> input files, documented in
    945    the <code>zic</code> man page.
    946   </li>
    947   <li>
    948    The format of <code>zic</code> output files, documented in
    949    the <code>tzfile</code> man page.
    950   </li>
    951   <li>
    952    The format of zone table files, documented in <code>zone1970.tab</code>.
    953   </li>
    954   <li>
    955    The format of the country code file, documented in <code>iso3166.tab</code>.
    956   </li>
    957   <li>
    958    The version number of the code and data, as the first line of
    959    the text file '<code>version</code>' in each release.
    960   </li>
    961 </ul>
    962 <p>
    963 Interface changes in a release attempt to preserve compatibility with
    964 recent releases.  For example, tz data files typically do not rely on
    965 recently-added <code>zic</code> features, so that users can run
    966 older <code>zic</code> versions to process newer data
    967 files.  <a href="tz-link.html">Sources for time zone and daylight
    968 saving time data</a> describes how
    969 releases are tagged and distributed.
    970 </p>
    971 
    972 <p>
    973 Interfaces not listed above are less stable.  For example, users
    974 should not rely on particular UT offsets or abbreviations for
    975 timestamps, as data entries are often based on guesswork and these
    976 guesses may be corrected or improved.
    977 </p>
    978   </section>
    979 
    980 
    981   <section>
    982     <h2 id="calendar">Calendrical issues</h2>
    983 <p>
    984 Calendrical issues are a bit out of scope for a time zone database,
    985 but they indicate the sort of problems that we would run into if we
    986 extended the time zone database further into the past.  An excellent
    987 resource in this area is Nachum Dershowitz and Edward M. Reingold,
    988 <cite><a href="https://www.cs.tau.ac.il/~nachum/calendar-book/third-edition/">Calendrical
    989 Calculations: Third Edition</a></cite>, Cambridge University Press (2008).
    990 Other information and sources are given in the file '<samp>calendars</samp>'
    991 in the tz distribution.  They sometimes disagree.
    992 </p>
    993   </section>
    994 
    995 
    996   <section>
    997     <h2 id="planets">Time and time zones on other planets</h2>
    998 <p>
    999 Some people's work schedules use Mars time.  Jet Propulsion Laboratory
   1000 (JPL) coordinators have kept Mars time on and off at least since 1997
   1001 for the Mars Pathfinder mission.  Some of their family members have
   1002 also adapted to Mars time.  Dozens of special Mars watches were built
   1003 for JPL workers who kept Mars time during the Mars Exploration
   1004 Rovers mission (2004).  These timepieces look like normal Seikos and
   1005 Citizens but use Mars seconds rather than terrestrial seconds.
   1006 </p>
   1007 
   1008 <p>
   1009 A Mars solar day is called a "sol" and has a mean period equal to
   1010 about 24 hours 39 minutes 35.244 seconds in terrestrial time.  It is
   1011 divided into a conventional 24-hour clock, so each Mars second equals
   1012 about 1.02749125 terrestrial seconds.
   1013 </p>
   1014 
   1015 <p>
   1016 The prime meridian of Mars goes through the center of the crater
   1017 Airy-0, named in honor of the British astronomer who built the
   1018 Greenwich telescope that defines Earth's prime meridian.  Mean solar
   1019 time on the Mars prime meridian is called Mars Coordinated Time (MTC).
   1020 </p>
   1021 
   1022 <p>
   1023 Each landed mission on Mars has adopted a different reference for
   1024 solar time keeping, so there is no real standard for Mars time zones.
   1025 For example, the Mars Exploration Rover project (2004) defined two
   1026 time zones "Local Solar Time A" and "Local Solar Time B" for its two
   1027 missions, each zone designed so that its time equals local true solar
   1028 time at approximately the middle of the nominal mission.  Such a "time
   1029 zone" is not particularly suited for any application other than the
   1030 mission itself.
   1031 </p>
   1032 
   1033 <p>
   1034 Many calendars have been proposed for Mars, but none have achieved
   1035 wide acceptance.  Astronomers often use Mars Sol Date (MSD) which is a
   1036 sequential count of Mars solar days elapsed since about 1873-12-29
   1037 12:00 GMT.
   1038 </p>
   1039 
   1040 <p>
   1041 In our solar system, Mars is the planet with time and calendar most
   1042 like Earth's.  On other planets, Sun-based time and calendars would
   1043 work quite differently.  For example, although Mercury's sidereal
   1044 rotation period is 58.646 Earth days, Mercury revolves around the Sun
   1045 so rapidly that an observer on Mercury's equator would see a sunrise
   1046 only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury
   1047 day.  Venus is more complicated, partly because its rotation is
   1048 slightly retrograde: its year is 1.92 of its days.  Gas giants like
   1049 Jupiter are trickier still, as their polar and equatorial regions
   1050 rotate at different rates, so that the length of a day depends on
   1051 latitude.  This effect is most pronounced on Neptune, where the day is
   1052 about 12 hours at the poles and 18 hours at the equator.
   1053 </p>
   1054 
   1055 <p>
   1056 Although the tz database does not support time on other planets, it is
   1057 documented here in the hopes that support will be added eventually.
   1058 </p>
   1059 
   1060 <p>
   1061 Sources:
   1062 </p>
   1063 <ul>
   1064   <li>
   1065 Michael Allison and Robert Schmunk,
   1066 "<a href="https://www.giss.nasa.gov/tools/mars24/help/notes.html">Technical
   1067 Notes on Mars Solar Time as Adopted by the Mars24 Sunclock</a>"
   1068 (2015-06-30).
   1069   </li>
   1070   <li>
   1071 Jia-Rui Chong,
   1072 "<a href="http://articles.latimes.com/2004/jan/14/science/sci-marstime14">Workdays
   1073 Fit for a Martian</a>", Los Angeles Times
   1074 (2004-01-14), pp A1, A20-A21.
   1075   </li>
   1076   <li>
   1077 Tom Chmielewski,
   1078 "<a href="https://www.theatlantic.com/technology/archive/2015/02/jet-lag-is-worse-on-mars/386033/">Jet
   1079 Lag Is Worse on Mars</a>", The Atlantic (2015-02-26)
   1080   </li>
   1081   <li>
   1082 Matt Williams,
   1083 "<a href="https://www.universetoday.com/37481/days-of-the-planets/">How
   1084 long is a day on the other planets of the solar system?</a>"
   1085 (2017-04-27).
   1086   </li>
   1087 </ul>
   1088   </section>
   1089 
   1090   <footer>
   1091     <hr>
   1092 This file is in the public domain, so clarified as of 2009-05-17 by
   1093 Arthur David Olson.
   1094   </footer>
   1095 </body>
   1096 </html>
   1097