Home | History | Annotate | Line # | Download | only in time
theory.html revision 1.1
      1 <!DOCTYPE html>
      2 <html lang="en">
      3 <head>
      4   <title>Theory and pragmatics of the tz code and data</title>
      5   <meta charset="UTF-8">
      6 </head>
      7 
      8 <!-- The somewhat-unusal indenting style in this file is intended to
      9      shrink the output of the shell command 'diff Theory Theory.html',
     10      where 'Theory' was the plain text file that this file is derived
     11      from.  The 'Theory' file used leading white space to indent, and
     12      when possible that indentation is preserved here.  Eventually we
     13      may stop doing this and remove this comment.  -->
     14 
     15 <body>
     16   <h1>Theory and pragmatics of the tz code and data</h1>
     17   <h3>Outline</h3>
     18   <nav>
     19     <ul>
     20       <li><a href="#scope">Scope of the tz database</a></li>
     21       <li><a href="#naming">Names of time zone rules</a></li>
     22       <li><a href="#abbreviations">Time zone abbreviations</a></li>
     23       <li><a href="#accuracy">Accuracy of the tz database</a></li>
     24       <li><a href="#functions">Time and date functions</a></li>
     25       <li><a href="#stability">Interface stability</a></li>
     26       <li><a href="#calendar">Calendrical issues</a></li>
     27       <li><a href="#planets">Time and time zones on other planets</a></li>
     28     </ul>
     29   </nav>
     30 
     31 
     32   <section>
     33     <h2 id="scope">Scope of the tz database</h2>
     34 <p>
     35 The tz database attempts to record the history and predicted future of
     36 all computer-based clocks that track civil time.  To represent this
     37 data, the world is partitioned into regions whose clocks all agree
     38 about timestamps that occur after the somewhat-arbitrary cutoff point
     39 of the POSIX Epoch (1970-01-01 00:00:00 UTC).  For each such region,
     40 the database records all known clock transitions, and labels the region
     41 with a notable location.  Although 1970 is a somewhat-arbitrary
     42 cutoff, there are significant challenges to moving the cutoff earlier
     43 even by a decade or two, due to the wide variety of local practices
     44 before computer timekeeping became prevalent.
     45 </p>
     46 
     47 <p>
     48 Clock transitions before 1970 are recorded for each such location,
     49 because most systems support timestamps before 1970 and could
     50 misbehave if data entries were omitted for pre-1970 transitions.
     51 However, the database is not designed for and does not suffice for
     52 applications requiring accurate handling of all past times everywhere,
     53 as it would take far too much effort and guesswork to record all
     54 details of pre-1970 civil timekeeping.
     55 </p>
     56 
     57 <p>
     58 As described below, reference source code for using the tz database is
     59 also available.  The tz code is upwards compatible with POSIX, an
     60 international standard for UNIX-like systems.  As of this writing, the
     61 current edition of POSIX is:
     62   <a href="http://pubs.opengroup.org/onlinepubs/9699919799/">
     63   The Open Group Base Specifications Issue 7</a>,
     64   IEEE Std 1003.1-2008, 2016 Edition.
     65 </p>
     66   </section>
     67 
     68 
     69 
     70   <section>
     71     <h2 id="naming">Names of time zone rules</h2>
     72 <p>
     73 Each of the database's time zone rules has a unique name.
     74 Inexperienced users are not expected to select these names unaided.
     75 Distributors should provide documentation and/or a simple selection
     76 interface that explains the names; for one example, see the 'tzselect'
     77 program in the tz code.  The
     78 <a href="http://cldr.unicode.org/">Unicode Common Locale Data
     79 Repository</a> contains data that may be useful for other
     80 selection interfaces.
     81 </p>
     82 
     83 <p>
     84 The time zone rule naming conventions attempt to strike a balance
     85 among the following goals:
     86 </p>
     87 <ul>
     88   <li>
     89    Uniquely identify every region where clocks have agreed since 1970.
     90    This is essential for the intended use: static clocks keeping local
     91    civil time.
     92   </li>
     93   <li>
     94    Indicate to experts where that region is.
     95   </li>
     96   <li>
     97    Be robust in the presence of political changes.  For example, names
     98    of countries are ordinarily not used, to avoid incompatibilities
     99    when countries change their name (e.g. Zaire&rarr;Congo) or when
    100    locations change countries (e.g. Hong Kong from UK colony to
    101    China).
    102   </li>
    103   <li>
    104    Be portable to a wide variety of implementations.
    105   </li>
    106   <li>
    107    Use a consistent naming conventions over the entire world.
    108   </li>
    109 </ul>
    110 <p>
    111 Names normally have the
    112 form <var>AREA</var><code>/</code><var>LOCATION</var>,
    113 where <var>AREA</var> is the name of a continent or ocean,
    114 and <var>LOCATION</var> is the name of a specific
    115 location within that region.  North and South America share the same
    116 area, '<code>America</code>'.  Typical names are
    117 '<code>Africa/Cairo</code>', '<code>America/New_York</code>', and
    118 '<code>Pacific/Honolulu</code>'.
    119 </p>
    120 
    121 <p>
    122 Here are the general rules used for choosing location names,
    123 in decreasing order of importance:
    124 </p>
    125 <ul>
    126   <li>
    127 	Use only valid POSIX file name components (i.e., the parts of
    128 		names other than '<code>/</code>').  Do not use the file name
    129 		components '<code>.</code>' and '<code>..</code>'.
    130 		Within a file name component,
    131 		use only ASCII letters, '<code>.</code>',
    132 		'<code>-</code>' and '<code>_</code>'.  Do not use
    133 		digits, as that might create an ambiguity with POSIX
    134 		TZ strings.  A file name component must not exceed 14
    135 		characters or start with '<code>-</code>'.  E.g.,
    136 		prefer '<code>Brunei</code>' to
    137 		'<code>Bandar_Seri_Begawan</code>'.  Exceptions: see
    138 		the discussion
    139 		of legacy names below.
    140   </li>
    141   <li>
    142 	A name must not be empty, or contain '<code>//</code>', or
    143 	start or end with '<code>/</code>'.
    144   </li>
    145   <li>
    146 	Do not use names that differ only in case.  Although the reference
    147 		implementation is case-sensitive, some other implementations
    148 		are not, and they would mishandle names differing only in case.
    149   </li>
    150   <li>
    151 	If one name <var>A</var> is an initial prefix of another
    152 		name <var>AB</var> (ignoring case), then <var>B</var>
    153 		must not start with '<code>/</code>', as a
    154 		regular file cannot have
    155 		the same name as a directory in POSIX.  For example,
    156 		'<code>America/New_York</code>' precludes
    157 		'<code>America/New_York/Bronx</code>'.
    158   </li>
    159   <li>
    160 	Uninhabited regions like the North Pole and Bouvet Island
    161 		do not need locations, since local time is not defined there.
    162   </li>
    163   <li>
    164 	There should typically be at least one name for each ISO 3166-1
    165 		officially assigned two-letter code for an inhabited country
    166 		or territory.
    167   </li>
    168   <li>
    169 	If all the clocks in a region have agreed since 1970,
    170 		don't bother to include more than one location
    171 		even if subregions' clocks disagreed before 1970.
    172 		Otherwise these tables would become annoyingly large.
    173   </li>
    174   <li>
    175 	If a name is ambiguous, use a less ambiguous alternative;
    176 		e.g. many cities are named San Jos and Georgetown, so
    177 		prefer '<code>Costa_Rica</code>' to '<code>San_Jose</code>' and '<code>Guyana</code>' to '<code>Georgetown</code>'.
    178   </li>
    179   <li>
    180 	Keep locations compact.  Use cities or small islands, not countries
    181 		or regions, so that any future time zone changes do not split
    182 		locations into different time zones.  E.g. prefer
    183 		'<code>Paris</code>' to '<code>France</code>', since
    184 		France has had multiple time zones.
    185   </li>
    186   <li>
    187 	Use mainstream English spelling, e.g. prefer
    188 		'<code>Rome</code>' to '<code>Roma</code>', and prefer
    189 		'<code>Athens</code>' to the Greek
    190 		'<code></code>' or the Romanized
    191 		'<code>Athna</code>'.
    192 		The POSIX file name restrictions encourage this rule.
    193   </li>
    194   <li>
    195 	Use the most populous among locations in a zone,
    196 		e.g. prefer '<code>Shanghai</code>' to
    197 		'<code>Beijing</code>'.  Among locations with
    198 		similar populations, pick the best-known location,
    199 		e.g. prefer '<code>Rome</code>' to '<code>Milan</code>'.
    200   </li>
    201   <li>
    202 	Use the singular form, e.g. prefer '<code>Canary</code>' to '<code>Canaries</code>'.
    203   </li>
    204   <li>
    205 	Omit common suffixes like '<code>_Islands</code>' and
    206 		'<code>_City</code>', unless that would lead to
    207 		ambiguity.  E.g. prefer '<code>Cayman</code>' to
    208 		'<code>Cayman_Islands</code>' and
    209 		'<code>Guatemala</code>' to
    210 		'<code>Guatemala_City</code>', but prefer
    211 		'<code>Mexico_City</code>' to '<code>Mexico</code>'
    212 		because the country
    213 		of Mexico has several time zones.
    214   </li>
    215   <li>
    216 	Use '<code>_</code>' to represent a space.
    217   </li>
    218   <li>
    219 	Omit '<code>.</code>' from abbreviations in names, e.g. prefer
    220 		'<code>St_Helena</code>' to '<code>St._Helena</code>'.
    221   </li>
    222   <li>
    223 	Do not change established names if they only marginally
    224 		violate the above rules.  For example, don't change
    225 		the existing name '<code>Rome</code>' to
    226 		'<code>Milan</code>' merely because
    227 		Milan's population has grown to be somewhat greater
    228 		than Rome's.
    229   </li>
    230   <li>
    231 	If a name is changed, put its old spelling in the
    232 		'<code>backward</code>' file.
    233 		This means old spellings will continue to work.
    234   </li>
    235 </ul>
    236 
    237 <p>
    238 The file '<code>zone1970.tab</code>' lists geographical locations used
    239 to name time
    240 zone rules.  It is intended to be an exhaustive list of names for
    241 geographic regions as described above; this is a subset of the names
    242 in the data.  Although a '<code>zone1970.tab</code>' location's longitude
    243 corresponds to its LMT offset with one hour for every 15 degrees east
    244 longitude, this relationship is not exact.
    245 </p>
    246 
    247 <p>
    248 Older versions of this package used a different naming scheme,
    249 and these older names are still supported.
    250 See the file '<code>backward</code>' for most of these older names
    251 (e.g., '<code>US/Eastern</code>' instead of '<code>America/New_York</code>').
    252 The other old-fashioned names still supported are
    253 '<code>WET</code>', '<code>CET</code>', '<code>MET</code>', and '<code>EET</code>' (see the file '<code>europe</code>').
    254 </p>
    255 
    256 <p>
    257 Older versions of this package defined legacy names that are
    258 incompatible with the first rule of location names, but which are
    259 still supported.  These legacy names are mostly defined in the file
    260 '<code>etcetera</code>'.  Also, the file '<code>backward</code>' defines the legacy names
    261 '<code>GMT0</code>', '<code>GMT-0</code>' and '<code>GMT+0</code>', and the file '<code>northamerica</code>' defines the
    262 legacy names '<code>EST5EDT</code>', '<code>CST6CDT</code>', '<code>MST7MDT</code>', and '<code>PST8PDT</code>'.
    263 </p>
    264 
    265 <p>
    266 Excluding '<code>backward</code>' should not affect the other data.  If
    267 '<code>backward</code>' is excluded, excluding '<code>etcetera</code>' should not affect the
    268 remaining data.
    269 </p>
    270 
    271 
    272   </section>
    273   <section>
    274     <h2 id="abbreviations">Time zone abbreviations</h2>
    275 <p>
    276 When this package is installed, it generates time zone abbreviations
    277 like '<code>EST</code>' to be compatible with human tradition and POSIX.
    278 Here are the general rules used for choosing time zone abbreviations,
    279 in decreasing order of importance:
    280 <ul>
    281   <li>
    282 	Use three or more characters that are ASCII alphanumerics or
    283 		'<code>+</code>' or '<code>-</code>'.
    284 		Previous editions of this database also used characters like
    285 		'<code> </code>' and '<code>?</code>', but these
    286 		characters have a special meaning to
    287 		the shell and cause commands like
    288 			'<code>set `date`</code>'
    289 		to have unexpected effects.
    290 		Previous editions of this rule required upper-case letters,
    291 		but the Congressman who introduced Chamorro Standard Time
    292 		preferred "ChST", so lower-case letters are now allowed.
    293 		Also, POSIX from 2001 on relaxed the rule to allow
    294 		'<code>-</code>', '<code>+</code>',
    295 		and alphanumeric characters from the portable character set
    296 		in the current locale.  In practice ASCII alphanumerics and
    297 		'<code>+</code>' and '<code>-</code>' are safe in all locales.
    298 
    299 		In other words, in the C locale the POSIX extended regular
    300 		expression <code>[-+[:alnum:]]{3,}</code> should match
    301 		the abbreviation.
    302 		This guarantees that all abbreviations could have been
    303 		specified by a POSIX TZ string.
    304   </li>
    305   <li>
    306 	Use abbreviations that are in common use among English-speakers,
    307 		e.g. 'EST' for Eastern Standard Time in North America.
    308 		We assume that applications translate them to other languages
    309 		as part of the normal localization process; for example,
    310 		a French application might translate 'EST' to 'HNE'.
    311   </li>
    312   <li>
    313 	For zones whose times are taken from a city's longitude, use the
    314 		traditional <var>x</var>MT notation, e.g. 'PMT' for
    315 		Paris Mean Time.
    316 		The only name like this in current use is 'GMT'.
    317   </li>
    318   <li>
    319 	Use 'LMT' for local mean time of locations before the introduction
    320 		of standard time; see "<a href="#scope">Scope of the
    321 		tz database</a>".
    322   </li>
    323   <li>
    324 	If there is no common English abbreviation, use numeric offsets like
    325 		<code>-</code>05 and <code>+</code>0830 that are
    326 		generated by zic's <code>%z</code> notation.
    327   </li>
    328   <li>
    329 	Use current abbreviations for older timestamps to avoid confusion.
    330 		For example, in 1910 a common English abbreviation for UT +01
    331 		in central Europe was 'MEZ' (short for both "Middle European
    332 		Zone" and for "Mitteleuropische Zeit" in German).  Nowadays
    333 		'CET' ("Central European Time") is more common in English, and
    334 		the database uses 'CET' even for circa-1910 timestamps as this
    335 		is less confusing for modern users and avoids the need for
    336 		determining when 'CET' supplanted 'MEZ' in common usage.
    337   </li>
    338   <li>
    339 	Use a consistent style in a zone's history.  For example, if a zone's
    340 		history tends to use numeric abbreviations and a particular
    341 		entry could go either way, use a numeric abbreviation.
    342   </li>
    343 </ul>
    344     [The remaining guidelines predate the introduction of <code>%z</code>.
    345     They are problematic as they mean tz data entries invent
    346     notation rather than record it.  These guidelines are now
    347     deprecated and the plan is to gradually move to <code>%z</code> for
    348     inhabited locations and to "<code>-</code>00" for uninhabited locations.]
    349 <ul>
    350   <li>
    351 	If there is no common English abbreviation, abbreviate the English
    352 		translation of the usual phrase used by native speakers.
    353 		If this is not available or is a phrase mentioning the country
    354 		(e.g. "Cape Verde Time"), then:
    355 	<ul>
    356 	  <li>
    357 		When a country is identified with a single or principal zone,
    358 			append 'T' to the country's ISO	code, e.g. 'CVT' for
    359 			Cape Verde Time.  For summer time append 'ST';
    360 			for double summer time append 'DST'; etc.
    361 	  </li>
    362 	  <li>
    363 		Otherwise, take the first three letters of an English place
    364 			name identifying each zone and append 'T', 'ST', etc.
    365 			as before; e.g. 'CHAST' for CHAtham Summer Time.
    366 	  </li>
    367 	</ul>
    368   </li>
    369   <li>
    370 	Use UT (with time zone abbreviation '<code>-</code>00') for
    371 		locations while uninhabited.  The leading
    372 		'<code>-</code>' is a flag that the time
    373 		zone is in some sense undefined; this notation is
    374 		derived from Internet RFC 3339.
    375   </li>
    376 </ul>
    377 <p>
    378 Application writers should note that these abbreviations are ambiguous
    379 in practice: e.g. 'CST' has a different meaning in China than
    380 it does in the United States.  In new applications, it's often better
    381 to use numeric UT offsets like '<code>-</code>0600' instead of time zone
    382 abbreviations like 'CST'; this avoids the ambiguity.
    383 </p>
    384   </section>
    385 
    386 
    387   <section>
    388     <h2 id="accuracy">Accuracy of the tz database</h2>
    389 <p>
    390 The tz database is not authoritative, and it surely has errors.
    391 Corrections are welcome and encouraged; see the file CONTRIBUTING.
    392 Users requiring authoritative data should consult national standards
    393 bodies and the references cited in the database's comments.
    394 </p>
    395 
    396 <p>
    397 Errors in the tz database arise from many sources:
    398 </p>
    399 <ul>
    400   <li>
    401    The tz database predicts future timestamps, and current predictions
    402    will be incorrect after future governments change the rules.
    403    For example, if today someone schedules a meeting for 13:00 next
    404    October 1, Casablanca time, and tomorrow Morocco changes its
    405    daylight saving rules, software can mess up after the rule change
    406    if it blithely relies on conversions made before the change.
    407   </li>
    408   <li>
    409    The pre-1970 entries in this database cover only a tiny sliver of how
    410    clocks actually behaved; the vast majority of the necessary
    411    information was lost or never recorded.  Thousands more zones would
    412    be needed if the tz database's scope were extended to cover even
    413    just the known or guessed history of standard time; for example,
    414    the current single entry for France would need to split into dozens
    415    of entries, perhaps hundreds.  And in most of the world even this
    416    approach would be misleading due to widespread disagreement or
    417    indifference about what times should be observed.  In her 2015 book
    418    <cite>The Global Transformation of Time, 1870-1950</cite>, Vanessa Ogle writes
    419    "Outside of Europe and North America there was no system of time
    420    zones at all, often not even a stable landscape of mean times,
    421    prior to the middle decades of the twentieth century".  See:
    422    Timothy Shenk, <a
    423    href="https://www.dissentmagazine.org/blog/booked-a-global-history-of-time-vanessa-ogle">Booked:
    424    A Global History of Time</a>. <cite>Dissent</cite> 2015-12-17.
    425   </li>
    426   <li>
    427    Most of the pre-1970 data entries come from unreliable sources, often
    428    astrology books that lack citations and whose compilers evidently
    429    invented entries when the true facts were unknown, without
    430    reporting which entries were known and which were invented.
    431    These books often contradict each other or give implausible entries,
    432    and on the rare occasions when they are checked they are
    433    typically found to be incorrect.
    434   </li>
    435   <li>
    436    For the UK the tz database relies on years of first-class work done by
    437    Joseph Myers and others; see
    438    "<a href="https://www.polyomino.org.uk/british-time/">History of
    439    legal time in Britain</a>".
    440    Other countries are not done nearly as well.
    441   </li>
    442   <li>
    443    Sometimes, different people in the same city would maintain clocks
    444    that differed significantly.  Railway time was used by railroad
    445    companies (which did not always agree with each other),
    446    church-clock time was used for birth certificates, etc.
    447    Often this was merely common practice, but sometimes it was set by law.
    448    For example, from 1891 to 1911 the UT offset in France was legally
    449    0:09:21 outside train stations and 0:04:21 inside.
    450   </li>
    451   <li>
    452    Although a named location in the tz database stands for the
    453    containing region, its pre-1970 data entries are often accurate for
    454    only a small subset of that region.  For example, <code>Europe/London</code>
    455    stands for the United Kingdom, but its pre-1847 times are valid
    456    only for locations that have London's exact meridian, and its 1847
    457    transition to GMT is known to be valid only for the L&amp;NW and the
    458    Caledonian railways.
    459   </li>
    460   <li>
    461    The tz database does not record the earliest time for which a zone's
    462    data entries are thereafter valid for every location in the region.
    463    For example, <code>Europe/London</code> is valid for all locations in its
    464    region after GMT was made the standard time, but the date of
    465    standardization (1880-08-02) is not in the tz database, other than
    466    in commentary.  For many zones the earliest time of validity is
    467    unknown.
    468   </li>
    469   <li>
    470    The tz database does not record a region's boundaries, and in many
    471    cases the boundaries are not known.  For example, the zone
    472    <code>America/Kentucky/Louisville</code> represents a region around
    473    the city of
    474    Louisville, the boundaries of which are unclear.
    475   </li>
    476   <li>
    477    Changes that are modeled as instantaneous transitions in the tz
    478    database were often spread out over hours, days, or even decades.
    479   </li>
    480   <li>
    481    Even if the time is specified by law, locations sometimes
    482    deliberately flout the law.
    483   </li>
    484   <li>
    485    Early timekeeping practices, even assuming perfect clocks, were
    486    often not specified to the accuracy that the tz database requires.
    487   </li>
    488   <li>
    489    Sometimes historical timekeeping was specified more precisely
    490    than what the tz database can handle.  For example, from 1909 to
    491    1937 Netherlands clocks were legally UT +00:19:32.13, but the tz
    492    database cannot represent the fractional second.
    493   </li>
    494   <li>
    495    Even when all the timestamp transitions recorded by the tz database
    496    are correct, the tz rules that generate them may not faithfully
    497    reflect the historical rules.  For example, from 1922 until World
    498    War II the UK moved clocks forward the day following the third
    499    Saturday in April unless that was Easter, in which case it moved
    500    clocks forward the previous Sunday.  Because the tz database has no
    501    way to specify Easter, these exceptional years are entered as
    502    separate tz Rule lines, even though the legal rules did not change.
    503   </li>
    504   <li>
    505    The tz database models pre-standard time using the proleptic Gregorian
    506    calendar and local mean time (LMT), but many people used other
    507    calendars and other timescales.  For example, the Roman Empire used
    508    the Julian calendar, and had 12 varying-length daytime hours with a
    509    non-hour-based system at night.
    510   </li>
    511   <li>
    512    Early clocks were less reliable, and data entries do not represent
    513    clock error.
    514   </li>
    515   <li>
    516    The tz database assumes Universal Time (UT) as an origin, even
    517    though UT is not standardized for older timestamps.  In the tz
    518    database commentary, UT denotes a family of time standards that
    519    includes Coordinated Universal Time (UTC) along with other variants
    520    such as UT1 and GMT, with days starting at midnight.  Although UT
    521    equals UTC for modern timestamps, UTC was not defined until 1960,
    522    so commentary uses the more-general abbreviation UT for timestamps
    523    that might predate 1960.  Since UT, UT1, etc. disagree slightly,
    524    and since pre-1972 UTC seconds varied in length, interpretation of
    525    older timestamps can be problematic when subsecond accuracy is
    526    needed.
    527   </li>
    528   <li>
    529    Civil time was not based on atomic time before 1972, and we don't
    530    know the history of earth's rotation accurately enough to map SI
    531    seconds to historical solar time to more than about one-hour
    532    accuracy.  See: Stephenson FR, Morrison LV, Hohenkerk CY.
    533    <a href="http://dx.doi.org/10.1098/rspa.2016.0404">Measurement
    534    of the Earth's rotation: 720 BC to AD 2015</a>.
    535    <cite>Proc Royal Soc A</cite>. 2016 Dec 7;472:20160404.
    536    Also see: Espenak F. <a
    537    href="https://eclipse.gsfc.nasa.gov/SEhelp/uncertainty2004.html">Uncertainty
    538    in Delta T (T)</a>.
    539   </li>
    540   <li>
    541    The relationship between POSIX time (that is, UTC but ignoring leap
    542    seconds) and UTC is not agreed upon after 1972.  Although the POSIX
    543    clock officially stops during an inserted leap second, at least one
    544    proposed standard has it jumping back a second instead; and in
    545    practice POSIX clocks more typically either progress glacially during
    546    a leap second, or are slightly slowed while near a leap second.
    547   </li>
    548   <li>
    549    The tz database does not represent how uncertain its information is.
    550    Ideally it would contain information about when data entries are
    551    incomplete or dicey.  Partial temporal knowledge is a field of
    552    active research, though, and it's not clear how to apply it here.
    553   </li>
    554 </ul>
    555 <p>
    556 In short, many, perhaps most, of the tz database's pre-1970 and future
    557 timestamps are either wrong or misleading.  Any attempt to pass the
    558 tz database off as the definition of time should be unacceptable to
    559 anybody who cares about the facts.  In particular, the tz database's
    560 LMT offsets should not be considered meaningful, and should not prompt
    561 creation of zones merely because two locations differ in LMT or
    562 transitioned to standard time at different dates.
    563 </p>
    564   </section>
    565 
    566 
    567   <section>
    568     <h2 id="functions">Time and date functions</h2>
    569 <p>
    570 The tz code contains time and date functions that are upwards
    571 compatible with those of POSIX.
    572 </p>
    573 
    574 <p>
    575 POSIX has the following properties and limitations.
    576 </p>
    577 <ul>
    578   <li>
    579     <p>
    580 	In POSIX, time display in a process is controlled by the
    581 	environment variable TZ.  Unfortunately, the POSIX TZ string takes
    582 	a form that is hard to describe and is error-prone in practice.
    583 	Also, POSIX TZ strings can't deal with other (for example, Israeli)
    584 	daylight saving time rules, or situations where more than two
    585 	time zone abbreviations are used in an area.
    586     </p>
    587     <p>
    588       The POSIX TZ string takes the following form:
    589     </p>
    590     <p>
    591       <var>stdoffset</var>[<var>dst</var>[<var>offset</var>][<code>,</code><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]]]
    592     </p>
    593     <p>
    594 	where:
    595     <dl>
    596       <dt><var>std</var> and <var>dst</var></dt><dd>
    597 		are 3 or more characters specifying the standard
    598 		and daylight saving time (DST) zone names.
    599 		Starting with POSIX.1-2001, <var>std</var>
    600 		and <var>dst</var> may also be
    601 		in a quoted form like '<code>&lt;UTC+10&gt;</code>'; this allows
    602 		"<code>+</code>" and "<code>-</code>" in the names.
    603       </dd>
    604       <dt><var>offset</var></dt><dd>
    605 		is of the form
    606 		'<code>[&plusmn;]<var>hh</var>:[<var>mm</var>[:<var>ss</var>]]</code>'
    607 		and specifies the offset west of UT.  '<var>hh</var>'
    608 		may be a single digit; 0&le;<var>hh</var>&le;24.
    609 		The default DST offset is one hour ahead of standard time.
    610       </dd>
    611       <dt><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]</dt><dd>
    612 		specifies the beginning and end of DST.  If this is absent,
    613 		the system supplies its own rules for DST, and these can
    614 		differ from year to year; typically US DST rules are used.
    615       </dd>
    616       <dt><var>time</var></dt><dd>
    617 		takes the form
    618 		'<var>hh</var><code>:</code>[<var>mm</var>[<code>:</code><var>ss</var>]]'
    619 		and defaults to 02:00.
    620 		This is the same format as the offset, except that a
    621 		leading '<code>+</code>' or '<code>-</code>' is not allowed.
    622       </dd>
    623       <dt><var>date</var></dt><dd>
    624 		takes one of the following forms:
    625 	<dl>
    626 	  <dt>J<var>n</var> (1&le;<var>n</var>&le;365)</dt><dd>
    627 			origin-1 day number not counting February 29
    628           </dd>
    629 	  <dt><var>n</var> (0&le;<var>n</var>&le;365)</dt><dd>
    630 			origin-0 day number counting February 29 if present
    631           </dd>
    632 	  <dt><code>M</code><var>m</var><code>.</code><var>n</var><code>.</code><var>d</var> (0[Sunday]&le;<var>d</var>&le;6[Saturday], 1&le;<var>n</var>&le;5, 1&le;<var>m</var>&le;12)</dt><dd>
    633 			for the <var>d</var>th day of
    634 			week <var>n</var> of month <var>m</var> of the
    635 			year, where week 1 is the first week in which
    636 			day <var>d</var> appears, and '<code>5</code>'
    637 			stands for the last week in which
    638 			day <var>d</var> appears
    639 			(which may be either the 4th or 5th week).
    640 			Typically, this is the only useful form;
    641 			the <var>n</var>
    642 			and <code>J</code><var>n</var> forms are
    643 			rarely used.
    644 	  </dd>
    645 </dl>
    646 </dd>
    647 </dl>
    648 	Here is an example POSIX TZ string for New Zealand after 2007.
    649 	It says that standard time (NZST) is 12 hours ahead of UTC,
    650 	and that daylight saving time (NZDT) is observed from September's
    651 	last Sunday at 02:00 until April's first Sunday at 03:00:
    652 
    653         <pre><code>TZ='NZST-12NZDT,M9.5.0,M4.1.0/3'</code></pre>
    654 
    655 	This POSIX TZ string is hard to remember, and mishandles some
    656 	timestamps before 2008.  With this package you can use this
    657 	instead:
    658 
    659 	<pre><code>TZ='Pacific/Auckland'</code></pre>
    660   </li>
    661   <li>
    662 	POSIX does not define the exact meaning of TZ values like
    663 	"<code>EST5EDT</code>".
    664 	Typically the current US DST rules are used to interpret such values,
    665 	but this means that the US DST rules are compiled into each program
    666 	that does time conversion.  This means that when US time conversion
    667 	rules change (as in the United States in 1987), all programs that
    668 	do time conversion must be recompiled to ensure proper results.
    669   </li>
    670   <li>
    671 	The TZ environment variable is process-global, which makes it hard
    672 	to write efficient, thread-safe applications that need access
    673 	to multiple time zones.
    674   </li>
    675   <li>
    676 	In POSIX, there's no tamper-proof way for a process to learn the
    677 	system's best idea of local wall clock.  (This is important for
    678 	applications that an administrator wants used only at certain
    679 	times &ndash;
    680 	without regard to whether the user has fiddled the TZ environment
    681 	variable.  While an administrator can "do everything in UTC" to get
    682 	around the problem, doing so is inconvenient and precludes handling
    683 	daylight saving time shifts - as might be required to limit phone
    684 	calls to off-peak hours.)
    685   </li>
    686   <li>
    687 	POSIX provides no convenient and efficient way to determine the UT
    688 	offset and time zone abbreviation of arbitrary timestamps,
    689 	particularly for time zone settings that do not fit into the
    690 	POSIX model.
    691   </li>
    692   <li>
    693 	POSIX requires that systems ignore leap seconds.
    694   </li>
    695   <li>
    696 	The tz code attempts to support all the <code>time_t</code>
    697 	implementations allowed by POSIX.  The <code>time_t</code>
    698 	type represents a nonnegative count of
    699 	seconds since 1970-01-01 00:00:00 UTC, ignoring leap seconds.
    700 	In practice, <code>time_t</code> is usually a signed 64- or
    701 	32-bit integer; 32-bit signed <code>time_t</code> values stop
    702 	working after 2038-01-19 03:14:07 UTC, so
    703 	new implementations these days typically use a signed 64-bit integer.
    704 	Unsigned 32-bit integers are used on one or two platforms,
    705 	and 36-bit and 40-bit integers are also used occasionally.
    706 	Although earlier POSIX versions allowed <code>time_t</code> to be a
    707 	floating-point type, this was not supported by any practical
    708 	systems, and POSIX.1-2013 and the tz code both
    709 	require <code>time_t</code>
    710 	to be an integer type.
    711   </li>
    712 </ul>
    713 <p>
    714 These are the extensions that have been made to the POSIX functions:
    715 </p>
    716 <ul>
    717   <li>
    718     <p>
    719 	The TZ environment variable is used in generating the name of a file
    720 	from which time zone information is read (or is interpreted a la
    721 	POSIX); TZ is no longer constrained to be a three-letter time zone
    722 	name followed by a number of hours and an optional three-letter
    723 	daylight time zone name.  The daylight saving time rules to be used
    724 	for a particular time zone are encoded in the time zone file;
    725 	the format of the file allows U.S., Australian, and other rules to be
    726 	encoded, and allows for situations where more than two time zone
    727 	abbreviations are used.
    728     </p>
    729     <p>
    730 	It was recognized that allowing the TZ environment variable to
    731 	take on values such as '<code>America/New_York</code>' might
    732 	cause "old" programs
    733 	(that expect TZ to have a certain form) to operate incorrectly;
    734 	consideration was given to using some other environment variable
    735 	(for example, TIMEZONE) to hold the string used to generate the
    736 	time zone information file name.  In the end, however, it was decided
    737 	to continue using TZ: it is widely used for time zone purposes;
    738 	separately maintaining both TZ and TIMEZONE seemed a nuisance;
    739 	and systems where "new" forms of TZ might cause problems can simply
    740 	use TZ values such as "<code>EST5EDT</code>" which can be used both by
    741 	"new" programs (a la POSIX) and "old" programs (as zone names and
    742 	offsets).
    743     </p>
    744 </li>
    745 <li>
    746 	The code supports platforms with a UT offset member
    747 	in <code>struct tm</code>,
    748 	e.g., <code>tm_gmtoff</code>.
    749 </li>
    750 <li>
    751 	The code supports platforms with a time zone abbreviation member in
    752 	<code>struct tm</code>, e.g., <code>tm_zone</code>.
    753 </li>
    754 <li>
    755 	Since the TZ environment variable can now be used to control time
    756 	conversion, the <code>daylight</code>
    757 	and <code>timezone</code> variables are no longer needed.
    758 	(These variables are defined and set by <code>tzset</code>;
    759 	however, their values will not be used
    760 	by <code>localtime</code>.)
    761 </li>
    762 <li>
    763 	Functions <code>tzalloc</code>, <code>tzfree</code>,
    764 	<code>localtime_rz</code>, and <code>mktime_z</code> for
    765 	more-efficient thread-safe applications that need to use
    766 	multiple time zones.  The <code>tzalloc</code>
    767 	and <code>tzfree</code> functions allocate and free objects of
    768 	type <code>timezone_t</code>, and <code>localtime_rz</code>
    769 	and <code>mktime_z</code> are like <code>localtime_r</code>
    770 	and <code>mktime</code> with an extra
    771 	<code>timezone_t</code> argument.  The functions were inspired
    772 	by NetBSD.
    773 </li>
    774 <li>
    775 	A function <code>tzsetwall</code> has been added to arrange
    776 	for the system's
    777 	best approximation to local wall clock time to be delivered by
    778 	subsequent calls to <code>localtime</code>.  Source code for portable
    779 	applications that "must" run on local wall clock time should call
    780 	<code>tzsetwall</code>; if such code is moved to "old" systems that don't
    781 	provide tzsetwall, you won't be able to generate an executable program.
    782 	(These time zone functions also arrange for local wall clock time to be
    783 	used if tzset is called &ndash; directly or indirectly &ndash;
    784 	and there's no TZ
    785 	environment variable; portable applications should not, however, rely
    786 	on this behavior since it's not the way SVR2 systems behave.)
    787 </li>
    788 <li>
    789 	Negative <code>time_t</code> values are supported, on systems
    790 	where <code>time_t</code> is signed.
    791 </li>
    792 <li>
    793 	These functions can account for leap seconds, thanks to Bradley White.
    794 </li>
    795 </ul>
    796 <p>
    797 Points of interest to folks with other systems:
    798 </p>
    799 <ul>
    800   <li>
    801 	Code compatible with this package is already part of many platforms,
    802 	including GNU/Linux, Android, the BSDs, Chromium OS, Cygwin, AIX, iOS,
    803 	BlackBery 10, macOS, Microsoft Windows, OpenVMS, and Solaris.
    804 	On such hosts, the primary use of this package
    805 	is to update obsolete time zone rule tables.
    806 	To do this, you may need to compile the time zone compiler
    807 	'<code>zic</code>' supplied with this package instead of using
    808 	the system '<code>zic</code>', since the format
    809 	of <code>zic</code>'s input is occasionally extended, and a
    810 	platform may still be shipping an older <code>zic</code>.
    811   </li>
    812   <li>
    813 	The UNIX Version 7 <code>timezone</code> function is not
    814 	present in this package;
    815 	it's impossible to reliably map timezone's arguments (a "minutes west
    816 	of GMT" value and a "daylight saving time in effect" flag) to a
    817 	time zone abbreviation, and we refuse to guess.
    818 	Programs that in the past used the timezone function may now examine
    819 	<code>localtime(&amp;clock)-&gt;tm_zone</code>
    820 	(if <code>TM_ZONE</code> is defined) or
    821 	<code>tzname[localtime(&amp;clock)-&gt;tm_isdst]</code>
    822 	(if <code>HAVE_TZNAME</code> is defined)
    823 	to learn the correct time zone abbreviation to use.
    824   </li>
    825   <li>
    826 	The 4.2BSD <code>gettimeofday</code> function is not used in
    827 	this package.
    828 	This formerly let users obtain the current UTC offset and DST flag,
    829 	but this functionality was removed in later versions of BSD.
    830   </li>
    831   <li>
    832 	In SVR2, time conversion fails for near-minimum or near-maximum
    833 	<code>time_t</code> values when doing conversions for places
    834 	that don't use UT.
    835 	This package takes care to do these conversions correctly.
    836 	A comment in the source code tells how to get compatibly wrong
    837 	results.
    838   </li>
    839 </ul>
    840 <p>
    841 The functions that are conditionally compiled
    842 if <code>STD_INSPIRED</code> is defined
    843 should, at this point, be looked on primarily as food for thought.  They are
    844 not in any sense "standard compatible" &ndash; some are not, in fact,
    845 specified in <em>any</em> standard.  They do, however, represent responses of
    846 various authors to
    847 standardization proposals.
    848 </p>
    849 
    850 <p>
    851 Other time conversion proposals, in particular the one developed by folks at
    852 Hewlett Packard, offer a wider selection of functions that provide capabilities
    853 beyond those provided here.  The absence of such functions from this package
    854 is not meant to discourage the development, standardization, or use of such
    855 functions.  Rather, their absence reflects the decision to make this package
    856 contain valid extensions to POSIX, to ensure its broad acceptability.  If
    857 more powerful time conversion functions can be standardized, so much the
    858 better.
    859 </p>
    860   </section>
    861 
    862 
    863   <section>
    864     <h2 id="stability">Interface stability</h2>
    865 <p>
    866 The tz code and data supply the following interfaces:
    867 </p>
    868 <ul>
    869   <li>
    870    A set of zone names as per "<a href="#naming">Names of time zone
    871    rules</a>" above.
    872   </li>
    873   <li>
    874    Library functions described in "<a href="#functions">Time and date
    875    functions</a>" above.
    876   </li>
    877   <li>
    878    The programs <code>tzselect</code>, <code>zdump</code>,
    879    and <code>zic</code>, documented in their man pages.
    880   </li>
    881   <li>
    882    The format of <code>zic</code> input files, documented in
    883    the <code>zic</code> man page.
    884   </li>
    885   <li>
    886    The format of <code>zic</code> output files, documented in
    887    the <code>tzfile</code> man page.
    888   </li>
    889   <li>
    890    The format of zone table files, documented in <code>zone1970.tab</code>.
    891   </li>
    892   <li>
    893    The format of the country code file, documented in <code>iso3166.tab</code>.
    894   </li>
    895   <li>
    896    The version number of the code and data, as the first line of
    897    the text file '<code>version</code>' in each release.
    898   </li>
    899 </ul>
    900 <p>
    901 Interface changes in a release attempt to preserve compatibility with
    902 recent releases.  For example, tz data files typically do not rely on
    903 recently-added <code>zic</code> features, so that users can run
    904 older <code>zic</code> versions to process newer data
    905 files.  <a href="tz-link.htm">Sources for time zone and daylight
    906 saving time data</a> describes how
    907 releases are tagged and distributed.
    908 </p>
    909 
    910 <p>
    911 Interfaces not listed above are less stable.  For example, users
    912 should not rely on particular UT offsets or abbreviations for
    913 timestamps, as data entries are often based on guesswork and these
    914 guesses may be corrected or improved.
    915 </p>
    916   </section>
    917 
    918 
    919   <section>
    920     <h2 id="calendar">Calendrical issues</h2>
    921 <p>
    922 Calendrical issues are a bit out of scope for a time zone database,
    923 but they indicate the sort of problems that we would run into if we
    924 extended the time zone database further into the past.  An excellent
    925 resource in this area is Nachum Dershowitz and Edward M. Reingold,
    926 <cite><a href="https://www.cs.tau.ac.il/~nachum/calendar-book/third-edition/">Calendrical
    927 Calculations: Third Edition</a></cite>, Cambridge University Press (2008).
    928 Other information and sources are given in the file '<samp>calendars</samp>'
    929 in the tz distribution.  They sometimes disagree.
    930 </p>
    931   </section>
    932 
    933 
    934   <section>
    935     <h2 id="planets">Time and time zones on other planets</h2>
    936 <p>
    937 Some people's work schedules use Mars time.  Jet Propulsion Laboratory
    938 (JPL) coordinators have kept Mars time on and off at least since 1997
    939 for the Mars Pathfinder mission.  Some of their family members have
    940 also adapted to Mars time.  Dozens of special Mars watches were built
    941 for JPL workers who kept Mars time during the Mars Exploration
    942 Rovers mission (2004).  These timepieces look like normal Seikos and
    943 Citizens but use Mars seconds rather than terrestrial seconds.
    944 </p>
    945 
    946 <p>
    947 A Mars solar day is called a "sol" and has a mean period equal to
    948 about 24 hours 39 minutes 35.244 seconds in terrestrial time.  It is
    949 divided into a conventional 24-hour clock, so each Mars second equals
    950 about 1.02749125 terrestrial seconds.
    951 </p>
    952 
    953 <p>
    954 The prime meridian of Mars goes through the center of the crater
    955 Airy-0, named in honor of the British astronomer who built the
    956 Greenwich telescope that defines Earth's prime meridian.  Mean solar
    957 time on the Mars prime meridian is called Mars Coordinated Time (MTC).
    958 </p>
    959 
    960 <p>
    961 Each landed mission on Mars has adopted a different reference for
    962 solar time keeping, so there is no real standard for Mars time zones.
    963 For example, the Mars Exploration Rover project (2004) defined two
    964 time zones "Local Solar Time A" and "Local Solar Time B" for its two
    965 missions, each zone designed so that its time equals local true solar
    966 time at approximately the middle of the nominal mission.  Such a "time
    967 zone" is not particularly suited for any application other than the
    968 mission itself.
    969 </p>
    970 
    971 <p>
    972 Many calendars have been proposed for Mars, but none have achieved
    973 wide acceptance.  Astronomers often use Mars Sol Date (MSD) which is a
    974 sequential count of Mars solar days elapsed since about 1873-12-29
    975 12:00 GMT.
    976 </p>
    977 
    978 <p>
    979 In our solar system, Mars is the planet with time and calendar most
    980 like Earth's.  On other planets, Sun-based time and calendars would
    981 work quite differently.  For example, although Mercury's sidereal
    982 rotation period is 58.646 Earth days, Mercury revolves around the Sun
    983 so rapidly that an observer on Mercury's equator would see a sunrise
    984 only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury
    985 day.  Venus is more complicated, partly because its rotation is
    986 slightly retrograde: its year is 1.92 of its days.  Gas giants like
    987 Jupiter are trickier still, as their polar and equatorial regions
    988 rotate at different rates, so that the length of a day depends on
    989 latitude.  This effect is most pronounced on Neptune, where the day is
    990 about 12 hours at the poles and 18 hours at the equator.
    991 </p>
    992 
    993 <p>
    994 Although the tz database does not support time on other planets, it is
    995 documented here in the hopes that support will be added eventually.
    996 </p>
    997 
    998 <p>
    999 Sources:
   1000 </p>
   1001 <ul>
   1002   <li>
   1003 Michael Allison and Robert Schmunk,
   1004 "<a href="https://www.giss.nasa.gov/tools/mars24/help/notes.html">Technical
   1005 Notes on Mars Solar Time as Adopted by the Mars24 Sunclock</a>"
   1006 (2012-08-08).
   1007   </li>
   1008   <li>
   1009 Jia-Rui Chong,
   1010 "<a href="http://articles.latimes.com/2004/jan/14/science/sci-marstime14">Workdays
   1011 Fit for a Martian</a>", Los Angeles Times
   1012 (2004-01-14), pp A1, A20-A21.
   1013   </li>
   1014   <li>
   1015 Tom Chmielewski,
   1016 "<a href="https://www.theatlantic.com/technology/archive/2015/02/jet-lag-is-worse-on-mars/386033/">Jet
   1017 Lag Is Worse on Mars</a>", The Atlantic (2015-02-26)
   1018   </li>
   1019   <li>
   1020 Matt Williams,
   1021 "<a href="https://www.universetoday.com/37481/days-of-the-planets/">How
   1022 long is a day on the other planets of the solar system?</a>"
   1023 (2017-04-27).
   1024   </li>
   1025 </ul>
   1026   </section>
   1027 
   1028   <footer>
   1029     <hr>
   1030 This file is in the public domain, so clarified as of 2009-05-17 by
   1031 Arthur David Olson.
   1032   </footer>
   1033 </body>
   1034 </html>
   1035