Home | History | Annotate | Line # | Download | only in manual
      1 <chapter xmlns="http://docbook.org/ns/docbook" version="5.0"
      2 	 xml:id="std.iterators" xreflabel="Iterators">
      3 <?dbhtml filename="iterators.html"?>
      4 
      5 <info><title>
      6   Iterators
      7   <indexterm><primary>Iterators</primary></indexterm>
      8 </title>
      9   <keywordset>
     10     <keyword>ISO C++</keyword>
     11     <keyword>library</keyword>
     12   </keywordset>
     13 </info>
     14 
     15 
     16 
     17 <!-- Sect1 01 : Predefined -->
     18 <section xml:id="std.iterators.predefined" xreflabel="Predefined"><info><title>Predefined</title></info>
     19 
     20 
     21   <section xml:id="iterators.predefined.vs_pointers" xreflabel="Versus Pointers"><info><title>Iterators vs. Pointers</title></info>
     22 
     23    <para>
     24      The following
     25 FAQ <link linkend="faq.iterator_as_pod">entry</link> points out that
     26 iterators are not implemented as pointers.  They are a generalization
     27 of pointers, but they are implemented in libstdc++ as separate
     28 classes.
     29    </para>
     30    <para>
     31      Keeping that simple fact in mind as you design your code will
     32       prevent a whole lot of difficult-to-understand bugs.
     33    </para>
     34    <para>
     35      You can think of it the other way 'round, even.  Since iterators
     36      are a generalization, that means
     37      that <emphasis>pointers</emphasis> are
     38       <emphasis>iterators</emphasis>, and that pointers can be used
     39      whenever an iterator would be.  All those functions in the
     40      Algorithms section of the Standard will work just as well on plain
     41      arrays and their pointers.
     42    </para>
     43    <para>
     44      That doesn't mean that when you pass in a pointer, it gets
     45       wrapped into some special delegating iterator-to-pointer class
     46       with a layer of overhead.  (If you think that's the case
     47       anywhere, you don't understand templates to begin with...)  Oh,
     48       no; if you pass in a pointer, then the compiler will instantiate
     49       that template using T* as a type, and good old high-speed
     50       pointer arithmetic as its operations, so the resulting code will
     51       be doing exactly the same things as it would be doing if you had
     52       hand-coded it yourself (for the 273rd time).
     53    </para>
     54    <para>
     55      How much overhead <emphasis>is</emphasis> there when using an
     56       iterator class?  Very little.  Most of the layering classes
     57       contain nothing but typedefs, and typedefs are
     58       "meta-information" that simply tell the compiler some
     59       nicknames; they don't create code.  That information gets passed
     60       down through inheritance, so while the compiler has to do work
     61       looking up all the names, your runtime code does not.  (This has
     62       been a prime concern from the beginning.)
     63    </para>
     64 
     65 
     66   </section>
     67 
     68   <section xml:id="iterators.predefined.end" xreflabel="end() Is One Past the End"><info><title>One Past the End</title></info>
     69 
     70 
     71    <para>This starts off sounding complicated, but is actually very easy,
     72       especially towards the end.  Trust me.
     73    </para>
     74    <para>Beginners usually have a little trouble understand the whole
     75       'past-the-end' thing, until they remember their early algebra classes
     76       (see, they <emphasis>told</emphasis> you that stuff would come in handy!) and
     77       the concept of half-open ranges.
     78    </para>
     79    <para>First, some history, and a reminder of some of the funkier rules in
     80       C and C++ for builtin arrays.  The following rules have always been
     81       true for both languages:
     82    </para>
     83    <orderedlist inheritnum="ignore" continuation="restarts">
     84       <listitem>
     85 	<para>You can point anywhere in the array, <emphasis>or to the first element
     86 	  past the end of the array</emphasis>.  A pointer that points to one
     87 	  past the end of the array is guaranteed to be as unique as a
     88 	  pointer to somewhere inside the array, so that you can compare
     89 	  such pointers safely.
     90 	</para>
     91       </listitem>
     92       <listitem>
     93 	<para>You can only dereference a pointer that points into an array.
     94 	  If your array pointer points outside the array -- even to just
     95 	  one past the end -- and you dereference it, Bad Things happen.
     96 	</para>
     97       </listitem>
     98       <listitem>
     99 	<para>Strictly speaking, simply pointing anywhere else invokes
    100 	  undefined behavior.  Most programs won't puke until such a
    101 	  pointer is actually dereferenced, but the standards leave that
    102 	  up to the platform.
    103 	</para>
    104       </listitem>
    105    </orderedlist>
    106    <para>The reason this past-the-end addressing was allowed is to make it
    107       easy to write a loop to go over an entire array, e.g.,
    108       while (*d++ = *s++);.
    109    </para>
    110    <para>So, when you think of two pointers delimiting an array, don't think
    111       of them as indexing 0 through n-1.  Think of them as <emphasis>boundary
    112       markers</emphasis>:
    113    </para>
    114    <programlisting>
    115 
    116    beginning            end
    117      |                   |
    118      |                   |               This is bad.  Always having to
    119      |                   |               remember to add or subtract one.
    120      |                   |               Off-by-one bugs very common here.
    121      V                   V
    122 	array of N elements
    123      |---|---|--...--|---|---|
    124      | 0 | 1 |  ...  |N-2|N-1|
    125      |---|---|--...--|---|---|
    126 
    127      ^                       ^
    128      |                       |
    129      |                       |           This is good.  This is safe.  This
    130      |                       |           is guaranteed to work.  Just don't
    131      |                       |           dereference 'end'.
    132    beginning                end
    133 
    134    </programlisting>
    135    <para>See?  Everything between the boundary markers is chapter of the array.
    136       Simple.
    137    </para>
    138    <para>Now think back to your junior-high school algebra course, when you
    139       were learning how to draw graphs.  Remember that a graph terminating
    140       with a solid dot meant, "Everything up through this point,"
    141       and a graph terminating with an open dot meant, "Everything up
    142       to, but not including, this point," respectively called closed
    143       and open ranges?  Remember how closed ranges were written with
    144       brackets, <emphasis>[a,b]</emphasis>, and open ranges were written with parentheses,
    145       <emphasis>(a,b)</emphasis>?
    146    </para>
    147    <para>The boundary markers for arrays describe a <emphasis>half-open range</emphasis>,
    148       starting with (and including) the first element, and ending with (but
    149       not including) the last element:  <emphasis>[beginning,end)</emphasis>.  See, I
    150       told you it would be simple in the end.
    151    </para>
    152    <para>Iterators, and everything working with iterators, follows this same
    153       time-honored tradition.  A container's <code>begin()</code> method returns
    154       an iterator referring to the first element, and its <code>end()</code>
    155       method returns a past-the-end iterator, which is guaranteed to be
    156       unique and comparable against any other iterator pointing into the
    157       middle of the container.
    158    </para>
    159    <para>Container constructors, container methods, and algorithms, all take
    160       pairs of iterators describing a range of values on which to operate.
    161       All of these ranges are half-open ranges, so you pass the beginning
    162       iterator as the starting parameter, and the one-past-the-end iterator
    163       as the finishing parameter.
    164    </para>
    165    <para>This generalizes very well.  You can operate on sub-ranges quite
    166       easily this way; functions accepting a <emphasis>[first,last)</emphasis> range
    167       don't know or care whether they are the boundaries of an entire {array,
    168       sequence, container, whatever}, or whether they only enclose a few
    169       elements from the center.  This approach also makes zero-length
    170       sequences very simple to recognize:  if the two endpoints compare
    171       equal, then the {array, sequence, container, whatever} is empty.
    172    </para>
    173    <para>Just don't dereference <code>end()</code>.
    174    </para>
    175 
    176   </section>
    177 </section>
    178 
    179 <!-- Sect1 02 : Stream -->
    180 
    181 </chapter>
    182