Home | History | Annotate | Line # | Download | only in gdb
      1 /* Interface to prologue value handling for GDB.
      2    Copyright (C) 2003-2024 Free Software Foundation, Inc.
      3 
      4    This file is part of GDB.
      5 
      6    This program is free software; you can redistribute it and/or modify
      7    it under the terms of the GNU General Public License as published by
      8    the Free Software Foundation; either version 3 of the License, or
      9    (at your option) any later version.
     10 
     11    This program is distributed in the hope that it will be useful,
     12    but WITHOUT ANY WARRANTY; without even the implied warranty of
     13    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     14    GNU General Public License for more details.
     15 
     16    You should have received a copy of the GNU General Public License
     17    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
     18 
     19 #ifndef GDB_PROLOGUE_VALUE_H
     20 #define GDB_PROLOGUE_VALUE_H
     21 
     22 /* What sort of value is this?  This determines the interpretation
     23    of subsequent fields.  */
     24 enum prologue_value_kind
     25 {
     26   /* We don't know anything about the value.  This is also used for
     27      values we could have kept track of, when doing so would have
     28      been too complex and we don't want to bother.  The bottom of
     29      our lattice.  */
     30   pvk_unknown,
     31 
     32   /* A known constant.  K is its value.  */
     33   pvk_constant,
     34 
     35   /* The value that register REG originally had *UPON ENTRY TO THE
     36      FUNCTION*, plus K.  If K is zero, this means, obviously, just
     37      the value REG had upon entry to the function.  REG is a GDB
     38      register number.  Before we start interpreting, we initialize
     39      every register R to { pvk_register, R, 0 }.  */
     40   pvk_register,
     41 };
     42 
     43 /* When we analyze a prologue, we're really doing 'abstract
     44    interpretation' or 'pseudo-evaluation': running the function's code
     45    in simulation, but using conservative approximations of the values
     46    it would have when it actually runs.  For example, if our function
     47    starts with the instruction:
     48 
     49       addi r1, 42     # add 42 to r1
     50 
     51    we don't know exactly what value will be in r1 after executing this
     52    instruction, but we do know it'll be 42 greater than its original
     53    value.
     54 
     55    If we then see an instruction like:
     56 
     57       addi r1, 22     # add 22 to r1
     58 
     59    we still don't know what r1's value is, but again, we can say it is
     60    now 64 greater than its original value.
     61 
     62    If the next instruction were:
     63 
     64       mov r2, r1      # set r2 to r1's value
     65 
     66    then we can say that r2's value is now the original value of r1
     67    plus 64.
     68 
     69    It's common for prologues to save registers on the stack, so we'll
     70    need to track the values of stack frame slots, as well as the
     71    registers.  So after an instruction like this:
     72 
     73       mov (fp+4), r2
     74 
     75    then we'd know that the stack slot four bytes above the frame
     76    pointer holds the original value of r1 plus 64.
     77 
     78    And so on.
     79 
     80    Of course, this can only go so far before it gets unreasonable.  If
     81    we wanted to be able to say anything about the value of r1 after
     82    the instruction:
     83 
     84       xor r1, r3      # exclusive-or r1 and r3, place result in r1
     85 
     86    then things would get pretty complex.  But remember, we're just
     87    doing a conservative approximation; if exclusive-or instructions
     88    aren't relevant to prologues, we can just say r1's value is now
     89    'unknown'.  We can ignore things that are too complex, if that loss
     90    of information is acceptable for our application.
     91 
     92    So when I say "conservative approximation" here, what I mean is an
     93    approximation that is either accurate, or marked "unknown", but
     94    never inaccurate.
     95 
     96    Once you've reached the current PC, or an instruction that you
     97    don't know how to simulate, you stop.  Now you can examine the
     98    state of the registers and stack slots you've kept track of.
     99 
    100    - To see how large your stack frame is, just check the value of the
    101      stack pointer register; if it's the original value of the SP
    102      minus a constant, then that constant is the stack frame's size.
    103      If the SP's value has been marked as 'unknown', then that means
    104      the prologue has done something too complex for us to track, and
    105      we don't know the frame size.
    106 
    107    - To see where we've saved the previous frame's registers, we just
    108      search the values we've tracked --- stack slots, usually, but
    109      registers, too, if you want --- for something equal to the
    110      register's original value.  If the ABI suggests a standard place
    111      to save a given register, then we can check there first, but
    112      really, anything that will get us back the original value will
    113      probably work.
    114 
    115    Sure, this takes some work.  But prologue analyzers aren't
    116    quick-and-simple pattern patching to recognize a few fixed prologue
    117    forms any more; they're big, hairy functions.  Along with inferior
    118    function calls, prologue analysis accounts for a substantial
    119    portion of the time needed to stabilize a GDB port.  So I think
    120    it's worthwhile to look for an approach that will be easier to
    121    understand and maintain.  In the approach used here:
    122 
    123    - It's easier to see that the analyzer is correct: you just see
    124      whether the analyzer properly (albeit conservatively) simulates
    125      the effect of each instruction.
    126 
    127    - It's easier to extend the analyzer: you can add support for new
    128      instructions, and know that you haven't broken anything that
    129      wasn't already broken before.
    130 
    131    - It's orthogonal: to gather new information, you don't need to
    132      complicate the code for each instruction.  As long as your domain
    133      of conservative values is already detailed enough to tell you
    134      what you need, then all the existing instruction simulations are
    135      already gathering the right data for you.
    136 
    137    A 'struct prologue_value' is a conservative approximation of the
    138    real value the register or stack slot will have.  */
    139 
    140 struct prologue_value {
    141 
    142   /* What sort of value is this?  This determines the interpretation
    143      of subsequent fields.  */
    144   enum prologue_value_kind kind;
    145 
    146   /* The meanings of the following fields depend on 'kind'; see the
    147      comments for the specific 'kind' values.  */
    148   int reg;
    149   CORE_ADDR k;
    150 };
    151 
    152 typedef struct prologue_value pv_t;
    153 
    154 
    155 /* Return the unknown prologue value --- { pvk_unknown, ?, ? }.  */
    156 pv_t pv_unknown (void);
    157 
    158 /* Return the prologue value representing the constant K.  */
    159 pv_t pv_constant (CORE_ADDR k);
    160 
    161 /* Return the prologue value representing the original value of
    162    register REG, plus the constant K.  */
    163 pv_t pv_register (int reg, CORE_ADDR k);
    164 
    165 
    166 /* Return conservative approximations of the results of the following
    167    operations.  */
    168 pv_t pv_add (pv_t a, pv_t b);               /* a + b */
    169 pv_t pv_add_constant (pv_t v, CORE_ADDR k); /* a + k */
    170 pv_t pv_subtract (pv_t a, pv_t b);          /* a - b */
    171 pv_t pv_logical_and (pv_t a, pv_t b);       /* a & b */
    172 
    173 
    174 /* Return non-zero iff A and B are identical expressions.
    175 
    176    This is not the same as asking if the two values are equal; the
    177    result of such a comparison would have to be a pv_boolean, and
    178    asking whether two 'unknown' values were equal would give you
    179    pv_maybe.  Same for comparing, say, { pvk_register, R1, 0 } and {
    180    pvk_register, R2, 0}.
    181 
    182    Instead, this function asks whether the two representations are the
    183    same.  */
    184 int pv_is_identical (pv_t a, pv_t b);
    185 
    186 
    187 /* Return non-zero if A is known to be a constant.  */
    188 int pv_is_constant (pv_t a);
    189 
    190 /* Return non-zero if A is the original value of register number R
    191    plus some constant, zero otherwise.  */
    192 int pv_is_register (pv_t a, int r);
    193 
    194 
    195 /* Return non-zero if A is the original value of register R plus the
    196    constant K.  */
    197 int pv_is_register_k (pv_t a, int r, CORE_ADDR k);
    198 
    199 /* A conservative boolean type, including "maybe", when we can't
    200    figure out whether something is true or not.  */
    201 enum pv_boolean {
    202   pv_maybe,
    203   pv_definite_yes,
    204   pv_definite_no,
    205 };
    206 
    207 
    208 /* Decide whether a reference to SIZE bytes at ADDR refers exactly to
    209    an element of an array.  The array starts at ARRAY_ADDR, and has
    210    ARRAY_LEN values of ELT_SIZE bytes each.  If ADDR definitely does
    211    refer to an array element, set *I to the index of the referenced
    212    element in the array, and return pv_definite_yes.  If it definitely
    213    doesn't, return pv_definite_no.  If we can't tell, return pv_maybe.
    214 
    215    If the reference does touch the array, but doesn't fall exactly on
    216    an element boundary, or doesn't refer to the whole element, return
    217    pv_maybe.  */
    218 enum pv_boolean pv_is_array_ref (pv_t addr, CORE_ADDR size,
    219 				 pv_t array_addr, CORE_ADDR array_len,
    220 				 CORE_ADDR elt_size,
    221 				 int *i);
    222 
    223 
    224 /* A 'pv_area' keeps track of values stored in a particular region of
    225    memory.  */
    226 class pv_area
    227 {
    228 public:
    229 
    230   /* Create a new area, tracking stores relative to the original value
    231      of BASE_REG.  If BASE_REG is SP, then this effectively records the
    232      contents of the stack frame: the original value of the SP is the
    233      frame's CFA, or some constant offset from it.
    234 
    235      Stores to constant addresses, unknown addresses, or to addresses
    236      relative to registers other than BASE_REG will trash this area; see
    237      pv_area::store_would_trash.
    238 
    239      To check whether a pointer refers to this area, only the low
    240      ADDR_BIT bits will be compared.  */
    241   pv_area (int base_reg, int addr_bit);
    242 
    243   ~pv_area ();
    244 
    245   DISABLE_COPY_AND_ASSIGN (pv_area);
    246 
    247   /* Store the SIZE-byte value VALUE at ADDR in AREA.
    248 
    249      If ADDR is not relative to the same base register we used in
    250      creating AREA, then we can't tell which values here the stored
    251      value might overlap, and we'll have to mark everything as
    252      unknown.  */
    253   void store (pv_t addr,
    254 	      CORE_ADDR size,
    255 	      pv_t value);
    256 
    257   /* Return the SIZE-byte value at ADDR in AREA.  This may return
    258      pv_unknown ().  */
    259   pv_t fetch (pv_t addr, CORE_ADDR size);
    260 
    261   /* Return true if storing to address ADDR in AREA would force us to
    262      mark the contents of the entire area as unknown.  This could happen
    263      if, say, ADDR is unknown, since we could be storing anywhere.  Or,
    264      it could happen if ADDR is relative to a different register than
    265      the other stores base register, since we don't know the relative
    266      values of the two registers.
    267 
    268      If you've reached such a store, it may be better to simply stop the
    269      prologue analysis, and return the information you've gathered,
    270      instead of losing all that information, most of which is probably
    271      okay.  */
    272   bool store_would_trash (pv_t addr);
    273 
    274   /* Search AREA for the original value of REGISTER.  If we can't find
    275      it, return zero; if we can find it, return a non-zero value, and if
    276      OFFSET_P is non-zero, set *OFFSET_P to the register's offset within
    277      AREA.  GDBARCH is the architecture of which REGISTER is a member.
    278 
    279      In the worst case, this takes time proportional to the number of
    280      items stored in AREA.  If you plan to gather a lot of information
    281      about registers saved in AREA, consider calling pv_area::scan
    282      instead, and collecting all your information in one pass.  */
    283   bool find_reg (struct gdbarch *gdbarch, int reg, CORE_ADDR *offset_p);
    284 
    285 
    286   /* For every part of AREA whose value we know, apply FUNC to CLOSURE,
    287      the value's address, its size, and the value itself.  */
    288   void scan (void (*func) (void *closure,
    289 			   pv_t addr,
    290 			   CORE_ADDR size,
    291 			   pv_t value),
    292 	     void *closure);
    293 
    294 private:
    295 
    296   struct area_entry;
    297 
    298   /* Delete all entries from AREA.  */
    299   void clear_entries ();
    300 
    301   /* Return a pointer to the first entry we hit in AREA starting at
    302      OFFSET and going forward.
    303 
    304      This may return zero, if AREA has no entries.
    305 
    306      And since the entries are a ring, this may return an entry that
    307      entirely precedes OFFSET.  This is the correct behavior: depending
    308      on the sizes involved, we could still overlap such an area, with
    309      wrap-around.  */
    310   struct area_entry *find_entry (CORE_ADDR offset);
    311 
    312   /* Return non-zero if the SIZE bytes at OFFSET would overlap ENTRY;
    313      return zero otherwise.  AREA is the area to which ENTRY belongs.  */
    314   int overlaps (struct area_entry *entry,
    315 		CORE_ADDR offset,
    316 		CORE_ADDR size);
    317 
    318   /* This area's base register.  */
    319   int m_base_reg;
    320 
    321   /* The mask to apply to addresses, to make the wrap-around happen at
    322      the right place.  */
    323   CORE_ADDR m_addr_mask;
    324 
    325   /* An element of the doubly-linked ring of entries, or zero if we
    326      have none.  */
    327   struct area_entry *m_entry;
    328 };
    329 
    330 #endif /* GDB_PROLOGUE_VALUE_H */
    331