Home | History | Annotate | Line # | Download | only in docs
      1 
      2                       How to enhance XKB configuration
      3 
      4                          Kamil Toman, Ivan U. Pascal
      5 
      6                               25 November 2002
      7 
      8                                   Abstract
      9 
     10      This guide is aimed at alleviating one's labour when creating a
     11      new (internationalized) keyboard layout. Unlike other documents,
     12      this guide emphasizes the keymap developer's point of view.
     13 
     14 
     15 1.  Overview
     16 
     17 The developer of a new layout should read the XKB protocol specification
     18 (The X Keyboard Extension: Protocol Specification [1]) at least to clarify
     19 for themselves some XKB-specific terms used in this document and elsewhere
     20 in XKB configuration. It is also wise to understand how the X server and
     21 a client digest their keyboard inputs (with and without XKB).
     22 
     23 Another useful source is Ivan Pascal's text about XKB configuration [2].
     24 
     25    [1] https://www.x.org/docs/XKB/XKBproto.pdf
     26    [2] http://pascal.tsu.ru/en/xkb/
     27 
     28 Note that this document covers only enhancements which are to be made to
     29 XFree86 versions 4.3.x and newer.
     30 
     31 
     32 2.  The Basics
     33 
     34 At boottime (or later at the user's command) the X server starts its xkb
     35 keyboard extension module and reads data from a compiled configuration file.
     36 
     37 This compiled configuration file is prepared by the program xkbcomp which
     38 behaves altogether as an ordinary compiler (see man xkbcomp). Its input are
     39 human-readable xkb configuration files which are verified and then composed
     40 into a useful xkb configuration. Users don't need to mess with xkbcomp them-
     41 selves, for them it is invisible. Usually, it is run upon X server startup.
     42 
     43 As you probably already know, XKB configuration consists of five main
     44 modules:
     45 
     46       Keycodes
     47             Tables that define the translation from keyboard scan codes into
     48             reasonably symbolic names, maximum and minimum valid keycodes,
     49             symbolic aliases, and a description of physically present LED-indica-
     50             tors. The primary sense of this component is to allow definitions
     51             of maps of symbols (see below) to be independent of physical key-
     52             board scancodes. There are two main conventions for symbolic
     53             names (always four bytes long):
     54 
     55                o  names which express some traditional meaning, like <SPCE>
     56                   (which stands for space bar)
     57 
     58                o  names which express a relative position on the keyboard,
     59                   for example <AE01> (the exclamation mark on US keyboards),
     60                   with on its right the keys <AE02>, <AE03>, etc.
     61 
     62       Types
     63             Types describe how the pressed key is affected by active modifiers
     64             (like Shift, Control, Alt, ...). There are several predefined
     65             types which cover most of the usual combinations.
     66 
     67       Compat
     68             The compatibility component defines the internal behaviour of
     69             modifiers. Using the compat component you can assign various
     70             actions (elaborately described in the XKB specification) to key
     71             events. This is also the place where LED-indicators behaviour
     72             is defined.
     73 
     74       Symbols
     75             For i18n purposes, this is the most important table. It defines
     76             what values (=symbols) are assigned to what keycodes (represented
     77             by their symbolic name, see above). More than one value may be
     78             defined for each key and then it depends on the key type and on
     79             the modifiers state (respective compat component) which value
     80             will be the resulting one when the key is pressed.
     81 
     82       Geometry
     83             Geometry files aren't used by xkb itself but they may be used by
     84             some external programs to depict a keyboard image.
     85 
     86 All these components have their files located in the xkb configuration tree,
     87 in subdirectories with the same name (usually in /usr/share/X11/xkb).
     88 
     89 
     90 3.  Enhancing the XKB Configuration
     91 
     92 Most of XKB enhancements are about a need to define new output symbols for
     93 some input key events. In other words, a need to define a new symbol map (for
     94 a new language, or standard, or just to feel more comfortable when typing text).
     95 
     96 What do you need to do? Generally, you have to define the following things:
     97 
     98    o  the map of symbols itself
     99 
    100    o  the rules to allow users to select the new mapping
    101 
    102    o  the description of the new layout
    103 
    104 First of all, it is good to go through existing layouts and to examine them
    105 to see if there is something you could easily adjust to fit your needs. Even
    106 if there is nothing similar, you may get some ideas about the basic concepts
    107 and used tricks.
    108 
    109 3.1  Levels and Groups
    110 
    111 Since XFree86 4.3.0, you can use multiple layouts in the xkb configuration.
    112 Though still within the boundaries of the xkb protocol and its general ideas,
    113 the keymap designer must obey new rules when creating new maps. In exchange
    114 we get a more powerful and cleaner configuration system.
    115 
    116 Remember that it is the application which must decide which symbol matches
    117 which keycode according to the effective modifier state. The X server itself
    118 sends only an input event message. Of course, usually the interpretation is
    119 done by Xlib, Xaw, Motif, Qt, Gtk, or similar libraries. The X server only
    120 supplies its mapping table (usually upon application startup).
    121 
    122 You can think of the X server's symbol table as of an irregular table where
    123 each keycode has its row and where each combination of modifiers determines
    124 exactly one column. The resulting cell then gives the proper symbolic value.
    125 Not all keycodes need to bind different values for different combinations of
    126 modifiers. The <ENTER> key, for instance, usually doesn't depend on any modi-
    127 fiers so it has in its row only one column defined.
    128 
    129 Note that in XKB there is no prior assumption that certain modifiers are
    130 bound to certain columns. By editing the proper files (see Key Types, below)
    131 this mapping can be changed as well.
    132 
    133 Unlike the original X protocol, the XKB approach is far more flexible.
    134 XKB introduces one additional term: the group. You can think of a group
    135 as of a vector of columns per keycode (naturally the dimension of this
    136 vector may differ for different keycodes). What is it good for? The group is
    137 not very useful unless you intend to use more than one logically different
    138 set of symbols (like more than one alphabet) defined in a single mapping ta-
    139 ble. But then the group has a natural meaning: each symbol set has its own
    140 group and changing it means selecting a different one. The XKB approach allows
    141 up to four different groups. The columns inside each group are called (shift)
    142 levels. The X server knows what the current group is and reports it together
    143 with the modifier state and the keycode in key events.
    144 
    145 To sum it up:
    146 
    147    o  for each keycode the XKB keyboard map contains up to four one-dimensional
    148       tables - groups (logically different symbol sets)
    149 
    150    o  for each group of a keycode the XKB keyboard map contains some columns -
    151       shift levels (values reached by combinations of Shift, Ctrl, Alt, ...
    152       modifiers)
    153 
    154    o  different keycodes can have different number of groups
    155 
    156    o  different groups of one keycode can have different number of shift levels
    157 
    158    o  the current group number is tracked by the X server
    159 
    160 It is clear that if you sanely define levels and groups, and sanely bind modi-
    161 fiers and associated actions, you can have loaded simultaneously up to four
    162 different symbol sets where each of them would reside in its own group.
    163 
    164 The multi-layout concept provides a facility to manipulate xkb groups and
    165 symbol definitions in a way that allows almost arbitrary composition of
    166 predefined symbol tables. To keep it fully functional you have to:
    167 
    168    o  define all symbols only in the first group
    169 
    170    o  (re)define any modifiers with extra care to avoid strange (anisometric)
    171       behaviour
    172 
    173 
    174 4.  Defining New Layouts
    175 
    176 See "Some Words About XKB internals" [3] for an explanation of used XKB
    177 terms and problems addressed by the XKB extension.
    178 
    179 See "Common notes about XKB configuration files language" [4] for a more
    180 precise explanation of the syntax of XKB configuration files.
    181 
    182    [3] http://pascal.tsu.ru/en/xkb/internals.html
    183    [4] http://pascal.tsu.ru/en/xkb/gram-common.html
    184 
    185 4.1  Predefined XKB Symbol Sets
    186 
    187 If you are about to define some European symbol map extension, you might want
    188 to use one of four predefined Latin alphabet layouts.
    189 
    190 Okay, let's assume you want to extend an existing keymap and you want to over-
    191 ride a few keys. Let's take a simple U.K. keyboard as an example (defined in
    192 pc/gb):
    193 
    194      default partial alphanumeric_keys
    195      xkb_symbols "basic" {
    196 
    197        include "pc/latin"
    198 
    199        name[Group1]="Great Britain";
    200 
    201        key <AE02>  { [         2,   quotedbl,  twosuperior,    oneeighth ] };
    202        key <AE03>  { [         3,   sterling, threesuperior,    sterling ] };
    203        key <AC11>  { [apostrophe,         at, dead_circumflex, dead_caron] };
    204        key <TLDE>  { [     grave,    notsign,          bar,          bar ] };
    205        key <BKSL>  { [numbersign, asciitilde,   dead_grave,   dead_breve ] };
    206 
    207        key <RALT>  { type[Group1]="TWO_LEVEL",
    208                      [ ISO_Level3_Shift, Multi_key ] };
    209 
    210        modifier_map Mod5   { <RALT> };
    211      };
    212 
    213 It defines a new layout in the basic variant as an extension of a common latin
    214 alphabet layout. The layout (symbol set) name is set to "Great Britain".
    215 Then there are redefinitions of a few keycodes and a modifier binding. As
    216 you can see, the number of shift levels is the same for the <AE02>, <AE03>,
    217 <AC11>, <TLDE> and <BKSL> keys but it differs from the number of shift
    218 levels of <RALT>.
    219 
    220 Note that the <RALT> key itself is a binding key for Mod5 and that it serves
    221 like a shift modifier for LevelThree, and together with Shift as a Compose key.
    222 It is a good habit to respect this rule in a new similar layout.
    223 
    224 Okay, you could now define more variants of your new layout besides basic
    225 simply by including (augmenting/overriding/...) the basic definition and
    226 altering what may be needed.
    227 
    228 4.2  Key Types
    229 
    230 The differences in the number of columns (shift levels) are caused by the
    231 different types of the keys (see the Types definition in section The Basics).
    232 Most keycodes have implicitly set the keytype in the included "pc/latin" file
    233 to "FOUR_LEVEL_ALPHABETIC". The only exception is the <RALT> keycode which is
    234 explicitly set "TWO_LEVEL" keytype.
    235 
    236 All those names refer to pre-defined shift level schemes. Usually you can
    237 choose a suitable shift level scheme from the default types scheme list in
    238 the proper xkb component's subdirectory.
    239 
    240 The most used schemes are:
    241 
    242       ONE_LEVEL
    243             The key does not depend on any modifiers. The symbol from the
    244             first level is always chosen.
    245 
    246       TWO_LEVEL
    247             The key uses the modifier Shift and may have two possible values.
    248             The second level is chosen by the Shift modifier. If the Lock
    249             modifier (usually Caps-lock) applies, the symbol is further
    250             processed using system-specific capitalization rules. If both
    251             the Shift and Lock modifiers apply, the symbol from the second
    252             level is taken and capitalization rules are applied (but usually
    253             have no effect).
    254 
    255       ALPHABETIC
    256             The key uses the modifiers Shift and Lock. It may have two
    257             possible values. The second level is chosen by Shift. When the
    258             Lock modifier applies, the symbol from the first level is taken
    259             and further processed using system-specific capitalization rules.
    260             If both the Shift and Lock modifiers apply, the symbol from the
    261             first level is taken and no capitalization rules are applied.
    262             This is often called shift-cancels-caps behaviour.
    263 
    264       THREE_LEVEL
    265             Is the same as TWO_LEVEL but it considers an extra modifier:
    266             LevelThree, which can be used to gain the symbol value from the
    267             third level. If both the Shift and LevelThree modifiers apply,
    268             the value from the third level is taken. As in TWO_LEVEL, the
    269             Lock modifier doesn't influence the resulting level - only Shift
    270             and LevelThree are taken into consideration. If the Lock modifier
    271             is active, capitalization rules are applied to the resulting
    272             symbol.
    273 
    274       FOUR_LEVEL
    275             Is the same as THREE_LEVEL but, unlike THREE_LEVEL, if both the
    276             Shift and LevelThree modifiers apply, the symbol is taken from
    277             the fourth level.
    278 
    279       FOUR_LEVEL_ALPHABETIC
    280             Is similar to FOUR_LEVEL but also defines shift-cancels-caps
    281             behaviour as in ALPHABETIC. If both Lock and LevelThree apply,
    282             the symbol from the third level is taken and the capitalization
    283             rules are applied. If all three modifiers (Lock and Shift and
    284             LevelThree) apply, the symbol from the third level is taken and
    285             no capitalization rules are applied
    286 
    287       KEYPAD
    288             As the name suggest, this scheme is primarily used for numeric
    289             keypads. The scheme considers two modifiers: Shift and NumLock.
    290             If none of the modifiers applies, the symbol from the first level
    291             is taken. If either the Shift or the NumLock modifier apply, the
    292             symbol from the second level is taken. If both the Shift and the
    293             NumLock modifier apply, the symbol from the first level is taken.
    294             Again, a shift-cancels-caps variant.
    295 
    296       FOUR_LEVEL_KEYPAD
    297             Is similar to the KEYPAD scheme but considers also the LevelThree
    298             modifier. If the LevelThree modifier applies, the symbol from the
    299             third level is taken. If both Shift and LevelThree or NumLock and
    300             LevelThree apply, the symbol from the fourth level is taken. If
    301             all three (Shift+NumLock+LevelThree) apply, the symbol from the
    302             third level is taken. This also is a shift-cancels-caps variant.
    303 
    304       FOUR_LEVEL_MIXED_KEYPAD
    305             A four-level keypad scheme where the first two levels behave like
    306             the KEYPAD scheme (with Shift and NumLock). The LevelThree modifier
    307             acts as an override, providing access to two normally Shift-ed
    308             levels: when LevelThree is active we ignore the NumLock state.
    309             Intended for the digit area of the keypad.
    310 
    311       FOUR_LEVEL_X
    312             A four-level scheme where the base level accepts no modifier,
    313             LevelThree provides two more Shift-ed levels (levels 2 and 3),
    314             and Ctrl plus Alt command the fourth level. Intended for the
    315             operator part of a keypad, though since NumLock plays no part,
    316             it is not keypad-specific.
    317 
    318 Besides that, there are some schemes for special purposes:
    319 
    320       PC_CONTROL_LEVEL2
    321             Similar to the TWO_LEVEL scheme but it considers the Control
    322             modifier rather than Shift. That means, the symbol from the
    323             second level is chosen by Control rather than by Shift.
    324 
    325       PC_ALT_LEVEL2
    326             Similar to the TWO_LEVEL scheme but it considers the Alt
    327             modifier rather than Shift. That means, the symbol from
    328             the second level is chosen by Alt rather than by Shift.
    329 
    330       CTRL+ALT
    331             The key uses the modifiers Alt and Control. It may have two
    332             possible values. If just one modifier (Alt or Control) applies,
    333             the symbol from the first level is chosen. Only if both the Alt
    334             and Control modifiers apply, the symbol from the second level
    335             is chosen.
    336 
    337       SHIFT+ALT
    338             The key uses the modifiers Shift and Alt. It may have two
    339             possible values. If just one modifier (Alt or Shift) applies,
    340             the symbol from the first level is chosen. Only if both the
    341             Alt and Shift modifiers apply, the symbol from the second
    342             level is chosen.
    343 
    344 If needed, special caps schemes may be used. They redefine the standard
    345 behaviour of all *ALPHABETIC types. The layouts (maps of symbols) with keys
    346 defined in respective types then automatically change their behaviour accord-
    347 ingly. Possible redefinitions are:
    348 
    349    o internal
    350 
    351    o internal_nocancel
    352 
    353    o shift
    354 
    355    o shift_nocancel
    356 
    357 None of these schemes should be used directly. They are defined merely for
    358 the 'caps:' xkb option (used to globally change the layouts behaviour).
    359 
    360 Don't alter any of the existing key types. If you need a different behaviour,
    361 create a new type.
    362 
    363 4.2.1  More on Definitions of Types
    364 
    365 When the XKB software deals with a separate type description, it gets a com-
    366 plete list of modifiers that should be taken into account from the 'modi-
    367 fiers=<list of modifiers>' list and expects a set of 'map[<combination of
    368 modifiers>]=<level indication>' instructions that contain the mapping for
    369 each combination of modifiers mentioned in that list. Modifiers that are not
    370 explicitly listed are NOT taken into account when the resulting shift level
    371 is computed. If some combination is omitted, the program (subroutine) should
    372 choose the first level for this combination (a quite reasonable behavior).
    373 
    374 Let's consider an example with two modifiers, ModOne and ModTwo:
    375 
    376      type "..." {
    377          modifiers = ModOne+ModTwo;
    378          map[None] = Level1;
    379          map[ModOne] = Level2;
    380      };
    381 
    382 In this case the map has a statement for ModOne only and ModOne+ModTwo is
    383 omitted. This means that if ModTwo is active, the subroutine can't find an
    384 explicit mapping for this combination and will use the default level, i.e.
    385 Level1.
    386 
    387 But in the case that the type is described as:
    388 
    389      type "..." {
    390          modifiers = ModOne;
    391          map[None] = Level1;
    392          map[ModOne] = Level2;
    393      };
    394 
    395 the ModTwo will not be taken into account and the resulting level depends on
    396 the ModOne state only. That means, ModTwo alone produces the Level1 but the
    397 combination ModOne+ModTwo (as well as ModOne alone) produces the Level2.
    398 
    399 What does it mean if the second modifier is not ModTwo but Lock? It means that
    400 in the first case (Lock itself is included in the list of modifiers but combina-
    401 tions with this modifier aren't mentioned in the map statements) the internal
    402 capitalization rules will be applied to the symbol from the first level. But
    403 in the second case the capitalization will be applied to the symbol chosen
    404 accordingly to the first modifier - and this can be the symbol from the first
    405 as well as from the second level.
    406 
    407 Usually, all modifiers introduced in 'modifiers=<list of modifiers>' list are
    408 used for shift level calculation and then discarded. Sometimes this is not
    409 desirable. If you want to use a modifier for shift level calculation but you
    410 don't want to discard it, you may list it in 'preserve[<combination of modi-
    411 fiers>]=<list of modifiers>'. That means, for a given combination all listed
    412 modifiers will be preserved. If the Lock modifier is preserved then the
    413 resulting symbol is passed to the internal capitalization routine regardless
    414 whether it has been used for a shift level calculation or not.
    415 
    416 Any key type description can use both real and virtual modifiers. Since real
    417 modifiers always have standard names it is not necessary to explicitly
    418 declare them. Virtual modifiers can have arbitrary names and must be declared
    419 (prior to using them) directly in the key type definition:
    420 
    421      virtual_modifiers <comma-separated list of modifiers> ;
    422 
    423 as seen in for example the basic, pc, or mousekeys key type definitions.
    424 
    425 4.3  Rules
    426 
    427 Once you are finished with your symbol map you need to add it to the rules file.
    428 The rules file describes how all the five basic components (keycodes, types,
    429 compat, symbols, and geometry) should be composed to give a sensible resulting
    430 xkb configuration.
    431 
    432 The main advantage of rules over formerly used keymaps is the possibility to
    433 simply parameterize (once) fixed patterns of configurations and thus to ele-
    434 gantly allow substitutions of various local configurations into predefined
    435 templates.
    436 
    437 A pattern in a rules file (often located in /usr/share/X11/xkb/rules) can be
    438 parameterized with four other arguments: Model, Layout, Variant, and Options.
    439 For most cases the parameters Model and Layout should be sufficient for choosing
    440 a functional keyboard mapping.
    441 
    442 The rules file itself is composed of pattern lines and lines with rules. Each
    443 pattern line starts with an exclamation mark ('!') and describes how XKB will
    444 interpret the subsequent lines (rules). A sample rules file looks like this:
    445 
    446      ! model                   =   keycodes
    447        macintosh_old           =   macintosh
    448        ...
    449        *                       =   xfree86
    450 
    451      ! model                   =   symbols
    452        hp                      =   +inet(%m)
    453        microsoftpro            =   +inet(%m)
    454        geniuscomfy             =   +inet(%m)
    455 
    456      ! model       layout[1]   =   symbols
    457        macintosh   us          =   macintosh/us%(v[1])
    458        *           *           =   pc/pc(%m)+pc/%l[1]%(v[1])
    459 
    460      ! model       layout[2]   =   symbols
    461        macintosh   us          =   +macintosh/us[2]%(v[2]):2
    462        *           *           =   +pc/%l[2]%(v[2]):2
    463 
    464      ! option                  =   types
    465        caps:internal           =   +caps(internal)
    466        caps:internal_nocancel  =   +caps(internal_nocancel)
    467 
    468 Each rule defines what a certain combination of values on the left side of the
    469 equals sign ('=') results in. For example, a (keyboard) model macintosh_old
    470 instructs xkb to take definitions of keycodes from file keycodes/macintosh
    471 while the rest of the models (represented by a wildcard '*') instructs it to
    472 take them from file keycodes/xfree86. The wildcard represents all possible
    473 values on the left side which were not found in any of the previous rules.
    474 The more specialized (more complete) rules have higher precedence than gen-
    475 eral ones, i.e. the more general rules supply reasonable default values.
    476 
    477 As you can see some lines contain substitution parameters - the parameters
    478 preceded by the percent sign ('%'). The first alphabetical character after
    479 the percent sign expands to the value which has been found on the left side.
    480 For example +%l%(v) expands into +cz(bksl) if the respective values on the
    481 left side were cz layout in its bksl variant. More, if the layout resp. vari-
    482 ant  parameter is followed by a pair of brackets ('[', ']') it means that xkb
    483 should place the layout resp. variant into the specified xkb group. If the
    484 brackets are omitted, the first group is the default value.
    485 
    486 So the second block of rules enhances symbol definitions for some particular
    487 keyboard models with extra keys (for internet, multimedia, ...) . Other mod-
    488 els are left intact. Similarly, the last block overrides some key type defi-
    489 nitions, so the common global behaviour ''shift cancels caps'' or ''shift
    490 doesn't cancel caps'' can be selected. The rest of the rules produce special
    491 symbols for each US variant of the macintosh keyboard, and standard pc symbols
    492 in appropriate variants as a default.
    493 
    494 4.4  Descriptive Files of Rules
    495 
    496 Now you just need to add a detailed description to the <rules>.xml description
    497 file so that other users (and external programs which often parse this file)
    498 know what your work is about.
    499 
    500 4.4.1  Old Descriptive Files
    501 
    502 The formerly used descriptive files were named <rules>.lst. Its structure is
    503 very simple and quite self descriptive but such simplicity had also some cav-
    504 ities, for example there was no way how to describe local variants of layouts
    505 and there were problems with the localization of descriptions. To preserve
    506 compatibility with some older programs, new XML descriptive files can be con-
    507 verted to the old '.lst' format.
    508 
    509 The meaning of each possible parameter of the rules file should be described.
    510 For the sample rules file given above, the .lst file could look like this:
    511 
    512      ! model
    513        pc104        Generic 104-key PC
    514        microsoft    Microsoft Natural
    515        pc98         PC-98xx Series
    516        macintosh    Original Macintosh
    517        ...
    518 
    519      ! layout
    520        us      U.S. English
    521        cz      Czech
    522        de      German
    523        ...
    524 
    525      ! option
    526        caps:internal           uses internal capitalization, Shift cancels Caps
    527        caps:internal_nocancel  uses internal capitalization, Shift doesn't cancel Caps
    528 
    529 And that should be it. Enjoy creating your own xkb mapping.
    530