1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
3 "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
4]>
5
6<article>
7
8  <articleinfo>
9    <!-- Title information -->
10    <title>Distributed Multihead X design</title>
11    <authorgroup>
12      <author><firstname>Kevin E.</firstname><surname>Martin</surname></author>
13      <author><firstname>David H.</firstname><surname>Dawes</surname></author>
14      <author><firstname>Rickard E.</firstname><surname>Faith</surname></author>
15    </authorgroup>
16    <pubdate>29 June 2004 (created 25 July 2001)</pubdate>
17    <abstract><para>
18        This document covers the motivation, background, design, and
19        implementation of the distributed multihead X (DMX) system.  It
20        is a living document and describes the current design and
21        implementation details of the DMX system.  As the project
22        progresses, this document will be continually updated to reflect
23        the changes in the code and/or design.  <emphasis remap="it">Copyright 2001 by VA
24        Linux Systems, Inc., Fremont, California.  Copyright 2001-2004
25        by Red Hat, Inc., Raleigh, North Carolina</emphasis>
26      </para></abstract>
27  </articleinfo>
28
29<!-- Begin the document -->
30<sect1>
31<title>Introduction</title>
32
33<sect2>
34<title>The Distributed Multihead X Server</title>
35
36<para>Current Open Source multihead solutions are limited to a single
37physical machine.  A single X server controls multiple display devices,
38which can be arranged as independent heads or unified into a single
39desktop (with Xinerama).  These solutions are limited to the number of
40physical devices that can co-exist in a single machine (e.g., due to the
41number of AGP/PCI slots available for graphics cards).  Thus, large
42tiled displays are not currently possible.  The work described in this
43paper will eliminate the requirement that the display devices reside in
44the same physical machine.  This will be accomplished by developing a
45front-end proxy X server that will control multiple back-end X servers
46that make up the large display.
47</para>
48
49<para>The overall structure of the distributed multihead X (DMX) project is
50as follows: A single front-end X server will act as a proxy to a set of
51back-end X servers, which handle all of the visible rendering.  X
52clients will connect to the front-end server just as they normally would
53to a regular X server.  The front-end server will present an abstracted
54view to the client of a single large display.  This will ensure that all
55standard X clients will continue to operate without modification
56(limited, as always, by the visuals and extensions provided by the X
57server).  Clients that are DMX-aware will be able to use an extension to
58obtain information about the back-end servers (e.g., for placement of
59pop-up windows, window alignments by the window manager, etc.).
60</para>
61
62<para>The architecture of the DMX server is divided into two main sections:
63input (e.g., mouse and keyboard events) and output (e.g., rendering and
64windowing requests).  Each of these are describe briefly below, and the
65rest of this design document will describe them in greater detail.
66</para>
67
68<para>The DMX server can receive input from three general types of input
69devices: "local" devices that are physically attached to the machine on
70which DMX is running, "backend" devices that are physically attached to
71one or more of the back-end X servers (and that generate events via the
72X protocol stream from the backend), and "console" devices that can be
73abstracted from any non-back-end X server.  Backend and console devices
74are treated differently because the pointer device on the back-end X
75server also controls the location of the hardware X cursor.  Full
76support for XInput extension devices is provided.
77</para>
78
79<para>Rendering requests will be accepted by the front-end server; however,
80rendering to visible windows will be broken down as needed and sent to
81the appropriate back-end server(s) via X11 library calls for actual
82rendering.  The basic framework will follow a Xnest-style approach.  GC
83state will be managed in the front-end server and sent to the
84appropriate back-end server(s) as required.  Pixmap rendering will (at
85least initially) be handled by the front-end X server.  Windowing
86requests (e.g., ordering, mapping, moving, etc.) will handled in the
87front-end server.  If the request requires a visible change, the
88windowing operation will be translated into requests for the appropriate
89back-end server(s).  Window state will be mirrored in the back-end
90server(s) as needed.
91</para>
92</sect2>
93
94<sect2>
95<title>Layout of Paper</title>
96
97<para>The next section describes the general development plan that was
98actually used for implementation.  The final section discusses
99outstanding issues at the conclusion of development.  The first appendix
100provides low-level technical detail that may be of interest to those
101intimately familiar with the X server architecture.  The final appendix
102describes the four phases of development that were performed during the
103first two years of development.
104</para>
105
106<para>The final year of work was divided into 9 tasks that are not
107described in specific sections of this document.  The major tasks during
108that time were the enhancement of the reconfiguration ability added in
109Phase IV, addition of support for a dynamic number of back-end displays
110(instead of a hard-coded limit), and the support for back-end display
111and input removal and addition.  This work is mentioned in this paper,
112but is not covered in detail.
113</para>
114</sect2>
115</sect1>
116
117<!-- ============================================================ -->
118<sect1>
119<title>Development plan</title>
120
121<para>This section describes the development plan from approximately June
1222001 through July 2003.
123</para>
124
125<sect2>
126<title>Bootstrap code</title>
127
128<para>To allow for rapid development of the DMX server by multiple
129developers during the first development stage, the problem will be
130broken down into three tasks: the overall DMX framework, back-end
131rendering services and input device handling services.  However, before
132the work begins on these tasks, a simple framework that each developer
133could use was implemented to bootstrap the development effort.  This
134framework renders to a single back-end server and provides dummy input
135devices (i.e., the keyboard and mouse).  The simple back-end rendering
136service was implemented using the shadow framebuffer support currently
137available in the XFree86 environment.
138</para>
139
140<para>Using this bootstrapping framework, each developer has been able to
141work on each of the tasks listed above independently as follows: the
142framework will be extended to handle arbitrary back-end server
143configurations; the back-end rendering services will be transitioned to
144the more efficient Xnest-style implementation; and, an input device
145framework to handle various input devices via the input extension will
146be developed.
147</para>
148
149<para>Status: The boot strap code is complete.   <!-- August 2001 -->
150</para>
151
152</sect2>
153
154<sect2>
155<title>Input device handling</title>
156
157<para>An X server (including the front-end X server) requires two core
158input devices -- a keyboard and a pointer (mouse).  These core devices
159are handled and required by the core X11 protocol.  Additional types of
160input devices may be attached and utilized via the XInput extension.
161These are usually referred to as ``XInput extension devices'',
162</para>
163
164<para>There are some options as to how the front-end X server gets its core
165input devices:
166
167<orderedlist>
168<listitem>
169    <para>Local Input. The physical input devices (e.g., keyboard and
170    mouse) can be attached directly to the front-end X server.  In this
171    case, the keyboard and mouse on the machine running the front-end X
172    server will be used.  The front-end will have drivers to read the
173    raw input from those devices and convert it into the required X
174    input events (e.g., key press/release, pointer button press/release,
175    pointer motion).  The front-end keyboard driver will keep track of
176    keyboard properties such as key and modifier mappings, autorepeat
177    state, keyboard sound and led state.  Similarly the front-end
178    pointer driver will keep track if pointer properties such as the
179    button mapping and movement acceleration parameters.  With this
180    option, input is handled fully in the front-end X server, and the
181    back-end X servers are used in a display-only mode.  This option was
182    implemented and works for a limited number of Linux-specific
183    devices.  Adding additional local input devices for other
184    architectures is expected to be relatively simple.
185</para>
186
187    <para>The following options are available for implementing local input
188    devices:
189
190<orderedlist>
191<listitem>
192        <para>The XFree86 X server has modular input drivers that could
193        be adapted for this purpose.  The mouse driver supports a wide
194        range of mouse types and interfaces, as well as a range of
195        Operating System platforms.  The keyboard driver in XFree86 is
196        not currently as modular as the mouse driver, but could be made
197        so.  The XFree86 X server also has a range of other input
198        drivers for extended input devices such as tablets and touch
199        screens.  Unfortunately, the XFree86 drivers are generally
200        complex, often simultaneously providing support for multiple
201        devices across multiple architectures; and rely so heavily on
202        XFree86-specific helper-functions, that this option was not
203        pursued.
204</para>
205</listitem>
206
207<listitem>
208        <para>The <command>kdrive</command> X server in XFree86 has built-in drivers that
209        support PS/2 mice and keyboard under Linux.  The mouse driver
210        can indirectly handle other mouse types if the Linux utility
211        <command>gpm</command> is used as to translate the native mouse protocol into
212        PS/2 mouse format.  These drivers could be adapted and built in
213        to the front-end X server if this range of hardware and OS
214        support is sufficient.  While much simpler than the XFree86
215        drivers, the <command>kdrive</command> drivers were not used for the DMX
216        implementation.
217</para>
218</listitem>
219
220<listitem>
221        <para>Reimplementation of keyboard and mouse drivers from
222        scratch for the DMX framework.  Because keyboard and mouse
223        drivers are relatively trivial to implement, this pathway was
224        selected.  Other drivers in the X source tree were referenced,
225        and significant contributions from other drivers are noted in
226        the DMX source code.
227</para>
228</listitem>
229</orderedlist>
230</para>
231</listitem>
232
233<listitem>
234    <para>Backend Input.  The front-end can make use of the core input
235    devices attached to one or more of the back-end X servers.  Core
236    input events from multiple back-ends are merged into a single input
237    event stream.  This can work sanely when only a single set of input
238    devices is used at any given time.  The keyboard and pointer state
239    will be handled in the front-end, with changes propagated to the
240    back-end servers as needed.  This option was implemented and works
241    well.  Because the core pointer on a back-end controls the hardware
242    mouse on that back-end, core pointers cannot be treated as XInput
243    extension devices.  However, all back-end XInput extensions devices
244    can be mapped to either DMX core or DMX XInput extension devices.
245</para>
246</listitem>
247
248<listitem>
249    <para>Console Input.  The front-end server could create a console
250    window that is displayed on an X server independent of the back-end
251    X servers.  This console window could display things like the
252    physical screen layout, and the front-end could get its core input
253    events from events delivered to the console window.  This option was
254    implemented and works well.  To help the human navigate, window
255    outlines are also displayed in the console window.  Further, console
256    windows can be used as either core or XInput extension devices.
257</para>
258</listitem>
259
260<listitem>
261    <para>Other options were initially explored, but they were all
262    partial subsets of the options listed above and, hence, are
263    irrelevant.
264</para>
265</listitem>
266
267</orderedlist>
268</para>
269
270<para>Although extended input devices are not specifically mentioned in the
271Distributed X requirements, the options above were all implemented so
272that XInput extension devices were supported.
273</para>
274
275<para>The bootstrap code (Xdmx) had dummy input devices, and these are
276still supported in the final version.  These do the necessary
277initialization to satisfy the X server's requirements for core pointer
278and keyboard devices, but no input events are ever generated.
279</para>
280
281<para>Status: The input code is complete.  Because of the complexity of the
282XFree86 input device drivers (and their heavy reliance on XFree86
283infrastructure), separate low-level device drivers were implemented for
284Xdmx.  The following kinds of drivers are supported (in general, the
285devices can be treated arbitrarily as "core" input devices or as XInput
286"extension" devices; and multiple instances of different kinds of
287devices can be simultaneously available):
288<orderedlist>
289<listitem>
290        <para> A "dummy" device drive that never generates events.
291</para>
292</listitem>
293
294<listitem>
295        <para> "Local" input is from the low-level hardware on which the
296        Xdmx binary is running.  This is the only area where using the
297        XFree86 driver infrastructure would have been helpful, and then
298        only partially, since good support for generic USB devices does
299        not yet exist in XFree86 (in any case, XFree86 and kdrive driver
300        code was used where possible).  Currently, the following local
301        devices are supported under Linux (porting to other operating
302        systems should be fairly straightforward):
303        <itemizedlist>
304            <listitem><para>Linux keyboard</para></listitem>
305            <listitem><para>Linux serial mouse (MS)</para></listitem>
306            <listitem><para>Linux PS/2 mouse</para></listitem>
307            <listitem><para>USB keyboard</para></listitem>
308            <listitem><para>USB mouse</para></listitem>
309            <listitem><para>USB generic device (e.g., joystick, gamepad, etc.)</para></listitem>
310        </itemizedlist>
311</para>
312</listitem>
313
314<listitem>
315        <para> "Backend" input is taken from one or more of the back-end
316        displays.  In this case, events are taken from the back-end X
317        server and are converted to Xdmx events.  Care must be taken so
318        that the sprite moves properly on the display from which input
319        is being taken.
320</para>
321</listitem>
322
323<listitem>
324        <para> "Console" input is taken from an X window that Xdmx
325        creates on the operator's display (i.e., on the machine running
326        the Xdmx binary).  When the operator's mouse is inside the
327        console window, then those events are converted to Xdmx events.
328        Several special features are available: the console can display
329        outlines of windows that are on the Xdmx display (to facilitate
330        navigation), the cursor can be confined to the console, and a
331        "fine" mode can be activated to allow very precise cursor
332        positioning.
333</para>
334</listitem>
335</orderedlist>
336
337</para>
338
339</sect2>
340
341<!-- May 2002; July 2003 -->
342
343<sect2>
344<title>Output device handling</title>
345
346<para>The output of the DMX system displays rendering and windowing
347requests across multiple screens.  The screens are typically arranged in
348a grid such that together they represent a single large display.
349</para>
350
351<para>The output section of the DMX code consists of two parts.  The first
352is in the front-end proxy X server (Xdmx), which accepts client
353connections, manages the windows, and potentially renders primitives but
354does not actually display any of the drawing primitives.  The second
355part is the back-end X server(s), which accept commands from the
356front-end server and display the results on their screens.
357</para>
358
359<sect3>
360<title>Initialization</title>
361
362<para>The DMX front-end must first initialize its screens by connecting to
363each of the back-end X servers and collecting information about each of
364these screens.  However, the information collected from the back-end X
365servers might be inconsistent.  Handling these cases can be difficult
366and/or inefficient.  For example, a two screen system has one back-end X
367server running at 16bpp while the second is running at 32bpp.
368Converting rendering requests (e.g., XPutImage() or XGetImage()
369requests) to the appropriate bit depth can be very time consuming.
370Analyzing these cases to determine how or even if it is possible to
371handle them is required.  The current Xinerama code handles many of
372these cases (e.g., in PanoramiXConsolidate()) and will be used as a
373starting point.  In general, the best solution is to use homogeneous X
374servers and display devices.  Using back-end servers with the same depth
375is a requirement of the final DMX implementation.
376</para>
377
378<para>Once this screen consolidation is finished, the relative position of
379each back-end X server's screen in the unified screen is initialized.  A
380full-screen window is opened on each of the back-end X servers, and the
381cursor on each screen is turned off.  The final DMX implementation can
382also make use of a partial-screen window, or multiple windows per
383back-end screen.
384</para>
385</sect3>
386
387<sect3>
388<title>Handling rendering requests</title>
389
390<para>After initialization, X applications connect to the front-end server.
391There are two possible implementations of how rendering and windowing
392requests are handled in the DMX system:
393
394<orderedlist>
395<listitem>
396    <para>A shadow framebuffer is used in the front-end server as the
397    render target.  In this option, all protocol requests are completely
398    handled in the front-end server.  All state and resources are
399    maintained in the front-end including a shadow copy of the entire
400    framebuffer.  The framebuffers attached to the back-end servers are
401    updated by XPutImage() calls with data taken directly from the
402    shadow framebuffer.
403</para>
404
405    <para>This solution suffers from two main problems.  First, it does not
406    take advantage of any accelerated hardware available in the system.
407    Second, the size of the XPutImage() calls can be quite large and
408    thus will be limited by the bandwidth available.
409</para>
410
411    <para>The initial DMX implementation used a shadow framebuffer by
412    default.
413</para>
414</listitem>
415
416<listitem>
417    <para>Rendering requests are sent to each back-end server for
418    handling (as is done in the Xnest server described above).  In this
419    option, certain protocol requests are handled in the front-end
420    server and certain requests are repackaged and then sent to the
421    back-end servers.  The framebuffer is distributed across the
422    multiple back-end servers.  Rendering to the framebuffer is handled
423    on each back-end and can take advantage of any acceleration
424    available on the back-end servers' graphics display device.  State
425    is maintained both in the front and back-end servers.
426</para>
427
428    <para>This solution suffers from two main drawbacks.  First, protocol
429    requests are sent to all back-end servers -- even those that will
430    completely clip the rendering primitive -- which wastes bandwidth
431    and processing time.  Second, state is maintained both in the front-
432    and back-end servers.  These drawbacks are not as severe as in
433    option 1 (above) and can either be overcome through optimizations or
434    are acceptable.  Therefore, this option will be used in the final
435    implementation.
436</para>
437
438    <para>The final DMX implementation defaults to this mechanism, but also
439    supports the shadow framebuffer mechanism.  Several optimizations
440    were implemented to eliminate the drawbacks of the default
441    mechanism.  These optimizations are described the section below and
442    in Phase II of the Development Results (see appendix).
443</para>
444</listitem>
445
446</orderedlist>
447</para>
448
449<para>Status: Both the shadow framebuffer and Xnest-style code is complete.
450<!-- May 2002 -->
451</para>
452
453</sect3>
454</sect2>
455
456<sect2>
457<title>Optimizing DMX</title>
458
459<para>Initially, the Xnest-style solution's performance will be measured
460and analyzed to determine where the performance bottlenecks exist.
461There are four main areas that will be addressed.
462</para>
463
464<para>First, to obtain reasonable interactivity with the first development
465phase, XSync() was called after each protocol request.  The XSync()
466function flushes any pending protocol requests.  It then waits for the
467back-end to process the request and send a reply that the request has
468completed.  This happens with each back-end server and performance
469greatly suffers.  As a result of the way XSync() is called in the first
470development phase, the batching that the X11 library performs is
471effectively defeated.  The XSync() call usage will be analyzed and
472optimized by batching calls and performing them at regular intervals,
473except where interactivity will suffer (e.g., on cursor movements).
474</para>
475
476<para>Second, the initial Xnest-style solution described above sends the
477repackaged protocol requests to all back-end servers regardless of
478whether or not they would be completely clipped out.  The requests that
479are trivially rejected on the back-end server wastes the limited
480bandwidth available.  By tracking clipping changes in the DMX X server's
481windowing code (e.g., by opening, closing, moving or resizing windows),
482we can determine whether or not back-end windows are visible so that
483trivial tests in the front-end server's GC ops drawing functions can
484eliminate these unnecessary protocol requests.
485</para>
486
487<para>Third, each protocol request will be analyzed to determine if it is
488possible to break the request into smaller pieces at display boundaries.
489The initial ones to be analyzed are put and get image requests since
490they will require the greatest bandwidth to transmit data between the
491front and back-end servers.  Other protocol requests will be analyzed
492and those that will benefit from breaking them into smaller requests
493will be implemented.
494</para>
495
496<para>Fourth, an extension is being considered that will allow font glyphs to
497be transferred from the front-end DMX X server to each back-end server.
498This extension will permit the front-end to handle all font requests and
499eliminate the requirement that all back-end X servers share the exact
500same fonts as the front-end server.  We are investigating the
501feasibility of this extension during this development phase.
502</para>
503
504<para>Other potential optimizations will be determined from the performance
505analysis.
506</para>
507
508<para>Please note that in our initial design, we proposed optimizing BLT
509operations (e.g., XCopyArea() and window moves) by developing an
510extension that would allow individual back-end servers to directly copy
511pixel data to other back-end servers.  This potential optimization was
512in response to the simple image movement implementation that required
513potentially many calls to GetImage() and PutImage().  However, the
514current Xinerama implementation handles these BLT operations
515differently.  Instead of copying data to and from screens, they generate
516expose events -- just as happens in the case when a window is moved from
517off a screen to on screen.  This approach saves the limited bandwidth
518available between front and back-end servers and is being standardized
519with Xinerama.  It also eliminates the potential setup problems and
520security issues resulting from having each back-end server open
521connections to all other back-end servers.  Therefore, we suggest
522accepting Xinerama's expose event solution.
523</para>
524
525<para>Also note that the approach proposed in the second and third
526optimizations might cause backing store algorithms in the back-end to be
527defeated, so a DMX X server configuration flag will be added to disable
528these optimizations.
529</para>
530
531<para>Status: The optimizations proposed above are complete.  It was
532determined that the using the xfs font server was sufficient and
533creating a new mechanism to pass glyphs was redundant; therefore, the
534fourth optimization proposed above was not included in DMX.
535<!-- September 2002 -->
536</para>
537
538</sect2>
539
540<sect2>
541<title>DMX X extension support</title>
542
543<para>The DMX X server keeps track of all the windowing information on the
544back-end X servers, but does not currently export this information to
545any client applications.  An extension will be developed to pass the
546screen information and back-end window IDs to DMX-aware clients.  These
547clients can then use this information to directly connect to and render
548to the back-end windows.  Bypassing the DMX X server allows DMX-aware
549clients to break up complex rendering requests on their own and send
550them directly to the windows on the back-end server's screens.  An
551example of a client that can make effective use of this extension is
552Chromium.
553</para>
554
555<para>Status: The extension, as implemented, is fully documented in
556"Client-to-Server DMX Extension to the X Protocol".  Future changes
557might be required based on feedback and other proposed enhancements to
558DMX.  Currently, the following facilities are supported:
559<orderedlist>
560<listitem><para>
561        Screen information (clipping rectangle for each screen relative
562        to the virtual screen)
563</para></listitem>
564<listitem><para>
565        Window information (window IDs and clipping information for each
566        back-end window that corresponds to each DMX window)
567</para></listitem>
568<listitem><para>
569        Input device information (mappings from DMX device IDs to
570        back-end device IDs)
571</para></listitem>
572<listitem><para>
573        Force window creation (so that a client can override the
574        server-side lazy window creation optimization)
575</para></listitem>
576<listitem><para>
577        Reconfiguration (so that a client can request that a screen
578        position be changed)
579</para></listitem>
580<listitem><para>
581        Addition and removal of back-end servers and back-end and
582        console inputs.
583</para></listitem>
584</orderedlist>
585</para>
586<!-- September 2002; July 2003 -->
587
588</sect2>
589
590<sect2>
591<title>Common X extension support</title>
592
593<para>The XInput, XKeyboard and Shape extensions are commonly used
594extensions to the base X11 protocol.  XInput allows multiple and
595non-standard input devices to be accessed simultaneously.  These input
596devices can be connected to either the front-end or back-end servers.
597XKeyboard allows much better keyboard mappings control.  Shape adds
598support for arbitrarily shaped windows and is used by various window
599managers.  Nearly all potential back-end X servers make these extensions
600available, and support for each one will be added to the DMX system.
601</para>
602
603<para>In addition to the extensions listed above, support for the X
604Rendering extension (Render) is being developed.  Render adds digital
605image composition to the rendering model used by the X Window System.
606While this extension is still under development by Keith Packard of HP,
607support for the current version will be added to the DMX system.
608</para>
609
610<para>Support for the XTest extension was added during the first
611development phase.
612</para>
613
614<!-- WARNING: this list is duplicated in the Phase IV discussion -->
615<para>Status: The following extensions are supported and are discussed in
616more detail in Phase IV of the Development Results (see appendix):
617    BIG-REQUESTS,
618    DEC-XTRAP,
619    DMX,
620    DPMS,
621    Extended-Visual-Information,
622    GLX,
623    LBX,
624    RECORD,
625    RENDER,
626    SECURITY,
627    SHAPE,
628    SYNC,
629    X-Resource,
630    XC-APPGROUP,
631    XC-MISC,
632    XFree86-Bigfont,
633    XINERAMA,
634    XInputExtension,
635    XKEYBOARD, and
636    XTEST.
637<!-- November 2002; updated February 2003, July 2003 -->
638</para>
639</sect2>
640
641<sect2>
642<title>OpenGL support</title>
643
644<para>OpenGL support using the Mesa code base exists in XFree86 release 4
645and later.  Currently, the direct rendering infrastructure (DRI)
646provides accelerated OpenGL support for local clients and unaccelerated
647OpenGL support (i.e., software rendering) is provided for non-local
648clients.
649</para>
650
651<para>The single head OpenGL support in XFree86 4.x will be extended to use
652the DMX system.  When the front and back-end servers are on the same
653physical hardware, it is possible to use the DRI to directly render to
654the back-end servers.  First, the existing DRI will be extended to
655support multiple display heads, and then to support the DMX system.
656OpenGL rendering requests will be direct rendering to each back-end X
657server.  The DRI will request the screen layout (either from the
658existing Xinerama extension or a DMX-specific extension).  Support for
659synchronized swap buffers will also be added (on hardware that supports
660it).  Note that a single front-end server with a single back-end server
661on the same physical machine can emulate accelerated indirect rendering.
662</para>
663
664<para>When the front and back-end servers are on different physical
665hardware or are using non-XFree86 4.x X servers, a mechanism to render
666primitives across the back-end servers will be provided.  There are
667several options as to how this can be implemented.
668</para>
669
670<orderedlist>
671<listitem>
672    <para>The existing OpenGL support in each back-end server can be
673    used by repackaging rendering primitives and sending them to each
674    back-end server.  This option is similar to the unoptimized
675    Xnest-style approach mentioned above.  Optimization of this solution
676    is beyond the scope of this project and is better suited to other
677    distributed rendering systems.
678</para></listitem>
679
680<listitem>
681    <para>Rendering to a pixmap in the front-end server using the
682    current XFree86 4.x code, and then displaying to the back-ends via
683    calls to XPutImage() is another option.  This option is similar to
684    the shadow frame buffer approach mentioned above.  It is slower and
685    bandwidth intensive, but has the advantage that the back-end servers
686    are not required to have OpenGL support.
687</para></listitem>
688</orderedlist>
689
690<para>These, and other, options will be investigated in this phase of the
691work.
692</para>
693
694<para>Work by others have made Chromium DMX-aware.  Chromium will use the
695DMX X protocol extension to obtain information about the back-end
696servers and will render directly to those servers, bypassing DMX.
697</para>
698
699<para>Status: OpenGL support by the glxProxy extension was implemented by
700SGI and has been integrated into the DMX code base.
701</para>
702<!-- May 2003-->
703</sect2>
704
705</sect1>
706
707<!-- ============================================================ -->
708<sect1>
709<title>Current issues</title>
710
711<para>In this sections the current issues are outlined that require further
712investigation.
713</para>
714
715<sect2>
716<title>Fonts</title>
717
718<para>The font path and glyphs need to be the same for the front-end and
719each of the back-end servers.  Font glyphs could be sent to the back-end
720servers as necessary but this would consume a significant amount of
721available bandwidth during font rendering for clients that use many
722different fonts (e.g., Netscape).  Initially, the font server (xfs) will
723be used to provide the fonts to both the front-end and back-end servers.
724Other possibilities will be investigated during development.
725</para>
726</sect2>
727
728<sect2>
729<title>Zero width rendering primitives</title>
730
731<para>To allow pixmap and on-screen rendering to be pixel perfect, all
732back-end servers must render zero width primitives exactly the same as
733the front-end renders the primitives to pixmaps.  For those back-end
734servers that do not exactly match, zero width primitives will be
735automatically converted to one width primitives.  This can be handled in
736the front-end server via the GC state.
737</para>
738</sect2>
739
740<sect2>
741<title>Output scaling</title>
742
743<para>With very large tiled displays, it might be difficult to read the
744information on the standard X desktop.  In particular, the cursor can be
745easily lost and fonts could be difficult to read.  Automatic primitive
746scaling might prove to be very useful.  We will investigate the
747possibility of scaling the cursor and providing a set of alternate
748pre-scaled fonts to replace the standard fonts that many applications
749use (e.g., fixed).  Other options for automatic scaling will also be
750investigated.
751</para>
752</sect2>
753
754<sect2>
755<title>Per-screen colormaps</title>
756
757<para>Each screen's default colormap in the set of back-end X servers
758should be able to be adjusted via a configuration utility.  This support
759is would allow the back-end screens to be calibrated via custom gamma
760tables.  On 24-bit systems that support a DirectColor visual, this type
761of correction can be accommodated.  One possible implementation would be
762to advertise to X client of the DMX server a TrueColor visual while
763using DirectColor visuals on the back-end servers to implement this type
764of color correction.  Other options will be investigated.
765</para>
766</sect2>
767</sect1>
768
769<!-- ============================================================ -->
770<appendix>
771<title>Appendix</title>
772
773<sect1>
774<title>Background</title>
775
776<para>This section describes the existing Open Source architectures that
777can be used to handle multiple screens and upon which this development
778project is based.  This section was written before the implementation
779was finished, and may not reflect actual details of the implementation.
780It is left for historical interest only.
781</para>
782
783<sect2>
784<title>Core input device handling</title>
785
786<para>The following is a description of how core input devices are handled
787by an X server.
788</para>
789
790<sect3>
791<title>InitInput()</title>
792
793<para>InitInput() is a DDX function that is called at the start of each
794server generation from the X server's main() function.  Its purpose is
795to determine what input devices are connected to the X server, register
796them with the DIX and MI layers, and initialize the input event queue.
797InitInput() does not have a return value, but the X server will abort if
798either a core keyboard device or a core pointer device are not
799registered.  Extended input (XInput) devices can also be registered in
800InitInput().
801</para>
802
803<para>InitInput() usually has implementation specific code to determine
804which input devices are available.  For each input device it will be
805using, it calls AddInputDevice():
806
807<variablelist>
808<varlistentry>
809<term>AddInputDevice()</term>
810<listitem><para>This DIX function allocates the device structure,
811registers a callback function (which handles device init, close, on and
812off), and returns the input handle, which can be treated as opaque.  It
813is called once for each input device.
814</para></listitem>
815</varlistentry>
816</variablelist>
817</para>
818
819<para>Once input handles for core keyboard and core pointer devices have
820been obtained from AddInputDevice().  If both core devices are not
821registered, then the X server will exit with a fatal error when it
822attempts to start the input devices in InitAndStartDevices(), which is
823called directly after InitInput() (see below).
824</para>
825
826<para>The core pointer device is then registered with the miPointer code
827(which does the high level cursor handling).  While this registration
828is not necessary for correct miPointer operation in the current XFree86
829code, it is still done mostly for compatibility reasons.
830</para>
831
832<para><variablelist>
833
834<varlistentry>
835<term>miRegisterPointerDevice()</term>
836<listitem><para>This MI function registers the core
837pointer's input handle with with the miPointer code.
838</para></listitem></varlistentry>
839</variablelist>
840</para>
841
842<para>The final part of InitInput() is the initialization of the input
843event queue handling.  In most cases, the event queue handling provided
844in the MI layer is used.  The primary XFree86 X server uses its own
845event queue handling to support some special cases related to the XInput
846extension and the XFree86-specific DGA extension.  For our purposes, the
847MI event queue handling should be suitable.  It is initialized by
848calling mieqInit():
849
850<variablelist>
851<varlistentry>
852<term>mieqInit()</term>
853<listitem><para>This MI function initializes the MI event queue for the
854core devices, and is passed the public component of the input handles
855for the two core devices.
856</para></listitem></varlistentry>
857</variablelist>
858</para>
859
860<para>If a wakeup handler is required to deliver synchronous input
861events, it can be registered here by calling the DIX function
862RegisterBlockAndWakeupHandlers().  (See the devReadInput() description
863below.)
864</para>
865</sect3>
866
867<sect3>
868<title>InitAndStartDevices()</title>
869
870<para>InitAndStartDevices() is a DIX function that is called immediately
871after InitInput() from the X server's main() function.  Its purpose is
872to initialize each input device that was registered with
873AddInputDevice(), enable each input device that was successfully
874initialized, and create the list of enabled input devices.  Once each
875registered device is processed in this way, the list of enabled input
876devices is checked to make sure that both a core keyboard device and
877core pointer device were registered and successfully enabled.  If not,
878InitAndStartDevices() returns failure, and results in the the X server
879exiting with a fatal error.
880</para>
881
882<para>Each registered device is initialized by calling its callback
883(dev-&gt;deviceProc) with the DEVICE_INIT argument:
884
885<variablelist>
886<varlistentry>
887<term>(*dev-&gt;deviceProc)(dev, DEVICE_INIT)</term>
888<listitem>
889<para>This function initializes the
890device structs with core information relevant to the device.
891</para>
892
893<para>For pointer devices, this means specifying the number of buttons,
894default button mapping, the function used to get motion events (usually
895miPointerGetMotionEvents()), the function used to change/control the
896core pointer motion parameters (acceleration and threshold), and the
897motion buffer size.
898</para>
899
900<para>For keyboard devices, this means specifying the keycode range,
901default keycode to keysym mapping, default modifier mapping, and the
902functions used to sound the keyboard bell and modify/control the
903keyboard parameters (LEDs, bell pitch and duration, key click, which
904keys are auto-repeating, etc).
905</para></listitem></varlistentry>
906</variablelist>
907</para>
908
909<para>Each initialized device is enabled by calling EnableDevice():
910
911<variablelist>
912<varlistentry>
913<term>EnableDevice()</term>
914<listitem>
915<para>EnableDevice() calls the device callback with
916DEVICE_ON:
917    <variablelist>
918    <varlistentry>
919    <term>(*dev-&gt;deviceProc)(dev, DEVICE_ON)</term>
920    <listitem>
921    <para>This typically opens and
922    initializes the relevant physical device, and when appropriate,
923    registers the device's file descriptor (or equivalent) as a valid
924    input source.
925    </para></listitem></varlistentry>
926    </variablelist>
927    </para>
928
929    <para>EnableDevice() then adds the device handle to the X server's
930    global list of enabled devices.
931</para></listitem></varlistentry>
932</variablelist>
933</para>
934
935<para>InitAndStartDevices() then verifies that a valid core keyboard and
936pointer has been initialized and enabled.  It returns failure if either
937are missing.
938</para>
939</sect3>
940
941<sect3>
942<title>devReadInput()</title>
943
944<para>Each device will have some function that gets called to read its
945physical input.  These may be called in a number of different ways.  In
946the case of synchronous I/O, they will be called from a DDX
947wakeup-handler that gets called after the server detects that new input is
948available.  In the case of asynchronous I/O, they will be called from a
949(SIGIO) signal handler triggered when new input is available.  This
950function should do at least two things: make sure that input events get
951enqueued, and make sure that the cursor gets moved for motion events
952(except if these are handled later by the driver's own event queue
953processing function, which cannot be done when using the MI event queue
954handling).
955</para>
956
957<para>Events are queued by calling mieqEnqueue():
958
959<variablelist>
960<varlistentry>
961<term>mieqEnqueue()</term>
962<listitem>
963<para>This MI function is used to add input events to the
964event queue.  It is simply passed the event to be queued.
965</para></listitem></varlistentry>
966</variablelist>
967</para>
968
969<para>The cursor position should be updated when motion events are
970enqueued, by calling either miPointerAbsoluteCursor() or
971miPointerDeltaCursor():
972
973<variablelist>
974<varlistentry>
975<term>miPointerAbsoluteCursor()</term>
976<listitem>
977<para>This MI function is used to move the
978cursor to the absolute coordinates provided.
979</para></listitem></varlistentry>
980<varlistentry>
981<term>miPointerDeltaCursor()</term>
982<listitem>
983<para>This MI function is used to move the cursor
984relative to its current position.
985</para></listitem></varlistentry>
986</variablelist>
987</para>
988</sect3>
989
990<sect3>
991<title>ProcessInputEvents()</title>
992
993<para>ProcessInputEvents() is a DDX function that is called from the X
994server's main dispatch loop when new events are available in the input
995event queue.  It typically processes the enqueued events, and updates
996the cursor/pointer position.  It may also do other DDX-specific event
997processing.
998</para>
999
1000<para>Enqueued events are processed by mieqProcessInputEvents() and passed
1001to the DIX layer for transmission to clients:
1002
1003<variablelist>
1004<varlistentry>
1005<term>mieqProcessInputEvents()</term>
1006<listitem>
1007<para>This function processes each event in the
1008event queue, and passes it to the device's input processing function.
1009The DIX layer provides default functions to do this processing, and they
1010handle the task of getting the events passed back to the relevant
1011clients.
1012</para></listitem></varlistentry>
1013<varlistentry>
1014<term>miPointerUpdate()</term>
1015<listitem>
1016<para>This function resynchronized the cursor position
1017with the new pointer position.  It also takes care of moving the cursor
1018between screens when needed in multi-head configurations.
1019</para></listitem></varlistentry>
1020</variablelist>
1021</para>
1022
1023</sect3>
1024
1025<sect3>
1026<title>DisableDevice()</title>
1027
1028<para>DisableDevice is a DIX function that removes an input device from the
1029list of enabled devices.  The result of this is that the device no
1030longer generates input events.  The device's data structures are kept in
1031place, and disabling a device like this can be reversed by calling
1032EnableDevice().  DisableDevice() may be called from the DDX when it is
1033desirable to do so (e.g., the XFree86 server does this when VT
1034switching).  Except for special cases, this is not normally called for
1035core input devices.
1036</para>
1037
1038<para>DisableDevice() calls the device's callback function with
1039<constant>DEVICE_OFF</constant>:
1040
1041<variablelist>
1042<varlistentry>
1043<term>(*dev-&gt;deviceProc)(dev, DEVICE_OFF)</term>
1044<listitem>
1045<para>This typically closes the
1046relevant physical device, and when appropriate, unregisters the device's
1047file descriptor (or equivalent) as a valid input source.
1048</para></listitem></varlistentry>
1049</variablelist>
1050</para>
1051
1052<para>DisableDevice() then removes the device handle from the X server's
1053global list of enabled devices.
1054</para>
1055
1056</sect3>
1057
1058<sect3>
1059<title>CloseDevice()</title>
1060
1061<para>CloseDevice is a DIX function that removes an input device from the
1062list of available devices.  It disables input from the device and frees
1063all data structures associated with the device.  This function is
1064usually called from CloseDownDevices(), which is called from main() at
1065the end of each server generation to close all input devices.
1066</para>
1067
1068<para>CloseDevice() calls the device's callback function with
1069<constant>DEVICE_CLOSE</constant>:
1070
1071<variablelist>
1072<varlistentry>
1073<term>(*dev-&gt;deviceProc)(dev, DEVICE_CLOSE)</term>
1074<listitem>
1075<para>This typically closes the
1076relevant physical device, and when appropriate, unregisters the device's
1077file descriptor (or equivalent) as a valid input source.  If any device
1078specific data structures were allocated when the device was initialized,
1079they are freed here.
1080</para></listitem></varlistentry>
1081</variablelist>
1082</para>
1083
1084<para>CloseDevice() then frees the data structures that were allocated
1085for the device when it was registered/initialized.
1086</para>
1087
1088</sect3>
1089
1090<sect3>
1091<title>LegalModifier()</title>
1092<!-- dmx/dmxinput.c - currently returns TRUE -->
1093<para>LegalModifier() is a required DDX function that can be used to
1094restrict which keys may be modifier keys.  This seems to be present for
1095historical reasons, so this function should simply return TRUE
1096unconditionally.
1097</para>
1098
1099</sect3>
1100</sect2>
1101
1102<sect2>
1103<title>Output handling</title>
1104
1105<para>The following sections describe the main functions required to
1106initialize, use and close the output device(s) for each screen in the X
1107server.
1108</para>
1109
1110<sect3>
1111<title>InitOutput()</title>
1112
1113<para>This DDX function is called near the start of each server generation
1114from the X server's main() function.  InitOutput()'s main purpose is to
1115initialize each screen and fill in the global screenInfo structure for
1116each screen.  It is passed three arguments: a pointer to the screenInfo
1117struct, which it is to initialize, and argc and argv from main(), which
1118can be used to determine additional configuration information.
1119</para>
1120
1121<para>The primary tasks for this function are outlined below:
1122
1123<orderedlist>
1124<listitem>
1125    <para><emphasis remap="bf">Parse configuration info:</emphasis> The first task of InitOutput()
1126    is to parses any configuration information from the configuration
1127    file.  In addition to the XF86Config file, other configuration
1128    information can be taken from the command line.  The command line
1129    options can be gathered either in InitOutput() or earlier in the
1130    ddxProcessArgument() function, which is called by
1131    ProcessCommandLine().  The configuration information determines the
1132    characteristics of the screen(s).  For example, in the XFree86 X
1133    server, the XF86Config file specifies the monitor information, the
1134    screen resolution, the graphics devices and slots in which they are
1135    located, and, for Xinerama, the screens' layout.
1136</para>
1137</listitem>
1138
1139<listitem>
1140    <para><emphasis remap="bf">Initialize screen info:</emphasis> The next task is to initialize
1141    the screen-dependent internal data structures.  For example, part of
1142    what the XFree86 X server does is to allocate its screen and pixmap
1143    private indices, probe for graphics devices, compare the probed
1144    devices to the ones listed in the XF86Config file, and add the ones that
1145    match to the internal xf86Screens&lsqb;&rsqb; structure.
1146</para>
1147</listitem>
1148
1149<listitem>
1150    <para><emphasis remap="bf">Set pixmap formats:</emphasis> The next task is to initialize the
1151    screenInfo's image byte order, bitmap bit order and bitmap scanline
1152    unit/pad.  The screenInfo's pixmap format's depth, bits per pixel
1153    and scanline padding is also initialized at this stage.
1154</para>
1155</listitem>
1156
1157<listitem>
1158    <para><emphasis remap="bf">Unify screen info:</emphasis> An optional task that might be done at
1159    this stage is to compare all of the information from the various
1160    screens and determines if they are compatible (i.e., if the set of
1161    screens can be unified into a single desktop).  This task has
1162    potential to be useful to the DMX front-end server, if Xinerama's
1163    PanoramiXConsolidate() function is not sufficient.
1164</para>
1165</listitem>
1166</orderedlist>
1167</para>
1168
1169<para>Once these tasks are complete, the valid screens are known and each
1170of these screens can be initialized by calling AddScreen().
1171</para>
1172</sect3>
1173
1174<sect3>
1175<title>AddScreen()</title>
1176
1177<para>This DIX function is called from InitOutput(), in the DDX layer, to
1178add each new screen to the screenInfo structure.  The DDX screen
1179initialization function and command line arguments (i.e., argc and argv)
1180are passed to it as arguments.
1181</para>
1182
1183<para>This function first allocates a new Screen structure and any privates
1184that are required.  It then initializes some of the fields in the Screen
1185struct and sets up the pixmap padding information.  Finally, it calls
1186the DDX screen initialization function ScreenInit(), which is described
1187below.  It returns the number of the screen that were just added, or -1
1188if there is insufficient memory to add the screen or if the DDX screen
1189initialization fails.
1190</para>
1191</sect3>
1192
1193<sect3>
1194<title>ScreenInit()</title>
1195
1196<para>This DDX function initializes the rest of the Screen structure with
1197either generic or screen-specific functions (as necessary).  It also
1198fills in various screen attributes (e.g., width and height in
1199millimeters, black and white pixel values).
1200</para>
1201
1202<para>The screen init function usually calls several functions to perform
1203certain screen initialization functions.  They are described below:
1204
1205<variablelist>
1206<varlistentry>
1207<term>{mi,*fb}ScreenInit()</term>
1208<listitem>
1209<para>The DDX layer's ScreenInit() function usually
1210calls another layer's ScreenInit() function (e.g., miScreenInit() or
1211fbScreenInit()) to initialize the fallbacks that the DDX driver does not
1212specifically handle.
1213</para>
1214
1215<para>After calling another layer's ScreenInit() function, any
1216screen-specific functions either wrap or replace the other layer's
1217function pointers.  If a function is to be wrapped, each of the old
1218function pointers from the other layer are stored in a screen private
1219area.  Common functions to wrap are CloseScreen() and SaveScreen().
1220</para></listitem></varlistentry>
1221
1222<varlistentry>
1223<term>miInitializeBackingStore()</term>
1224<listitem>
1225<para>This MI function initializes the
1226screen's backing storage functions, which are used to save areas of
1227windows that are currently covered by other windows.
1228</para></listitem></varlistentry>
1229
1230<varlistentry>
1231<term>miDCInitialize()</term>
1232<listitem>
1233<para>This MI function initializes the MI cursor
1234display structures and function pointers.  If a hardware cursor is used,
1235the DDX layer's ScreenInit() function will wrap additional screen and
1236the MI cursor display function pointers.
1237</para></listitem></varlistentry>
1238</variablelist>
1239</para>
1240
1241<para>Another common task for ScreenInit() function is to initialize the
1242output device state.  For example, in the XFree86 X server, the
1243ScreenInit() function saves the original state of the video card and
1244then initializes the video mode of the graphics device.
1245</para>
1246</sect3>
1247
1248<sect3>
1249<title>CloseScreen()</title>
1250
1251<para>This function restores any wrapped screen functions (and in
1252particular the wrapped CloseScreen() function) and restores the state of
1253the output device to its original state.  It should also free any
1254private data it created during the screen initialization.
1255</para>
1256</sect3>
1257
1258<sect3>
1259<title>GC operations</title>
1260
1261<para>When the X server is requested to render drawing primitives, it does
1262so by calling drawing functions through the graphics context's operation
1263function pointer table (i.e., the GCOps functions).  These functions
1264render the basic graphics operations such as drawing rectangles, lines,
1265text or copying pixmaps.  Default routines are provided either by the MI
1266layer, which draws indirectly through a simple span interface, or by the
1267framebuffer layers (e.g., CFB, MFB, FB), which draw directly to a
1268linearly mapped frame buffer.
1269</para>
1270
1271<para>To take advantage of special hardware on the graphics device,
1272specific GCOps functions can be replaced by device specific code.
1273However, many times the graphics devices can handle only a subset of the
1274possible states of the GC, so during graphics context validation,
1275appropriate routines are selected based on the state and capabilities of
1276the hardware.  For example, some graphics hardware can accelerate single
1277pixel width lines with certain dash patterns.  Thus, for dash patterns
1278that are not supported by hardware or for width 2 or greater lines, the
1279default routine is chosen during GC validation.
1280</para>
1281
1282<para>Note that some pointers to functions that draw to the screen are
1283stored in the Screen structure.  They include GetImage(), GetSpans(),
1284CopyWindow() and RestoreAreas().
1285</para>
1286</sect3>
1287
1288<sect3>
1289<title>Xnest</title>
1290
1291<para>The Xnest X server is a special proxy X server that relays the X
1292protocol requests that it receives to a ``real'' X server that then
1293processes the requests and displays the results, if applicable.  To the X
1294applications, Xnest appears as if it is a regular X server.  However,
1295Xnest is both server to the X application and client of the real X
1296server, which will actually handle the requests.
1297</para>
1298
1299<para>The Xnest server implements all of the standard input and output
1300initialization steps outlined above.
1301</para>
1302
1303<para><variablelist>
1304<varlistentry>
1305<term>InitOutput()</term>
1306<listitem>
1307<para>Xnest takes its configuration information from
1308command line arguments via ddxProcessArguments().  This information
1309includes the real X server display to connect to, its default visual
1310class, the screen depth, the Xnest window's geometry, etc.  Xnest then
1311connects to the real X server and gathers visual, colormap, depth and
1312pixmap information about that server's display, creates a window on that
1313server, which will be used as the root window for Xnest.
1314</para>
1315
1316<para>Next, Xnest initializes its internal data structures and uses the
1317data from the real X server's pixmaps to initialize its own pixmap
1318formats.  Finally, it calls AddScreen(xnestOpenScreen, argc, argv) to
1319initialize each of its screens.
1320</para></listitem></varlistentry>
1321
1322<varlistentry>
1323<term>ScreenInit()</term>
1324<listitem>
1325<para>Xnest's ScreenInit() function is called
1326xnestOpenScreen().  This function initializes its screen's depth and
1327visual information, and then calls miScreenInit() to set up the default
1328screen functions.  It then calls miInitializeBackingStore() and
1329miDCInitialize() to initialize backing store and the software cursor.
1330Finally, it replaces many of the screen functions with its own
1331functions that repackage and send the requests to the real X server to
1332which Xnest is attached.
1333</para></listitem></varlistentry>
1334
1335<varlistentry>
1336<term>CloseScreen()</term>
1337<listitem>
1338<para>This function frees its internal data structure
1339allocations.  Since it replaces instead of wrapping screen functions,
1340there are no function pointers to unwrap.  This can potentially lead to
1341problems during server regeneration.
1342</para></listitem></varlistentry>
1343
1344<varlistentry>
1345<term>GC operations</term>
1346<listitem>
1347<para>The GC operations in Xnest are very simple since
1348they leave all of the drawing to the real X server to which Xnest is
1349attached.  Each of the GCOps takes the request and sends it to the
1350real X server using standard Xlib calls.  For example, the X
1351application issues a XDrawLines() call.  This function turns into a
1352protocol request to Xnest, which calls the xnestPolylines() function
1353through Xnest's GCOps function pointer table.  The xnestPolylines()
1354function is only a single line, which calls XDrawLines() using the same
1355arguments that were passed into it.  Other GCOps functions are very
1356similar.  Two exceptions to the simple GCOps functions described above
1357are the image functions and the BLT operations.
1358</para>
1359
1360<para>The image functions, GetImage() and PutImage(), must use a temporary
1361image to hold the image to be put of the image that was just grabbed
1362from the screen while it is in transit to the real X server or the
1363client.  When the image has been transmitted, the temporary image is
1364destroyed.
1365</para>
1366
1367<para>The BLT operations, CopyArea() and CopyPlane(), handle not only the
1368copy function, which is the same as the simple cases described above,
1369but also the graphics exposures that result when the GC's graphics
1370exposure bit is set to True.  Graphics exposures are handled in a helper
1371function, xnestBitBlitHelper().  This function collects the exposure
1372events from the real X server and, if any resulting in regions being
1373exposed, then those regions are passed back to the MI layer so that it
1374can generate exposure events for the X application.
1375</para></listitem></varlistentry>
1376</variablelist>
1377</para>
1378
1379<para>The Xnest server takes its input from the X server to which it is
1380connected.  When the mouse is in the Xnest server's window, keyboard and
1381mouse events are received by the Xnest server, repackaged and sent back
1382to any client that requests those events.
1383</para>
1384</sect3>
1385
1386<sect3>
1387<title>Shadow framebuffer</title>
1388
1389<para>The most common type of framebuffer is a linear array memory that
1390maps to the video memory on the graphics device.  However, accessing
1391that video memory over an I/O bus (e.g., ISA or PCI) can be slow.  The
1392shadow framebuffer layer allows the developer to keep the entire
1393framebuffer in main memory and copy it back to video memory at regular
1394intervals.  It also has been extended to handle planar video memory and
1395rotated framebuffers.
1396</para>
1397
1398<para>There are two main entry points to the shadow framebuffer code:
1399
1400<variablelist>
1401<varlistentry>
1402<term>shadowAlloc(width, height, bpp)</term>
1403<listitem>
1404<para>This function allocates the in
1405memory copy of the framebuffer of size width*height*bpp.  It returns a
1406pointer to that memory, which will be used by the framebuffer
1407ScreenInit() code during the screen's initialization.
1408</para></listitem></varlistentry>
1409
1410<varlistentry>
1411<term>shadowInit(pScreen, updateProc, windowProc)</term>
1412<listitem>
1413<para>This function
1414initializes the shadow framebuffer layer.  It wraps several screen
1415drawing functions, and registers a block handler that will update the
1416screen.  The updateProc is a function that will copy the damaged regions
1417to the screen, and the windowProc is a function that is used when the
1418entire linear video memory range cannot be accessed simultaneously so
1419that only a window into that memory is available (e.g., when using the
1420VGA aperture).
1421</para></listitem></varlistentry>
1422</variablelist>
1423</para>
1424
1425<para>The shadow framebuffer code keeps track of the damaged area of each
1426screen by calculating the bounding box of all drawing operations that
1427have occurred since the last screen update.  Then, when the block handler
1428is next called, only the damaged portion of the screen is updated.
1429</para>
1430
1431<para>Note that since the shadow framebuffer is kept in main memory, all
1432drawing operations are performed by the CPU and, thus, no accelerated
1433hardware drawing operations are possible.
1434</para>
1435
1436</sect3>
1437</sect2>
1438
1439<sect2>
1440<title>Xinerama</title>
1441
1442<para>Xinerama is an X extension that allows multiple physical screens
1443controlled by a single X server to appear as a single screen.  Although
1444the extension allows clients to find the physical screen layout via
1445extension requests, it is completely transparent to clients at the core
1446X11 protocol level.  The original public implementation of Xinerama came
1447from Digital/Compaq.  XFree86 rewrote it, filling in some missing pieces
1448and improving both X11 core protocol compliance and performance.  The
1449Xinerama extension will be passing through X.Org's standardization
1450process in the near future, and the sample implementation will be based
1451on this rewritten version.
1452</para>
1453
1454<para>The current implementation of Xinerama is based primarily in the DIX
1455(device independent) and MI (machine independent) layers of the X
1456server.  With few exceptions the DDX layers do not need any changes to
1457support Xinerama.  X server extensions often do need modifications to
1458provide full Xinerama functionality.
1459</para>
1460
1461<para>The following is a code-level description of how Xinerama functions.
1462</para>
1463
1464<para>Note: Because the Xinerama extension was originally called the
1465PanoramiX extension, many of the Xinerama functions still have the
1466PanoramiX prefix.
1467</para>
1468
1469<variablelist>
1470<varlistentry>
1471<term>PanoramiXExtensionInit()</term>
1472<listitem>
1473    <para>PanoramiXExtensionInit() is a
1474    device-independent extension function that is called at the start of
1475    each server generation from InitExtensions(), which is called from
1476    the X server's main() function after all output devices have been
1477    initialized, but before any input devices have been initialized.
1478    </para>
1479
1480    <para>PanoramiXNumScreens is set to the number of physical screens.  If
1481    only one physical screen is present, the extension is disabled, and
1482    PanoramiXExtensionInit() returns without doing anything else.
1483    </para>
1484
1485    <para>The Xinerama extension is registered by calling AddExtension().
1486    </para>
1487
1488    <para>GC and Screen private
1489    indexes are allocated, and both GC and Screen private areas are
1490    allocated for each physical screen.  These hold Xinerama-specific
1491    per-GC and per-Screen data.  Each screen's CreateGC and CloseScreen
1492    functions are wrapped by XineramaCreateGC() and
1493    XineramaCloseScreen() respectively.  Some new resource classes are
1494    created for Xinerama drawables and GCs, and resource types for
1495    Xinerama windows, pixmaps and colormaps.
1496    </para>
1497
1498    <para>A region (PanoramiXScreenRegion) is
1499    initialized to be the union of the screen regions.
1500    The relative positioning information for the
1501    physical screens is taken from the ScreenRec x and y members, which
1502    the DDX layer must initialize in InitOutput().  The bounds of the
1503    combined screen is also calculated (PanoramiXPixWidth and
1504    PanoramiXPixHeight).
1505    </para>
1506
1507    <para>The DIX layer has a list of function pointers
1508    (ProcVector&lsqb;&rsqb;) that
1509    holds the entry points for the functions that process core protocol
1510    requests.  The requests that Xinerama must intercept and break up
1511    into physical screen-specific requests are wrapped.  The original
1512    set is copied to SavedProcVector&lsqb;&rsqb;.  The types of requests
1513    intercepted are Window requests, GC requests, colormap requests,
1514    drawing requests, and some geometry-related requests.  This wrapping
1515    allows the bulk of the protocol request processing to be handled
1516    transparently to the DIX layer.  Some operations cannot be dealt with
1517    in this way and are handled with Xinerama-specific code within the
1518    DIX layer.
1519    </para>
1520</listitem></varlistentry>
1521
1522<varlistentry>
1523<term>PanoramiXConsolidate()</term>
1524<listitem>
1525    <para>PanoramiXConsolidate() is a
1526    device-independent extension function that is called directly from
1527    the X server's main() function after extensions and input/output
1528    devices have been initialized, and before the root windows are
1529    defined and initialized.
1530</para>
1531
1532    <para>This function finds the set of depths (PanoramiXDepths&lsqb;&rsqb;) and
1533    visuals (PanoramiXVisuals&lsqb;&rsqb;)
1534    common to all of the physical screens.
1535    PanoramiXNumDepths is set to the number of common depths, and
1536    PanoramiXNumVisuals is set to the number of common visuals.
1537    Resources are created for the single root window and the default
1538    colormap.  Each of these resources has per-physical screen entries.
1539    </para>
1540</listitem></varlistentry>
1541
1542<varlistentry>
1543<term>PanoramiXCreateConnectionBlock()</term>
1544<listitem>
1545    <para>PanoramiXConsolidate() is a
1546    device-independent extension function that is called directly from
1547    the X server's main() function after the per-physical screen root
1548    windows are created.  It is called instead of the standard DIX
1549    CreateConnectionBlock() function.  If this function returns FALSE,
1550    the X server exits with a fatal error.  This function will return
1551    FALSE if no common depths were found in PanoramiXConsolidate().
1552    With no common depths, Xinerama mode is not possible.
1553    </para>
1554
1555    <para>The connection block holds the information that clients get when
1556    they open a connection to the X server.  It includes information
1557    such as the supported pixmap formats, number of screens and the
1558    sizes, depths, visuals, default colormap information, etc, for each
1559    of the screens (much of information that <command>xdpyinfo</command> shows).  The
1560    connection block is initialized with the combined single screen
1561    values that were calculated in the above two functions.
1562    </para>
1563
1564    <para>The Xinerama extension allows the registration of connection
1565    block callback functions.  The purpose of these is to allow other
1566    extensions to do processing at this point.  These callbacks can be
1567    registered by calling XineramaRegisterConnectionBlockCallback() from
1568    the other extension's ExtensionInit() function.  Each registered
1569    connection block callback is called at the end of
1570    PanoramiXCreateConnectionBlock().
1571    </para>
1572</listitem></varlistentry>
1573</variablelist>
1574
1575<sect3>
1576<title>Xinerama-specific changes to the DIX code</title>
1577
1578<para>There are a few types of Xinerama-specific changes within the DIX
1579code.  The main ones are described here.
1580</para>
1581
1582<para>Functions that deal with colormap or GC -related operations outside of
1583the intercepted protocol requests have a test added to only do the
1584processing for screen numbers &gt; 0.  This is because they are handled for
1585the single Xinerama screen and the processing is done once for screen 0.
1586</para>
1587
1588<para>The handling of motion events does some coordinate translation between
1589the physical screen's origin and screen zero's origin.  Also, motion
1590events must be reported relative to the composite screen origin rather
1591than the physical screen origins.
1592</para>
1593
1594<para>There is some special handling for cursor, window and event processing
1595that cannot (either not at all or not conveniently) be done via the
1596intercepted protocol requests.  A particular case is the handling of
1597pointers moving between physical screens.
1598</para>
1599</sect3>
1600
1601<sect3>
1602<title>Xinerama-specific changes to the MI code</title>
1603
1604<para>The only Xinerama-specific change to the MI code is in miSendExposures()
1605to handle the coordinate (and window ID) translation for expose events.
1606</para>
1607</sect3>
1608
1609<sect3>
1610<title>Intercepted DIX core requests</title>
1611
1612<para>Xinerama breaks up drawing requests for dispatch to each physical
1613screen.  It also breaks up windows into pieces for each physical screen.
1614GCs are translated into per-screen GCs.  Colormaps are replicated on
1615each physical screen.  The functions handling the intercepted requests
1616take care of breaking the requests and repackaging them so that they can
1617be passed to the standard request handling functions for each screen in
1618turn.  In addition, and to aid the repackaging, the information from
1619many of the intercepted requests is used to keep up to date the
1620necessary state information for the single composite screen.  Requests
1621(usually those with replies) that can be satisfied completely from this
1622stored state information do not call the standard request handling
1623functions.
1624</para>
1625
1626</sect3>
1627
1628</sect2>
1629
1630</sect1>
1631
1632<!-- ============================================================ -->
1633
1634<sect1>
1635<title>Development Results</title>
1636
1637<para>In this section the results of each phase of development are
1638discussed.  This development took place between approximately June 2001
1639and July 2003.
1640</para>
1641
1642<sect2>
1643<title>Phase I</title>
1644
1645<para>The initial development phase dealt with the basic implementation
1646including the bootstrap code, which used the shadow framebuffer, and the
1647unoptimized implementation, based on an Xnest-style implementation.
1648</para>
1649
1650<sect3>
1651<title>Scope</title>
1652
1653<para>The goal of Phase I is to provide fundamental functionality that can
1654act as a foundation for ongoing work:
1655<orderedlist>
1656<listitem>
1657    <para>Develop the proxy X server
1658    <itemizedlist>
1659	<listitem>
1660	<para>The proxy X server will operate on the X11 protocol and
1661	relay requests as necessary to correctly perform the request.
1662	</para></listitem>
1663	<listitem>
1664	<para>Work will be based on the existing work for Xinerama and
1665	Xnest.
1666	</para></listitem>
1667	<listitem>
1668	<para>Input events and windowing operations are handled in the
1669	proxy server and rendering requests are repackaged and sent to
1670	each of the back-end servers for display.
1671	</para></listitem>
1672	<listitem>
1673	<para>The multiple screen layout (including support for
1674	overlapping screens) will be user configurable via a
1675	configuration file or through the configuration tool.
1676	</para></listitem>
1677    </itemizedlist>
1678    </para></listitem>
1679    <listitem>
1680    <para>Develop graphical configuration tool
1681    <itemizedlist>
1682	<listitem>
1683	<para>There will be potentially a large number of X servers to
1684	configure into a single display.  The tool will allow the user
1685	to specify which servers are involved in the configuration and
1686	how they should be laid out.
1687	</para></listitem>
1688    </itemizedlist>
1689    </para></listitem>
1690    <listitem>
1691    <para>Pass the X Test Suite
1692    <itemizedlist>
1693	<listitem>
1694	<para>The X Test Suite covers the basic X11 operations.  All
1695	tests known to succeed must correctly operate in the distributed
1696	X environment.
1697	</para></listitem>
1698    </itemizedlist>
1699    </para></listitem>
1700</orderedlist>
1701
1702</para>
1703
1704<para>For this phase, the back-end X servers are assumed to be unmodified X
1705servers that do not support any DMX-related protocol extensions; future
1706optimization pathways are considered, but are not implemented; and the
1707configuration tool is assumed to rely only on libraries in the X source
1708tree (e.g., Xt).
1709</para>
1710</sect3>
1711
1712<sect3>
1713<title>Results</title>
1714
1715<para>The proxy X server, Xdmx, was developed to distribute X11 protocol
1716requests to the set of back-end X servers.  It opens a window on each
1717back-end server, which represents the part of the front-end's root
1718window that is visible on that screen.  It mirrors window, pixmap and
1719other state in each back-end server.  Drawing requests are sent to
1720either windows or pixmaps on each back-end server.  This code is based
1721on Xnest and uses the existing Xinerama extension.
1722</para>
1723
1724<para>Input events can be taken from (1) devices attached to the back-end
1725server, (2) core devices attached directly to the Xdmx server, or (3)
1726from a ``console'' window on another X server.  Events for these devices
1727are gathered, processed and delivered to clients attached to the Xdmx
1728server.
1729</para>
1730
1731<para>An intuitive configuration format was developed to help the user
1732easily configure the multiple back-end X servers.  It was defined (see
1733grammar in Xdmx man page) and a parser was implemented that is used by
1734the Xdmx server and by a standalone xdmxconfig utility.  The parsing
1735support was implemented such that it can be easily factored out of the X
1736source tree for use with other tools (e.g., vdl).  Support for
1737converting legacy vdl-format configuration files to the DMX format is
1738provided by the vdltodmx utility.
1739</para>
1740
1741<para>Originally, the configuration file was going to be a subsection of
1742XFree86's XF86Config file, but that was not possible since Xdmx is a
1743completely separate X server.  Thus, a separate config file format was
1744developed.  In addition, a graphical configuration
1745tool, xdmxconfig, was developed to allow the user to create and arrange
1746the screens in the configuration file.  The <emphasis remap="bf">-configfile</emphasis> and <emphasis remap="bf">-config</emphasis>
1747command-line options can be used to start Xdmx using a configuration
1748file.
1749</para>
1750
1751<para>An extension that enables remote input testing is required for the X
1752Test Suite to function.  During this phase, this extension (XTEST) was
1753implemented in the Xdmx server.  The results from running the X Test
1754Suite are described in detail below.
1755</para>
1756</sect3>
1757
1758<sect3>
1759<title>X Test Suite</title>
1760
1761        <sect4>
1762          <title>Introduction</title>
1763            <para>
1764              The X Test Suite contains tests that verify Xlib functions
1765              operate correctly.  The test suite is designed to run on a
1766              single X server; however, since X applications will not be
1767              able to tell the difference between the DMX server and a
1768              standard X server, the X Test Suite should also run on the
1769              DMX server.
1770            </para>
1771            <para>
1772              The Xdmx server was tested with the X Test Suite, and the
1773              existing failures are noted in this section.  To put these
1774              results in perspective, we first discuss expected X Test
1775              failures and how errors in underlying systems can impact
1776              Xdmx test results.
1777            </para>
1778        </sect4>
1779
1780        <sect4>
1781          <title>Expected Failures for a Single Head</title>
1782            <para>
1783              A correctly implemented X server with a single screen is
1784              expected to fail certain X Test tests.  The following
1785              well-known errors occur because of rounding error in the X
1786              server code:
1787              <literallayout>
1788XDrawArc: Tests 42, 63, 66, 73
1789XDrawArcs: Tests 45, 66, 69, 76
1790              </literallayout>
1791            </para>
1792            <para>
1793              The following failures occur because of the high-level X
1794              server implementation:
1795              <literallayout>
1796XLoadQueryFont: Test 1
1797XListFontsWithInfo: Tests 3, 4
1798XQueryFont: Tests 1, 2
1799              </literallayout>
1800            </para>
1801            <para>
1802              The following test fails when running the X server as root
1803              under Linux because of the way directory modes are
1804              interpreted:
1805              <literallayout>
1806XWriteBitmapFile: Test 3
1807              </literallayout>
1808            </para>
1809            <para>
1810              Depending on the video card used for the back-end, other
1811              failures may also occur because of bugs in the low-level
1812              driver implementation.  Over time, failures of this kind
1813              are usually fixed by XFree86, but will show up in Xdmx
1814              testing until then.
1815            </para>
1816        </sect4>
1817
1818        <sect4>
1819          <title>Expected Failures for Xinerama</title>
1820            <para>
1821              Xinerama fails several X Test Suite tests because of
1822              design decisions made for the current implementation of
1823              Xinerama.  Over time, many of these errors will be
1824              corrected by XFree86 and the group working on a new
1825              Xinerama implementation.  Therefore, Xdmx will also share
1826              X Suite Test failures with Xinerama.
1827            </para>
1828
1829            <para>
1830              We may be able to fix or work-around some of these
1831              failures at the Xdmx level, but this will require
1832              additional exploration that was not part of Phase I.
1833            </para>
1834
1835            <para>
1836              Xinerama is constantly improving, and the list of
1837              Xinerama-related failures depends on XFree86 version and
1838              the underlying graphics hardware.  We tested with a
1839              variety of hardware, including nVidia, S3, ATI Radeon,
1840              and Matrox G400 (in dual-head mode).  The list below
1841              includes only those failures that appear to be from the
1842              Xinerama layer, and does not include failures listed in
1843              the previous section, or failures that appear to be from
1844              the low-level graphics driver itself:
1845            </para>
1846
1847            <para>
1848              These failures were noted with multiple Xinerama
1849              configurations:
1850              <literallayout>
1851XCopyPlane: Tests 13, 22, 31 (well-known Xinerama implementation issue)
1852XSetFontPath: Test 4
1853XGetDefault: Test 5
1854XMatchVisualInfo: Test 1
1855              </literallayout>
1856            </para>
1857            <para>
1858              These failures were noted only when using one dual-head
1859              video card with a 4.2.99.x XFree86 server:
1860              <literallayout>
1861XListPixmapFormats: Test 1
1862XDrawRectangles: Test 45
1863              </literallayout>
1864            </para>
1865            <para>
1866              These failures were noted only when using two video cards
1867              from different vendors with a 4.1.99.x XFree86 server:
1868              <literallayout>
1869XChangeWindowAttributes: Test 32
1870XCreateWindow: Test 30
1871XDrawLine: Test 22
1872XFillArc: Test 22
1873XChangeKeyboardControl: Tests 9, 10
1874XRebindKeysym: Test 1
1875              </literallayout>
1876            </para>
1877        </sect4>
1878
1879        <sect4>
1880	  <title>Additional Failures from Xdmx</title>
1881
1882            <para>
1883              When running Xdmx, no unexpected failures were noted.
1884              Since the Xdmx server is based on Xinerama, we expect to
1885              have most of the Xinerama failures present in the Xdmx
1886              server.  Similarly, since the Xdmx server must rely on the
1887              low-level device drivers on each back-end server, we also
1888              expect that Xdmx will exhibit most of the back-end
1889              failures.  Here is a summary:
1890              <literallayout>
1891XListPixmapFormats: Test 1 (configuration dependent)
1892XChangeWindowAttributes: Test 32
1893XCreateWindow: Test 30
1894XCopyPlane: Test 13, 22, 31
1895XSetFontPath: Test 4
1896XGetDefault: Test 5 (configuration dependent)
1897XMatchVisualInfo: Test 1
1898XRebindKeysym: Test 1 (configuration dependent)
1899                </literallayout>
1900            </para>
1901            <para>
1902              Note that this list is shorter than the combined list for
1903              Xinerama because Xdmx uses different code paths to perform
1904              some Xinerama operations.  Further, some Xinerama failures
1905              have been fixed in the XFree86 4.2.99.x CVS repository.
1906            </para>
1907        </sect4>
1908
1909        <sect4>
1910          <title>Summary and Future Work</title>
1911
1912            <para>
1913              Running the X Test Suite on Xdmx does not produce any
1914              failures that cannot be accounted for by the underlying
1915              Xinerama subsystem used by the front-end or by the
1916              low-level device-driver code running on the back-end X
1917              servers.  The Xdmx server therefore is as ``correct'' as
1918              possible with respect to the standard set of X Test Suite
1919              tests.
1920            </para>
1921
1922            <para>
1923              During the following phases, we will continue to verify
1924              Xdmx correctness using the X Test Suite.  We may also use
1925              other tests suites or write additional tests that run
1926              under the X Test Suite that specifically verify the
1927              expected behavior of DMX.
1928            </para>
1929        </sect4>
1930</sect3>
1931
1932<sect3>
1933<title>Fonts</title>
1934
1935<para>In Phase I, fonts are handled directly by both the front-end and the
1936back-end servers, which is required since we must treat each back-end
1937server during this phase as a ``black box''.  What this requires is that
1938<emphasis remap="bf">the front- and back-end servers must share the exact same font
1939path</emphasis>.  There are two ways to help make sure that all servers share the
1940same font path:
1941
1942<orderedlist>
1943  <listitem>
1944    <para>First, each server can be configured to use the same font
1945    server.  The font server, xfs, can be configured to serve fonts to
1946    multiple X servers via TCP.
1947    </para></listitem>
1948
1949  <listitem>
1950    <para>Second, each server can be configured to use the same font
1951    path and either those font paths can be copied to each back-end
1952    machine or they can be mounted (e.g., via NFS) on each back-end
1953    machine.
1954    </para></listitem>
1955</orderedlist>
1956</para>
1957
1958<para>One additional concern is that a client program can set its own font
1959path, and if it does so, then that font path must be available on each
1960back-end machine.
1961</para>
1962
1963<para>The -fontpath command line option was added to allow users to
1964initialize the font path of the front end server.  This font path is
1965propagated to each back-end server when the default font is loaded.  If
1966there are any problems, an error message is printed, which will describe
1967the problem and list the current font path.  For more information about
1968setting the font path, see the -fontpath option description in the man
1969page.
1970</para>
1971</sect3>
1972
1973<sect3>
1974<title>Performance</title>
1975
1976<para>Phase I of development was not intended to optimize performance.  Its
1977focus was on completely and correctly handling the base X11 protocol in
1978the Xdmx server.  However, several insights were gained during Phase I,
1979which are listed here for reference during the next phase of
1980development.
1981</para>
1982
1983<orderedlist>
1984  <listitem>
1985    <para>Calls to XSync() can slow down rendering since it requires a
1986    complete round trip to and from a back-end server.  This is
1987    especially problematic when communicating over long haul networks.
1988    </para></listitem>
1989
1990  <listitem>
1991    <para>Sending drawing requests to only the screens that they overlap
1992    should improve performance.
1993    </para></listitem>
1994</orderedlist>
1995</sect3>
1996
1997<sect3>
1998<title>Pixmaps</title>
1999
2000<para>Pixmaps were originally expected to be handled entirely in the
2001front-end X server; however, it was found that this overly complicated
2002the rendering code and would have required sending potentially large
2003images to each back server that required them when copying from pixmap
2004to screen.  Thus, pixmap state is mirrored in the back-end server just
2005as it is with regular window state.  With this implementation, the same
2006rendering code that draws to windows can be used to draw to pixmaps on
2007the back-end server, and no large image transfers are required to copy
2008from pixmap to window.
2009</para>
2010
2011</sect3>
2012
2013</sect2>
2014
2015<!-- ============================================================ -->
2016<sect2>
2017<title>Phase II</title>
2018
2019<para>The second phase of development concentrates on performance
2020optimizations.  These optimizations are documented here, with
2021<command>x11perf</command> data to show how the optimizations improve performance.
2022</para>
2023
2024<para>All benchmarks were performed by running Xdmx on a dual processor
20251.4GHz AMD Athlon machine with 1GB of RAM connecting over 100baseT to
2026two single-processor 1GHz Pentium III machines with 256MB of RAM and ATI
2027Rage 128 (RF) video cards.  The front end was running Linux
20282.4.20-pre1-ac1 and the back ends were running Linux 2.4.7-10 and
2029version 4.2.99.1 of XFree86 pulled from the XFree86 CVS repository on
2030August 7, 2002.  All systems were running Red Hat Linux 7.2.
2031</para>
2032
2033<sect3>
2034<title>Moving from XFree86 4.1.99.1 to 4.2.0.0</title>
2035
2036<para>For phase II, the working source tree was moved to the branch tagged
2037with dmx-1-0-branch and was updated from version 4.1.99.1 (20 August
20382001) of the XFree86 sources to version 4.2.0.0 (18 January 2002).
2039After this update, the following tests were noted to be more than 10%
2040faster:
2041<screen>
20421.13   Fill 300x300 opaque stippled trapezoid (161x145 stipple)
20431.16   Fill 1x1 tiled trapezoid (161x145 tile)
20441.13   Fill 10x10 tiled trapezoid (161x145 tile)
20451.17   Fill 100x100 tiled trapezoid (161x145 tile)
20461.16   Fill 1x1 tiled trapezoid (216x208 tile)
20471.20   Fill 10x10 tiled trapezoid (216x208 tile)
20481.15   Fill 100x100 tiled trapezoid (216x208 tile)
20491.37   Circulate Unmapped window (200 kids)
2050</screen>
2051And the following tests were noted to be more than 10% slower:
2052<screen>
20530.88   Unmap window via parent (25 kids)
20540.75   Circulate Unmapped window (4 kids)
20550.79   Circulate Unmapped window (16 kids)
20560.80   Circulate Unmapped window (25 kids)
20570.82   Circulate Unmapped window (50 kids)
20580.85   Circulate Unmapped window (75 kids)
2059</screen>
2060</para>
2061
2062<para>These changes were not caused by any changes in the DMX system, and
2063may point to changes in the XFree86 tree or to tests that have more
2064"jitter" than most other <command>x11perf</command> tests.
2065</para>
2066</sect3>
2067
2068<sect3>
2069<title>Global changes</title>
2070
2071<para>During the development of the Phase II DMX server, several global
2072changes were made.  These changes were also compared with the Phase I
2073server.  The following tests were noted to be more than 10% faster:
2074<screen>
20751.13   Fill 300x300 opaque stippled trapezoid (161x145 stipple)
20761.15   Fill 1x1 tiled trapezoid (161x145 tile)
20771.13   Fill 10x10 tiled trapezoid (161x145 tile)
20781.17   Fill 100x100 tiled trapezoid (161x145 tile)
20791.16   Fill 1x1 tiled trapezoid (216x208 tile)
20801.19   Fill 10x10 tiled trapezoid (216x208 tile)
20811.15   Fill 100x100 tiled trapezoid (216x208 tile)
20821.15   Circulate Unmapped window (4 kids)
2083</screen>
2084</para>
2085
2086<para>The following tests were noted to be more than 10% slower:
2087<screen>
20880.69   Scroll 10x10 pixels
20890.68   Scroll 100x100 pixels
20900.68   Copy 10x10 from window to window
20910.68   Copy 100x100 from window to window
20920.76   Circulate Unmapped window (75 kids)
20930.83   Circulate Unmapped window (100 kids)
2094</screen>
2095</para>
2096
2097<para>For the remainder of this analysis, the baseline of comparison will
2098be the Phase II deliverable with all optimizations disabled (unless
2099otherwise noted).  This will highlight how the optimizations in
2100isolation impact performance.
2101</para>
2102</sect3>
2103
2104<sect3>
2105<title>XSync() Batching</title>
2106
2107<para>During the Phase I implementation, XSync() was called after every
2108protocol request made by the DMX server.  This provided the DMX server
2109with an interactive feel, but defeated X11's protocol buffering system
2110and introduced round-trip wire latency into every operation.  During
2111Phase II, DMX was changed so that protocol requests are no longer
2112followed by calls to XSync().  Instead, the need for an XSync() is
2113noted, and XSync() calls are only made every 100mS or when the DMX
2114server specifically needs to make a call to guarantee interactivity.
2115With this new system, X11 buffers protocol as much as possible during a
2116100mS interval, and many unnecessary XSync() calls are avoided.
2117</para>
2118
2119<para>Out of more than 300 <command>x11perf</command> tests, 8 tests became more than 100
2120times faster, with 68 more than 50X faster, 114 more than 10X faster,
2121and 181 more than 2X faster.  See table below for summary.
2122</para>
2123
2124<para>The following tests were noted to be more than 10% slower with
2125XSync() batching on:
2126<screen>
21270.88   500x500 tiled rectangle (161x145 tile)
21280.89   Copy 500x500 from window to window
2129</screen>
2130</para>
2131</sect3>
2132
2133<sect3>
2134<title>Offscreen Optimization</title>
2135
2136<para>Windows span one or more of the back-end servers' screens; however,
2137during Phase I development, windows were created on every back-end
2138server and every rendering request was sent to every window regardless
2139of whether or not that window was visible.  With the offscreen
2140optimization, the DMX server tracks when a window is completely off of a
2141back-end server's screen and, in that case, it does not send rendering
2142requests to those back-end windows.  This optimization saves bandwidth
2143between the front and back-end servers, and it reduces the number of
2144XSync() calls.  The performance tests were run on a DMX system with only
2145two back-end servers.  Greater performance gains will be had as the
2146number of back-end servers increases.
2147</para>
2148
2149<para>Out of more than 300 <command>x11perf</command> tests, 3 tests were at least twice as
2150fast, and 146 tests were at least 10% faster.  Two tests were more than
215110% slower with the offscreen optimization:
2152<screen>
21530.88   Hide/expose window via popup (4 kids)
21540.89   Resize unmapped window (75 kids)
2155</screen>
2156</para>
2157</sect3>
2158
2159<sect3>
2160<title>Lazy Window Creation Optimization</title>
2161
2162<para>As mentioned above, during Phase I, windows were created on every
2163back-end server even if they were not visible on that back-end.  With
2164the lazy window creation optimization, the DMX server does not create
2165windows on a back-end server until they are either visible or they
2166become the parents of a visible window.  This optimization builds on the
2167offscreen optimization (described above) and requires it to be enabled.
2168</para>
2169
2170<para>The lazy window creation optimization works by creating the window
2171data structures in the front-end server when a client creates a window,
2172but delays creation of the window on the back-end server(s).  A private
2173window structure in the DMX server saves the relevant window data and
2174tracks changes to the window's attributes and stacking order for later
2175use.  The only times a window is created on a back-end server are (1)
2176when it is mapped and is at least partially overlapping the back-end
2177server's screen (tracked by the offscreen optimization), or (2) when the
2178window becomes the parent of a previously visible window.  The first
2179case occurs when a window is mapped or when a visible window is copied,
2180moved or resized and now overlaps the back-end server's screen.  The
2181second case occurs when starting a window manager after having created
2182windows to which the window manager needs to add decorations.
2183</para>
2184
2185<para>When either case occurs, a window on the back-end server is created
2186using the data saved in the DMX server's window private data structure.
2187The stacking order is then adjusted to correctly place the window on the
2188back-end and lastly the window is mapped.  From this time forward, the
2189window is handled exactly as if the window had been created at the time
2190of the client's request.
2191</para>
2192
2193<para>Note that when a window is no longer visible on a back-end server's
2194screen (e.g., it is moved offscreen), the window is not destroyed;
2195rather, it is kept and reused later if the window once again becomes
2196visible on the back-end server's screen.  Originally with this
2197optimization, destroying windows was implemented but was later rejected
2198because it increased bandwidth when windows were opaquely moved or
2199resized, which is common in many window managers.
2200</para>
2201
2202<para>The performance tests were run on a DMX system with only two back-end
2203servers.  Greater performance gains will be had as the number of
2204back-end servers increases.
2205</para>
2206
2207<para>This optimization improved the following <command>x11perf</command> tests by more
2208than 10%:
2209<screen>
22101.10   500x500 rectangle outline
22111.12   Fill 100x100 stippled trapezoid (161x145 stipple)
22121.20   Circulate Unmapped window (50 kids)
22131.19   Circulate Unmapped window (75 kids)
2214</screen>
2215</para>
2216</sect3>
2217
2218<sect3>
2219<title>Subdividing Rendering Primitives</title>
2220
2221<para>X11 imaging requests transfer significant data between the client and
2222the X server.  During Phase I, the DMX server would then transfer the
2223image data to each back-end server.  Even with the offscreen
2224optimization (above), these requests still required transferring
2225significant data to each back-end server that contained a visible
2226portion of the window.  For example, if the client uses XPutImage() to
2227copy an image to a window that overlaps the entire DMX screen, then the
2228entire image is copied by the DMX server to every back-end server.
2229</para>
2230
2231<para>To reduce the amount of data transferred between the DMX server and
2232the back-end servers when XPutImage() is called, the image data is
2233subdivided and only the data that will be visible on a back-end server's
2234screen is sent to that back-end server.  Xinerama already implements a
2235subdivision algorithm for XGetImage() and no further optimization was
2236needed.
2237</para>
2238
2239<para>Other rendering primitives were analyzed, but the time required to
2240subdivide these primitives was a significant proportion of the time
2241required to send the entire rendering request to the back-end server, so
2242this optimization was rejected for the other rendering primitives.
2243</para>
2244
2245<para>Again, the performance tests were run on a DMX system with only two
2246back-end servers.  Greater performance gains will be had as the number
2247of back-end servers increases.
2248</para>
2249
2250<para>This optimization improved the following <command>x11perf</command> tests by more
2251than 10%:
2252<screen>
22531.12   Fill 100x100 stippled trapezoid (161x145 stipple)
22541.26   PutImage 10x10 square
22551.83   PutImage 100x100 square
22561.91   PutImage 500x500 square
22571.40   PutImage XY 10x10 square
22581.48   PutImage XY 100x100 square
22591.50   PutImage XY 500x500 square
22601.45   Circulate Unmapped window (75 kids)
22611.74   Circulate Unmapped window (100 kids)
2262</screen>
2263</para>
2264
2265<para>The following test was noted to be more than 10% slower with this
2266optimization:
2267<screen>
22680.88   10-pixel fill chord partial circle
2269</screen>
2270</para>
2271</sect3>
2272
2273<sect3>
2274<title>Summary of x11perf Data</title>
2275
2276<para>With all of the optimizations on, 53 <command>x11perf</command> tests are more than
2277100X faster than the unoptimized Phase II deliverable, with 69 more than
227850X faster, 73 more than 10X faster, and 199 more than twice as fast.
2279No tests were more than 10% slower than the unoptimized Phase II
2280deliverable.  (Compared with the Phase I deliverable, only Circulate
2281Unmapped window (100 kids) was more than 10% slower than the Phase II
2282deliverable.  As noted above, this test seems to have wider variability
2283than other <command>x11perf</command> tests.)
2284</para>
2285
2286<para>The following table summarizes relative <command>x11perf</command> test changes for
2287all optimizations individually and collectively.  Note that some of the
2288optimizations have a synergistic effect when used together.
2289<screen>
2290
22911: XSync() batching only
22922: Off screen optimizations only
22933: Window optimizations only
22944: Subdivprims only
22955: All optimizations
2296
2297    1     2    3    4      5 Operation
2298------ ---- ---- ---- ------ ---------
2299  2.14 1.85 1.00 1.00   4.13 Dot
2300  1.67 1.80 1.00 1.00   3.31 1x1 rectangle
2301  2.38 1.43 1.00 1.00   2.44 10x10 rectangle
2302  1.00 1.00 0.92 0.98   1.00 100x100 rectangle
2303  1.00 1.00 1.00 1.00   1.00 500x500 rectangle
2304  1.83 1.85 1.05 1.06   3.54 1x1 stippled rectangle (8x8 stipple)
2305  2.43 1.43 1.00 1.00   2.41 10x10 stippled rectangle (8x8 stipple)
2306  0.98 1.00 1.00 1.00   1.00 100x100 stippled rectangle (8x8 stipple)
2307  1.00 1.00 1.00 1.00   0.98 500x500 stippled rectangle (8x8 stipple)
2308  1.75 1.75 1.00 1.00   3.40 1x1 opaque stippled rectangle (8x8 stipple)
2309  2.38 1.42 1.00 1.00   2.34 10x10 opaque stippled rectangle (8x8 stipple)
2310  1.00 1.00 0.97 0.97   1.00 100x100 opaque stippled rectangle (8x8 stipple)
2311  1.00 1.00 1.00 1.00   0.99 500x500 opaque stippled rectangle (8x8 stipple)
2312  1.82 1.82 1.04 1.04   3.56 1x1 tiled rectangle (4x4 tile)
2313  2.33 1.42 1.00 1.00   2.37 10x10 tiled rectangle (4x4 tile)
2314  1.00 0.92 1.00 1.00   1.00 100x100 tiled rectangle (4x4 tile)
2315  1.00 1.00 1.00 1.00   1.00 500x500 tiled rectangle (4x4 tile)
2316  1.94 1.62 1.00 1.00   3.66 1x1 stippled rectangle (17x15 stipple)
2317  1.74 1.28 1.00 1.00   1.73 10x10 stippled rectangle (17x15 stipple)
2318  1.00 1.00 1.00 0.89   0.98 100x100 stippled rectangle (17x15 stipple)
2319  1.00 1.00 1.00 1.00   0.98 500x500 stippled rectangle (17x15 stipple)
2320  1.94 1.62 1.00 1.00   3.67 1x1 opaque stippled rectangle (17x15 stipple)
2321  1.69 1.26 1.00 1.00   1.66 10x10 opaque stippled rectangle (17x15 stipple)
2322  1.00 0.95 1.00 1.00   1.00 100x100 opaque stippled rectangle (17x15 stipple)
2323  1.00 1.00 1.00 1.00   0.97 500x500 opaque stippled rectangle (17x15 stipple)
2324  1.93 1.61 0.99 0.99   3.69 1x1 tiled rectangle (17x15 tile)
2325  1.73 1.27 1.00 1.00   1.72 10x10 tiled rectangle (17x15 tile)
2326  1.00 1.00 1.00 1.00   0.98 100x100 tiled rectangle (17x15 tile)
2327  1.00 1.00 0.97 0.97   1.00 500x500 tiled rectangle (17x15 tile)
2328  1.95 1.63 1.00 1.00   3.83 1x1 stippled rectangle (161x145 stipple)
2329  1.80 1.30 1.00 1.00   1.83 10x10 stippled rectangle (161x145 stipple)
2330  0.97 1.00 1.00 1.00   1.01 100x100 stippled rectangle (161x145 stipple)
2331  1.00 1.00 1.00 1.00   0.98 500x500 stippled rectangle (161x145 stipple)
2332  1.95 1.63 1.00 1.00   3.56 1x1 opaque stippled rectangle (161x145 stipple)
2333  1.65 1.25 1.00 1.00   1.68 10x10 opaque stippled rectangle (161x145 stipple)
2334  1.00 1.00 1.00 1.00   1.01 100x100 opaque stippled rectangle (161x145...
2335  1.00 1.00 1.00 1.00   0.97 500x500 opaque stippled rectangle (161x145...
2336  1.95 1.63 0.98 0.99   3.80 1x1 tiled rectangle (161x145 tile)
2337  1.67 1.26 1.00 1.00   1.67 10x10 tiled rectangle (161x145 tile)
2338  1.13 1.14 1.14 1.14   1.14 100x100 tiled rectangle (161x145 tile)
2339  0.88 1.00 1.00 1.00   0.99 500x500 tiled rectangle (161x145 tile)
2340  1.93 1.63 1.00 1.00   3.53 1x1 tiled rectangle (216x208 tile)
2341  1.69 1.26 1.00 1.00   1.66 10x10 tiled rectangle (216x208 tile)
2342  1.00 1.00 1.00 1.00   1.00 100x100 tiled rectangle (216x208 tile)
2343  1.00 1.00 1.00 1.00   1.00 500x500 tiled rectangle (216x208 tile)
2344  1.82 1.70 1.00 1.00   3.38 1-pixel line segment
2345  2.07 1.56 0.90 1.00   3.31 10-pixel line segment
2346  1.29 1.10 1.00 1.00   1.27 100-pixel line segment
2347  1.05 1.06 1.03 1.03   1.09 500-pixel line segment
2348  1.30 1.13 1.00 1.00   1.29 100-pixel line segment (1 kid)
2349  1.32 1.15 1.00 1.00   1.32 100-pixel line segment (2 kids)
2350  1.33 1.16 1.00 1.00   1.33 100-pixel line segment (3 kids)
2351  1.92 1.64 1.00 1.00   3.73 10-pixel dashed segment
2352  1.34 1.16 1.00 1.00   1.34 100-pixel dashed segment
2353  1.24 1.11 0.99 0.97   1.23 100-pixel double-dashed segment
2354  1.72 1.77 1.00 1.00   3.25 10-pixel horizontal line segment
2355  1.83 1.66 1.01 1.00   3.54 100-pixel horizontal line segment
2356  1.86 1.30 1.00 1.00   1.84 500-pixel horizontal line segment
2357  2.11 1.52 1.00 0.99   3.02 10-pixel vertical line segment
2358  1.21 1.10 1.00 1.00   1.20 100-pixel vertical line segment
2359  1.03 1.03 1.00 1.00   1.02 500-pixel vertical line segment
2360  4.42 1.68 1.00 1.01   4.64 10x1 wide horizontal line segment
2361  1.83 1.31 1.00 1.00   1.83 100x10 wide horizontal line segment
2362  1.07 1.00 0.96 1.00   1.07 500x50 wide horizontal line segment
2363  4.10 1.67 1.00 1.00   4.62 10x1 wide vertical line segment
2364  1.50 1.24 1.06 1.06   1.48 100x10 wide vertical line segment
2365  1.06 1.03 1.00 1.00   1.05 500x50 wide vertical line segment
2366  2.54 1.61 1.00 1.00   3.61 1-pixel line
2367  2.71 1.48 1.00 1.00   2.67 10-pixel line
2368  1.19 1.09 1.00 1.00   1.19 100-pixel line
2369  1.04 1.02 1.00 1.00   1.03 500-pixel line
2370  2.68 1.51 0.98 1.00   3.17 10-pixel dashed line
2371  1.23 1.11 0.99 0.99   1.23 100-pixel dashed line
2372  1.15 1.08 1.00 1.00   1.15 100-pixel double-dashed line
2373  2.27 1.39 1.00 1.00   2.23 10x1 wide line
2374  1.20 1.09 1.00 1.00   1.20 100x10 wide line
2375  1.04 1.02 1.00 1.00   1.04 500x50 wide line
2376  1.52 1.45 1.00 1.00   1.52 100x10 wide dashed line
2377  1.54 1.47 1.00 1.00   1.54 100x10 wide double-dashed line
2378  1.97 1.30 0.96 0.95   1.95 10x10 rectangle outline
2379  1.44 1.27 1.00 1.00   1.43 100x100 rectangle outline
2380  3.22 2.16 1.10 1.09   3.61 500x500 rectangle outline
2381  1.95 1.34 1.00 1.00   1.90 10x10 wide rectangle outline
2382  1.14 1.14 1.00 1.00   1.13 100x100 wide rectangle outline
2383  1.00 1.00 1.00 1.00   1.00 500x500 wide rectangle outline
2384  1.57 1.72 1.00 1.00   3.03 1-pixel circle
2385  1.96 1.35 1.00 1.00   1.92 10-pixel circle
2386  1.21 1.07 0.86 0.97   1.20 100-pixel circle
2387  1.08 1.04 1.00 1.00   1.08 500-pixel circle
2388  1.39 1.19 1.03 1.03   1.38 100-pixel dashed circle
2389  1.21 1.11 1.00 1.00   1.23 100-pixel double-dashed circle
2390  1.59 1.28 1.00 1.00   1.58 10-pixel wide circle
2391  1.22 1.12 0.99 1.00   1.22 100-pixel wide circle
2392  1.06 1.04 1.00 1.00   1.05 500-pixel wide circle
2393  1.87 1.84 1.00 1.00   1.85 100-pixel wide dashed circle
2394  1.90 1.93 1.01 1.01   1.90 100-pixel wide double-dashed circle
2395  2.13 1.43 1.00 1.00   2.32 10-pixel partial circle
2396  1.42 1.18 1.00 1.00   1.42 100-pixel partial circle
2397  1.92 1.85 1.01 1.01   1.89 10-pixel wide partial circle
2398  1.73 1.67 1.00 1.00   1.73 100-pixel wide partial circle
2399  1.36 1.95 1.00 1.00   2.64 1-pixel solid circle
2400  2.02 1.37 1.00 1.00   2.03 10-pixel solid circle
2401  1.19 1.09 1.00 1.00   1.19 100-pixel solid circle
2402  1.02 0.99 1.00 1.00   1.01 500-pixel solid circle
2403  1.74 1.28 1.00 0.88   1.73 10-pixel fill chord partial circle
2404  1.31 1.13 1.00 1.00   1.31 100-pixel fill chord partial circle
2405  1.67 1.31 1.03 1.03   1.72 10-pixel fill slice partial circle
2406  1.30 1.13 1.00 1.00   1.28 100-pixel fill slice partial circle
2407  2.45 1.49 1.01 1.00   2.71 10-pixel ellipse
2408  1.22 1.10 1.00 1.00   1.22 100-pixel ellipse
2409  1.09 1.04 1.00 1.00   1.09 500-pixel ellipse
2410  1.90 1.28 1.00 1.00   1.89 100-pixel dashed ellipse
2411  1.62 1.24 0.96 0.97   1.61 100-pixel double-dashed ellipse
2412  2.43 1.50 1.00 1.00   2.42 10-pixel wide ellipse
2413  1.61 1.28 1.03 1.03   1.60 100-pixel wide ellipse
2414  1.08 1.05 1.00 1.00   1.08 500-pixel wide ellipse
2415  1.93 1.88 1.00 1.00   1.88 100-pixel wide dashed ellipse
2416  1.94 1.89 1.01 1.00   1.94 100-pixel wide double-dashed ellipse
2417  2.31 1.48 1.00 1.00   2.67 10-pixel partial ellipse
2418  1.38 1.17 1.00 1.00   1.38 100-pixel partial ellipse
2419  2.00 1.85 0.98 0.97   1.98 10-pixel wide partial ellipse
2420  1.89 1.86 1.00 1.00   1.89 100-pixel wide partial ellipse
2421  3.49 1.60 1.00 1.00   3.65 10-pixel filled ellipse
2422  1.67 1.26 1.00 1.00   1.67 100-pixel filled ellipse
2423  1.06 1.04 1.00 1.00   1.06 500-pixel filled ellipse
2424  2.38 1.43 1.01 1.00   2.32 10-pixel fill chord partial ellipse
2425  2.06 1.30 1.00 1.00   2.05 100-pixel fill chord partial ellipse
2426  2.27 1.41 1.00 1.00   2.27 10-pixel fill slice partial ellipse
2427  1.98 1.33 1.00 0.97   1.97 100-pixel fill slice partial ellipse
2428 57.46 1.99 1.01 1.00 114.92 Fill 1x1 equivalent triangle
2429 56.94 1.98 1.01 1.00  73.89 Fill 10x10 equivalent triangle
2430  6.07 1.75 1.00 1.00   6.07 Fill 100x100 equivalent triangle
2431 51.12 1.98 1.00 1.00 102.81 Fill 1x1 trapezoid
2432 51.42 1.82 1.01 1.00  94.89 Fill 10x10 trapezoid
2433  6.47 1.80 1.00 1.00   6.44 Fill 100x100 trapezoid
2434  1.56 1.28 1.00 0.99   1.56 Fill 300x300 trapezoid
2435 51.27 1.97 0.96 0.97 102.54 Fill 1x1 stippled trapezoid (8x8 stipple)
2436 51.73 2.00 1.02 1.02  67.92 Fill 10x10 stippled trapezoid (8x8 stipple)
2437  5.36 1.72 1.00 1.00   5.36 Fill 100x100 stippled trapezoid (8x8 stipple)
2438  1.54 1.26 1.00 1.00   1.59 Fill 300x300 stippled trapezoid (8x8 stipple)
2439 51.41 1.94 1.01 1.00 102.82 Fill 1x1 opaque stippled trapezoid (8x8 stipple)
2440 50.71 1.95 0.99 1.00  65.44 Fill 10x10 opaque stippled trapezoid (8x8...
2441  5.33 1.73 1.00 1.00   5.36 Fill 100x100 opaque stippled trapezoid (8x8...
2442  1.58 1.25 1.00 1.00   1.58 Fill 300x300 opaque stippled trapezoid (8x8...
2443 51.56 1.96 0.99 0.90 103.68 Fill 1x1 tiled trapezoid (4x4 tile)
2444 51.59 1.99 1.01 1.01  62.25 Fill 10x10 tiled trapezoid (4x4 tile)
2445  5.38 1.72 1.00 1.00   5.38 Fill 100x100 tiled trapezoid (4x4 tile)
2446  1.54 1.25 1.00 0.99   1.58 Fill 300x300 tiled trapezoid (4x4 tile)
2447 51.70 1.98 1.01 1.01 103.98 Fill 1x1 stippled trapezoid (17x15 stipple)
2448 44.86 1.97 1.00 1.00  44.86 Fill 10x10 stippled trapezoid (17x15 stipple)
2449  2.74 1.56 1.00 1.00   2.73 Fill 100x100 stippled trapezoid (17x15 stipple)
2450  1.29 1.14 1.00 1.00   1.27 Fill 300x300 stippled trapezoid (17x15 stipple)
2451 51.41 1.96 0.96 0.95 103.39 Fill 1x1 opaque stippled trapezoid (17x15...
2452 45.14 1.96 1.01 1.00  45.14 Fill 10x10 opaque stippled trapezoid (17x15...
2453  2.68 1.56 1.00 1.00   2.68 Fill 100x100 opaque stippled trapezoid (17x15...
2454  1.26 1.10 1.00 1.00   1.28 Fill 300x300 opaque stippled trapezoid (17x15...
2455 51.13 1.97 1.00 0.99 103.39 Fill 1x1 tiled trapezoid (17x15 tile)
2456 47.58 1.96 1.00 1.00  47.86 Fill 10x10 tiled trapezoid (17x15 tile)
2457  2.74 1.56 1.00 1.00   2.74 Fill 100x100 tiled trapezoid (17x15 tile)
2458  1.29 1.14 1.00 1.00   1.28 Fill 300x300 tiled trapezoid (17x15 tile)
2459 51.13 1.97 0.99 0.97 103.39 Fill 1x1 stippled trapezoid (161x145 stipple)
2460 45.14 1.97 1.00 1.00  44.29 Fill 10x10 stippled trapezoid (161x145 stipple)
2461  3.02 1.77 1.12 1.12   3.38 Fill 100x100 stippled trapezoid (161x145 stipple)
2462  1.31 1.13 1.00 1.00   1.30 Fill 300x300 stippled trapezoid (161x145 stipple)
2463 51.27 1.97 1.00 1.00 103.10 Fill 1x1 opaque stippled trapezoid (161x145...
2464 45.01 1.97 1.00 1.00  45.01 Fill 10x10 opaque stippled trapezoid (161x145...
2465  2.67 1.56 1.00 1.00   2.69 Fill 100x100 opaque stippled trapezoid (161x145..
2466  1.29 1.13 1.00 1.01   1.27 Fill 300x300 opaque stippled trapezoid (161x145..
2467 51.41 1.96 1.00 0.99 103.39 Fill 1x1 tiled trapezoid (161x145 tile)
2468 45.01 1.96 0.98 1.00  45.01 Fill 10x10 tiled trapezoid (161x145 tile)
2469  2.62 1.36 1.00 1.00   2.69 Fill 100x100 tiled trapezoid (161x145 tile)
2470  1.27 1.13 1.00 1.00   1.22 Fill 300x300 tiled trapezoid (161x145 tile)
2471 51.13 1.98 1.00 1.00 103.39 Fill 1x1 tiled trapezoid (216x208 tile)
2472 45.14 1.97 1.01 0.99  45.14 Fill 10x10 tiled trapezoid (216x208 tile)
2473  2.62 1.55 1.00 1.00   2.71 Fill 100x100 tiled trapezoid (216x208 tile)
2474  1.28 1.13 1.00 1.00   1.20 Fill 300x300 tiled trapezoid (216x208 tile)
2475 50.71 1.95 1.00 1.00  54.70 Fill 10x10 equivalent complex polygon
2476  5.51 1.71 0.96 0.98   5.47 Fill 100x100 equivalent complex polygons
2477  8.39 1.97 1.00 1.00  16.75 Fill 10x10 64-gon (Convex)
2478  8.38 1.83 1.00 1.00   8.43 Fill 100x100 64-gon (Convex)
2479  8.50 1.96 1.00 1.00  16.64 Fill 10x10 64-gon (Complex)
2480  8.26 1.83 1.00 1.00   8.35 Fill 100x100 64-gon (Complex)
2481 14.09 1.87 1.00 1.00  14.05 Char in 80-char line (6x13)
2482 11.91 1.87 1.00 1.00  11.95 Char in 70-char line (8x13)
2483 11.16 1.85 1.01 1.00  11.10 Char in 60-char line (9x15)
2484 10.09 1.78 1.00 1.00  10.09 Char16 in 40-char line (k14)
2485  6.15 1.75 1.00 1.00   6.31 Char16 in 23-char line (k24)
2486 11.92 1.90 1.03 1.03  11.88 Char in 80-char line (TR 10)
2487  8.18 1.78 1.00 0.99   8.17 Char in 30-char line (TR 24)
2488 42.83 1.44 1.01 1.00  42.11 Char in 20/40/20 line (6x13, TR 10)
2489 27.45 1.43 1.01 1.01  27.45 Char16 in 7/14/7 line (k14, k24)
2490 12.13 1.85 1.00 1.00  12.05 Char in 80-char image line (6x13)
2491 10.00 1.84 1.00 1.00  10.00 Char in 70-char image line (8x13)
2492  9.18 1.83 1.00 1.00   9.12 Char in 60-char image line (9x15)
2493  9.66 1.82 0.98 0.95   9.66 Char16 in 40-char image line (k14)
2494  5.82 1.72 1.00 1.00   5.99 Char16 in 23-char image line (k24)
2495  8.70 1.80 1.00 1.00   8.65 Char in 80-char image line (TR 10)
2496  4.67 1.66 1.00 1.00   4.67 Char in 30-char image line (TR 24)
2497 84.43 1.47 1.00 1.00 124.18 Scroll 10x10 pixels
2498  3.73 1.50 1.00 0.98   3.73 Scroll 100x100 pixels
2499  1.00 1.00 1.00 1.00   1.00 Scroll 500x500 pixels
2500 84.43 1.51 1.00 1.00 134.02 Copy 10x10 from window to window
2501  3.62 1.51 0.98 0.98   3.62 Copy 100x100 from window to window
2502  0.89 1.00 1.00 1.00   1.00 Copy 500x500 from window to window
2503 57.06 1.99 1.00 1.00  88.64 Copy 10x10 from pixmap to window
2504  2.49 2.00 1.00 1.00   2.48 Copy 100x100 from pixmap to window
2505  1.00 0.91 1.00 1.00   0.98 Copy 500x500 from pixmap to window
2506  2.04 1.01 1.00 1.00   2.03 Copy 10x10 from window to pixmap
2507  1.05 1.00 1.00 1.00   1.05 Copy 100x100 from window to pixmap
2508  1.00 1.00 0.93 1.00   1.04 Copy 500x500 from window to pixmap
2509 58.52 1.03 1.03 1.02  57.95 Copy 10x10 from pixmap to pixmap
2510  2.40 1.00 1.00 1.00   2.45 Copy 100x100 from pixmap to pixmap
2511  1.00 1.00 1.00 1.00   1.00 Copy 500x500 from pixmap to pixmap
2512 51.57 1.92 1.00 1.00  85.75 Copy 10x10 1-bit deep plane
2513  6.37 1.75 1.01 1.01   6.37 Copy 100x100 1-bit deep plane
2514  1.26 1.11 1.00 1.00   1.24 Copy 500x500 1-bit deep plane
2515  4.23 1.63 0.98 0.97   4.38 Copy 10x10 n-bit deep plane
2516  1.04 1.02 1.00 1.00   1.04 Copy 100x100 n-bit deep plane
2517  1.00 1.00 1.00 1.00   1.00 Copy 500x500 n-bit deep plane
2518  6.45 1.98 1.00 1.26  12.80 PutImage 10x10 square
2519  1.10 1.87 1.00 1.83   2.11 PutImage 100x100 square
2520  1.02 1.93 1.00 1.91   1.91 PutImage 500x500 square
2521  4.17 1.78 1.00 1.40   7.18 PutImage XY 10x10 square
2522  1.27 1.49 0.97 1.48   2.10 PutImage XY 100x100 square
2523  1.00 1.50 1.00 1.50   1.52 PutImage XY 500x500 square
2524  1.07 1.01 1.00 1.00   1.06 GetImage 10x10 square
2525  1.01 1.00 1.00 1.00   1.01 GetImage 100x100 square
2526  1.00 1.00 1.00 1.00   1.00 GetImage 500x500 square
2527  1.56 1.00 0.99 0.97   1.56 GetImage XY 10x10 square
2528  1.02 1.00 1.00 1.00   1.02 GetImage XY 100x100 square
2529  1.00 1.00 1.00 1.00   1.00 GetImage XY 500x500 square
2530  1.00 1.00 1.01 0.98   0.95 X protocol NoOperation
2531  1.02 1.03 1.04 1.03   1.00 QueryPointer
2532  1.03 1.02 1.04 1.03   1.00 GetProperty
2533100.41 1.51 1.00 1.00 198.76 Change graphics context
2534 45.81 1.00 0.99 0.97  57.10 Create and map subwindows (4 kids)
2535 78.45 1.01 1.02 1.02  63.07 Create and map subwindows (16 kids)
2536 73.91 1.01 1.00 1.00  56.37 Create and map subwindows (25 kids)
2537 73.22 1.00 1.00 1.00  49.07 Create and map subwindows (50 kids)
2538 72.36 1.01 0.99 1.00  32.14 Create and map subwindows (75 kids)
2539 70.34 1.00 1.00 1.00  30.12 Create and map subwindows (100 kids)
2540 55.00 1.00 1.00 0.99  23.75 Create and map subwindows (200 kids)
2541 55.30 1.01 1.00 1.00 141.03 Create unmapped window (4 kids)
2542 55.38 1.01 1.01 1.00 163.25 Create unmapped window (16 kids)
2543 54.75 0.96 1.00 0.99 166.95 Create unmapped window (25 kids)
2544 54.83 1.00 1.00 0.99 178.81 Create unmapped window (50 kids)
2545 55.38 1.01 1.01 1.00 181.20 Create unmapped window (75 kids)
2546 55.38 1.01 1.01 1.00 181.20 Create unmapped window (100 kids)
2547 54.87 1.01 1.01 1.00 182.05 Create unmapped window (200 kids)
2548 28.13 1.00 1.00 1.00  30.75 Map window via parent (4 kids)
2549 36.14 1.01 1.01 1.01  32.58 Map window via parent (16 kids)
2550 26.13 1.00 0.98 0.95  29.85 Map window via parent (25 kids)
2551 40.07 1.00 1.01 1.00  27.57 Map window via parent (50 kids)
2552 23.26 0.99 1.00 1.00  18.23 Map window via parent (75 kids)
2553 22.91 0.99 1.00 0.99  16.52 Map window via parent (100 kids)
2554 27.79 1.00 1.00 0.99  12.50 Map window via parent (200 kids)
2555 22.35 1.00 1.00 1.00  56.19 Unmap window via parent (4 kids)
2556  9.57 1.00 0.99 1.00  89.78 Unmap window via parent (16 kids)
2557 80.77 1.01 1.00 1.00 103.85 Unmap window via parent (25 kids)
2558 96.34 1.00 1.00 1.00 116.06 Unmap window via parent (50 kids)
2559 99.72 1.00 1.00 1.00 124.93 Unmap window via parent (75 kids)
2560112.36 1.00 1.00 1.00 125.27 Unmap window via parent (100 kids)
2561105.41 1.00 1.00 0.99 120.00 Unmap window via parent (200 kids)
2562 51.29 1.03 1.02 1.02  74.19 Destroy window via parent (4 kids)
2563 86.75 0.99 0.99 0.99 116.87 Destroy window via parent (16 kids)
2564106.43 1.01 1.01 1.01 127.49 Destroy window via parent (25 kids)
2565120.34 1.01 1.01 1.00 140.11 Destroy window via parent (50 kids)
2566126.67 1.00 0.99 0.99 145.00 Destroy window via parent (75 kids)
2567126.11 1.01 1.01 1.00 140.56 Destroy window via parent (100 kids)
2568128.57 1.01 1.00 1.00 137.91 Destroy window via parent (200 kids)
2569 16.04 0.88 1.00 1.00  20.36 Hide/expose window via popup (4 kids)
2570 19.04 1.01 1.00 1.00  23.48 Hide/expose window via popup (16 kids)
2571 19.22 1.00 1.00 1.00  20.44 Hide/expose window via popup (25 kids)
2572 17.41 1.00 0.91 0.97  17.68 Hide/expose window via popup (50 kids)
2573 17.29 1.01 1.00 1.01  17.07 Hide/expose window via popup (75 kids)
2574 16.74 1.00 1.00 1.00  16.17 Hide/expose window via popup (100 kids)
2575 10.30 1.00 1.00 1.00  10.51 Hide/expose window via popup (200 kids)
2576 16.48 1.01 1.00 1.00  26.05 Move window (4 kids)
2577 17.01 0.95 1.00 1.00  23.97 Move window (16 kids)
2578 16.95 1.00 1.00 1.00  22.90 Move window (25 kids)
2579 16.05 1.01 1.00 1.00  21.32 Move window (50 kids)
2580 15.58 1.00 0.98 0.98  19.44 Move window (75 kids)
2581 14.98 1.02 1.03 1.03  18.17 Move window (100 kids)
2582 10.90 1.01 1.01 1.00  12.68 Move window (200 kids)
2583 49.42 1.00 1.00 1.00 198.27 Moved unmapped window (4 kids)
2584 50.72 0.97 1.00 1.00 193.66 Moved unmapped window (16 kids)
2585 50.87 1.00 0.99 1.00 195.09 Moved unmapped window (25 kids)
2586 50.72 1.00 1.00 1.00 189.34 Moved unmapped window (50 kids)
2587 50.87 1.00 1.00 1.00 191.33 Moved unmapped window (75 kids)
2588 50.87 1.00 1.00 0.90 186.71 Moved unmapped window (100 kids)
2589 50.87 1.00 1.00 1.00 179.19 Moved unmapped window (200 kids)
2590 41.04 1.00 1.00 1.00  56.61 Move window via parent (4 kids)
2591 69.81 1.00 1.00 1.00 130.82 Move window via parent (16 kids)
2592 95.81 1.00 1.00 1.00 141.92 Move window via parent (25 kids)
2593 95.98 1.00 1.00 1.00 149.43 Move window via parent (50 kids)
2594 96.59 1.01 1.01 1.00 153.98 Move window via parent (75 kids)
2595 97.19 1.00 1.00 1.00 157.30 Move window via parent (100 kids)
2596 96.67 1.00 0.99 0.96 159.44 Move window via parent (200 kids)
2597 17.75 1.01 1.00 1.00  27.61 Resize window (4 kids)
2598 17.94 1.00 1.00 0.99  25.42 Resize window (16 kids)
2599 17.92 1.01 1.00 1.00  24.47 Resize window (25 kids)
2600 17.24 0.97 1.00 1.00  24.14 Resize window (50 kids)
2601 16.81 1.00 1.00 0.99  22.75 Resize window (75 kids)
2602 16.08 1.00 1.00 1.00  21.20 Resize window (100 kids)
2603 12.92 1.00 0.99 1.00  16.26 Resize window (200 kids)
2604 52.94 1.01 1.00 1.00 327.12 Resize unmapped window (4 kids)
2605 53.60 1.01 1.01 1.01 333.71 Resize unmapped window (16 kids)
2606 52.99 1.00 1.00 1.00 337.29 Resize unmapped window (25 kids)
2607 51.98 1.00 1.00 1.00 329.38 Resize unmapped window (50 kids)
2608 53.05 0.89 1.00 1.00 322.60 Resize unmapped window (75 kids)
2609 53.05 1.00 1.00 1.00 318.08 Resize unmapped window (100 kids)
2610 53.11 1.00 1.00 0.99 306.21 Resize unmapped window (200 kids)
2611 16.76 1.00 0.96 1.00  19.46 Circulate window (4 kids)
2612 17.24 1.00 1.00 0.97  16.24 Circulate window (16 kids)
2613 16.30 1.03 1.03 1.03  15.85 Circulate window (25 kids)
2614 13.45 1.00 1.00 1.00  14.90 Circulate window (50 kids)
2615 12.91 1.00 1.00 1.00  13.06 Circulate window (75 kids)
2616 11.30 0.98 1.00 1.00  11.03 Circulate window (100 kids)
2617  7.58 1.01 1.01 0.99   7.47 Circulate window (200 kids)
2618  1.01 1.01 0.98 1.00   0.95 Circulate Unmapped window (4 kids)
2619  1.07 1.07 1.01 1.07   1.02 Circulate Unmapped window (16 kids)
2620  1.04 1.09 1.06 1.05   0.97 Circulate Unmapped window (25 kids)
2621  1.04 1.23 1.20 1.18   1.05 Circulate Unmapped window (50 kids)
2622  1.18 1.53 1.19 1.45   1.24 Circulate Unmapped window (75 kids)
2623  1.08 1.02 1.01 1.74   1.01 Circulate Unmapped window (100 kids)
2624  1.01 1.12 0.98 0.91   0.97 Circulate Unmapped window (200 kids)
2625</screen>
2626</para>
2627</sect3>
2628
2629<sect3>
2630<title>Profiling with OProfile</title>
2631
2632<para>OProfile (available from http://oprofile.sourceforge.net/) is a
2633system-wide profiler for Linux systems that uses processor-level
2634counters to collect sampling data.  OProfile can provide information
2635that is similar to that provided by <command>gprof</command>, but without the
2636necessity of recompiling the program with special instrumentation (i.e.,
2637OProfile can collect statistical profiling information about optimized
2638programs).  A test harness was developed to collect OProfile data for
2639each <command>x11perf</command> test individually.
2640</para>
2641
2642<para>Test runs were performed using the RETIRED_INSNS counter on the AMD
2643Athlon and the CPU_CLK_HALTED counter on the Intel Pentium III (with a
2644test configuration different from the one described above).  We have
2645examined OProfile output and have compared it with <command>gprof</command> output.
2646This investigation has not produced results that yield performance
2647increases in <command>x11perf</command> numbers.
2648</para>
2649
2650</sect3>
2651
2652<!--
2653<sect3>Retired Instructions
2654
2655<p>The initial tests using OProfile were done using the RETIRED_INSNS
2656counter with DMX running on the dual-processor AMD Athlon machine - the
2657same test configuration that was described above and that was used for
2658other tests.  The RETIRED_INSNS counter counts retired instructions and
2659showed drawing, text, copying, and image tests to be dominated (&gt;
266030%) by calls to Hash(), SecurityLookupIDByClass(),
2661SecurityLookupIDByType(), and StandardReadRequestFromClient().  Some of
2662these tests also executed significant instructions in
2663WaitForSomething().
2664
2665<p>In contrast, the window tests executed significant
2666instructions in SecurityLookupIDByType(), Hash(),
2667StandardReadRequestFromClient(), but also executed significant
2668instructions in other routines, such as ConfigureWindow().  Some time
2669was spent looking at Hash() function, but optimizations in this routine
2670did not lead to a dramatic increase in <tt/x11perf/ performance.
2671-->
2672
2673<!--
2674<sect3>Clock Cycles
2675
2676<p>Retired instructions can be misleading because Intel/AMD instructions
2677execute in variable amounts of time.  The OProfile tests were repeated
2678using the Intel CPU_CLK_HALTED counter with DMX running on the second
2679back-end machine.  Note that this is a different test configuration that
2680the one described above.  However, these tests show the amount of time
2681(as measured in CPU cycles) that are spent in each routine.  Because
2682<tt/x11perf/ was running on the first back-end machine and because
2683window optimizations were on, the load on the second back-end machine
2684was not significant.
2685
2686<p>Using CPU_CLK_HALTED, DMX showed simple drawing
2687tests spending more than 10% of their time in
2688StandardReadRequestFromClient(), with significant time (&gt; 20% total)
2689spent in SecurityLookupIDByClass(), WaitForSomething(), and Dispatch().
2690For these tests, &lt; 5% of the time was spent in Hash(), which explains
2691why optimizing the Hash() routine did not impact <tt/x11perf/ results.
2692
2693<p>The trapezoid, text, scrolling, copying, and image tests were
2694dominated by time in ProcFillPoly(), PanoramiXFillPoly(), dmxFillPolygon(),
2695SecurityLookupIDByClass(), SecurityLookupIDByType(), and
2696StandardReadRequestFromClient().  Hash() time was generally above 5% but
2697less than 10% of total time.
2698-->
2699
2700<sect3>
2701<title>X Test Suite</title>
2702
2703<para>The X Test Suite was run on the fully optimized DMX server using the
2704configuration described above.  The following failures were noted:
2705<screen>
2706XListPixmapFormats: Test 1              [1]
2707XChangeWindowAttributes: Test 32        [1]
2708XCreateWindow: Test 30                  [1]
2709XFreeColors: Test 4                     [3]
2710XCopyArea: Test 13, 17, 21, 25, 30      [2]
2711XCopyPlane: Test 11, 15, 27, 31         [2]
2712XSetFontPath: Test 4                    [1]
2713XChangeKeyboardControl: Test 9, 10      [1]
2714
2715[1] Previously documented errors expected from the Xinerama
2716    implementation (see Phase I discussion).
2717[2] Newly noted errors that have been verified as expected
2718    behavior of the Xinerama implementation.
2719[3] Newly noted error that has been verified as a Xinerama
2720    implementation bug.
2721</screen>
2722</para>
2723
2724</sect3>
2725
2726</sect2>
2727
2728<!-- ============================================================ -->
2729<sect2>
2730<title>Phase III</title>
2731
2732<para>During the third phase of development, support was provided for the
2733following extensions: SHAPE, RENDER, XKEYBOARD, XInput.
2734</para>
2735
2736<sect3>
2737<title>SHAPE</title>
2738
2739<para>The SHAPE extension is supported.  Test applications (e.g., xeyes and
2740oclock) and window managers that make use of the SHAPE extension will
2741work as expected.
2742</para>
2743</sect3>
2744
2745<sect3>
2746<title>RENDER</title>
2747
2748<para>The RENDER extension is supported.  The version included in the DMX
2749CVS tree is version 0.2, and this version is fully supported by Xdmx.
2750Applications using only version 0.2 functions will work correctly;
2751however, some apps that make use of functions from later versions do not
2752properly check the extension's major/minor version numbers.  These apps
2753will fail with a Bad Implementation error when using post-version 0.2
2754functions.  This is expected behavior.  When the DMX CVS tree is updated
2755to include newer versions of RENDER, support for these newer functions
2756will be added to the DMX X server.
2757</para>
2758</sect3>
2759
2760<sect3>
2761<title>XKEYBOARD</title>
2762
2763<para>The XKEYBOARD extension is supported.  If present on the back-end X
2764servers, the XKEYBOARD extension will be used to obtain information
2765about the type of the keyboard for initialization.  Otherwise, the
2766keyboard will be initialized using defaults.  Note that this departs
2767from older behavior: when Xdmx is compiled without XKEYBOARD support,
2768the map from the back-end X server will be preserved.  With XKEYBOARD
2769support, the map is not preserved because better information and control
2770of the keyboard is available.
2771</para>
2772</sect3>
2773
2774<sect3>
2775<title>XInput</title>
2776
2777<para>The XInput extension is supported.  Any device can be used as a core
2778device and be used as an XInput extension device, with the exception of
2779core devices on the back-end servers.  This limitation is present
2780because cursor handling on the back-end requires that the back-end
2781cursor sometimes track the Xdmx core cursor -- behavior that is
2782incompatible with using the back-end pointer as a non-core device.
2783</para>
2784
2785<para>Currently, back-end extension devices are not available as Xdmx
2786extension devices, but this limitation should be removed in the future.
2787</para>
2788
2789<para>To demonstrate the XInput extension, and to provide more examples for
2790low-level input device driver writers, USB device drivers have been
2791written for mice (usb-mou), keyboards (usb-kbd), and
2792non-mouse/non-keyboard USB devices (usb-oth).  Please see the man page
2793for information on Linux kernel drivers that are required for using
2794these Xdmx drivers.
2795</para>
2796</sect3>
2797
2798<sect3>
2799<title>DPMS</title>
2800
2801<para>The DPMS extension is exported but does not do anything at this time.
2802</para>
2803
2804</sect3>
2805
2806<sect3>
2807<title>Other Extensions</title>
2808
2809<para>The LBX,
2810       SECURITY,
2811       XC-APPGROUP, and
2812       XFree86-Bigfont
2813extensions do not require any special Xdmx support and have been exported.
2814</para>
2815
2816<para>The
2817    BIG-REQUESTS,
2818    DEC-XTRAP,
2819    DOUBLE-BUFFER,
2820    Extended-Visual-Information,
2821    FontCache,
2822    GLX,
2823    MIT-SCREEN-SAVER,
2824    MIT-SHM,
2825    MIT-SUNDRY-NONSTANDARD,
2826    RECORD,
2827    SECURITY,
2828    SGI-GLX,
2829    SYNC,
2830    TOG-CUP,
2831    X-Resource,
2832    XC-MISC,
2833    XFree86-DGA,
2834    XFree86-DRI,
2835    XFree86-Misc,
2836    XFree86-VidModeExtension, and
2837    XVideo
2838extensions are <emphasis remap="it">not</emphasis> supported at this time, but will be evaluated
2839for inclusion in future DMX releases.  <emphasis remap="bf">See below for additional work
2840on extensions after Phase III.</emphasis>
2841</para>
2842</sect3>
2843</sect2>
2844
2845<sect2>
2846<title>Phase IV</title>
2847
2848<sect3>
2849<title>Moving to XFree86 4.3.0</title>
2850
2851<para>For Phase IV, the recent release of XFree86 4.3.0 (27 February 2003)
2852was merged onto the dmx.sourceforge.net CVS trunk and all work is
2853proceeding using this tree.
2854</para>
2855</sect3>
2856
2857<sect3>
2858<title>Extensions </title>
2859
2860<sect4>
2861<title>XC-MISC (supported)</title>
2862
2863<para>XC-MISC is used internally by the X library to recycle XIDs from the
2864X server.  This is important for long-running X server sessions.  Xdmx
2865supports this extension.  The X Test Suite passed and failed the exact
2866same tests before and after this extension was enabled.
2867<!-- Tested February/March 2003 -->
2868</para>
2869</sect4>
2870
2871<sect4>
2872<title>Extended-Visual-Information (supported)</title>
2873
2874<para>The Extended-Visual-Information extension provides a method for an X
2875client to obtain detailed visual information.  Xdmx supports this
2876extension.  It was tested using the <filename>hw/dmx/examples/evi</filename> example
2877program.  <emphasis remap="bf">Note that this extension is not Xinerama-aware</emphasis> -- it will
2878return visual information for each screen even though Xinerama is
2879causing the X server to export a single logical screen.
2880<!-- Tested March 2003 -->
2881</para>
2882</sect4>
2883
2884<sect4>
2885<title>RES (supported)</title>
2886
2887<para>The X-Resource extension provides a mechanism for a client to obtain
2888detailed information about the resources used by other clients.  This
2889extension was tested with the <filename>hw/dmx/examples/res</filename> program.  The
2890X Test Suite passed and failed the exact same tests before and after
2891this extension was enabled.
2892<!-- Tested March 2003 -->
2893</para>
2894</sect4>
2895
2896<sect4>
2897<title>BIG-REQUESTS (supported)</title>
2898
2899<para>This extension enables the X11 protocol to handle requests longer
2900than 262140 bytes.  The X Test Suite passed and failed the exact same
2901tests before and after this extension was enabled.
2902<!-- Tested March 2003 -->
2903</para>
2904</sect4>
2905
2906<sect4>
2907<title>XSYNC (supported)</title>
2908
2909<para>This extension provides facilities for two different X clients to
2910synchronize their requests.  This extension was minimally tested with
2911<command>xdpyinfo</command> and the X Test Suite passed and failed the exact same
2912tests before and after this extension was enabled.
2913<!-- Tested March 2003 -->
2914</para>
2915</sect4>
2916
2917<sect4>
2918<title>XTEST, RECORD, DEC-XTRAP (supported) and XTestExtension1 (not supported)</title>
2919
2920<para>The XTEST and RECORD extension were developed by the X Consortium for
2921use in the X Test Suite and are supported as a standard in the X11R6
2922tree.  They are also supported in Xdmx.  When X Test Suite tests that
2923make use of the XTEST extension are run, Xdmx passes and fails exactly
2924the same tests as does a standard XFree86 X server.  When the
2925<literal remap="tt">rcrdtest</literal> test (a part of the X Test Suite that verifies the RECORD
2926extension) is run, Xdmx passes and fails exactly the same tests as does
2927a standard XFree86 X server. <!-- Tested February/March 2003 -->
2928</para>
2929
2930<para>There are two older XTEST-like extensions: DEC-XTRAP and
2931XTestExtension1.  The XTestExtension1 extension was developed for use by
2932the X Testing Consortium for use with a test suite that eventually
2933became (part of?) the X Test Suite.  Unlike XTEST, which only allows
2934events to be sent to the server, the XTestExtension1 extension also
2935allowed events to be recorded (similar to the RECORD extension).  The
2936second is the DEC-XTRAP extension that was developed by the Digital
2937Equipment Corporation.
2938</para>
2939
2940<para>The DEC-XTRAP extension is available from Xdmx and has been tested
2941with the <command>xtrap*</command> tools which are distributed as standard X11R6
2942clients. <!-- Tested March 2003 -->
2943</para>
2944
2945<para>The XTestExtension1 is <emphasis>not</emphasis> supported because it does not appear
2946to be used by any modern X clients (the few that support it also support
2947XTEST) and because there are no good methods available for testing that
2948it functions correctly (unlike XTEST and DEC-XTRAP, the code for
2949XTestExtension1 is not part of the standard X server source tree, so
2950additional testing is important). <!-- Tested March 2003 -->
2951</para>
2952
2953<para>Most of these extensions are documented in the X11R6 source tree.
2954Further, several original papers exist that this author was unable to
2955locate -- for completeness and historical interest, citations are
2956provide:
2957<variablelist>
2958<varlistentry>
2959<term>XRECORD</term>
2960<listitem>
2961<para>Martha Zimet. Extending X For Recording.  8th Annual X
2962Technical Conference Boston, MA January 24-26, 1994.
2963</para></listitem></varlistentry>
2964<varlistentry>
2965<term>DEC-XTRAP</term>
2966<listitem>
2967<para>Dick Annicchiarico, Robert Chesler, Alan Jamison. XTrap
2968Architecture. Digital Equipment Corporation, July 1991.
2969</para></listitem></varlistentry>
2970<varlistentry>
2971<term>XTestExtension1</term>
2972<listitem>
2973<para>Larry Woestman. X11 Input Synthesis Extension
2974Proposal. Hewlett Packard, November 1991.
2975</para></listitem></varlistentry>
2976</variablelist>
2977</para>
2978</sect4>
2979
2980<sect4>
2981<title>MIT-MISC (not supported)</title>
2982
2983<para>The MIT-MISC extension is used to control a bug-compatibility flag
2984that provides compatibility with xterm programs from X11R1 and X11R2.
2985There does not appear to be a single client available that makes use of
2986this extension and there is not way to verify that it works correctly.
2987The Xdmx server does <emphasis>not</emphasis> support MIT-MISC.
2988</para>
2989</sect4>
2990
2991<sect4>
2992<title>SCREENSAVER (not supported)</title>
2993
2994<para>This extension provides special support for the X screen saver.  It
2995was tested with beforelight, which appears to be the only client that
2996works with it.  When Xinerama was not active, <command>beforelight</command> behaved
2997as expected.  However, when Xinerama was active, <command>beforelight</command> did
2998not behave as expected.  Further, when this extension is not active,
2999<command>xscreensaver</command> (a widely-used X screen saver program) did not behave
3000as expected.  Since this extension is not Xinerama-aware and is not
3001commonly used with expected results by clients, we have left this
3002extension disabled at this time.
3003</para>
3004</sect4>
3005
3006<sect4>
3007<title>GLX (supported)</title>
3008
3009<para>The GLX extension provides OpenGL and GLX windowing support.  In
3010Xdmx, the extension is called glxProxy, and it is Xinerama aware.  It
3011works by either feeding requests forward through Xdmx to each of the
3012back-end servers or handling them locally.  All rendering requests are
3013handled on the back-end X servers.  This code was donated to the DMX
3014project by SGI.  For the X Test Suite results comparison, see below.
3015</para>
3016</sect4>
3017
3018<sect4>
3019<title>RENDER (supported)</title>
3020
3021<para>The X Rendering Extension (RENDER) provides support for digital image
3022composition.  Geometric and text rendering are supported.  RENDER is
3023partially Xinerama-aware, with text and the most basic compositing
3024operator; however, its higher level primitives (triangles, triangle
3025strips, and triangle fans) are not yet Xinerama-aware.  The RENDER
3026extension is still under development, and is currently at version 0.8.
3027Additional support will be required in DMX as more primitives and/or
3028requests are added to the extension.
3029</para>
3030
3031<para>There is currently no test suite for the X Rendering Extension;
3032however, there has been discussion of developing a test suite as the
3033extension matures.  When that test suite becomes available, additional
3034testing can be performed with Xdmx.  The X Test Suite passed and failed
3035the exact same tests before and after this extension was enabled.
3036</para>
3037</sect4>
3038
3039<sect4>
3040<title>Summary</title>
3041
3042<!-- WARNING: this list is duplicated in the "Common X extension
3043support" section -->
3044<para>To summarize, the following extensions are currently supported:
3045    BIG-REQUESTS,
3046    DEC-XTRAP,
3047    DMX,
3048    DPMS,
3049    Extended-Visual-Information,
3050    GLX,
3051    LBX,
3052    RECORD,
3053    RENDER,
3054    SECURITY,
3055    SHAPE,
3056    SYNC,
3057    X-Resource,
3058    XC-APPGROUP,
3059    XC-MISC,
3060    XFree86-Bigfont,
3061    XINERAMA,
3062    XInputExtension,
3063    XKEYBOARD, and
3064    XTEST.
3065</para>
3066
3067<para>The following extensions are <emphasis>not</emphasis> supported at this time:
3068    DOUBLE-BUFFER,
3069    FontCache,
3070    MIT-SCREEN-SAVER,
3071    MIT-SHM,
3072    MIT-SUNDRY-NONSTANDARD,
3073    TOG-CUP,
3074    XFree86-DGA,
3075    XFree86-Misc,
3076    XFree86-VidModeExtension,
3077    XTestExtensionExt1, and
3078    XVideo.
3079</para>
3080</sect4>
3081</sect3>
3082
3083<sect3>
3084<title>Additional Testing with the X Test Suite</title>
3085
3086<sect4>
3087<title>XFree86 without XTEST</title>
3088
3089<para>After the release of XFree86 4.3.0, we retested the XFree86 X server
3090with and without using the XTEST extension.  When the XTEST extension
3091was <emphasis>not</emphasis> used for testing, the XFree86 4.3.0 server running on our
3092usual test system with a Radeon VE card reported unexpected failures in
3093the following tests:
3094<literallayout>
3095XListPixmapFormats: Test 1
3096XChangeKeyboardControl: Tests 9, 10
3097XGetDefault: Test 5
3098XRebindKeysym: Test 1
3099</literallayout>
3100</para>
3101</sect4>
3102
3103<sect4>
3104<title>XFree86 with XTEST</title>
3105
3106<para>When using the XTEST extension, the XFree86 4.3.0 server reported the
3107following errors:
3108<literallayout>
3109XListPixmapFormats: Test 1
3110XChangeKeyboardControl: Tests 9, 10
3111XGetDefault: Test 5
3112XRebindKeysym: Test 1
3113
3114XAllowEvents: Tests 20, 21, 24
3115XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3116XGrabKey: Test 8
3117XSetPointerMapping: Test 3
3118XUngrabButton: Test 4
3119</literallayout>
3120</para>
3121
3122<para>While these errors may be important, they will probably be fixed
3123eventually in the XFree86 source tree.  We are particularly interested
3124in demonstrating that the Xdmx server does not introduce additional
3125failures that are not known Xinerama failures.
3126</para>
3127</sect4>
3128
3129<sect4>
3130<title>Xdmx with XTEST, without Xinerama, without GLX</title>
3131
3132<para>Without Xinerama, but using the XTEST extension, the following errors
3133were reported from Xdmx (note that these are the same as for the XFree86
31344.3.0, except that XGetDefault no longer fails):
3135<literallayout>
3136XListPixmapFormats: Test 1
3137XChangeKeyboardControl: Tests 9, 10
3138XRebindKeysym: Test 1
3139
3140XAllowEvents: Tests  20, 21, 24
3141XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3142XGrabKey: Test 8
3143XSetPointerMapping: Test 3
3144XUngrabButton: Test 4
3145</literallayout>
3146</para>
3147</sect4>
3148
3149<sect4>
3150<title>Xdmx with XTEST, with Xinerama, without GLX</title>
3151
3152<para>With Xinerama, using the XTEST extension, the following errors
3153were reported from Xdmx:
3154<literallayout>
3155XListPixmapFormats: Test 1
3156XChangeKeyboardControl: Tests 9, 10
3157XRebindKeysym: Test 1
3158
3159XAllowEvents: Tests 20, 21, 24
3160XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3161XGrabKey: Test 8
3162XSetPointerMapping: Test 3
3163XUngrabButton: Test 4
3164
3165XCopyPlane: Tests 13, 22, 31 (well-known XTEST/Xinerama interaction issue)
3166XDrawLine: Test 67
3167XDrawLines: Test 91
3168XDrawSegments: Test 68
3169</literallayout>
3170Note that the first two sets of errors are the same as for the XFree86
31714.3.0 server, and that the XCopyPlane error is a well-known error
3172resulting from an XTEST/Xinerama interaction when the request crosses a
3173screen boundary.  The XDraw* errors are resolved when the tests are run
3174individually and they do not cross a screen boundary.  We will
3175investigate these errors further to determine their cause.
3176</para>
3177</sect4>
3178
3179<sect4>
3180<title>Xdmx with XTEST, with Xinerama, with GLX</title>
3181
3182<para>With GLX enabled, using the XTEST extension, the following errors
3183were reported from Xdmx (these results are from early during the Phase
3184IV development, but were confirmed with a late Phase IV snapshot):
3185<literallayout>
3186XListPixmapFormats: Test 1
3187XChangeKeyboardControl: Tests 9, 10
3188XRebindKeysym: Test 1
3189
3190XAllowEvents: Tests 20, 21, 24
3191XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3192XGrabKey: Test 8
3193XSetPointerMapping: Test 3
3194XUngrabButton: Test 4
3195
3196XClearArea: Test 8
3197XCopyArea: Tests 4, 5, 11, 14, 17, 23, 25, 27, 30
3198XCopyPlane: Tests 6, 7, 10, 19, 22, 31
3199XDrawArcs: Tests 89, 100, 102
3200XDrawLine: Test 67
3201XDrawSegments: Test 68
3202</literallayout>
3203Note that the first two sets of errors are the same as for the XFree86
32044.3.0 server, and that the third set has different failures than when
3205Xdmx does not include GLX support.  Since the GLX extension adds new
3206visuals to support GLX's visual configs and the X Test Suite runs tests
3207over the entire set of visuals, additional rendering tests were run and
3208presumably more of them crossed a screen boundary.  This conclusion is
3209supported by the fact that nearly all of the rendering errors reported
3210are resolved when the tests are run individually and they do no cross a
3211screen boundary.
3212</para>
3213
3214<para>Further, when hardware rendering is disabled on the back-end displays,
3215many of the errors in the third set are eliminated, leaving only:
3216<literallayout>
3217XClearArea: Test 8
3218XCopyArea: Test 4, 5, 11, 14, 17, 23, 25, 27, 30
3219XCopyPlane: Test 6, 7, 10, 19, 22, 31
3220</literallayout>
3221</para>
3222</sect4>
3223
3224<sect4>
3225<title>Conclusion</title>
3226
3227<para>We conclude that all of the X Test Suite errors reported for Xdmx are
3228the result of errors in the back-end X server or the Xinerama
3229implementation.  Further, all of these errors that can be reasonably
3230fixed at the Xdmx layer have been.  (Where appropriate, we have
3231submitted patches to the XFree86 and Xinerama upstream maintainers.)
3232</para>
3233</sect4>
3234</sect3>
3235
3236<sect3>
3237<title>Dynamic Reconfiguration</title>
3238
3239<para>During this development phase, dynamic reconfiguration support was
3240added to DMX.  This support allows an application to change the position
3241and offset of a back-end server's screen.  For example, if the
3242application would like to shift a screen slightly to the left, it could
3243query Xdmx for the screen's &lt;x,y&gt; position and then dynamically
3244reconfigure that screen to be at position &lt;x+10,y&gt;.  When a screen
3245is dynamically reconfigured, input handling and a screen's root window
3246dimensions are adjusted as needed.  These adjustments are transparent to
3247the user.
3248</para>
3249
3250<sect4>
3251<title>Dynamic reconfiguration extension</title>
3252
3253<para>The application interface to DMX's dynamic reconfiguration is through
3254a function in the DMX extension library:
3255<programlisting>
3256Bool DMXReconfigureScreen(Display *dpy, int screen, int x, int y)
3257</programlisting>
3258where <parameter>dpy</parameter> is DMX server's display, <parameter>screen</parameter> is the number of the
3259screen to be reconfigured, and <parameter>x</parameter> and <parameter>y</parameter> are the new upper,
3260left-hand coordinates of the screen to be reconfigured.
3261</para>
3262
3263<para>The coordinates are not limited other than as required by the X
3264protocol, which limits all coordinates to a signed 16 bit number.  In
3265addition, all coordinates within a screen must also be legal values.
3266Therefore, setting a screen's upper, left-hand coordinates such that the
3267right or bottom edges of the screen is greater than 32,767 is illegal.
3268</para>
3269</sect4>
3270
3271<sect4>
3272<title>Bounding box</title>
3273
3274<para>When the Xdmx server is started, a bounding box is calculated from
3275the screens' layout given either on the command line or in the
3276configuration file.  This bounding box is currently fixed for the
3277lifetime of the Xdmx server.
3278</para>
3279
3280<para>While it is possible to move a screen outside of the bounding box, it
3281is currently not possible to change the dimensions of the bounding box.
3282For example, it is possible to specify coordinates of &lt;-100,-100&gt;
3283for the upper, left-hand corner of the bounding box, which was
3284previously at coordinates &lt;0,0&gt;.  As expected, the screen is moved
3285down and to the right; however, since the bounding box is fixed, the
3286left side and upper portions of the screen exposed by the
3287reconfiguration are no longer accessible on that screen.  Those
3288inaccessible regions are filled with black.
3289</para>
3290
3291<para>This fixed bounding box limitation will be addressed in a future
3292development phase.
3293</para>
3294</sect4>
3295
3296<sect4>
3297<title>Sample applications</title>
3298
3299<para>An example of where this extension is useful is in setting up a video
3300wall.  It is not always possible to get everything perfectly aligned,
3301and sometimes the positions are changed (e.g., someone might bump into a
3302projector).  Instead of physically moving projectors or monitors, it is
3303now possible to adjust the positions of the back-end server's screens
3304using the dynamic reconfiguration support in DMX.
3305</para>
3306
3307<para>Other applications, such as automatic setup and calibration tools,
3308can make use of dynamic reconfiguration to correct for projector
3309alignment problems, as long as the projectors are still arranged
3310rectilinearly.  Horizontal and vertical keystone correction could be
3311applied to projectors to correct for non-rectilinear alignment problems;
3312however, this must be done external to Xdmx.
3313</para>
3314
3315<para>A sample test program is included in the DMX server's examples
3316directory to demonstrate the interface and how an application might use
3317dynamic reconfiguration.  See <filename>dmxreconfig.c</filename> for details.
3318</para>
3319</sect4>
3320
3321<sect4>
3322<title>Additional notes</title>
3323
3324<para>In the original development plan, Phase IV was primarily devoted to
3325adding OpenGL support to DMX; however, SGI became interested in the DMX
3326project and developed code to support OpenGL/GLX.  This code was later
3327donated to the DMX project and integrated into the DMX code base, which
3328freed the DMX developers to concentrate on dynamic reconfiguration (as
3329described above).
3330</para>
3331</sect4>
3332</sect3>
3333
3334<sect3>
3335<title>Doxygen documentation</title>
3336
3337<para>Doxygen is an open-source (GPL) documentation system for generating
3338browseable documentation from stylized comments in the source code.  We
3339have placed all of the Xdmx server and DMX protocol source code files
3340under Doxygen so that comprehensive documentation for the Xdmx source
3341code is available in an easily browseable format.
3342</para>
3343</sect3>
3344
3345<sect3>
3346<title>Valgrind</title>
3347
3348<para>Valgrind, an open-source (GPL) memory debugger for Linux, was used to
3349search for memory management errors.  Several memory leaks were detected
3350and repaired.  The following errors were not addressed:
3351<orderedlist>
3352    <listitem><para>
3353        When the X11 transport layer sends a reply to the client, only
3354        those fields that are required by the protocol are filled in --
3355        unused fields are left as uninitialized memory and are therefore
3356        noted by valgrind.  These instances are not errors and were not
3357        repaired.
3358    </para></listitem>
3359    <listitem><para>
3360        At each server generation, glxInitVisuals allocates memory that
3361        is never freed.  The amount of memory lost each generation
3362        approximately equal to 128 bytes for each back-end visual.
3363        Because the code involved is automatically generated, this bug
3364        has not been fixed and will be referred to SGI.
3365    </para></listitem>
3366    <listitem><para>
3367        At each server generation, dmxRealizeFont calls XLoadQueryFont,
3368        which allocates a font structure that is not freed.
3369        dmxUnrealizeFont can free the font structure for the first
3370        screen, but cannot free it for the other screens since they are
3371        already closed by the time dmxUnrealizeFont could free them.
3372        The amount of memory lost each generation is approximately equal
3373        to 80 bytes per font per back-end.  When this bug is fixed in
3374        the the X server's device-independent (dix) code, DMX will be
3375        able to properly free the memory allocated by XLoadQueryFont.
3376    </para></listitem>
3377</orderedlist>
3378</para>
3379</sect3>
3380
3381<sect3>
3382<title>RATS</title>
3383
3384<para>RATS (Rough Auditing Tool for Security) is an open-source (GPL)
3385security analysis tool that scans source code for common
3386security-related programming errors (e.g., buffer overflows and TOCTOU
3387races).  RATS was used to audit all of the code in the hw/dmx directory
3388and all "High" notations were checked manually.  The code was either
3389re-written to eliminate the warning, or a comment containing "RATS" was
3390inserted on the line to indicate that a human had checked the code.
3391Unrepaired warnings are as follows:
3392<orderedlist>
3393    <listitem><para>
3394        Fixed-size buffers are used in many areas, but code has been
3395        added to protect against buffer overflows (e.g., XmuSnprint).
3396        The only instances that have not yet been fixed are in
3397        config/xdmxconfig.c (which is not part of the Xdmx server) and
3398        input/usb-common.c.
3399    </para></listitem>
3400    <listitem><para>
3401        vprintf and vfprintf are used in the logging routines.  In
3402        general, all uses of these functions (e.g., dmxLog) provide a
3403        constant format string from a trusted source, so the use is
3404        relatively benign.
3405    </para></listitem>
3406    <listitem><para>
3407        glxProxy/glxscreens.c uses getenv and strcat.  The use of these
3408        functions is safe and will remain safe as long as
3409        ExtensionsString is longer then GLXServerExtensions (ensuring
3410        this may not be ovious to the casual programmer, but this is in
3411        automatically generated code, so we hope that the generator
3412        enforces this constraint).
3413    </para></listitem>
3414</orderedlist>
3415
3416</para>
3417
3418</sect3>
3419
3420</sect2>
3421
3422</sect1>
3423
3424</appendix>
3425
3426  </article>
3427
3428  <!-- Local Variables: -->
3429  <!-- fill-column: 72  -->
3430  <!-- End:             -->
3431