1706f2543Smrg Client Scheduling in X 2706f2543Smrg Keith Packard 3706f2543Smrg SuSE 4706f2543Smrg 10/28/99 5706f2543Smrg 6706f2543SmrgHistory: 7706f2543Smrg 8706f2543SmrgSince the original X server was written at Digital in 1987, the OS and DIX 9706f2543Smrglayers shared responsibility for scheduling the order to service 10706f2543Smrgclient requests. The original design was simplistic; under the maximum 11706f2543Smrgfirst make it work, then make it work well, this was a good idea. Now 12706f2543Smrgthat we have a bit more experience with X applications, it's time to 13706f2543Smrgrethink the design. 14706f2543Smrg 15706f2543SmrgThe basic dispatch loop in DIX looks like: 16706f2543Smrg 17706f2543Smrg for (;;) 18706f2543Smrg { 19706f2543Smrg nready = WaitForSomething (...); 20706f2543Smrg while (nready--) 21706f2543Smrg { 22706f2543Smrg isItTimeToYield = FALSE; 23706f2543Smrg while (!isItTimeToYield) 24706f2543Smrg { 25706f2543Smrg if (!ReadRequestFromClient (...)) 26706f2543Smrg break; 27706f2543Smrg (execute request); 28706f2543Smrg } 29706f2543Smrg } 30706f2543Smrg } 31706f2543Smrg 32706f2543SmrgWaitForSomething looks like: 33706f2543Smrg 34706f2543Smrg for (;;) 35706f2543Smrg if (ANYSET (ClientsWithInput)) 36706f2543Smrg return popcount (ClientsWithInput); 37706f2543Smrg select (...) 38706f2543Smrg compute clientsReadable from select result; 39706f2543Smrg return popcount (clientsReadable) 40706f2543Smrg } 41706f2543Smrg 42706f2543SmrgReadRequestFromClient looks like: 43706f2543Smrg 44706f2543Smrg if (!fullRequestQueued) 45706f2543Smrg { 46706f2543Smrg read (); 47706f2543Smrg if (!fullRequestQueued) 48706f2543Smrg { 49706f2543Smrg remove from ClientsWithInput; 50706f2543Smrg timesThisConnection = 0; 51706f2543Smrg return 0; 52706f2543Smrg } 53706f2543Smrg } 54706f2543Smrg if (twoFullRequestsQueued) 55706f2543Smrg add to ClientsWithInput; 56706f2543Smrg 57706f2543Smrg if (++timesThisConnection >= 10) 58706f2543Smrg { 59706f2543Smrg isItTimeToYield = TRUE; 60706f2543Smrg timesThisConnection = 0; 61706f2543Smrg } 62706f2543Smrg return 1; 63706f2543Smrg 64706f2543SmrgHere's what happens in this code: 65706f2543Smrg 66706f2543SmrgWith a single client executing a stream of requests: 67706f2543Smrg 68706f2543Smrg A client sends a packet of requests to the server. 69706f2543Smrg 70706f2543Smrg WaitForSomething wakes up from select and returns that client 71706f2543Smrg to Dispatch 72706f2543Smrg 73706f2543Smrg Dispatch calls ReadRequestFromClient which reads a buffer (4K) 74706f2543Smrg full of requests from the client 75706f2543Smrg 76706f2543Smrg The server executes requests from this buffer until it emptys, 77706f2543Smrg in two stages -- 10 requests at a time are executed in the 78706f2543Smrg inner Dispatch loop, a buffer full of requests are executed 79706f2543Smrg because WaitForSomething immediately returns if any clients 80706f2543Smrg have complete requests pending in their input queues. 81706f2543Smrg 82706f2543Smrg When the buffer finally emptys, the next call to ReadRequest 83706f2543Smrg FromClient will return zero and Dispatch will go back to 84706f2543Smrg WaitForSomething; now that the client has no requests pending, 85706f2543Smrg WaitForSomething will block in select again. If the client 86706f2543Smrg is active, this select will immediately return that client 87706f2543Smrg as ready to read. 88706f2543Smrg 89706f2543SmrgWith multiple clients sending streams of requests, the sequence 90706f2543Smrgof operations is similar, except that ReadRequestFromClient will 91706f2543Smrgset isItTimeToYield after each 10 requests executed causing the 92706f2543Smrgserver to round-robin among the clients with available requests. 93706f2543Smrg 94706f2543SmrgIt's important to realize here that any complete requests which have been 95706f2543Smrgread from clients will be executed before the server will use select again 96706f2543Smrgto discover input from other clients. A single busy client can easily 97706f2543Smrgmonopolize the X server. 98706f2543Smrg 99706f2543SmrgSo, the X server doesn't share well with clients which are more interactive 100706f2543Smrgin nature. 101706f2543Smrg 102706f2543SmrgThe X server executes at most a buffer full of requests before again heading 103706f2543Smrginto select; ReadRequestFromClient causes the server to yield when the 104706f2543Smrgclient request buffer doesn't contain a complete request. When 105706f2543Smrgthat buffer is executed quickly, the server spends a lot of time 106706f2543Smrgin select discovering that the same client again has input ready. Thus 107706f2543Smrgthe server also runs busy clients less efficiently than is would be 108706f2543Smrgpossible. 109706f2543Smrg 110706f2543SmrgWhat to do. 111706f2543Smrg 112706f2543SmrgThere are several things evident from the above discussion: 113706f2543Smrg 114706f2543Smrg 1 The server has a poor metric for deciding how much work it 115706f2543Smrg should do at one time on behalf of a particular client. 116706f2543Smrg 117706f2543Smrg 2 The server doesn't call select often enough to detect less 118706f2543Smrg aggressive clients in the face of busy clients, especially 119706f2543Smrg when those clients are executing slow requests. 120706f2543Smrg 121706f2543Smrg 3 The server calls select too often when executing fast requests. 122706f2543Smrg 123706f2543Smrg 4 Some priority scheme is needed to keep interactive clients 124706f2543Smrg responding to the user. 125706f2543Smrg 126706f2543SmrgAnd, there are some assumptions about how X applications work: 127706f2543Smrg 128706f2543Smrg 1 Each X request is executed relatively quickly; a request-granularity 129706f2543Smrg is good enough for interactive response almost all of the time. 130706f2543Smrg 131706f2543Smrg 2 X applications receiving mouse/keyboard events are likely to 132706f2543Smrg warrant additional attention from the X server. 133706f2543Smrg 134706f2543SmrgInstead of a request-count metric for work, a time-based metric should be 135706f2543Smrgused. The server should select a reasonable time slice for each client 136706f2543Smrgand execute requests for the entire timeslice before yielding to 137706f2543Smrganother client. 138706f2543Smrg 139706f2543SmrgInstead of returning immediately from WaitForSomething if clients have 140706f2543Smrgcomplete requests queued, the server should go through select each 141706f2543Smrgtime and gather as many ready clients as possible. This involves 142706f2543Smrgpolling instead of blocking and adding the ClientsWithInput to 143706f2543SmrgclientsReadable after the select returns. 144706f2543Smrg 145706f2543SmrgInstead of yielding when the request buffer is empty for a particular 146706f2543Smrgclient, leave the yielding to the upper level scheduling and allow 147706f2543Smrgthe server to try and read again from the socket. If the client 148706f2543Smrgis busy, another buffer full of requests will already be waiting 149706f2543Smrgto be delivered thus avoiding the call through select and the 150706f2543Smrgadditional overhead in WaitForSomething. 151706f2543Smrg 152706f2543SmrgFinally, the dispatch loop should not simply execute requests from the 153706f2543Smrgfirst available client, instead each client should be prioritized with 154706f2543Smrgbusy clients penalized and clients receiving user events praised. 155706f2543Smrg 156706f2543SmrgHow it's done: 157706f2543Smrg 158706f2543SmrgPolling the current time of day from the OS is too expensive to 159706f2543Smrgbe done at each request boundary, so instead an interval timer is 160706f2543Smrgset allowing the server to track time changes by counting invocations 161706f2543Smrgof the related signal handler. Instead of using the wall time for 162706f2543Smrgthis purpose, the process CPU time is used instead. This serves 163706f2543Smrgtwo purposes -- first, it allows the server to consume no CPU cycles 164706f2543Smrgwhen idle, second it avoids conflicts with SIGALRM usage in other 165706f2543Smrgparts of the server code. It's not without problems though; other 166706f2543SmrgCPU intensive processes on the same machine can reduce interactive 167706f2543Smrgresponse time within the X server. The dispatch loop can now 168706f2543Smrgcalculate an approximate time value using the number of signals 169706f2543Smrgreceived. The granularity of the timer sets the scheduling jitter, 170706f2543Smrgat 20ms it's only occasionally noticeable. 171706f2543Smrg 172706f2543SmrgThe changes to WaitForSomething and ReadRequestFromClient are 173706f2543Smrgstraightforward, adjusting when select is called and avoiding 174706f2543Smrgsetting isItTimeToYield too often. 175706f2543Smrg 176706f2543SmrgThe dispatch loop changes are more extensive, now instead of 177706f2543Smrgexecuting requests from all available clients, a single client 178706f2543Smrgis chosen after each call to WaitForSomething, requests are 179706f2543Smrgexecuted for that client and WaitForSomething is called again. 180706f2543Smrg 181706f2543SmrgEach client is assigned a priority, the dispatch loop chooses the 182706f2543Smrgclient with the highest priority to execute. Priorities are 183706f2543Smrgupdated in three ways: 184706f2543Smrg 185706f2543Smrg 1. Clients which consume their entire slice are penalized 186706f2543Smrg by having their priority reduced by one until they 187706f2543Smrg reach some minimum value. 188706f2543Smrg 189706f2543Smrg 2. Clients which have executed no requests for some time 190706f2543Smrg are praised by having their priority raised until they 191706f2543Smrg return to normal priority. 192706f2543Smrg 193706f2543Smrg 3. Clients which receive user input are praised by having 194706f2543Smrg their priority rased until they reach some maximal 195706f2543Smrg value, above normal priority. 196706f2543Smrg 197706f2543SmrgThe effect of these changes is to both improve interactive application 198706f2543Smrgresponse and benchmark numbers at the same time. 199706f2543Smrg 200706f2543Smrg 201706f2543Smrg 202706f2543Smrg 203706f2543Smrg 204706f2543Smrg$XFree86: $ 205