105b261ecSmrg Client Scheduling in X 205b261ecSmrg Keith Packard 305b261ecSmrg SuSE 405b261ecSmrg 10/28/99 505b261ecSmrg 605b261ecSmrgHistory: 705b261ecSmrg 805b261ecSmrgSince the original X server was written at Digital in 1987, the OS and DIX 905b261ecSmrglayers shared responsibility for scheduling the order to service 1005b261ecSmrgclient requests. The original design was simplistic; under the maximum 1105b261ecSmrgfirst make it work, then make it work well, this was a good idea. Now 1205b261ecSmrgthat we have a bit more experience with X applications, it's time to 1305b261ecSmrgrethink the design. 1405b261ecSmrg 1505b261ecSmrgThe basic dispatch loop in DIX looks like: 1605b261ecSmrg 1705b261ecSmrg for (;;) 1805b261ecSmrg { 1905b261ecSmrg nready = WaitForSomething (...); 2005b261ecSmrg while (nready--) 2105b261ecSmrg { 2205b261ecSmrg isItTimeToYield = FALSE; 2305b261ecSmrg while (!isItTimeToYield) 2405b261ecSmrg { 2505b261ecSmrg if (!ReadRequestFromClient (...)) 2605b261ecSmrg break; 2705b261ecSmrg (execute request); 2805b261ecSmrg } 2905b261ecSmrg } 3005b261ecSmrg } 3105b261ecSmrg 3205b261ecSmrgWaitForSomething looks like: 3305b261ecSmrg 3405b261ecSmrg for (;;) 3505b261ecSmrg if (ANYSET (ClientsWithInput)) 3605b261ecSmrg return popcount (ClientsWithInput); 3705b261ecSmrg select (...) 3805b261ecSmrg compute clientsReadable from select result; 3905b261ecSmrg return popcount (clientsReadable) 4005b261ecSmrg } 4105b261ecSmrg 4205b261ecSmrgReadRequestFromClient looks like: 4305b261ecSmrg 4405b261ecSmrg if (!fullRequestQueued) 4505b261ecSmrg { 4605b261ecSmrg read (); 4705b261ecSmrg if (!fullRequestQueued) 4805b261ecSmrg { 4905b261ecSmrg remove from ClientsWithInput; 5005b261ecSmrg timesThisConnection = 0; 5105b261ecSmrg return 0; 5205b261ecSmrg } 5305b261ecSmrg } 5405b261ecSmrg if (twoFullRequestsQueued) 5505b261ecSmrg add to ClientsWithInput; 5605b261ecSmrg 5705b261ecSmrg if (++timesThisConnection >= 10) 5805b261ecSmrg { 5905b261ecSmrg isItTimeToYield = TRUE; 6005b261ecSmrg timesThisConnection = 0; 6105b261ecSmrg } 6205b261ecSmrg return 1; 6305b261ecSmrg 6405b261ecSmrgHere's what happens in this code: 6505b261ecSmrg 6605b261ecSmrgWith a single client executing a stream of requests: 6705b261ecSmrg 6805b261ecSmrg A client sends a packet of requests to the server. 6905b261ecSmrg 7005b261ecSmrg WaitForSomething wakes up from select and returns that client 7105b261ecSmrg to Dispatch 7205b261ecSmrg 7305b261ecSmrg Dispatch calls ReadRequestFromClient which reads a buffer (4K) 7405b261ecSmrg full of requests from the client 7505b261ecSmrg 7605b261ecSmrg The server executes requests from this buffer until it emptys, 7705b261ecSmrg in two stages -- 10 requests at a time are executed in the 7805b261ecSmrg inner Dispatch loop, a buffer full of requests are executed 7905b261ecSmrg because WaitForSomething immediately returns if any clients 8005b261ecSmrg have complete requests pending in their input queues. 8105b261ecSmrg 8205b261ecSmrg When the buffer finally emptys, the next call to ReadRequest 8305b261ecSmrg FromClient will return zero and Dispatch will go back to 8405b261ecSmrg WaitForSomething; now that the client has no requests pending, 8505b261ecSmrg WaitForSomething will block in select again. If the client 8605b261ecSmrg is active, this select will immediately return that client 8705b261ecSmrg as ready to read. 8805b261ecSmrg 8905b261ecSmrgWith multiple clients sending streams of requests, the sequence 9005b261ecSmrgof operations is similar, except that ReadRequestFromClient will 9105b261ecSmrgset isItTimeToYield after each 10 requests executed causing the 9205b261ecSmrgserver to round-robin among the clients with available requests. 9305b261ecSmrg 9405b261ecSmrgIt's important to realize here that any complete requests which have been 9505b261ecSmrgread from clients will be executed before the server will use select again 9605b261ecSmrgto discover input from other clients. A single busy client can easily 9705b261ecSmrgmonopolize the X server. 9805b261ecSmrg 9905b261ecSmrgSo, the X server doesn't share well with clients which are more interactive 10005b261ecSmrgin nature. 10105b261ecSmrg 10205b261ecSmrgThe X server executes at most a buffer full of requests before again heading 10305b261ecSmrginto select; ReadRequestFromClient causes the server to yield when the 10405b261ecSmrgclient request buffer doesn't contain a complete request. When 10505b261ecSmrgthat buffer is executed quickly, the server spends a lot of time 10605b261ecSmrgin select discovering that the same client again has input ready. Thus 10705b261ecSmrgthe server also runs busy clients less efficiently than is would be 10805b261ecSmrgpossible. 10905b261ecSmrg 11005b261ecSmrgWhat to do. 11105b261ecSmrg 11205b261ecSmrgThere are several things evident from the above discussion: 11305b261ecSmrg 11405b261ecSmrg 1 The server has a poor metric for deciding how much work it 11505b261ecSmrg should do at one time on behalf of a particular client. 11605b261ecSmrg 11705b261ecSmrg 2 The server doesn't call select often enough to detect less 11805b261ecSmrg aggressive clients in the face of busy clients, especially 11905b261ecSmrg when those clients are executing slow requests. 12005b261ecSmrg 12105b261ecSmrg 3 The server calls select too often when executing fast requests. 12205b261ecSmrg 12305b261ecSmrg 4 Some priority scheme is needed to keep interactive clients 12405b261ecSmrg responding to the user. 12505b261ecSmrg 12605b261ecSmrgAnd, there are some assumptions about how X applications work: 12705b261ecSmrg 12805b261ecSmrg 1 Each X request is executed relatively quickly; a request-granularity 12905b261ecSmrg is good enough for interactive response almost all of the time. 13005b261ecSmrg 13105b261ecSmrg 2 X applications receiving mouse/keyboard events are likely to 13205b261ecSmrg warrant additional attention from the X server. 13305b261ecSmrg 13405b261ecSmrgInstead of a request-count metric for work, a time-based metric should be 13505b261ecSmrgused. The server should select a reasonable time slice for each client 13605b261ecSmrgand execute requests for the entire timeslice before yielding to 13705b261ecSmrganother client. 13805b261ecSmrg 13905b261ecSmrgInstead of returning immediately from WaitForSomething if clients have 14005b261ecSmrgcomplete requests queued, the server should go through select each 14105b261ecSmrgtime and gather as many ready clients as possible. This involves 14205b261ecSmrgpolling instead of blocking and adding the ClientsWithInput to 14305b261ecSmrgclientsReadable after the select returns. 14405b261ecSmrg 14505b261ecSmrgInstead of yielding when the request buffer is empty for a particular 14605b261ecSmrgclient, leave the yielding to the upper level scheduling and allow 14705b261ecSmrgthe server to try and read again from the socket. If the client 14805b261ecSmrgis busy, another buffer full of requests will already be waiting 14905b261ecSmrgto be delivered thus avoiding the call through select and the 15005b261ecSmrgadditional overhead in WaitForSomething. 15105b261ecSmrg 15205b261ecSmrgFinally, the dispatch loop should not simply execute requests from the 15305b261ecSmrgfirst available client, instead each client should be prioritized with 15405b261ecSmrgbusy clients penalized and clients receiving user events praised. 15505b261ecSmrg 15605b261ecSmrgHow it's done: 15705b261ecSmrg 15805b261ecSmrgPolling the current time of day from the OS is too expensive to 15905b261ecSmrgbe done at each request boundary, so instead an interval timer is 16005b261ecSmrgset allowing the server to track time changes by counting invocations 16105b261ecSmrgof the related signal handler. Instead of using the wall time for 16205b261ecSmrgthis purpose, the process CPU time is used instead. This serves 16305b261ecSmrgtwo purposes -- first, it allows the server to consume no CPU cycles 16405b261ecSmrgwhen idle, second it avoids conflicts with SIGALRM usage in other 16505b261ecSmrgparts of the server code. It's not without problems though; other 16605b261ecSmrgCPU intensive processes on the same machine can reduce interactive 16705b261ecSmrgresponse time within the X server. The dispatch loop can now 16805b261ecSmrgcalculate an approximate time value using the number of signals 16905b261ecSmrgreceived. The granularity of the timer sets the scheduling jitter, 17005b261ecSmrgat 20ms it's only occasionally noticeable. 17105b261ecSmrg 17205b261ecSmrgThe changes to WaitForSomething and ReadRequestFromClient are 17305b261ecSmrgstraightforward, adjusting when select is called and avoiding 17405b261ecSmrgsetting isItTimeToYield too often. 17505b261ecSmrg 17605b261ecSmrgThe dispatch loop changes are more extensive, now instead of 17705b261ecSmrgexecuting requests from all available clients, a single client 17805b261ecSmrgis chosen after each call to WaitForSomething, requests are 17905b261ecSmrgexecuted for that client and WaitForSomething is called again. 18005b261ecSmrg 18105b261ecSmrgEach client is assigned a priority, the dispatch loop chooses the 18205b261ecSmrgclient with the highest priority to execute. Priorities are 18305b261ecSmrgupdated in three ways: 18405b261ecSmrg 18505b261ecSmrg 1. Clients which consume their entire slice are penalized 18605b261ecSmrg by having their priority reduced by one until they 18705b261ecSmrg reach some minimum value. 18805b261ecSmrg 18905b261ecSmrg 2. Clients which have executed no requests for some time 19005b261ecSmrg are praised by having their priority raised until they 19105b261ecSmrg return to normal priority. 19205b261ecSmrg 19305b261ecSmrg 3. Clients which receive user input are praised by having 19405b261ecSmrg their priority rased until they reach some maximal 19505b261ecSmrg value, above normal priority. 19605b261ecSmrg 19705b261ecSmrgThe effect of these changes is to both improve interactive application 19805b261ecSmrgresponse and benchmark numbers at the same time. 199