1.. _context: 2 3Context 4======= 5 6A Gallium rendering context encapsulates the state which effects 3D 7rendering such as blend state, depth/stencil state, texture samplers, 8etc. 9 10Note that resource/texture allocation is not per-context but per-screen. 11 12 13Methods 14------- 15 16CSO State 17^^^^^^^^^ 18 19All Constant State Object (CSO) state is created, bound, and destroyed, 20with triplets of methods that all follow a specific naming scheme. 21For example, ``create_blend_state``, ``bind_blend_state``, and 22``destroy_blend_state``. 23 24CSO objects handled by the context object: 25 26* :ref:`Blend`: ``*_blend_state`` 27* :ref:`Sampler`: Texture sampler states are bound separately for fragment, 28 vertex, geometry and compute shaders with the ``bind_sampler_states`` 29 function. The ``start`` and ``num_samplers`` parameters indicate a range 30 of samplers to change. NOTE: at this time, start is always zero and 31 the CSO module will always replace all samplers at once (no sub-ranges). 32 This may change in the future. 33* :ref:`Rasterizer`: ``*_rasterizer_state`` 34* :ref:`depth-stencil-alpha`: ``*_depth_stencil_alpha_state`` 35* :ref:`Shader`: These are create, bind and destroy methods for vertex, 36 fragment and geometry shaders. 37* :ref:`vertexelements`: ``*_vertex_elements_state`` 38 39 40Resource Binding State 41^^^^^^^^^^^^^^^^^^^^^^ 42 43This state describes how resources in various flavours (textures, 44buffers, surfaces) are bound to the driver. 45 46 47* ``set_constant_buffer`` sets a constant buffer to be used for a given shader 48 type. index is used to indicate which buffer to set (some apis may allow 49 multiple ones to be set, and binding a specific one later, though drivers 50 are mostly restricted to the first one right now). 51 52* ``set_framebuffer_state`` 53 54* ``set_vertex_buffers`` 55 56 57Non-CSO State 58^^^^^^^^^^^^^ 59 60These pieces of state are too small, variable, and/or trivial to have CSO 61objects. They all follow simple, one-method binding calls, e.g. 62``set_blend_color``. 63 64* ``set_stencil_ref`` sets the stencil front and back reference values 65 which are used as comparison values in stencil test. 66* ``set_blend_color`` 67* ``set_sample_mask`` sets the per-context multisample sample mask. Note 68 that this takes effect even if multisampling is not explicitly enabled if 69 the frambuffer surface(s) are multisampled. Also, this mask is AND-ed 70 with the optional fragment shader sample mask output (when emitted). 71* ``set_sample_locations`` sets the sample locations used for rasterization. 72 ```get_sample_position``` still returns the default locations. When NULL, 73 the default locations are used. 74* ``set_min_samples`` sets the minimum number of samples that must be run. 75* ``set_clip_state`` 76* ``set_polygon_stipple`` 77* ``set_scissor_states`` sets the bounds for the scissor test, which culls 78 pixels before blending to render targets. If the :ref:`Rasterizer` does 79 not have the scissor test enabled, then the scissor bounds never need to 80 be set since they will not be used. Note that scissor xmin and ymin are 81 inclusive, but xmax and ymax are exclusive. The inclusive ranges in x 82 and y would be [xmin..xmax-1] and [ymin..ymax-1]. The number of scissors 83 should be the same as the number of set viewports and can be up to 84 PIPE_MAX_VIEWPORTS. 85* ``set_viewport_states`` 86* ``set_window_rectangles`` sets the window rectangles to be used for 87 rendering, as defined by GL_EXT_window_rectangles. There are two 88 modes - include and exclude, which define whether the supplied 89 rectangles are to be used for including fragments or excluding 90 them. All of the rectangles are ORed together, so in exclude mode, 91 any fragment inside any rectangle would be culled, while in include 92 mode, any fragment outside all rectangles would be culled. xmin/ymin 93 are inclusive, while xmax/ymax are exclusive (same as scissor states 94 above). Note that this only applies to draws, not clears or 95 blits. (Blits have their own way to pass the requisite rectangles 96 in.) 97* ``set_tess_state`` configures the default tessellation parameters: 98 99 * ``default_outer_level`` is the default value for the outer tessellation 100 levels. This corresponds to GL's ``PATCH_DEFAULT_OUTER_LEVEL``. 101 * ``default_inner_level`` is the default value for the inner tessellation 102 levels. This corresponds to GL's ``PATCH_DEFAULT_INNER_LEVEL``. 103 104* ``set_debug_callback`` sets the callback to be used for reporting 105 various debug messages, eventually reported via KHR_debug and 106 similar mechanisms. 107 108Samplers 109^^^^^^^^ 110 111pipe_sampler_state objects control how textures are sampled (coordinate 112wrap modes, interpolation modes, etc). Note that samplers are not used 113for texture buffer objects. That is, pipe_context::bind_sampler_views() 114will not bind a sampler if the corresponding sampler view refers to a 115PIPE_BUFFER resource. 116 117Sampler Views 118^^^^^^^^^^^^^ 119 120These are the means to bind textures to shader stages. To create one, specify 121its format, swizzle and LOD range in sampler view template. 122 123If texture format is different than template format, it is said the texture 124is being cast to another format. Casting can be done only between compatible 125formats, that is formats that have matching component order and sizes. 126 127Swizzle fields specify the way in which fetched texel components are placed 128in the result register. For example, ``swizzle_r`` specifies what is going to be 129placed in first component of result register. 130 131The ``first_level`` and ``last_level`` fields of sampler view template specify 132the LOD range the texture is going to be constrained to. Note that these 133values are in addition to the respective min_lod, max_lod values in the 134pipe_sampler_state (that is if min_lod is 2.0, and first_level 3, the first mip 135level used for sampling from the resource is effectively the fifth). 136 137The ``first_layer`` and ``last_layer`` fields specify the layer range the 138texture is going to be constrained to. Similar to the LOD range, this is added 139to the array index which is used for sampling. 140 141* ``set_sampler_views`` binds an array of sampler views to a shader stage. 142 Every binding point acquires a reference 143 to a respective sampler view and releases a reference to the previous 144 sampler view. 145 146 Sampler views outside of ``[start_slot, start_slot + num_views)`` are 147 unmodified. If ``views`` is NULL, the behavior is the same as if 148 ``views[n]`` was NULL for the entire range, ie. releasing the reference 149 for all the sampler views in the specified range. 150 151* ``create_sampler_view`` creates a new sampler view. ``texture`` is associated 152 with the sampler view which results in sampler view holding a reference 153 to the texture. Format specified in template must be compatible 154 with texture format. 155 156* ``sampler_view_destroy`` destroys a sampler view and releases its reference 157 to associated texture. 158 159Hardware Atomic buffers 160^^^^^^^^^^^^^^^^^^^^^^^ 161 162Buffers containing hw atomics are required to support the feature 163on some drivers. 164 165Drivers that require this need to fill the ``set_hw_atomic_buffers`` method. 166 167Shader Resources 168^^^^^^^^^^^^^^^^ 169 170Shader resources are textures or buffers that may be read or written 171from a shader without an associated sampler. This means that they 172have no support for floating point coordinates, address wrap modes or 173filtering. 174 175There are 2 types of shader resources: buffers and images. 176 177Buffers are specified using the ``set_shader_buffers`` method. 178 179Images are specified using the ``set_shader_images`` method. When binding 180images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view 181fields specify the mipmap level and the range of layers the image will be 182constrained to. 183 184Surfaces 185^^^^^^^^ 186 187These are the means to use resources as color render targets or depthstencil 188attachments. To create one, specify the mip level, the range of layers, and 189the bind flags (either PIPE_BIND_DEPTH_STENCIL or PIPE_BIND_RENDER_TARGET). 190Note that layer values are in addition to what is indicated by the geometry 191shader output variable XXX_FIXME (that is if first_layer is 3 and geometry 192shader indicates index 2, the 5th layer of the resource will be used). These 193first_layer and last_layer parameters will only be used for 1d array, 2d array, 194cube, and 3d textures otherwise they are 0. 195 196* ``create_surface`` creates a new surface. 197 198* ``surface_destroy`` destroys a surface and releases its reference to the 199 associated resource. 200 201Stream output targets 202^^^^^^^^^^^^^^^^^^^^^ 203 204Stream output, also known as transform feedback, allows writing the primitives 205produced by the vertex pipeline to buffers. This is done after the geometry 206shader or vertex shader if no geometry shader is present. 207 208The stream output targets are views into buffer resources which can be bound 209as stream outputs and specify a memory range where it's valid to write 210primitives. The pipe driver must implement memory protection such that any 211primitives written outside of the specified memory range are discarded. 212 213Two stream output targets can use the same resource at the same time, but 214with a disjoint memory range. 215 216Additionally, the stream output target internally maintains the offset 217into the buffer which is incremented everytime something is written to it. 218The internal offset is equal to how much data has already been written. 219It can be stored in device memory and the CPU actually doesn't have to query 220it. 221 222The stream output target can be used in a draw command to provide 223the vertex count. The vertex count is derived from the internal offset 224discussed above. 225 226* ``create_stream_output_target`` create a new target. 227 228* ``stream_output_target_destroy`` destroys a target. Users of this should 229 use pipe_so_target_reference instead. 230 231* ``set_stream_output_targets`` binds stream output targets. The parameter 232 offset is an array which specifies the internal offset of the buffer. The 233 internal offset is, besides writing, used for reading the data during the 234 draw_auto stage, i.e. it specifies how much data there is in the buffer 235 for the purposes of the draw_auto stage. -1 means the buffer should 236 be appended to, and everything else sets the internal offset. 237 238NOTE: The currently-bound vertex or geometry shader must be compiled with 239the properly-filled-in structure pipe_stream_output_info describing which 240outputs should be written to buffers and how. The structure is part of 241pipe_shader_state. 242 243Clearing 244^^^^^^^^ 245 246Clear is one of the most difficult concepts to nail down to a single 247interface (due to both different requirements from APIs and also driver/hw 248specific differences). 249 250``clear`` initializes some or all of the surfaces currently bound to 251the framebuffer to particular RGBA, depth, or stencil values. 252Currently, this does not take into account color or stencil write masks (as 253used by GL), and always clears the whole surfaces (no scissoring as used by 254GL clear or explicit rectangles like d3d9 uses). It can, however, also clear 255only depth or stencil in a combined depth/stencil surface. 256If a surface includes several layers then all layers will be cleared. 257 258``clear_render_target`` clears a single color rendertarget with the specified 259color value. While it is only possible to clear one surface at a time (which can 260include several layers), this surface need not be bound to the framebuffer. 261If render_condition_enabled is false, any current rendering condition is ignored 262and the clear will be unconditional. 263 264``clear_depth_stencil`` clears a single depth, stencil or depth/stencil surface 265with the specified depth and stencil values (for combined depth/stencil buffers, 266it is also possible to only clear one or the other part). While it is only 267possible to clear one surface at a time (which can include several layers), 268this surface need not be bound to the framebuffer. 269If render_condition_enabled is false, any current rendering condition is ignored 270and the clear will be unconditional. 271 272``clear_texture`` clears a non-PIPE_BUFFER resource's specified level 273and bounding box with a clear value provided in that resource's native 274format. 275 276``clear_buffer`` clears a PIPE_BUFFER resource with the specified clear value 277(which may be multiple bytes in length). Logically this is a memset with a 278multi-byte element value starting at offset bytes from resource start, going 279for size bytes. It is guaranteed that size % clear_value_size == 0. 280 281Evaluating Depth Buffers 282^^^^^^^^^^^^^^^^^^^^^^^^ 283 284``evaluate_depth_buffer`` is a hint to decompress the current depth buffer 285assuming the current sample locations to avoid problems that could arise when 286using programmable sample locations. 287 288If a depth buffer is rendered with different sample location state than 289what is current at the time of reading the depth buffer, the values may differ 290because depth buffer compression can depend the sample locations. 291 292 293Uploading 294^^^^^^^^^ 295 296For simple single-use uploads, use ``pipe_context::stream_uploader`` or 297``pipe_context::const_uploader``. The latter should be used for uploading 298constants, while the former should be used for uploading everything else. 299PIPE_USAGE_STREAM is implied in both cases, so don't use the uploaders 300for static allocations. 301 302Usage: 303 304Call u_upload_alloc or u_upload_data as many times as you want. After you are 305done, call u_upload_unmap. If the driver doesn't support persistent mappings, 306u_upload_unmap makes sure the previously mapped memory is unmapped. 307 308Gotchas: 309- Always fill the memory immediately after u_upload_alloc. Any following call 310to u_upload_alloc and u_upload_data can unmap memory returned by previous 311u_upload_alloc. 312- Don't interleave calls using stream_uploader and const_uploader. If you use 313one of them, do the upload, unmap, and only then can you use the other one. 314 315 316Drawing 317^^^^^^^ 318 319``draw_vbo`` draws a specified primitive. The primitive mode and other 320properties are described by ``pipe_draw_info``. 321 322The ``mode``, ``start``, and ``count`` fields of ``pipe_draw_info`` specify the 323the mode of the primitive and the vertices to be fetched, in the range between 324``start`` to ``start``+``count``-1, inclusive. 325 326Every instance with instanceID in the range between ``start_instance`` and 327``start_instance``+``instance_count``-1, inclusive, will be drawn. 328 329If ``index_size`` != 0, all vertex indices will be looked up from the index 330buffer. 331 332In indexed draw, ``min_index`` and ``max_index`` respectively provide a lower 333and upper bound of the indices contained in the index buffer inside the range 334between ``start`` to ``start``+``count``-1. This allows the driver to 335determine which subset of vertices will be referenced during te draw call 336without having to scan the index buffer. Providing a over-estimation of the 337the true bounds, for example, a ``min_index`` and ``max_index`` of 0 and 3380xffffffff respectively, must give exactly the same rendering, albeit with less 339performance due to unreferenced vertex buffers being unnecessarily DMA'ed or 340processed. Providing a underestimation of the true bounds will result in 341undefined behavior, but should not result in program or system failure. 342 343In case of non-indexed draw, ``min_index`` should be set to 344``start`` and ``max_index`` should be set to ``start``+``count``-1. 345 346``index_bias`` is a value added to every vertex index after lookup and before 347fetching vertex attributes. 348 349When drawing indexed primitives, the primitive restart index can be 350used to draw disjoint primitive strips. For example, several separate 351line strips can be drawn by designating a special index value as the 352restart index. The ``primitive_restart`` flag enables/disables this 353feature. The ``restart_index`` field specifies the restart index value. 354 355When primitive restart is in use, array indexes are compared to the 356restart index before adding the index_bias offset. 357 358If a given vertex element has ``instance_divisor`` set to 0, it is said 359it contains per-vertex data and effective vertex attribute address needs 360to be recalculated for every index. 361 362 attribAddr = ``stride`` * index + ``src_offset`` 363 364If a given vertex element has ``instance_divisor`` set to non-zero, 365it is said it contains per-instance data and effective vertex attribute 366address needs to recalculated for every ``instance_divisor``-th instance. 367 368 attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset`` 369 370In the above formulas, ``src_offset`` is taken from the given vertex element 371and ``stride`` is taken from a vertex buffer associated with the given 372vertex element. 373 374The calculated attribAddr is used as an offset into the vertex buffer to 375fetch the attribute data. 376 377The value of ``instanceID`` can be read in a vertex shader through a system 378value register declared with INSTANCEID semantic name. 379 380 381Queries 382^^^^^^^ 383 384Queries gather some statistic from the 3D pipeline over one or more 385draws. Queries may be nested, though not all state trackers exercise this. 386 387Queries can be created with ``create_query`` and deleted with 388``destroy_query``. To start a query, use ``begin_query``, and when finished, 389use ``end_query`` to end the query. 390 391``create_query`` takes a query type (``PIPE_QUERY_*``), as well as an index, 392which is the vertex stream for ``PIPE_QUERY_PRIMITIVES_GENERATED`` and 393``PIPE_QUERY_PRIMITIVES_EMITTED``, and allocates a query structure. 394 395``begin_query`` will clear/reset previous query results. 396 397``get_query_result`` is used to retrieve the results of a query. If 398the ``wait`` parameter is TRUE, then the ``get_query_result`` call 399will block until the results of the query are ready (and TRUE will be 400returned). Otherwise, if the ``wait`` parameter is FALSE, the call 401will not block and the return value will be TRUE if the query has 402completed or FALSE otherwise. 403 404``get_query_result_resource`` is used to store the result of a query into 405a resource without synchronizing with the CPU. This write will optionally 406wait for the query to complete, and will optionally write whether the value 407is available instead of the value itself. 408 409``set_active_query_state`` Set whether all current non-driver queries except 410TIME_ELAPSED are active or paused. 411 412The interface currently includes the following types of queries: 413 414``PIPE_QUERY_OCCLUSION_COUNTER`` counts the number of fragments which 415are written to the framebuffer without being culled by 416:ref:`depth-stencil-alpha` testing or shader KILL instructions. 417The result is an unsigned 64-bit integer. 418This query can be used with ``render_condition``. 419 420In cases where a boolean result of an occlusion query is enough, 421``PIPE_QUERY_OCCLUSION_PREDICATE`` should be used. It is just like 422``PIPE_QUERY_OCCLUSION_COUNTER`` except that the result is a boolean 423value of FALSE for cases where COUNTER would result in 0 and TRUE 424for all other cases. 425This query can be used with ``render_condition``. 426 427In cases where a conservative approximation of an occlusion query is enough, 428``PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE`` should be used. It behaves 429like ``PIPE_QUERY_OCCLUSION_PREDICATE``, except that it may return TRUE in 430additional, implementation-dependent cases. 431This query can be used with ``render_condition``. 432 433``PIPE_QUERY_TIME_ELAPSED`` returns the amount of time, in nanoseconds, 434the context takes to perform operations. 435The result is an unsigned 64-bit integer. 436 437``PIPE_QUERY_TIMESTAMP`` returns a device/driver internal timestamp, 438scaled to nanoseconds, recorded after all commands issued prior to 439``end_query`` have been processed. 440This query does not require a call to ``begin_query``. 441The result is an unsigned 64-bit integer. 442 443``PIPE_QUERY_TIMESTAMP_DISJOINT`` can be used to check the 444internal timer resolution and whether the timestamp counter has become 445unreliable due to things like throttling etc. - only if this is FALSE 446a timestamp query (within the timestamp_disjoint query) should be trusted. 447The result is a 64-bit integer specifying the timer resolution in Hz, 448followed by a boolean value indicating whether the timestamp counter 449is discontinuous or disjoint. 450 451``PIPE_QUERY_PRIMITIVES_GENERATED`` returns a 64-bit integer indicating 452the number of primitives processed by the pipeline (regardless of whether 453stream output is active or not). 454 455``PIPE_QUERY_PRIMITIVES_EMITTED`` returns a 64-bit integer indicating 456the number of primitives written to stream output buffers. 457 458``PIPE_QUERY_SO_STATISTICS`` returns 2 64-bit integers corresponding to 459the result of 460``PIPE_QUERY_PRIMITIVES_EMITTED`` and 461the number of primitives that would have been written to stream output buffers 462if they had infinite space available (primitives_storage_needed), in this order. 463XXX the 2nd value is equivalent to ``PIPE_QUERY_PRIMITIVES_GENERATED`` but it is 464unclear if it should be increased if stream output is not active. 465 466``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` returns a boolean value indicating 467whether a selected stream output target has overflowed as a result of the 468commands issued between ``begin_query`` and ``end_query``. 469This query can be used with ``render_condition``. The output stream is 470selected by the stream number passed to ``create_query``. 471 472``PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE`` returns a boolean value indicating 473whether any stream output target has overflowed as a result of the commands 474issued between ``begin_query`` and ``end_query``. This query can be used 475with ``render_condition``, and its result is the logical OR of multiple 476``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` queries, one for each stream output 477target. 478 479``PIPE_QUERY_GPU_FINISHED`` returns a boolean value indicating whether 480all commands issued before ``end_query`` have completed. However, this 481does not imply serialization. 482This query does not require a call to ``begin_query``. 483 484``PIPE_QUERY_PIPELINE_STATISTICS`` returns an array of the following 48564-bit integers: 486Number of vertices read from vertex buffers. 487Number of primitives read from vertex buffers. 488Number of vertex shader threads launched. 489Number of geometry shader threads launched. 490Number of primitives generated by geometry shaders. 491Number of primitives forwarded to the rasterizer. 492Number of primitives rasterized. 493Number of fragment shader threads launched. 494Number of tessellation control shader threads launched. 495Number of tessellation evaluation shader threads launched. 496If a shader type is not supported by the device/driver, 497the corresponding values should be set to 0. 498 499``PIPE_QUERY_PIPELINE_STATISTICS_SINGLE`` returns a single counter from 500the ``PIPE_QUERY_PIPELINE_STATISTICS`` group. The specific counter must 501be selected when calling ``create_query`` by passing one of the 502``PIPE_STAT_QUERY`` enums as the query's ``index``. 503 504Gallium does not guarantee the availability of any query types; one must 505always check the capabilities of the :ref:`Screen` first. 506 507 508Conditional Rendering 509^^^^^^^^^^^^^^^^^^^^^ 510 511A drawing command can be skipped depending on the outcome of a query 512(typically an occlusion query, or streamout overflow predicate). 513The ``render_condition`` function specifies the query which should be checked 514prior to rendering anything. Functions always honoring render_condition include 515(and are limited to) draw_vbo and clear. 516The blit, clear_render_target and clear_depth_stencil functions (but 517not resource_copy_region, which seems inconsistent) can also optionally honor 518the current render condition. 519 520If ``render_condition`` is called with ``query`` = NULL, conditional 521rendering is disabled and drawing takes place normally. 522 523If ``render_condition`` is called with a non-null ``query`` subsequent 524drawing commands will be predicated on the outcome of the query. 525Commands will be skipped if ``condition`` is equal to the predicate result 526(for non-boolean queries such as OCCLUSION_QUERY, zero counts as FALSE, 527non-zero as TRUE). 528 529If ``mode`` is PIPE_RENDER_COND_WAIT the driver will wait for the 530query to complete before deciding whether to render. 531 532If ``mode`` is PIPE_RENDER_COND_NO_WAIT and the query has not yet 533completed, the drawing command will be executed normally. If the query 534has completed, drawing will be predicated on the outcome of the query. 535 536If ``mode`` is PIPE_RENDER_COND_BY_REGION_WAIT or 537PIPE_RENDER_COND_BY_REGION_NO_WAIT rendering will be predicated as above 538for the non-REGION modes but in the case that an occlusion query returns 539a non-zero result, regions which were occluded may be ommitted by subsequent 540drawing commands. This can result in better performance with some GPUs. 541Normally, if the occlusion query returned a non-zero result subsequent 542drawing happens normally so fragments may be generated, shaded and 543processed even where they're known to be obscured. 544 545 546Flushing 547^^^^^^^^ 548 549``flush`` 550 551PIPE_FLUSH_END_OF_FRAME: Whether the flush marks the end of frame. 552 553PIPE_FLUSH_DEFERRED: It is not required to flush right away, but it is required 554to return a valid fence. If fence_finish is called with the returned fence 555and the context is still unflushed, and the ctx parameter of fence_finish is 556equal to the context where the fence was created, fence_finish will flush 557the context. 558 559PIPE_FLUSH_ASYNC: The flush is allowed to be asynchronous. Unlike 560``PIPE_FLUSH_DEFERRED``, the driver must still ensure that the returned fence 561will finish in finite time. However, subsequent operations in other contexts of 562the same screen are no longer guaranteed to happen after the flush. Drivers 563which use this flag must implement pipe_context::fence_server_sync. 564 565PIPE_FLUSH_HINT_FINISH: Hints to the driver that the caller will immediately 566wait for the returned fence. 567 568Additional flags may be set together with ``PIPE_FLUSH_DEFERRED`` for even 569finer-grained fences. Note that as a general rule, GPU caches may not have been 570flushed yet when these fences are signaled. Drivers are free to ignore these 571flags and create normal fences instead. At most one of the following flags can 572be specified: 573 574PIPE_FLUSH_TOP_OF_PIPE: The fence should be signaled as soon as the next 575command is ready to start executing at the top of the pipeline, before any of 576its data is actually read (including indirect draw parameters). 577 578PIPE_FLUSH_BOTTOM_OF_PIPE: The fence should be signaled as soon as the previous 579command has finished executing on the GPU entirely (but data written by the 580command may still be in caches and inaccessible to the CPU). 581 582 583``flush_resource`` 584 585Flush the resource cache, so that the resource can be used 586by an external client. Possible usage: 587- flushing a resource before presenting it on the screen 588- flushing a resource if some other process or device wants to use it 589This shouldn't be used to flush caches if the resource is only managed 590by a single pipe_screen and is not shared with another process. 591(i.e. you shouldn't use it to flush caches explicitly if you want to e.g. 592use the resource for texturing) 593 594Fences 595^^^^^^ 596 597``pipe_fence_handle``, and related methods, are used to synchronize 598execution between multiple parties. Examples include CPU <-> GPU synchronization, 599renderer <-> windowing system, multiple external APIs, etc. 600 601A ``pipe_fence_handle`` can either be 'one time use' or 're-usable'. A 'one time use' 602fence behaves like a traditional GPU fence. Once it reaches the signaled state it 603is forever considered to be signaled. 604 605Once a re-usable ``pipe_fence_handle`` becomes signaled, it can be reset 606back into an unsignaled state. The ``pipe_fence_handle`` will be reset to 607the unsignaled state by performing a wait operation on said object, i.e. 608``fence_server_sync``. As a corollary to this behaviour, a re-usable 609``pipe_fence_handle`` can only have one waiter. 610 611This behaviour is useful in producer <-> consumer chains. It helps avoid 612unecessarily sharing a new ``pipe_fence_handle`` each time a new frame is 613ready. Instead, the fences are exchanged once ahead of time, and access is synchronized 614through GPU signaling instead of direct producer <-> consumer communication. 615 616``fence_server_sync`` inserts a wait command into the GPU's command stream. 617 618``fence_server_signal`` inserts a signal command into the GPU's command stream. 619 620There are no guarantees that the wait/signal commands will be flushed when 621calling ``fence_server_sync`` or ``fence_server_signal``. An explicit 622call to ``flush`` is required to make sure the commands are emitted to the GPU. 623 624The Gallium implementation may implicitly ``flush`` the command stream during a 625``fence_server_sync`` or ``fence_server_signal`` call if necessary. 626 627Resource Busy Queries 628^^^^^^^^^^^^^^^^^^^^^ 629 630``is_resource_referenced`` 631 632 633 634Blitting 635^^^^^^^^ 636 637These methods emulate classic blitter controls. 638 639These methods operate directly on ``pipe_resource`` objects, and stand 640apart from any 3D state in the context. Blitting functionality may be 641moved to a separate abstraction at some point in the future. 642 643``resource_copy_region`` blits a region of a resource to a region of another 644resource, provided that both resources have the same format, or compatible 645formats, i.e., formats for which copying the bytes from the source resource 646unmodified to the destination resource will achieve the same effect of a 647textured quad blitter.. The source and destination may be the same resource, 648but overlapping blits are not permitted. 649This can be considered the equivalent of a CPU memcpy. 650 651``blit`` blits a region of a resource to a region of another resource, including 652scaling, format conversion, and up-/downsampling, as well as a destination clip 653rectangle (scissors) and window rectangles. It can also optionally honor the 654current render condition (but either way the blit itself never contributes 655anything to queries currently gathering data). 656As opposed to manually drawing a textured quad, this lets the pipe driver choose 657the optimal method for blitting (like using a special 2D engine), and usually 658offers, for example, accelerated stencil-only copies even where 659PIPE_CAP_SHADER_STENCIL_EXPORT is not available. 660 661 662Transfers 663^^^^^^^^^ 664 665These methods are used to get data to/from a resource. 666 667``transfer_map`` creates a memory mapping and the transfer object 668associated with it. 669The returned pointer points to the start of the mapped range according to 670the box region, not the beginning of the resource. If transfer_map fails, 671the returned pointer to the buffer memory is NULL, and the pointer 672to the transfer object remains unchanged (i.e. it can be non-NULL). 673 674``transfer_unmap`` remove the memory mapping for and destroy 675the transfer object. The pointer into the resource should be considered 676invalid and discarded. 677 678``texture_subdata`` and ``buffer_subdata`` perform a simplified 679transfer for simple writes. Basically transfer_map, data write, and 680transfer_unmap all in one. 681 682 683The box parameter to some of these functions defines a 1D, 2D or 3D 684region of pixels. This is self-explanatory for 1D, 2D and 3D texture 685targets. 686 687For PIPE_TEXTURE_1D_ARRAY and PIPE_TEXTURE_2D_ARRAY, the box::z and box::depth 688fields refer to the array dimension of the texture. 689 690For PIPE_TEXTURE_CUBE, the box:z and box::depth fields refer to the 691faces of the cube map (z + depth <= 6). 692 693For PIPE_TEXTURE_CUBE_ARRAY, the box:z and box::depth fields refer to both 694the face and array dimension of the texture (face = z % 6, array = z / 6). 695 696 697.. _transfer_flush_region: 698 699transfer_flush_region 700%%%%%%%%%%%%%%%%%%%%% 701 702If a transfer was created with ``FLUSH_EXPLICIT``, it will not automatically 703be flushed on write or unmap. Flushes must be requested with 704``transfer_flush_region``. Flush ranges are relative to the mapped range, not 705the beginning of the resource. 706 707 708 709.. _texture_barrier: 710 711texture_barrier 712%%%%%%%%%%%%%%% 713 714This function flushes all pending writes to the currently-set surfaces and 715invalidates all read caches of the currently-set samplers. This can be used 716for both regular textures as well as for framebuffers read via FBFETCH. 717 718 719 720.. _memory_barrier: 721 722memory_barrier 723%%%%%%%%%%%%%%% 724 725This function flushes caches according to which of the PIPE_BARRIER_* flags 726are set. 727 728 729 730.. _resource_commit: 731 732resource_commit 733%%%%%%%%%%%%%%% 734 735This function changes the commit state of a part of a sparse resource. Sparse 736resources are created by setting the ``PIPE_RESOURCE_FLAG_SPARSE`` flag when 737calling ``resource_create``. Initially, sparse resources only reserve a virtual 738memory region that is not backed by memory (i.e., it is uncommitted). The 739``resource_commit`` function can be called to commit or uncommit parts (or all) 740of a resource. The driver manages the underlying backing memory. 741 742The contents of newly committed memory regions are undefined. Calling this 743function to commit an already committed memory region is allowed and leaves its 744content unchanged. Similarly, calling this function to uncommit an already 745uncommitted memory region is allowed. 746 747For buffers, the given box must be aligned to multiples of 748``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``. As an exception to this rule, if the size 749of the buffer is not a multiple of the page size, changing the commit state of 750the last (partial) page requires a box that ends at the end of the buffer 751(i.e., box->x + box->width == buffer->width0). 752 753 754 755.. _pipe_transfer: 756 757PIPE_TRANSFER 758^^^^^^^^^^^^^ 759 760These flags control the behavior of a transfer object. 761 762``PIPE_TRANSFER_READ`` 763 Resource contents read back (or accessed directly) at transfer create time. 764 765``PIPE_TRANSFER_WRITE`` 766 Resource contents will be written back at transfer_unmap time (or modified 767 as a result of being accessed directly). 768 769``PIPE_TRANSFER_MAP_DIRECTLY`` 770 a transfer should directly map the resource. May return NULL if not supported. 771 772``PIPE_TRANSFER_DISCARD_RANGE`` 773 The memory within the mapped region is discarded. Cannot be used with 774 ``PIPE_TRANSFER_READ``. 775 776``PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE`` 777 Discards all memory backing the resource. It should not be used with 778 ``PIPE_TRANSFER_READ``. 779 780``PIPE_TRANSFER_DONTBLOCK`` 781 Fail if the resource cannot be mapped immediately. 782 783``PIPE_TRANSFER_UNSYNCHRONIZED`` 784 Do not synchronize pending operations on the resource when mapping. The 785 interaction of any writes to the map and any operations pending on the 786 resource are undefined. Cannot be used with ``PIPE_TRANSFER_READ``. 787 788``PIPE_TRANSFER_FLUSH_EXPLICIT`` 789 Written ranges will be notified later with :ref:`transfer_flush_region`. 790 Cannot be used with ``PIPE_TRANSFER_READ``. 791 792``PIPE_TRANSFER_PERSISTENT`` 793 Allows the resource to be used for rendering while mapped. 794 PIPE_RESOURCE_FLAG_MAP_PERSISTENT must be set when creating 795 the resource. 796 If COHERENT is not set, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) 797 must be called to ensure the device can see what the CPU has written. 798 799``PIPE_TRANSFER_COHERENT`` 800 If PERSISTENT is set, this ensures any writes done by the device are 801 immediately visible to the CPU and vice versa. 802 PIPE_RESOURCE_FLAG_MAP_COHERENT must be set when creating 803 the resource. 804 805Compute kernel execution 806^^^^^^^^^^^^^^^^^^^^^^^^ 807 808A compute program can be defined, bound or destroyed using 809``create_compute_state``, ``bind_compute_state`` or 810``destroy_compute_state`` respectively. 811 812Any of the subroutines contained within the compute program can be 813executed on the device using the ``launch_grid`` method. This method 814will execute as many instances of the program as elements in the 815specified N-dimensional grid, hopefully in parallel. 816 817The compute program has access to four special resources: 818 819* ``GLOBAL`` represents a memory space shared among all the threads 820 running on the device. An arbitrary buffer created with the 821 ``PIPE_BIND_GLOBAL`` flag can be mapped into it using the 822 ``set_global_binding`` method. 823 824* ``LOCAL`` represents a memory space shared among all the threads 825 running in the same working group. The initial contents of this 826 resource are undefined. 827 828* ``PRIVATE`` represents a memory space local to a single thread. 829 The initial contents of this resource are undefined. 830 831* ``INPUT`` represents a read-only memory space that can be 832 initialized at ``launch_grid`` time. 833 834These resources use a byte-based addressing scheme, and they can be 835accessed from the compute program by means of the LOAD/STORE TGSI 836opcodes. Additional resources to be accessed using the same opcodes 837may be specified by the user with the ``set_compute_resources`` 838method. 839 840In addition, normal texture sampling is allowed from the compute 841program: ``bind_sampler_states`` may be used to set up texture 842samplers for the compute stage and ``set_sampler_views`` may 843be used to bind a number of sampler views to it. 844 845Mipmap generation 846^^^^^^^^^^^^^^^^^ 847 848If PIPE_CAP_GENERATE_MIPMAP is true, ``generate_mipmap`` can be used 849to generate mipmaps for the specified texture resource. 850It replaces texel image levels base_level+1 through 851last_level for layers range from first_layer through last_layer. 852It returns TRUE if mipmap generation succeeds, otherwise it 853returns FALSE. Mipmap generation may fail when it is not supported 854for particular texture types or formats. 855 856Device resets 857^^^^^^^^^^^^^ 858 859The state tracker can query or request notifications of when the GPU 860is reset for whatever reason (application error, driver error). When 861a GPU reset happens, the context becomes unusable and all related state 862should be considered lost and undefined. Despite that, context 863notifications are single-shot, i.e. subsequent calls to 864``get_device_reset_status`` will return PIPE_NO_RESET. 865 866* ``get_device_reset_status`` queries whether a device reset has happened 867 since the last call or since the last notification by callback. 868* ``set_device_reset_callback`` sets a callback which will be called when 869 a device reset is detected. The callback is only called synchronously. 870 871Bindless 872^^^^^^^^ 873 874If PIPE_CAP_BINDLESS_TEXTURE is TRUE, the following ``pipe_context`` functions 875are used to create/delete bindless handles, and to make them resident in the 876current context when they are going to be used by shaders. 877 878* ``create_texture_handle`` creates a 64-bit unsigned integer texture handle 879 that is going to be directly used in shaders. 880* ``delete_texture_handle`` deletes a 64-bit unsigned integer texture handle. 881* ``make_texture_handle_resident`` makes a 64-bit unsigned texture handle 882 resident in the current context to be accessible by shaders for texture 883 mapping. 884* ``create_image_handle`` creates a 64-bit unsigned integer image handle that 885 is going to be directly used in shaders. 886* ``delete_image_handle`` deletes a 64-bit unsigned integer image handle. 887* ``make_image_handle_resident`` makes a 64-bit unsigned integer image handle 888 resident in the current context to be accessible by shaders for image loads, 889 stores and atomic operations. 890 891Using several contexts 892---------------------- 893 894Several contexts from the same screen can be used at the same time. Objects 895created on one context cannot be used in another context, but the objects 896created by the screen methods can be used by all contexts. 897 898Transfers 899^^^^^^^^^ 900A transfer on one context is not expected to synchronize properly with 901rendering on other contexts, thus only areas not yet used for rendering should 902be locked. 903 904A flush is required after transfer_unmap to expect other contexts to see the 905uploaded data, unless: 906 907* Using persistent mapping. Associated with coherent mapping, unmapping the 908 resource is also not required to use it in other contexts. Without coherent 909 mapping, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) should be called on the 910 context that has mapped the resource. No flush is required. 911 912* Mapping the resource with PIPE_TRANSFER_MAP_DIRECTLY. 913