1b8e80941SmrgName
2b8e80941Smrg
3b8e80941Smrg    MESA_shader_integer_functions
4b8e80941Smrg
5b8e80941SmrgName Strings
6b8e80941Smrg
7b8e80941Smrg    GL_MESA_shader_integer_functions
8b8e80941Smrg
9b8e80941SmrgContact
10b8e80941Smrg
11b8e80941Smrg    Ian Romanick <ian.d.romanick@intel.com>
12b8e80941Smrg
13b8e80941SmrgContributors
14b8e80941Smrg
15b8e80941Smrg    All the contributors of GL_ARB_gpu_shader5
16b8e80941Smrg
17b8e80941SmrgStatus
18b8e80941Smrg
19b8e80941Smrg    Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
20b8e80941Smrg
21b8e80941SmrgVersion
22b8e80941Smrg
23b8e80941Smrg    Version 3, March 31, 2017
24b8e80941Smrg
25b8e80941SmrgNumber
26b8e80941Smrg
27b8e80941Smrg    OpenGL Extension #495
28b8e80941Smrg
29b8e80941SmrgDependencies
30b8e80941Smrg
31b8e80941Smrg    This extension is written against the OpenGL 3.2 (Compatibility Profile)
32b8e80941Smrg    Specification.
33b8e80941Smrg
34b8e80941Smrg    This extension is written against Version 1.50 (Revision 09) of the OpenGL
35b8e80941Smrg    Shading Language Specification.
36b8e80941Smrg
37b8e80941Smrg    GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required.
38b8e80941Smrg
39b8e80941Smrg    This extension interacts with ARB_gpu_shader5.
40b8e80941Smrg
41b8e80941Smrg    This extension interacts with ARB_gpu_shader_fp64.
42b8e80941Smrg
43b8e80941Smrg    This extension interacts with NV_gpu_shader5.
44b8e80941Smrg
45b8e80941SmrgOverview
46b8e80941Smrg
47b8e80941Smrg    GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
48b8e80941Smrg    added functionality requires significant hardware support.  There are many
49b8e80941Smrg    aspects, however, that can be easily implmented on any GPU with "real"
50b8e80941Smrg    integer support (as opposed to simulating integers using floating point
51b8e80941Smrg    calculations).
52b8e80941Smrg
53b8e80941Smrg    This extension provides a set of new features to the OpenGL Shading
54b8e80941Smrg    Language to support capabilities of these GPUs, extending the
55b8e80941Smrg    capabilities of version 1.30 of the OpenGL Shading Language and version
56b8e80941Smrg    3.00 of the OpenGL ES Shading Language.  Shaders using the new
57b8e80941Smrg    functionality provided by this extension should enable this
58b8e80941Smrg    functionality via the construct
59b8e80941Smrg
60b8e80941Smrg      #extension GL_MESA_shader_integer_functions : require   (or enable)
61b8e80941Smrg
62b8e80941Smrg    This extension provides a variety of new features for all shader types,
63b8e80941Smrg    including:
64b8e80941Smrg
65b8e80941Smrg      * support for implicitly converting signed integer types to unsigned
66b8e80941Smrg        types, as well as more general implicit conversion and function
67b8e80941Smrg        overloading infrastructure to support new data types introduced by
68b8e80941Smrg        other extensions;
69b8e80941Smrg
70b8e80941Smrg      * new built-in functions supporting:
71b8e80941Smrg
72b8e80941Smrg        * splitting a floating-point number into a significand and exponent
73b8e80941Smrg          (frexp), or building a floating-point number from a significand and
74b8e80941Smrg          exponent (ldexp);
75b8e80941Smrg
76b8e80941Smrg        * integer bitfield manipulation, including functions to find the
77b8e80941Smrg          position of the most or least significant set bit, count the number
78b8e80941Smrg          of one bits, and bitfield insertion, extraction, and reversal;
79b8e80941Smrg
80b8e80941Smrg        * extended integer precision math, including add with carry, subtract
81b8e80941Smrg          with borrow, and extenended multiplication;
82b8e80941Smrg
83b8e80941Smrg    The resulting extension is a strict subset of GL_ARB_gpu_shader5.
84b8e80941Smrg
85b8e80941SmrgIP Status
86b8e80941Smrg
87b8e80941Smrg    No known IP claims.
88b8e80941Smrg
89b8e80941SmrgNew Procedures and Functions
90b8e80941Smrg
91b8e80941Smrg    None
92b8e80941Smrg
93b8e80941SmrgNew Tokens
94b8e80941Smrg
95b8e80941Smrg    None
96b8e80941Smrg
97b8e80941SmrgAdditions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
98b8e80941Smrg(OpenGL Operation)
99b8e80941Smrg
100b8e80941Smrg    None.
101b8e80941Smrg
102b8e80941SmrgAdditions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
103b8e80941Smrg(Rasterization)
104b8e80941Smrg
105b8e80941Smrg    None.
106b8e80941Smrg
107b8e80941SmrgAdditions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
108b8e80941Smrg(Per-Fragment Operations and the Frame Buffer)
109b8e80941Smrg
110b8e80941Smrg    None.
111b8e80941Smrg
112b8e80941SmrgAdditions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
113b8e80941Smrg(Special Functions)
114b8e80941Smrg
115b8e80941Smrg    None.
116b8e80941Smrg
117b8e80941SmrgAdditions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
118b8e80941Smrg(State and State Requests)
119b8e80941Smrg
120b8e80941Smrg    None.
121b8e80941Smrg
122b8e80941SmrgAdditions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
123b8e80941SmrgSpecification (Invariance)
124b8e80941Smrg
125b8e80941Smrg    None.
126b8e80941Smrg
127b8e80941SmrgAdditions to the AGL/GLX/WGL Specifications
128b8e80941Smrg
129b8e80941Smrg    None.
130b8e80941Smrg
131b8e80941SmrgModifications to The OpenGL Shading Language Specification, Version 1.50
132b8e80941Smrg(Revision 09)
133b8e80941Smrg
134b8e80941Smrg    Including the following line in a shader can be used to control the
135b8e80941Smrg    language features described in this extension:
136b8e80941Smrg
137b8e80941Smrg      #extension GL_MESA_shader_integer_functions : <behavior>
138b8e80941Smrg
139b8e80941Smrg    where <behavior> is as specified in section 3.3.
140b8e80941Smrg
141b8e80941Smrg    New preprocessor #defines are added to the OpenGL Shading Language:
142b8e80941Smrg
143b8e80941Smrg      #define GL_MESA_shader_integer_functions        1
144b8e80941Smrg
145b8e80941Smrg
146b8e80941Smrg    Modify Section 4.1.10, Implicit Conversions, p. 27
147b8e80941Smrg
148b8e80941Smrg    (modify table of implicit conversions)
149b8e80941Smrg
150b8e80941Smrg                                Can be implicitly
151b8e80941Smrg        Type of expression        converted to
152b8e80941Smrg        ---------------------   -----------------
153b8e80941Smrg        int                     uint, float
154b8e80941Smrg        ivec2                   uvec2, vec2
155b8e80941Smrg        ivec3                   uvec3, vec3
156b8e80941Smrg        ivec4                   uvec4, vec4
157b8e80941Smrg
158b8e80941Smrg        uint                    float
159b8e80941Smrg        uvec2                   vec2
160b8e80941Smrg        uvec3                   vec3
161b8e80941Smrg        uvec4                   vec4
162b8e80941Smrg
163b8e80941Smrg    (modify second paragraph of the section) No implicit conversions are
164b8e80941Smrg    provided to convert from unsigned to signed integer types or from
165b8e80941Smrg    floating-point to integer types.  There are no implicit array or structure
166b8e80941Smrg    conversions.
167b8e80941Smrg
168b8e80941Smrg    (insert before the final paragraph of the section) When performing
169b8e80941Smrg    implicit conversion for binary operators, there may be multiple data types
170b8e80941Smrg    to which the two operands can be converted.  For example, when adding an
171b8e80941Smrg    int value to a uint value, both values can be implicitly converted to uint
172b8e80941Smrg    and float.  In such cases, a floating-point type is chosen if either
173b8e80941Smrg    operand has a floating-point type.  Otherwise, an unsigned integer type is
174b8e80941Smrg    chosen if either operand has an unsigned integer type.  Otherwise, a
175b8e80941Smrg    signed integer type is chosen.
176b8e80941Smrg    
177b8e80941Smrg
178b8e80941Smrg    Modify Section 5.9, Expressions, p. 57
179b8e80941Smrg
180b8e80941Smrg    (modify bulleted list as follows, adding support for implicit conversion
181b8e80941Smrg    between signed and unsigned types)
182b8e80941Smrg
183b8e80941Smrg    Expressions in the shading language are built from the following:
184b8e80941Smrg
185b8e80941Smrg    * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
186b8e80941Smrg      types, and all matrix types.
187b8e80941Smrg
188b8e80941Smrg    ...
189b8e80941Smrg
190b8e80941Smrg    * The operator modulus (%) operates on signed or unsigned integer scalars
191b8e80941Smrg      or vectors.  If the fundamental types of the operands do not match, the
192b8e80941Smrg      conversions from Section 4.1.10 "Implicit Conversions" are applied to
193b8e80941Smrg      produce matching types.  ...
194b8e80941Smrg
195b8e80941Smrg
196b8e80941Smrg    Modify Section 6.1, Function Definitions, p. 63
197b8e80941Smrg
198b8e80941Smrg    (modify description of overloading, beginning at the top of p. 64)
199b8e80941Smrg
200b8e80941Smrg     Function names can be overloaded.  The same function name can be used for
201b8e80941Smrg     multiple functions, as long as the parameter types differ.  If a function
202b8e80941Smrg     name is declared twice with the same parameter types, then the return
203b8e80941Smrg     types and all qualifiers must also match, and it is the same function
204b8e80941Smrg     being declared.  For example,
205b8e80941Smrg
206b8e80941Smrg       vec4 f(in vec4 x, out vec4  y);   // (A)
207b8e80941Smrg       vec4 f(in vec4 x, out uvec4 y);   // (B) okay, different argument type
208b8e80941Smrg       vec4 f(in ivec4 x, out uvec4 y);  // (C) okay, different argument type
209b8e80941Smrg
210b8e80941Smrg       int  f(in vec4 x, out ivec4 y);  // error, only return type differs
211b8e80941Smrg       vec4 f(in vec4 x, in  vec4  y);  // error, only qualifier differs
212b8e80941Smrg       vec4 f(const in vec4 x, out vec4 y);  // error, only qualifier differs
213b8e80941Smrg
214b8e80941Smrg     When function calls are resolved, an exact type match for all the
215b8e80941Smrg     arguments is sought.  If an exact match is found, all other functions are
216b8e80941Smrg     ignored, and the exact match is used.  If no exact match is found, then
217b8e80941Smrg     the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
218b8e80941Smrg     applied to find a match.  Mismatched types on input parameters (in or
219b8e80941Smrg     inout or default) must have a conversion from the calling argument type
220b8e80941Smrg     to the formal parameter type.  Mismatched types on output parameters (out
221b8e80941Smrg     or inout) must have a conversion from the formal parameter type to the
222b8e80941Smrg     calling argument type.
223b8e80941Smrg
224b8e80941Smrg     If implicit conversions can be used to find more than one matching
225b8e80941Smrg     function, a single best-matching function is sought.  To determine a best
226b8e80941Smrg     match, the conversions between calling argument and formal parameter
227b8e80941Smrg     types are compared for each function argument and pair of matching
228b8e80941Smrg     functions.  After these comparisons are performed, each pair of matching
229b8e80941Smrg     functions are compared.  A function definition A is considered a better
230b8e80941Smrg     match than function definition B if:
231b8e80941Smrg
232b8e80941Smrg       * for at least one function argument, the conversion for that argument
233b8e80941Smrg         in A is better than the corresponding conversion in B; and
234b8e80941Smrg
235b8e80941Smrg       * there is no function argument for which the conversion in B is better
236b8e80941Smrg         than the corresponding conversion in A.
237b8e80941Smrg
238b8e80941Smrg     If a single function definition is considered a better match than every
239b8e80941Smrg     other matching function definition, it will be used.  Otherwise, a
240b8e80941Smrg     semantic error occurs and the shader will fail to compile.
241b8e80941Smrg
242b8e80941Smrg     To determine whether the conversion for a single argument in one match is
243b8e80941Smrg     better than that for another match, the following rules are applied, in
244b8e80941Smrg     order:
245b8e80941Smrg
246b8e80941Smrg       1. An exact match is better than a match involving any implicit
247b8e80941Smrg          conversion.
248b8e80941Smrg
249b8e80941Smrg       2. A match involving an implicit conversion from float to double is
250b8e80941Smrg          better than a match involving any other implicit conversion.
251b8e80941Smrg
252b8e80941Smrg       3. A match involving an implicit conversion from either int or uint to
253b8e80941Smrg          float is better than a match involving an implicit conversion from
254b8e80941Smrg          either int or uint to double.
255b8e80941Smrg
256b8e80941Smrg     If none of the rules above apply to a particular pair of conversions,
257b8e80941Smrg     neither conversion is considered better than the other.
258b8e80941Smrg
259b8e80941Smrg     For the function prototypes (A), (B), and (C) above, the following
260b8e80941Smrg     examples show how the rules apply to different sets of calling argument
261b8e80941Smrg     types:
262b8e80941Smrg
263b8e80941Smrg       f(vec4, vec4);        // exact match of vec4 f(in vec4 x, out vec4 y)
264b8e80941Smrg       f(vec4, uvec4);       // exact match of vec4 f(in vec4 x, out ivec4 y)
265b8e80941Smrg       f(vec4, ivec4);       // matched to vec4 f(in vec4 x, out vec4 y)
266b8e80941Smrg                             //   (C) not relevant, can't convert vec4 to 
267b8e80941Smrg                             //   ivec4.  (A) better than (B) for 2nd
268b8e80941Smrg                             //   argument (rule 2), same on first argument.
269b8e80941Smrg       f(ivec4, vec4);       // NOT matched.  All three match by implicit
270b8e80941Smrg                             //   conversion.  (C) is better than (A) and (B)
271b8e80941Smrg                             //   on the first argument.  (A) is better than
272b8e80941Smrg                             //   (B) and (C).
273b8e80941Smrg
274b8e80941Smrg
275b8e80941Smrg    Modify Section 8.3, Common Functions, p. 84
276b8e80941Smrg
277b8e80941Smrg    (add support for single-precision frexp and ldexp functions)
278b8e80941Smrg
279b8e80941Smrg    Syntax:
280b8e80941Smrg
281b8e80941Smrg      genType frexp(genType x, out genIType exp);
282b8e80941Smrg      genType ldexp(genType x, in genIType exp);
283b8e80941Smrg
284b8e80941Smrg    The function frexp() splits each single-precision floating-point number in
285b8e80941Smrg    <x> into a binary significand, a floating-point number in the range [0.5,
286b8e80941Smrg    1.0), and an integral exponent of two, such that:
287b8e80941Smrg
288b8e80941Smrg      x = significand * 2 ^ exponent
289b8e80941Smrg
290b8e80941Smrg    The significand is returned by the function; the exponent is returned in
291b8e80941Smrg    the parameter <exp>.  For a floating-point value of zero, the significant
292b8e80941Smrg    and exponent are both zero.  For a floating-point value that is an
293b8e80941Smrg    infinity or is not a number, the results of frexp() are undefined.  
294b8e80941Smrg
295b8e80941Smrg    If the input <x> is a vector, this operation is performed in a
296b8e80941Smrg    component-wise manner; the value returned by the function and the value
297b8e80941Smrg    written to <exp> are vectors with the same number of components as <x>.
298b8e80941Smrg
299b8e80941Smrg    The function ldexp() builds a single-precision floating-point number from
300b8e80941Smrg    each significand component in <x> and the corresponding integral exponent
301b8e80941Smrg    of two in <exp>, returning:
302b8e80941Smrg
303b8e80941Smrg      significand * 2 ^ exponent
304b8e80941Smrg
305b8e80941Smrg    If this product is too large to be represented as a single-precision
306b8e80941Smrg    floating-point value, the result is considered undefined.
307b8e80941Smrg
308b8e80941Smrg    If the input <x> is a vector, this operation is performed in a
309b8e80941Smrg    component-wise manner; the value passed in <exp> and returned by the
310b8e80941Smrg    function are vectors with the same number of components as <x>.
311b8e80941Smrg
312b8e80941Smrg
313b8e80941Smrg    (add support for new integer built-in functions)
314b8e80941Smrg
315b8e80941Smrg    Syntax:
316b8e80941Smrg
317b8e80941Smrg      genIType bitfieldExtract(genIType value, int offset, int bits);
318b8e80941Smrg      genUType bitfieldExtract(genUType value, int offset, int bits);
319b8e80941Smrg
320b8e80941Smrg      genIType bitfieldInsert(genIType base, genIType insert, int offset, 
321b8e80941Smrg                              int bits);
322b8e80941Smrg      genUType bitfieldInsert(genUType base, genUType insert, int offset, 
323b8e80941Smrg                              int bits);
324b8e80941Smrg
325b8e80941Smrg      genIType bitfieldReverse(genIType value);
326b8e80941Smrg      genUType bitfieldReverse(genUType value);
327b8e80941Smrg
328b8e80941Smrg      genIType bitCount(genIType value);
329b8e80941Smrg      genIType bitCount(genUType value);
330b8e80941Smrg
331b8e80941Smrg      genIType findLSB(genIType value);
332b8e80941Smrg      genIType findLSB(genUType value);
333b8e80941Smrg
334b8e80941Smrg      genIType findMSB(genIType value);
335b8e80941Smrg      genIType findMSB(genUType value);
336b8e80941Smrg
337b8e80941Smrg    The function bitfieldExtract() extracts bits <offset> through
338b8e80941Smrg    <offset>+<bits>-1 from each component in <value>, returning them in the
339b8e80941Smrg    least significant bits of corresponding component of the result.  For
340b8e80941Smrg    unsigned data types, the most significant bits of the result will be set
341b8e80941Smrg    to zero.  For signed data types, the most significant bits will be set to
342b8e80941Smrg    the value of bit <offset>+<base>-1.  If <bits> is zero, the result will be
343b8e80941Smrg    zero.  The result will be undefined if <offset> or <bits> is negative, or
344b8e80941Smrg    if the sum of <offset> and <bits> is greater than the number of bits used
345b8e80941Smrg    to store the operand.  Note that for vector versions of bitfieldExtract(),
346b8e80941Smrg    a single pair of <offset> and <bits> values is shared for all components.
347b8e80941Smrg
348b8e80941Smrg    The function bitfieldInsert() inserts the <bits> least significant bits of
349b8e80941Smrg    each component of <insert> into the corresponding component of <base>.
350b8e80941Smrg    The result will have bits numbered <offset> through <offset>+<bits>-1
351b8e80941Smrg    taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
352b8e80941Smrg    directly from the corresponding bits of <base>.  If <bits> is zero, the
353b8e80941Smrg    result will simply be <base>.  The result will be undefined if <offset> or
354b8e80941Smrg    <bits> is negative, or if the sum of <offset> and <bits> is greater than
355b8e80941Smrg    the number of bits used to store the operand.  Note that for vector
356b8e80941Smrg    versions of bitfieldInsert(), a single pair of <offset> and <bits> values
357b8e80941Smrg    is shared for all components.
358b8e80941Smrg
359b8e80941Smrg    The function bitfieldReverse() reverses the bits of <value>.  The bit
360b8e80941Smrg    numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
361b8e80941Smrg    <value>, where <bits> is the total number of bits used to represent
362b8e80941Smrg    <value>.
363b8e80941Smrg
364b8e80941Smrg    The function bitCount() returns the number of one bits in the binary
365b8e80941Smrg    representation of <value>.
366b8e80941Smrg
367b8e80941Smrg    The function findLSB() returns the bit number of the least significant one
368b8e80941Smrg    bit in the binary representation of <value>.  If <value> is zero, -1 will
369b8e80941Smrg    be returned.
370b8e80941Smrg
371b8e80941Smrg    The function findMSB() returns the bit number of the most significant bit
372b8e80941Smrg    in the binary representation of <value>.  For positive integers, the
373b8e80941Smrg    result will be the bit number of the most significant one bit.  For
374b8e80941Smrg    negative integers, the result will be the bit number of the most
375b8e80941Smrg    significant zero bit.  For a <value> of zero or negative one, -1 will be
376b8e80941Smrg    returned.
377b8e80941Smrg
378b8e80941Smrg
379b8e80941Smrg    (support for unsigned integer add/subtract with carry-out)
380b8e80941Smrg
381b8e80941Smrg    Syntax:
382b8e80941Smrg
383b8e80941Smrg      genUType uaddCarry(genUType x, genUType y, out genUType carry);
384b8e80941Smrg      genUType usubBorrow(genUType x, genUType y, out genUType borrow);
385b8e80941Smrg
386b8e80941Smrg    The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
387b8e80941Smrg    <y>, returning the sum modulo 2^32.  The value <carry> is set to zero if
388b8e80941Smrg    the sum was less than 2^32, or one otherwise.
389b8e80941Smrg
390b8e80941Smrg    The function usubBorrow() subtracts the 32-bit unsigned integer or vector
391b8e80941Smrg    <y> from <x>, returning the difference if non-negative or 2^32 plus the
392b8e80941Smrg    difference, otherwise.  The value <borrow> is set to zero if x >= y, or
393b8e80941Smrg    one otherwise.
394b8e80941Smrg
395b8e80941Smrg
396b8e80941Smrg    (support for signed and unsigned multiplies, with 32-bit inputs and a
397b8e80941Smrg     64-bit result spanning two 32-bit outputs)
398b8e80941Smrg
399b8e80941Smrg    Syntax:
400b8e80941Smrg
401b8e80941Smrg      void umulExtended(genUType x, genUType y, out genUType msb, 
402b8e80941Smrg                        out genUType lsb);
403b8e80941Smrg      void imulExtended(genIType x, genIType y, out genIType msb,
404b8e80941Smrg                        out genIType lsb);
405b8e80941Smrg
406b8e80941Smrg    The functions umulExtended() and imulExtended() multiply 32-bit unsigned
407b8e80941Smrg    or signed integers or vectors <x> and <y>, producing a 64-bit result.  The
408b8e80941Smrg    32 least significant bits are returned in <lsb>; the 32 most significant
409b8e80941Smrg    bits are returned in <msb>.
410b8e80941Smrg
411b8e80941Smrg
412b8e80941SmrgGLX Protocol
413b8e80941Smrg
414b8e80941Smrg    None.
415b8e80941Smrg
416b8e80941SmrgDependencies on ARB_gpu_shader_fp64
417b8e80941Smrg
418b8e80941Smrg    This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
419b8e80941Smrg    of implicit conversions supported in the OpenGL Shading Language.  If more
420b8e80941Smrg    than one of these extensions is supported, an expression of one type may
421b8e80941Smrg    be converted to another type if that conversion is allowed by any of these
422b8e80941Smrg    specifications.
423b8e80941Smrg
424b8e80941Smrg    If ARB_gpu_shader_fp64 or a similar extension introducing new data types
425b8e80941Smrg    is not supported, the function overloading rule in the GLSL specification
426b8e80941Smrg    preferring promotion an input parameters to smaller type to a larger type
427b8e80941Smrg    is never applicable, as all data types are of the same size.  That rule
428b8e80941Smrg    and the example referring to "double" should be removed.
429b8e80941Smrg
430b8e80941Smrg
431b8e80941SmrgDependencies on NV_gpu_shader5
432b8e80941Smrg
433b8e80941Smrg    This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
434b8e80941Smrg    of implicit conversions supported in the OpenGL Shading Language.  If more
435b8e80941Smrg    than one of these extensions is supported, an expression of one type may
436b8e80941Smrg    be converted to another type if that conversion is allowed by any of these
437b8e80941Smrg    specifications.
438b8e80941Smrg
439b8e80941Smrg    If NV_gpu_shader5 is supported, integer data types are supported with four
440b8e80941Smrg    different precisions (8-, 16, 32-, and 64-bit) and floating-point data
441b8e80941Smrg    types are supported with three different precisions (16-, 32-, and
442b8e80941Smrg    64-bit).  The extension adds the following rule for output parameters,
443b8e80941Smrg    which is similar to the one present in this extension for input
444b8e80941Smrg    parameters:
445b8e80941Smrg
446b8e80941Smrg       5. If the formal parameters in both matches are output parameters, a
447b8e80941Smrg          conversion from a type with a larger number of bits per component is
448b8e80941Smrg          better than a conversion from a type with a smaller number of bits
449b8e80941Smrg          per component.  For example, a conversion from an "int16_t" formal
450b8e80941Smrg          parameter type to "int"  is better than one from an "int8_t" formal
451b8e80941Smrg          parameter type to "int".
452b8e80941Smrg
453b8e80941Smrg    Such a rule is not provided in this extension because there is no
454b8e80941Smrg    combination of types in this extension and ARB_gpu_shader_fp64 where this
455b8e80941Smrg    rule has any effect.
456b8e80941Smrg
457b8e80941Smrg
458b8e80941SmrgErrors
459b8e80941Smrg
460b8e80941Smrg    None
461b8e80941Smrg
462b8e80941Smrg
463b8e80941SmrgNew State
464b8e80941Smrg
465b8e80941Smrg    None
466b8e80941Smrg
467b8e80941SmrgNew Implementation Dependent State
468b8e80941Smrg
469b8e80941Smrg    None
470b8e80941Smrg
471b8e80941SmrgIssues
472b8e80941Smrg
473b8e80941Smrg    (1) What should this extension be called?
474b8e80941Smrg
475b8e80941Smrg      UNRESOLVED.  This extension borrows from GL_ARB_gpu_shader5, so creating
476b8e80941Smrg      some sort of a play on that name would be viable.  However, nothing in
477b8e80941Smrg      this extension should require SM5 hardware, so such a name would be a
478b8e80941Smrg      little misleading and weird.
479b8e80941Smrg
480b8e80941Smrg      Since the primary purpose is to add integer related functions from
481b8e80941Smrg      GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
482b8e80941Smrg      for now.
483b8e80941Smrg
484b8e80941Smrg    (2) Why is some of the formatting in this extension weird?
485b8e80941Smrg
486b8e80941Smrg      RESOLVED: This extension is formatted to minimize the differences (as
487b8e80941Smrg      reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
488b8e80941Smrg      specification.
489b8e80941Smrg
490b8e80941Smrg    (3) Should ldexp and frexp be included?
491b8e80941Smrg
492b8e80941Smrg      RESOLVED: Yes.  Few GPUs have native instructions to implement these
493b8e80941Smrg      functions.  These are generally implemented using existing GLSL built-in
494b8e80941Smrg      functions and the other functions provided by this extension.
495b8e80941Smrg
496b8e80941Smrg    (4) Should umulExtended and imulExtended be included?
497b8e80941Smrg
498b8e80941Smrg      RESOLVED: Yes.  These functions should be implementable on any GPU that
499b8e80941Smrg      can support the rest of this extension, but the implementation may be
500b8e80941Smrg      complex.  The implementation on a GPU that only supports 32bit x 32bit =
501b8e80941Smrg      32bit multiplication would be quite expensive.  However, many GPUs
502b8e80941Smrg      (including OpenGL 4.0 GPUs that already support this function) have a
503b8e80941Smrg      32bit x 16bit = 48bit multiplier.  The implementation there is only
504b8e80941Smrg      trivially more expensive than regular 32bit multiplication.
505b8e80941Smrg
506b8e80941Smrg    (5) Should the pack and unpack functions be included?
507b8e80941Smrg
508b8e80941Smrg      RESOLVED: No.  These functions are already available via
509b8e80941Smrg      GL_ARB_shading_language_packing.
510b8e80941Smrg
511b8e80941Smrg    (6) Should the "BitsTo" functions be included?
512b8e80941Smrg
513b8e80941Smrg      RESOLVED: No.  These functions are already available via
514b8e80941Smrg      GL_ARB_shader_bit_encoding.
515b8e80941Smrg
516b8e80941SmrgRevision History
517b8e80941Smrg
518b8e80941Smrg    Rev.      Date     Author    Changes
519b8e80941Smrg    ----  -----------  --------  -----------------------------------------
520b8e80941Smrg     3    31-Mar-2017  Jon Leech Add ES support (OpenGL-Registry/issues/3)
521b8e80941Smrg     2     7-Jul-2016  idr       Fix typo in #extension line
522b8e80941Smrg     1    20-Jun-2016  idr       Initial version based on GL_ARB_gpu_shader5.
523