こんちゃわ。Pocolです。
Wave組み込み命令の記事を漁っていたら,GithubにWaveActiveLerp()の実装を書いている人がいたので紹介しようと思います。
下記に説明の記事があります。
https://github.com/AlexSabourinDev/cranberry_blog/blob/master/WaveActiveLerp.md
実装は,https://github.com/AlexSabourinDev/cranberry_blog/blob/master/WaveActiveLerp_Shaders/WaveActiveLerp.hlslにあって,次のような感じみたいです。
uint WaveGetLastLaneIndex() { uint4 ballot = WaveActiveBallot(true); uint4 bits = firstbithigh(ballot); // Returns -1 (0xFFFFFFFF) if no bits set. // For reasons unclear to me, firstbithigh causes us to consider `bits` as a vector when compiling for RDNA // This then causes us to generate a waterfall loop later on in WaveReadLaneAt :( // Force scalarization here. See: https://godbolt.org/z/barT3rM3W bits = WaveReadLaneFirst(bits); bits = select(bits == 0xFFFFFFFF, 0, bits + uint4(0, 32, 64, 96)); return max(max(max(bits.x, bits.y), bits.z), bits.w); } float WaveReadLaneLast(float t) { uint lastLane = WaveGetLastLaneIndex(); return WaveReadLaneAt(t, lastLane); } // Interpolates as lerp(lerp(Lane2, Lane1, t1), Lane0, t0), etc // // NOTE: Values need to be sorted in order of last interpolant to first interpolant. // // As an example, say we have the loop: // for(int i = 0; i < 4; i++) // result = lerp(result, values[i], interpolations[i]); // // Lane0 should hold the last value, i.e. values[3]. NOT values[0]. // // WaveActiveLerp instead implements the loop as a reverse loop: // for(int i = 3; i >= 0; i--) // result = lerp(result, values[i], interpolations[i]); // // return.x == result of the wave's interpolation // return.y == product of all the wave's (1-t) for continued interpolation. float2 WaveActiveLerp(float value, float t) { // lerp(v1, v0, t0) = v1 * (1 - t0) + v0 * t0 // lerp(lerp(v2, v1, t1), v0, t0) // = (v2 * (1 - t1) + v1 * t1) * (1 - t0) + v0 * t0 // = v2 * (1 - t1) * (1 - t0) + v1 * t1 * (1 - t0) + v0 * t0 // We can then split the elements of our sum for each thread. // Lane0 = v0 * t0 // Lane1 = v1 * t1 * (1 - t0) // Lane2 = v2 * (1 - t1) * (1 - t0) // As you can see, each thread's (1 - tn) term is simply the product of the previous thread's terms. // We can achieve this result by using WavePrefixProduct float prefixProduct = WavePrefixProduct(1.0f - t); float laneValue = value * t * prefixProduct; float interpolation = WaveActiveSum(laneValue); // If you don't need this for a continued interpolation, you can simply remove this part. float postfixProduct = prefixProduct * (1.0f - t); float oneMinusT = WaveReadLaneLast(postfixProduct); return float2(interpolation, oneMinusT); }
いまのところで,使いどころがパッと浮かばないのですが,知っていればどこかで使えそうな気がしています。
…というわけで,WaveActiveLerp()の実装紹介でした。