Sorting a million sprites

When I wrote this article in 2020 about rendering a million sprites, I was also writing a 2D rendering framework using this technique combined with DOTS. I was planning to use this framework for Academia: School Simulator. I was not successful then because I found out that the method Graphics.DrawMeshInstancedIndirect() does not respect sorting layers which the currently used framework heavily uses. This would introduce a dangerous amount of changes that I wasn’t comfortable with especially that the game is about to be released. So this framework was just sitting on my repository ready to be forgotten. That is until I had a chat with Arthur DAntonio recently. He is the Senior Software Engineering Leader at LVGameDev LLC (and the owner I think). They are the fantastic developers of SimAirport.

I had corresponded with Arthur before when Academia was still in its infancy and SimAirport was already in the middle of their Early Access. We exchanged programming notes back then. It’s a long while before we talked again. He contacted me through LinkedIn as he saw my post about looking for a job in Europe. We scheduled for a chat and it was very enlightening as I’ve learned a lot. After talking about our lives and games, we eventually talked about technical stuff because that’s just how programmers’ conversations devolve to. I’ve learned that they’re using Graphics.DrawMeshInstancedIndirect() to render millions of sprites without breaking a sweat. In the framework that I have made, I had to sort the sprites (back to front or top to bottom) in the CPU before sending them to the GPU. But by doing this, the CPU takes a hit when there are more sprites. So I asked about how they did it and I was pleasantly surprised by the answer. They didn’t need to. Just let the GPU handle it as they’re very good at this. I was very intrigued by this revelation that I revisited my forgotten framework in the next few days.

Revisiting the million sprites once again

I went back to my quick sample of rendering a million sprites using a compute buffer. Some changes were made. This is the shader that I’ve used to render the sprites:

Shader "Instanced/SpriteRendererIndexedUv" {
    Properties {
        _MainTex ("Albedo (RGB)", 2D) = "white" {}
        _Cutoff ("Alpha cutoff", Range(0,1)) = 0.5
    }
    
    SubShader {
        Tags{
            "Queue"="AlphaTest"
            "IgnoreProjector"="True"
            "RenderType"="TransparentCutout"
        }
        Cull Back
        Lighting Off
        ZWrite On
        AlphaTest Greater 0
        Blend SrcAlpha OneMinusSrcAlpha
        Pass {
            CGPROGRAM
            // Upgrade NOTE: excluded shader from OpenGL ES 2.0 because it uses non-square matrices
            #pragma exclude_renderers gles

            #pragma vertex vert
            #pragma fragment frag
            #pragma target 4.5

            #include "UnityCG.cginc"

            sampler2D _MainTex;
            fixed _Cutoff;

            // xyz is the position, w is the rotation
            StructuredBuffer<float4> translationAndRotationBuffer;
            StructuredBuffer<float> scaleBuffer;
            
			StructuredBuffer<float4> colorsBuffer;

            // Note here that uvBuffer is only the available UV coordinates
            // An int value from uvIndexBuffer would then index the uvBuffer
            StructuredBuffer<float4> uvBuffer;
            StructuredBuffer<int> uvIndexBuffer;

            struct v2f {
                float4 pos : SV_POSITION;
                float2 uv: TEXCOORD0;
				fixed4 color : COLOR0;
            };

            float4x4 rotationZMatrix(float zRotRadians){
                float c = cos(zRotRadians);
                float s = sin(zRotRadians);
                float4x4 ZMatrix  = 
                    float4x4( 
                       c,  s, 0,  0,
                       -s, c, 0,  0,
                       0,  0, 1,  0,
                       0,  0, 0,  1);
                return ZMatrix;
            }

            v2f vert(appdata_full v, uint instanceID : SV_InstanceID) {
                float4 translationAndRot = translationAndRotationBuffer[instanceID];
                int uvIndex = uvIndexBuffer[instanceID];
                float4 uv = uvBuffer[uvIndex];
                
                //rotate the vertex
                v.vertex = mul(v.vertex - float4(0.5, 0.5, 0,0), rotationZMatrix(translationAndRot.w));
                
                //scale it
                float scale = scaleBuffer[instanceID];
                float3 worldPosition = translationAndRot.xyz + (v.vertex.xyz * scale);
                
                v2f o;
                o.pos = UnityObjectToClipPos(float4(worldPosition, 1.0f));
                
                // XY here is the dimension (width, height). 
                // ZW is the offset in the texture (the actual UV coordinates)
                o.uv =  v.texcoord * uv.xy + uv.zw;
                
				o.color = colorsBuffer[instanceID];
                return o;
            }

            fixed4 frag(v2f i, out float depth : SV_Depth) : SV_Target {
                fixed4 col = tex2D(_MainTex, i.uv) * i.color;
                clip(col.a - _Cutoff);
                col.rgb *= col.a;

				return col;
            }

            ENDCG
        }
    }
}

There are a few significant changes. The first one is now we need 3 floats for the position as we’re going to need the z component for the GPU to sort. This will be contained in translationAndRotationBuffer as the xyz of a float4. We are going to use the w value as the rotation. The scale is now represented with its own buffer.

Another significant change is the use of uvIndexBuffer. The buffer uvBuffer now only contains all the available UVs that represents separate images inside the texture (aka atlas). It’s a float4 where xy is the width and height and zw is the offset in the texture. You can store the UVs from your atlas here and you only need to set it once. The int value in uvIndexBuffer then is used to index a UV value in the uvBuffer. This implies that each sprite don’t have to keep a float4 to represent its UV value. They can just keep a single int instead. I learned this from Arthur as well.

The following is the MonoBehaviour test script that I used to render sprites using the mentioned shader:

public class ComputeBufferMultipleSpritesIndexedUv : MonoBehaviour {
    [SerializeField]
    private Camera referenceCamera;
    
    [SerializeField]
    private Material material;
    
    [SerializeField]
    private float spriteScale = 0.3f;  

    [SerializeField]
    private int count;

    private Mesh mesh;
    
    // Matrix here is a compressed transform information
    // xyz is the position, w is rotation
    private ComputeBuffer translationAndRotationBuffer;

    private ComputeBuffer scaleBuffer;
    
    private ComputeBuffer colorBuffer;
    
    // uvBuffer contains float4 values in which xy is the uv dimension and zw is the texture offset
    private ComputeBuffer uvBuffer;
    private ComputeBuffer uvIndexBuffer;

    private uint[] args;
    
    private ComputeBuffer argsBuffer;

    private const int UV_X_ELEMENTS = 4;
    private const int UV_Y_ELEMENTS = 4;

    private void Awake() {
        Assertion.AssertNotNull(this.referenceCamera);
        
        QualitySettings.vSyncCount = 0;
        Application.targetFrameRate = -1;
        
        this.mesh = MeshExtension.Quad(1.0f);
        
        // Prepare available UVs
        const int uvCount = UV_X_ELEMENTS * UV_Y_ELEMENTS;
        float4[] uvs = new float4[uvCount];
        for (int u = 0; u < UV_X_ELEMENTS; u++) {
            for (int v = 0; v < UV_Y_ELEMENTS; v++) {
                int index = v * UV_X_ELEMENTS + u;
                uvs[index] = new float4(0.25f, 0.25f, u * 0.25f, v * 0.25f);
            }
        }
        
        this.uvBuffer = new ComputeBuffer(uvs.Length, 16);
        this.uvBuffer.SetData(uvs);
        int uvBufferId = Shader.PropertyToID("uvBuffer");
        this.material.SetBuffer(uvBufferId, this.uvBuffer);
        
        // Prepare values
        float4[] translationAndRotations = new float4[this.count];
        float[] scales = new float[this.count];
        float4[] colors = new float4[this.count];
        int[] uvIndices = new int[this.count]; 

        const float maxRotation = Mathf.PI * 2;
        float screenRatio = (float) Screen.width / Screen.height;
        float orthoSize = this.referenceCamera.orthographicSize;
        float maxX = orthoSize * screenRatio;
        
        for (int i = 0; i < this.count; ++i) {
            // transform
            float y = UnityEngine.Random.Range(-orthoSize, orthoSize);
            float x = UnityEngine.Random.Range(-maxX, maxX);
            float z = -y; // Negate y so that higher sprites are rendered prior to sprites below
            float rotation = UnityEngine.Random.Range(0, maxRotation);
            translationAndRotations[i] = new float4(x, y, z, rotation);
            scales[i] = this.spriteScale;

            // UV index
            uvIndices[i] = UnityEngine.Random.Range(0, uvCount);
            
            // color
            float r = UnityEngine.Random.Range(0f, 1.0f);
            float g = UnityEngine.Random.Range(0f, 1.0f);
            float b = UnityEngine.Random.Range(0f, 1.0f);
            colors[i] = new float4(r, g, b, 1.0f);
        }
        
        this.translationAndRotationBuffer = new ComputeBuffer(this.count, 16);
        this.translationAndRotationBuffer.SetData(translationAndRotations);
        int translationAndRotationBufferId = Shader.PropertyToID("translationAndRotationBuffer");
        this.material.SetBuffer(translationAndRotationBufferId, this.translationAndRotationBuffer);

        this.scaleBuffer = new ComputeBuffer(this.count, sizeof(float));
        this.scaleBuffer.SetData(scales);
        int scaleBufferId = Shader.PropertyToID("scaleBuffer");
        this.material.SetBuffer(scaleBufferId, this.scaleBuffer);

        this.uvIndexBuffer = new ComputeBuffer(this.count, sizeof(int));
        this.uvIndexBuffer.SetData(uvIndices);
        int uvIndexBufferId = Shader.PropertyToID("uvIndexBuffer");
        this.material.SetBuffer(uvIndexBufferId, this.uvIndexBuffer);
        
        this.colorBuffer = new ComputeBuffer(this.count, 16);
        this.colorBuffer.SetData(colors);
        int colorsBufferId = Shader.PropertyToID("colorsBuffer");
        this.material.SetBuffer(colorsBufferId, this.colorBuffer);

        this.args = new uint[] {
            6, (uint)this.count, 0, 0, 0
        };
        this.argsBuffer = new ComputeBuffer(1, this.args.Length * sizeof(uint), ComputeBufferType.IndirectArguments);
        this.argsBuffer.SetData(this.args);
    }

    private static readonly Bounds BOUNDS = new Bounds(Vector2.zero, Vector3.one * 10);

    private void Update() {   
        // Draw
        Graphics.DrawMeshInstancedIndirect(this.mesh, 0, this.material, BOUNDS, this.argsBuffer);
    }
}

This script is very similar to the one in the old article with a few changes. I’m using only a single scale value here (spriteScale) because the goal of this script is to check if the sprite sorting is correct. The sorted sprites might not look sorted if they have different scales. Note that the sprites are anchored at the center here.

Another change is the uvBuffer now only contains the available UVs in the sprite texture which in this case is the 4×4 smileys. Each sprite no longer needs to have its own UV values which is 4 floats. It only needs an index (an int) to the uvBuffer. This will save memory as the number of sprites scales. These indices are selected randomly and stored in uvIndices.

Another difference is we are now setting the z value which is just the negation of y since we want sprites that are at the top to have lesser z than on the bottom (top to bottom sorting).

With these shader and script, we then setup a simple scene with a camera and another GameObject where ComputeBufferMultipleSpritesIndexedUv would be added:

We hit Play, and…

Well that doesn’t look right. The sprites are not sorted from top to bottom. It appears not sorted at all. This is where I got stuck for a while. I searched far and wide where I did wrong. I tweaked the shader lots of times. I even tried using SV_Depth to make sure that it is writing on the z buffer. Shader programming is not my strong suit but I know enough for the type of games I’m making. I felt very stumped that I asked Arthur again for help. He sent their shader code and told me that mine is already correct.

I’ve wasted enough time on this so I decided to write another test code. This time, the sprites would be tied with a Transform so I could drag them while the editor is playing and I could see the sorting in real time. This is what it looks like:

public class ComputeBufferSpriteDraggableTest : MonoBehaviour {
    [SerializeField]
    private Camera referenceCamera;
    
    [SerializeField]
    private Material material;

    [SerializeField]
    private float spriteScale = 0.3f;

    [SerializeField]
    private Transform[] spriteTransforms;
    
    private Mesh mesh;
    
    // Matrix here is a compressed transform information
    // xyz is the position, w is rotation
    private ComputeBuffer translationAndRotationBuffer;

    private ComputeBuffer scaleBuffer;
    
    private ComputeBuffer colorBuffer;
    
    // uvBuffer contains float4 values in which xy is the uv dimension and zw is the texture offset
    private ComputeBuffer uvBuffer;
    private ComputeBuffer uvIndexBuffer;

    private uint[] args;
    
    private ComputeBuffer argsBuffer;

    private const int UV_X_ELEMENTS = 4;
    private const int UV_Y_ELEMENTS = 4;

    private float4[] translationAndRotations;
    private float[] scales;
    private float4[] colors;
    private int[] uvIndices; 
    
    private void Awake() {
        Assertion.AssertNotNull(this.referenceCamera);
        
        QualitySettings.vSyncCount = 0;
        Application.targetFrameRate = -1;
        
        this.mesh = MeshExtension.Quad(1.0f);
        
        // Prepare available UVs
        const int uvCount = UV_X_ELEMENTS * UV_Y_ELEMENTS;
        float4[] uvs = new float4[uvCount];
        for (int u = 0; u < UV_X_ELEMENTS; u++) {
            for (int v = 0; v < UV_Y_ELEMENTS; v++) {
                int index = v * UV_X_ELEMENTS + u;
                uvs[index] = new float4(0.25f, 0.25f, u * 0.25f, v * 0.25f);
            }
        }
        
        this.uvBuffer = new ComputeBuffer(uvs.Length, 16);
        this.uvBuffer.SetData(uvs);
        int uvBufferId = Shader.PropertyToID("uvBuffer");
        this.material.SetBuffer(uvBufferId, this.uvBuffer);
        
        // Prepare values
        int count = this.spriteTransforms.Length;
        this.translationAndRotations = new float4[count];
        this.scales = new float[count];
        this.colors = new float4[count];
        this.uvIndices = new int[count]; 
        
        for (int i = 0; i < count; ++i) {
            this.scales[i] = this.spriteScale;

            // UV index
            this.uvIndices[i] = UnityEngine.Random.Range(0, uvCount);
            
            // color
            this.colors[i] = new float4(1.0f, 1.0f, 1.0f, 1.0f);
        }
        
        this.translationAndRotationBuffer = new ComputeBuffer(count, 16);

        UpdateSpriteTransforms();

        this.scaleBuffer = new ComputeBuffer(count, sizeof(float));
        this.scaleBuffer.SetData(this.scales);
        int scaleBufferId = Shader.PropertyToID("scaleBuffer");
        this.material.SetBuffer(scaleBufferId, this.scaleBuffer);

        this.uvIndexBuffer = new ComputeBuffer(count, sizeof(int));
        this.uvIndexBuffer.SetData(this.uvIndices);
        int uvIndexBufferId = Shader.PropertyToID("uvIndexBuffer");
        this.material.SetBuffer(uvIndexBufferId, this.uvIndexBuffer);
        
        this.colorBuffer = new ComputeBuffer(count, 16);
        this.colorBuffer.SetData(this.colors);
        int colorsBufferId = Shader.PropertyToID("colorsBuffer");
        this.material.SetBuffer(colorsBufferId, this.colorBuffer);

        this.args = new uint[] {
            6, (uint)count, 0, 0, 0
        };
        this.argsBuffer = new ComputeBuffer(1, this.args.Length * sizeof(uint), ComputeBufferType.IndirectArguments);
        this.argsBuffer.SetData(this.args);
    }

    private void UpdateSpriteTransforms() {
        for (int i = 0; i < this.spriteTransforms.Length; i++) {
            Transform spriteTransform = this.spriteTransforms[i];
            Vector3 position = spriteTransform.position;
            position.z = -position.y; // Negate y for sorting
            float scale = spriteTransform.localScale.z;
            this.translationAndRotations[i] = new float4(position, scale);
        }
        
        this.translationAndRotationBuffer.SetData(this.translationAndRotations);
        int translationAndRotationBufferId = Shader.PropertyToID("translationAndRotationBuffer");
        this.material.SetBuffer(translationAndRotationBufferId, this.translationAndRotationBuffer);
    }

    private static readonly Bounds BOUNDS = new Bounds(Vector2.zero, Vector3.one);

    private void Update() {   
        UpdateSpriteTransforms();
        
        // Draw
        Graphics.DrawMeshInstancedIndirect(this.mesh, 0, this.material, BOUNDS, this.argsBuffer);
    }
}

This new test code is mostly the same with the previous one but generates sprites only as many as the specified spriteTransforms. I really only need at least 2 for this test. A big difference is the addition of UpdateSpriteTransforms(). This method updates the data in translationAndRotationBuffer using the position and scale found in spriteTransforms. This method is also called per frame so that the positions of the sprites are updated whenever I drag them during play mode. This way, I could see if the sorting changes when the sprite changes its y position.

I setup a scene for this test as follows:

SpriteA and SpriteB are just empty GameObjects. We only need them for their position. We hit play and we see something like this:

I intentionally positioned SpriteA and SpriteB closer such that they would overlap. I can now go to the Scene window and drag any of these two.

Upon inspection, I’ve noticed that the sorting works but reversed. The order is from bottom to top. This is where I had a eureka moment. The setting of z position is wrong. I don’t have to negate the y position. I’ll just have to set the y position right away.

// Change this
position.z = -position.y; // Negate y for sorting

// to this
position.z = position.y;

This is because the GPU does not sort by z value. It sorts by distance from camera. Geometry that is farther away gets rendered first. What a stupid mistake. The setting of negative y position is a habit from my own 2D rendering frameworks since I do the sorting myself and I sort them by some render order value which usually uses the negation of the y position. I have a long way of learning shaders.

I fixed my previous script to check and…

Walla! The sprites are now sorted from top to bottom. Now I want to check if I can render a million of these that are also sorted. I have these settings on the script:

I have to reduce the scale to avoid overdraw. This is what a million smiley sprites looks like:

Upon zooming in on the Scene window, we can see that the sprites are indeed sorted from top to bottom:

When we open the Stats, we can see that this is nothing even on the editor:

Conclusion

We can sort compute buffer sprites by letting the GPU do it. This will shave off some CPU cycles to my 2D rendering framework based from this technique. The bottleneck now is updating the transformation buffer whenever there are changes but there will be lots of ideas to speed this up.

Another lesson is to chat with other programmers especially those that develop similar games to yours. You will improve each other’s lives. So with that, feel free to contact me if you want to schedule a chat. I can’t possibly chat with you all but I will consider some. Hit me a message on Twitter or on LinkedIn.

If you like my posts, please subscribe to my mailing list. I’ll also give you a free game.

2 thoughts on “Sorting a million sprites

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s