How to use RaycastCommand

It’s been a while since my last post. I was busy with processing my Visa and exit clearance since the last one. I haven’t opened Unity since then. I arrived in Germany last February 9 and I’m working on my new job now. I waited for a while for things to settle and to be well adjusted before writing a post again. Sorry it took a while. But I’m back and I hope that I can write monthly again.

One of the things that I discovered while optimizing the current game that I’m working on is RaycastCommand. It’s a Unity API to make multiple raycast requests in a single call. This uses the Jobs System so the computation of these raycasts is done in multiple threads. Instead of using multiple calls to methods like Physics.Raycast() or Physics.RaycastNonAlloc(), you can collect all your raycast calls and use RaycastCommand instead. This can significantly reduce the time spent on raycasts on the main thread, thus freeing CPU time that can be used for other things. Another good thing about this is usage of ECS is not required. You can use it right now in your MonoBehaviour based project. You just need to be familiar with the Jobs System, native collections, and Burst compiler.

A Contrived Example

Let’s make a simple environment where we can compare raycasts via Physics.RaycastNonAlloc() and RaycastCommand. For this test environment, we can just spawn cubes with a BoxCollider and RigidBody set as kinematic. We can spawn a fixed amount of them on Awake() randomly within a certain bounds.

Then we maintain a fixed amount of contrived Character instances that has a position and direction. We use these as the origin and direction of the raycasts. This is simply the following:

private struct Character {
    public Vector3 position;
    public Vector3 direction;

    public void Set(Vector3 position, Vector3 direction) {
        this.position = position;
        this.direction = direction;
    }
}

On Update(), we can set random positions and directions to the characters. From here, we can make our raycast requests and compare. Without further ado, I’m just going to show the whole sample code that demonstrates the comparison:

public class RaycastCommandSample : MonoBehaviour {
    [SerializeField]
    private int characterCount = 1000;
    
    [SerializeField]
    private int colliderCount = 100;

    [SerializeField]
    private Vector3 minBounds = new Vector3(-5, -5, -5);

    [SerializeField]
    private Vector3 maxBounds = new Vector3(5, 5, 5);

    [SerializeField]
    private GameObject colliderPrefab;
    
    private Character[] characters;
 
    private int layerMask;

    private void Awake() {
        // Prepare random colliders
        for (int i = 0; i < this.colliderCount; i++) {
            GameObject instance = Instantiate(this.colliderPrefab, RandomVector3(), Quaternion.identity);
            instance.name = $"Collider {i}";
        }
        
        // Prepare characters
        this.characters = new Character[this.characterCount];
        for (int i = 0; i < this.characterCount; i++) {
            this.characters[i] = new Character();
        }

        this.layerMask = LayerMask.GetMask("Default");
    }

    private void Update() {
        SetPositionsAndDirections();

        Profiler.BeginSample("UseRaycastNonAlloc()");
        UseRaycastNonAlloc();
        Profiler.EndSample();
        
        Profiler.BeginSample("UseRaycastCommand()");
        UseRaycastCommand();
        Profiler.EndSample();
    }
    
    private readonly RaycastHit[] nonAllocHitResults = new RaycastHit[8];

    private void UseRaycastNonAlloc() {
        for (int c = 0; c < this.characterCount; ++c) {
            ref Character character = ref this.characters[c];
            
            Ray ray = new() {
                origin = character.position,
                direction = character.direction
            };
            
            int hits = Physics.RaycastNonAlloc(ray, this.nonAllocHitResults, 50, this.layerMask);
            int limit = Mathf.Min(hits, this.nonAllocHitResults.Length);
            for (int hitIndex = 0; hitIndex < limit; hitIndex++) {
                RaycastHit hit = this.nonAllocHitResults[hitIndex];
                if (hit.collider != null) {
                    // Do something with the result
                    // Comment out Debug.Log() just to see if the raycast worked
                    //Debug.Log($"Hit {hit.collider.gameObject.name}");
                }
            }
        }
    }

    private void UseRaycastCommand() {
        // Collect positions and directions for raycast
        // We use native collections so we could prepare the raycast commands in a job
        NativeArray<Vector3> positions = new(this.characterCount, Allocator.TempJob);
        NativeArray<Vector3> directions = new(this.characterCount, Allocator.TempJob);
        for (int i = 0; i < this.characterCount; i++) {
            positions[i] = this.characters[i].position;
            directions[i] = this.characters[i].direction;
        }

        const int maxHits = 8;
        NativeArray<RaycastHit> results = new(this.characterCount * maxHits, Allocator.TempJob);
        NativeArray<RaycastCommand> commands = new(this.characterCount, Allocator.TempJob);

        // Prepare RaycastCommands
        PrepareRaycastCommandsJob prepareCommandsJob = new() {
            positions = positions,
            directions = directions,
            commands = commands,
            layerMask = this.layerMask.ValueOr(0)
        };
        JobHandle handle = prepareCommandsJob.ScheduleParallel(commands.Length, 64, default);

        // Schedule raycasts
        int commandsPerJob = Mathf.Max(this.characterCount / JobsUtility.JobWorkerCount, 1);
        handle = RaycastCommand.ScheduleBatch(commands, results, commandsPerJob, maxHits, handle);

        // We complete here so we get the results
        handle.Complete();
        
        // Process the results
        for (int i = 0; i < this.characterCount; i++) {
            RaycastResultEnumerator hitEnumerator = new(ref results, i, maxHits);
            while (hitEnumerator.HasNextHit(out RaycastHit hit)) {
                // Do something with the result
                // Comment out Debug.Log() just to see if the raycast worked
                //Debug.Log($"Hit: {hit.collider.name}");
            }
        }

        // Don't forget to dispose
        positions.Dispose();
        directions.Dispose();
        results.Dispose();
        commands.Dispose();
    }

    [BurstCompile]
    private struct PrepareRaycastCommandsJob : IJobFor {
        [ReadOnly]
        public NativeArray<Vector3> positions;

        [ReadOnly]
        public NativeArray<Vector3> directions;
        
        [NativeDisableParallelForRestriction]
        public NativeArray<RaycastCommand> commands;

        public int layerMask;

        public void Execute(int index) {
            Vector3 position = this.positions[index];
            Vector3 direction = this.directions[index];
            this.commands[index] = new RaycastCommand(
                position, direction, new QueryParameters(this.layerMask), 50);
        }
    }

    private void SetPositionsAndDirections() {
        for (int i = 0; i < this.characters.Length; i++) {
            this.characters[i].Set(RandomVector3(), RandomVector3().normalized);
        }
    }

    private Vector3 RandomVector3() {
        float x = Random.Range(this.minBounds.x, this.maxBounds.x);
        float y = Random.Range(this.minBounds.y, this.maxBounds.y);
        float z = Random.Range(this.minBounds.z, this.maxBounds.z);
        return new Vector3(x, y, z);
    }

    private struct Character {
        public Vector3 position;
        public Vector3 direction;

        public void Set(Vector3 position, Vector3 direction) {
            this.position = position;
            this.direction = direction;
        }
    }
}

Awake() is pretty straightforward. We just instantiate the random colliders using colliderCount as the amount. We also prepare the Character instances. We also get the layerMask that we’re going to use. It’s important to cache this value here since LayerMask.GetMask() throws garbage. This is bad when called frequently every frame.

This is what the random colliders look like in my editor.

The meat is really in Update(). First, we call SetPositionsAndDirections() which sets the Characters with random position and directions. After this, we call the methods to compare surrounded by Profiler sampling so we can see them in the Profiler. I believe the methods are named descriptively enough, UseRaycastNonAlloc() and UseRaycastCommand().

The code in UseRaycastNonAlloc() is how we usually make raycast queries. We go through each character, we use their position and direction of the Ray which is then fed to Physics.RaycastNonAlloc(). We go through how many hits it made and process them.

For UseRaycastCommand(), we’re going to mimic what happens in UseRaycastNonAlloc() but use RaycastCommand. This is where the fun starts.

Using RaycastCommand

First, we collect the positions and directions in NativeArrays. We do it this way because we can prepare the RaycastCommand structs in a Burst compiled job. In your game, this part might not necessarily be positions and directions. What we’re doing here is we are collecting the data that’s needed for preparing the raycasts. For example, you might be interested if there are obstacles blocking units from their current attack targets. Preparing for the raycasts here is a bit different since you have to do target – unitPosition and normalize that. This part depends on your use case.

Next we prepare an array of RaycastHits. Notice how the length of this array is characterCount * maxHits. There’s this text in the documentation: “The result for a command at index N in the command buffer will be stored at index N * maxHits in the results buffer.” I was confused by this at first because I was using just the characterCount as the length of the results. My understanding is that when calling RaycastCommand.ScheduleBatch(), you need to specify the parameter maxHits which is the maximum number of colliders the ray can hit. So when preparing for the results array, this should be big enough to accommodate the maxHits amount of results for each raycast. Since we are raycasting for each character, the length of the result array should then be characterCount * maxHits to accommodate all possible results.

You decide the most appropriate value of maxHits for your game. This depends on how the physical world of your game is made. How many raycast hits are you expecting every frame? What is the average amount of raycast hits that is possible? Decide on a reasonable value. Not too big but not too small either.

Then we prepare the array of RaycastCommands which is the same as the amount of characters. The RaycastCommand instances here contain default values. We have to prepare them. Since where are using native collections, we can prepare them using a Burst compile job. This is what the job PrepareRaycastCommandsJob does:

[BurstCompile]
private struct PrepareRaycastCommandsJob : IJobFor {
    [ReadOnly]
    public NativeArray<Vector3> positions;

    [ReadOnly]
    public NativeArray<Vector3> directions;
    
    [NativeDisableParallelForRestriction]
    public NativeArray<RaycastCommand> commands;

    public int layerMask;

    public void Execute(int index) {
        Vector3 position = this.positions[index];
        Vector3 direction = this.directions[index];
        this.commands[index] = new RaycastCommand(
            position, direction, new QueryParameters(this.layerMask), 50);
    }
}

We schedule this job to prepare the RaycastCommands. After this, we can now call RaycastCommand.ScheduleBatch(). This method accepts and returns a JobHandle as well so we can chain it with the handle result of scheduling PrepareRaycastCommandsJob:

// Prepare RaycastCommands
PrepareRaycastCommandsJob prepareCommandsJob = new() {
    positions = positions,
    directions = directions,
    commands = commands,
    layerMask = this.layerMask
};
JobHandle handle = prepareCommandsJob.ScheduleParallel(commands.Length, 64, default);

// Schedule raycasts (chain with previous handle)
int commandsPerJob = Mathf.Max(this.characterCount / JobsUtility.JobWorkerCount, 1);
handle = RaycastCommand.ScheduleBatch(commands, results, commandsPerJob, maxHits, handle);

The value of commandsPerJob is done in such a way that the amount of raycasts is evenly distributed amongst the available worker threads. From here, we can call JobHandle.Complete() so we can get the results of the raycasts and process them however we like.

Lastly, don’t forget to dispose the temporary native collections that you have created.

Processing the results

Processing the results is a bit tricky due to the usage of maxHits. In the results array, maxHits amount of RaycastHit is reserved for every raycast. Therefore, the starting RaycastHit of a raycast at index r is at r * maxHits. Going through the results looks like this:

NativeArray<RaycastHit> results = new(raycastCount * maxHits, Allocator.TempJob);
for(int r = 0; r < raycastCount; ++r) {
    int startIndex = r * maxHits;

    // Go through each raycast hit
    for(int i = 0; i < maxHits; ++i) {
        int raycastHitIndex = startIndex + i;
        RaycastHit hit = results[raycastHitIndex];
        // Do something with hit here
    }
}

That’s not all. There’s also this part of the documentation:

If maxHits is larger than the actual number of results for the command the result buffer will contain some invalid results which did not hit anything. The first invalid result is identified by the collider being null. The second and later invalid results are not written to by the raycast command so their colliders are not guaranteed to be null. When iterating over the results the loop should stop when the first invalid result is found.

You have to check for the first invalid result and discard the rest. Iterating through the results should look like this instead:

NativeArray<RaycastHit> results = new(raycastCount * maxHits, Allocator.TempJob);
for(int r = 0; r < raycastCount; ++r) {
    int startIndex = r * maxHits;

    // Go through each raycast hit
    for(int i = 0; i < maxHits; ++i) {
        int raycastHitIndex = startIndex + i;
        RaycastHit hit = results[raycastHitIndex];
   
        if(hit.collider == null) {
            // Found an invalid result. No need to continue.
            break;
        }

        // Do something with hit here
    }
}

It’s error prone to repeat this code so I implemented an enumerator for it:

public struct RaycastResultEnumerator {
    private readonly NativeArray<RaycastHit> results;
    private readonly int startingIndex;
    private readonly int maxHits;

    private int localIndex;

    public RaycastResultEnumerator(ref NativeArray<RaycastHit> results, int raycastIndex, int maxHits)
    {
        this.results = results;
        this.startingIndex = raycastIndex * maxHits;
        this.maxHits = maxHits;

        this.localIndex = 0;
    }

    public bool HasNextHit(out RaycastHit hit)
    {
        if (this.localIndex >= this.maxHits)
        {
            // Reached the end
            hit = default;
            return false;
        }

        int hitIndex = this.startingIndex + this.localIndex;
        hit = this.results[hitIndex];
        if (hit.colliderInstanceID == 0)
        {
            // Documentation says that iteration should stop as soon as a collider is null
            return false;
        }

        // Move to next
        ++this.localIndex;

        return true;
    }
}

Notice that I used if (hit.colliderInstanceID == 0) to check if the current result is already invalid. I did it this way so I can use this enumerator in Burst compiled environments. I wouldn’t be able to use this in Burst compiled jobs had I use if (hit.collider == null). Obviously, collider is a reference type which the Burst compiler can’t compile.

In the sample code, this is how the enumerator is used:

// Process the results
for (int i = 0; i < this.characterCount; i++) {
    RaycastResultEnumerator hitEnumerator = new(ref results, i, maxHits);
    while (hitEnumerator.HasNextHit(out RaycastHit hit)) {
        // Do something with the result
        // Comment out Debug.Log() just to see if the raycast worked
        //Debug.Log($"Hit: {hit.collider.name}");
    }
}

So how’s the performance?

This is the moment that you’ve been waiting for, haven’t you? I’ve set up a scene with a GameObject that has the RaycastCommandSample component added. I’ve made the default cube and added a RigidBody component to it, setting IsKinematic to true, as the colliderPrefab. I’ve set characterCount to 1,000 (so 1,000 raycasts) and colliderCount to 100. The profiler on my machine looks like this (note: this is in editor):

UseRaycastNonAlloc() takes around 2.0ms – 2.5ms while UseRaycastCommand() takes around 0.5ms – 0.7ms. UseRaycastCommand() is at least 4x faster. Notice also that the physics calls (the orange ones) are distributed amongst the worker threads. This is the specs of my laptop:

The performance gets better on machines with more cores. That’s pretty awesome to me!

Optimize further?

Wait! There’s more! Since the raycast results are already in NativeArray, we can pass them to another Burst compiled job for processing and it can even run in parallel. You can shave off a few more CPU time. A job that looks for the distance of the nearest hit can look like this:

[BurstCompile]
private struct ResolveNearestHitJob : IJobFor {
    [ReadOnly]
    public NativeArray<RaycastHit> results;

    // Set result here if the raycast has at least one hit
    [NativeDisableParallelForRestriction]
    public NativeArray<bool> hasHitResults;

    [NativeDisableParallelForRestriction]
    public NativeArray<float> nearestDistanceResults;

    public int maxHits;
    
    public void Execute(int index) {
        RaycastResultEnumerator hitEnumerator = new(ref this.results, index, this.maxHits);
        float nearestDistance = float.MaxValue;
        
        while (hitEnumerator.HasNextHit(out RaycastHit hit)) {
            // Has at least one hit
            this.hasHitResults[index] = true;

            float distance = hit.distance;
            if (distance < nearestDistance) {
                nearestDistance = distance;
            }
        }
        
        // Store the nearest distance
        this.nearestDistanceResults[index] = nearestDistance;
    }
}

Then at the call site, we can schedule this job and chain it after calling RaycastCommand.ScheduleBatch():

// Schedule raycasts
int commandsPerJob = Mathf.Max(this.characterCount / JobsUtility.JobWorkerCount, 1);
handle = RaycastCommand.ScheduleBatch(commands, results, commandsPerJob, maxHits, handle);

// Prepare processing job
NativeArray<bool> hasHitResults = new(this.characterCount, Allocator.TempJob);
NativeArray<float> nearestDistanceResults = new(this.characterCount, Allocator.TempJob);
ResolveNearestHitJob resolveNearestJob = new() {
    raycastResults = results,
    hasHitResults = hasHitResults,
    nearestDistanceResults = nearestDistanceResults,
    maxHits = maxHits
};
handle = resolveNearestJob.ScheduleParallel(this.characterCount, 64, handle);

// We only complete here
handle.Complete();

// Process nearest hit distance
for (int i = 0; i < this.characterCount; i++) {
    if (!hasHitResults[i]) {
        // No hit
        continue;
    }

    float nearestDistance = nearestDistanceResults[i];
    // Do something with nearestDistance
}

// Don't forget to dispose
positions.Dispose();
directions.Dispose();
results.Dispose();
commands.Dispose();

hasHitResults.Dispose();
nearestDistanceResults.Dispose();

That’s all I have for now. I hope you found this compelling. Consider using this if you have lots of raycasts.

If you like my posts, please subscribe to my mailing list. I’ll also give you a free game.

One thought on “How to use RaycastCommand

Leave a comment