GOAP Extensions

We use GOAP for our AI in Academia. It’s working well for us so far. New behaviour can be easily added and AI glitches can be easily fixed. But I’ve had problems with it as well. One of them which needed fixing is duplicate actions.

We use GOAP for our AI in Academia. It’s working well for us so far. New behaviour can be easily added and AI glitches can be easily fixed. But I’ve had problems with it as well. One of them which needed fixing is duplicate actions.

We have different classes of characters in the game. We have students, teachers, workers, cooks, nurses, and janitors (more are coming). Each of them have different set of actions but most of the time, they also have a common set of actions. For example, eating. If I fix the eating behavior in students, that means that I have to also apply the same fix to the other classes. This is maintenance nightmare. A character class could become broken if I just happen to forget a certain fix. Applying a fix to each of the classes is tedious, too.

GOAP Actions Refactoring

I needed a way to refactor GOAP actions in such that I could just edit one set of actions and it will be applied to all the character classes. Thus, I introduced “extensions” to our GOAP framework.

GoapExtensions

An extension is basically a reference to another GOAP data set. During parsing of this data, the system adds all the actions found in the extension. An extension can also have a set of preconditions. These preconditions are added to all the actions in the extension. For example, from the image above, the NeedsBehaviour extension will only be executed if HasRequestedTreatment = false.

The refactored actions that pertains to “needs” are now placed in its separate GOAP data set:

NeedsBehaviour

The specific GOAP data for each character class can simply reference this “needs” data to be able to execute such actions. I only need to fix this “needs” GOAP data if I have to fix a “needs” related behavior. No need to apply separate fixes to each character class.

This feature turned out to be very useful. Every time there’s a new set of behaviour that could be reused, I put them in a separate GOAP data. The character class that requires it could just add this new data as an extension. For example, for now, only students can use computers. So I made a separate GOAP data called “ComputerLabBehaviour” and added it as extension to the student’s GOAP data. Later on, if we decide that teachers could also use computers, I can simply add the “ComputerLabBehaviour” data as extension to the teacher’s GOAP data.

Behaviours
Our current set of behaviours

Avoiding Expensive A*

The most expensive A* searches are the ones that cannot be reached. It’s a paradox when using A* based path-finding algorithms. You want to know the most optimal path. At the same time, you also want to know if a certain position  is reachable before you execute it. Fortunately, there is such a way.

The most expensive A* searches are the ones that cannot be reached. It’s a paradox when using A* based path-finding algorithms. You want to know the most optimal path. At the same time, you also want to know if a certain position  is reachable before you execute it. Fortunately, there is such a way.

I don’t know the exact term for this. It’s called many names, “flood fill”, “seed fill”, or “connection labeling”. The concept is to “label” connected tiles. Label here simply means setting a unique int value for a set of connected tiles. Before invoking A*, you can simply compare the labels of the start and destination tile. If they are the same, then it means they are reachable and A* can proceed. If not, then they are unreachable and you can entirely skip expensive search. That’s pretty neat for something as simple as comparing integers.

ConnectionLabeling
Here’s an example from our current game Academia. 2 tiles can’t reach the 4 tiles.

This is the flood filler algorithm that I’ve used:

Queue disconnectedQueue;
Queue connectedQueue;

int currentLabel = 1;

// startingTile could be anything from random tile 
// or a tile that has recently been updated. Any tile in a map works.
connectedQueue.Enqueue(startingTile);

while(connectedQueue.Count > 0 || disconnectedQueue.Count > 0) {
    ProcessConnectedQueue();
    ProcessDisconnectedQueue();
}

void ProcessConnectedQueue() {
    while(connectedQueue.Count > 0) {
        Tile tile = connectedQueue.Dequeue();
        if(tile is already labeled) {
            continue;
        }

        bool hasBlocker = tile.HasBlocker; // Whether or not it blocks movement

        // Label the tile
        if(hasBlocker) {
            tile.Label = 0; // Assign zero for blockers
        }

        // Add each neighbor
        foreach(Tile neighbor in tile) {
            AddToQueue(neighbor, hasBlocker);
        }
    }
}

// Identify an unlabeled tile and add it to the connectedQueue for processing
void ProcessDisconnectedQueue() {
    while(disconnectedQueue.Count > 0) {
        Tile tile = disconnectedQueue.Dequeue();
        if(tile is already labeled) {
            // We are not interested in labeled tiles
            continue;
        }

        if(!tile.HasBlocker) {
            // Update the label if the next tile to process is not a blocker
            ++currentLabel;
        }

        connectedQueue.Enqueue(tile);
        break;
    }
}

void AddToQueue(Tile tile, bool hasBlocker) {
    if(tile is already labeled) {
        // No need to add to queue since it was already labeled
        return;
    }

    if(tile.HasBlocker == hasBlocker) {
        // Has the same blocker value. They are connected.
        connectedQueue.Enqueue(tile);
    } else {
        // Not the same blocker value. They are disjoint.
        disconnectedQueue.Enqueue(tile);
    }
}

The tile labels are updated every time a tile becomes a blocker like when a wall is built on it. Our implementation of the algorithm runs in multiple frames so that the game wouldn’t slow down whenever something is built. We process X number of tiles per frame. I’m still adjusting the right number.

What do you think? Let me know how you can make the flood algorithm run faster.

Our Tile Class

The foundation of every tile based games is the structure of their tile model. I think our game Academia has come a long way in the evolution of its tile structure. It’s robust enough to be shown to the public. If you’re making a tile based game, hopefully this helps you.

I’ll show you the member variables first:

public class Tile {

    private readonly TileGrid grid; // The parent grid
    private readonly Cell cell;

    private NodeHandle handle;

    // Note here that this is a map of layer and TileItem pair
    private Dictionary<string, TileItem> itemMap = new Dictionary<string, TileItem>();

    // Note here that task items are added to a separate map which uses its BuildItem as the key
    private Dictionary<string, TileItem> taskItemMap = new Dictionary<string, TileItem>();

    private readonly IntVector2 position;

    private int itemFlags;

    // May be null if tile is not in any zone
    private Zone zone;

    ...
}

We included the parent TileGrid so we could easily find other tiles whenever we have a reference to one tile. This helps a lot when neighbor tiles are needed. cell contains the information of the world tile position, cell width and height, bottom left position, top right position, etc. NodeHandle handle acts as the node in our A* framework. position is the tile position using integer coordinates.

The dictionaries itemMap and taskItemMap are used to store TileItem instances in each layer. A TileItem contains information about a tile. For example, say a TileItem instance for a table object. This TileItem instance means that a table occupies this tile. It also contains information like if the tile is blocked or not (for example wall). A Tile can have multiple TileItem instances for cases like the tile having a floor, dirt, and an object on top of it. Each of this are in different layers. The use of dictionary also implies that there can only be one TileItem per layer. This helps in checking if a tile already contains an item in a certain layer. This is usually used for preventing players from building objects on tiles that already have existing objects.

We differentiate between normal items and task items. Normal built items are added on itemMap while task items go on taskItemMap. This differentiation is needed so that items in multiple layers can be built in the tile. For example, build floor then build a table on top of it. Tasks used to be stored in only one layer but we found this to be inadequate, thus the current implementation.

itemFlags is a bitmask containing a bunch of information like if the tile is blocked or not, does it block students or not, does it contain a “character slot”, orientation of the character if one uses the slot. We specifically used a bitmask for faster “reachable” checking during A* search.

zone is the Zone instance where the tile is located. We added this for optimization purposes. We had to query the zone in a certain tile position in the old implementation which is slow and not very ideal.

Here’s the full class (removed function comments because “<text>” like these messes WordPress’ formatting):

    public class Tile {

        private readonly TileGrid grid; // The parent grid
        private readonly Cell cell;

        private NodeHandle handle;

        // Note here that this is a map of layer and TileItem pair
        private Dictionary<string, TileItem> itemMap = new Dictionary<string, TileItem>();

        // Note here that task items are added to a separate map which uses its BuildItem as the key
        private Dictionary<string, TileItem> taskItemMap = new Dictionary<string, TileItem>();

        private readonly IntVector2 position;

        private int itemFlags;

        // May be null if tile is not in any zone
        private Zone zone;

        public Tile(TileGrid grid, Cell cell) {
            this.grid = grid;
            this.cell = cell;
            this.position = new IntVector2(cell.x, cell.y);
        }

        public void Add(TileItem tileItem) {
            if (tileItem.Data.WorkerTask) {
                Assertion.Assert(!this.taskItemMap.ContainsKey(tileItem.BuildItemData.TileLayer));
                this.taskItemMap[tileItem.BuildItemData.TileLayer] = tileItem;
            } else {
                Assertion.Assert(!this.itemMap.ContainsKey(tileItem.Layer)); // Should not contain the item yet
                this.itemMap[tileItem.Layer] = tileItem;

                // Add the flag as well
                this.itemFlags |= tileItem.Flags;
            }
        }

        public void Remove(string tileLayer, string tileItemId) {
            // Can't remove task items through this method
            // Use RemoveTask() instead
            Assertion.Assert(!TileLayers.TASKS.EqualsString(tileLayer));

            TileItem item = null;
            Assertion.Assert(this.itemMap.TryGetValue(tileLayer, out item)); // Item should exist
            
            Assertion.Assert(item.Data.ItemId.Equals(tileItemId));
            this.itemMap.Remove(tileLayer);
            Assertion.Assert(!Contains(tileLayer));

            // Recreate the flags from the existing items
            this.itemFlags = 0;
            foreach (KeyValuePair<string, TileItem> entry in this.itemMap) {
                this.itemFlags |= entry.Value.Flags;
            }
        }

        public void RemoveTask(string buildItemLayer, string tileItemId) {
            // Must be the layer of the built item
            Assertion.Assert(!TileLayers.TASKS.EqualsString(buildItemLayer));

            TileItem taskItem = this.taskItemMap.Find(buildItemLayer);
            Assertion.AssertNotNull(taskItem);

            Assertion.Assert(taskItem.Data.ItemId.Equals(tileItemId));
            this.taskItemMap.Remove(buildItemLayer);
        }

        public bool Contains(string layerName) {
            // Note that a task layer is no longer just one item
            // It's a layer of items by itself
            Assertion.Assert(!TileLayers.TASKS.Equals(layerName));
            return this.itemMap.ContainsKey(layerName);
        }

        public bool ContainsTask(string layerName) {
            // Note that a task layer is no longer just one item
            // It's a layer of items by itself
            Assertion.Assert(!TileLayers.TASKS.Equals(layerName));
            return this.taskItemMap.ContainsKey(layerName);
        }

        public TileItem GetItem(string layerName) {
            // Note that a task layer is no longer just one item
            // It's a layer of items by itself
            Assertion.Assert(!TileLayers.TASKS.Equals(layerName));

            TileItem item = null;
            this.itemMap.TryGetValue(layerName, out item);

            // This may return null
            // Client code should check for this
            return item;
        }

        public TileItem GetTaskItem(string layerName) {
            // Note that a task layer is no longer just one item
            // It's a layer of items by itself
            Assertion.Assert(!TileLayers.TASKS.Equals(layerName));
            return this.taskItemMap.Find(layerName);
        }

        public bool HasTaskItems {
            get {
                return this.taskItemMap.Count > 0;
            }
        }

        public TileItem GetTopTaskItem() {
            if(!HasTaskItems) {
                return null;
            }

            for(int i = 0; i < TileLayers.ORDERED_LAYERS.Length; ++i) {                 
                TileItem taskItem = GetTaskItem(TileLayers.ORDERED_LAYERS[i]);                 
                if(taskItem != null) {                     
                    return taskItem;                 
                }             
            }             
            return null;         
        }         

        public bool HasCharacterSlot {             
            get {                 
                return (this.itemFlags & TileItemLayout.CHARACTER_SLOT) > 0;
            }
        }

        public bool HasPhysicalBlocker {
            get {
                return (this.itemFlags & TileItemLayout.PHYSICAL_BLOCKER) > 0;
            }
        }

        public bool HasStudentBlocker {
            get {
                return (this.itemFlags & TileItemLayout.STUDENT_BLOCKER) > 0;
            }
        }

        internal NodeHandle Handle {
            get {
                return handle;
            }

            set {
                this.handle = value;
            }
        }

        public Cell Cell {
            get {
                return cell;
            }
        }

        public TileGrid Grid {
            get {
                return grid;
            }
        }

        public IntVector2 Position {
            get {
                return position;
            }
        }

        public Zone Zone {
            get {
                return zone;
            }

            set {
                zone = value;
            }
        }

        public bool Contains(Vector3 worldPosition) {
            return this.cell.Contains(worldPosition);
        }

    }

GOAP For Our New Game

I’m excited that we’re making a builder type of game in the likes of Prison Architect Banished, and Rimworld. I love playing such games. Our’s is a school management game where you can design classrooms, offices, hire teachers, design curriculum, and guide students to their educational success.

I’m excited that we’re making a builder type of game in the likes of Prison Architect Banished, and Rimworld. I love playing such games. Our’s is a school management game where you can design classrooms, offices, hire teachers, design curriculum, and guide students to their educational success.

currentgamescreenshot

For every new game, it’s always my aim to try to implement a new algorithm or system and learn something new. I’ve always been fascinated with an AI planning system called Goal Oriented Action Planning or GOAP. If you’re not familiar with it, here’s a simple tutorialI haven’t developed such system myself as the games that I’ve made so far have no use for it. I think it’s the perfect AI system for builder games. I hope I’m right.

Why GOAP

The primary reason is I’m lazy. I don’t want to wire and connect stuff like you do with Finite State Machines and Behaviour Trees. I just want to provide a new action and my agents will use it when needed. Another main reason is I’ve reckoned that there’s going to be a lot of action order combinations in the game. I don’t want to enumerate all of those combinations. I want the game agents to just discover them and surprise the player.

Another important reason is the AI system itself is an aide for development. There’s going to be lots of objects in the game that the agents may interact with. While I’m adding them one by one, I’ll just add the actions that can be done with the object and the agents will do the rest. I don’t have to reconfigure them much every time there’s a new action available. Just add the action and it’s done.

Some Tweaks

While making the system, I had some ideas that would make the generic GOAP system better. They sure have paid off.

Multiple Sequenced Actions

Per GOAP action, instead of doing only one action, our custom GOAP action contains a set of modular atomic actions. Each atomic action is executed in sequence. This is what it looks like in editor:

multipleactions

By doing it this way, I can make reusable atomic actions that can be used by any agent. A GOAP action then is just a named object that contains preconditions, effects, and a set of atomic actions.

GoapResult

I incorporated the concept of action results like how it is in Behaviour Trees. An atomic action execution returns either SUCCESS, FAILED, or RUNNING. This is what the atomic action base class looks like:

public abstract class GoapAtomAction {

    public virtual void ResetForPlanning(GoapAgent agent) {
    }

    public virtual bool CanExecute(GoapAgent agent) {
        return true;
    }

    public virtual GoapResult Start(GoapAgent agent) {
        return GoapResult.SUCCESS;
    }

    public virtual GoapResult Update(GoapAgent agent) {
        return GoapResult.SUCCESS;
    }

    public virtual void OnFail(GoapAgent agent) {
    }

}

When an atom action returns FAILED, the whole current plan fails and the agent will plan again. A RUNNING result means that the current action is still running, thus also means that the current plan is still ongoing. A SUCCESS result means that the action has done its execution and can proceed to the next atomic action. When all of the atomic actions returned SUCCESS, the whole GOAP action is a success and the next GOAP action in the plan will be executed.

This concept makes it easy for me to add failure conditions while an action is being executed. Whenever one action fails, the agent automatically replans and proceeds to execute its new set of actions.

Condition Resolver

Condition Resolvers are objects that can query current world conditions which you need during planning. I implemented this as another base class in our system. The concrete classes can then be selectable in the editor. This is what the base class looks like:

public abstract class ConditionResolver {

    private bool resolved;
    private bool conditionMet;

    public ConditionResolver() {
        Reset();
    }

    public void Reset() {
        this.resolved = false;
        this.conditionMet = false;
    }

    public bool IsMet(GoapAgent agent) {
        if(!this.resolved) {
            // Not yet resolved
            this.conditionMet = Resolve(agent);
            this.resolved = true;
        }

        return this.conditionMet;
    }

    protected abstract bool Resolve(GoapAgent agent);

}

Note here that it has logic such that Resolve() will only be invoked once. Concrete subclasses need to only override this method. Such method may execute complex calculations so we need to make sure that it’s only called once when needed during planning.

This is what it looks like in editor:

conditionresolvers

All conditions default to false unless they have a resolver which is used to query the actual state of the condition.

Usage

Once the conditions, resolvers, and actions have been set up, all that’s left to do is to add goal conditions and invoke Replan().

void Start() {
    this.agent = GetComponent();
    Assertion.AssertNotNull(this.agent);

    // Start the AI
    this.agent.ClearGoals();
    this.agent.AddGoal("StudentBehaviour", true);
    this.agent.Replan();
}

If there are new goals to satisfy, the same calls can be invoked to change the goal(s) for a new plan to be executed.

So Far So Good

Our custom GOAP system is working well for us… for now. I now have working worker agents and student agents. More will be added. Here’s hoping that we don’t need to revamp the system as we’re already so deep with it.

I Created a Politically Corrupt AI

(This was originally posted in Squeaky Wheel’s website.)

(This was originally posted in Squeaky Wheel’s website.)

I’ve been using Genetic Algorithm as an aide for game design and development. It fills me with excitement that I can simulate natural selection to help me look for the best solutions to problems. Now I’ll tell you the story of how I used it to improve our AI.

My GA knowledge is still somewhat limited. I learned how to write code for it using this website (Go read it. It’s awesome! So unlike smarty academic papers.) To give you an idea in casual speak, GA is basically simulating the theory of evolution to look for the most “fit” solution. In this simulation, the species or “individuals” are the solutions to the problem. At the start, a certain number of individuals are spawned with random configurations. As such, most of them are dumb solutions at the beginning. Each of them are then assessed and given a fitness score. The ones with higher score means they are closer to the solution. Based from this initial population, we spawn the next generation. To do that, we either mutate or let them breed (Yes, solutions can mate) with a rule that those with higher fitness score has a higher chance of being selected for mutation or breeding. With this new generation, we repeat the process of assessment and spawning the new generation until we can find that one individual that solves our problem.

When coding a GA, you need some important pieces. These are the individual representation, the fitness assessment function, mutation function, and crossover function. If you have these four, you can run a GA. Individual representation is a bit tricky. How do you represent a solution that can also be mutated and bred? One of the most common is a list of bits. This can be represented by a list of booleans or just integers and use bit manipulation. Mutation is just then flipping a random number of bits. Breeding or crossover is simply exchanging a certain number of bits from two individuals.

Representation by bits is the only representation I know of. It’s what AI Junkie taught me and I sticked with it. That’s until I’ve read a book called “Essentials of Metaheuristics”, a highly recommended book. The contents are written in an informal way, not in an academic bullshit way. It’s a primer on different algorithms in the field of metaheuristics. Most of it, though, is GA. From there, I learned that you can represent an individual with anything. It can be lists, trees, graphs, your own data structure. Mutation and crossover can be any made up alteration of your representation. It can be adding a child, removing a child, changing a value, swapping nodes and edges. Anything! I realized how dumb I was for not going to that thought.

That gave me an aha moment. What if I automate the creation of our AI using GA. Our AI configuration is very simple. At the same time, AI is also the most neglected part of our game. We haven’t touched it for a long time. We have a working AI that I configured by hand. But then, our mechanics have already changed too much that we don’t know if it’s still competitive. Configuring a new AI would take time.

My team gave me a week to work on this, May 2-8, 2016. I’m not so sure if it would work. What if looking for a better AI takes time, like it would take days running the simulation. I certainly thought so because the assessment function is to let two AI players pit against each other. The one who wins has a bigger fitness score. Now, a single playthrough takes time, even if I speed it up. The point is, making the GA could be a waste of time.

The first thing I did is I made a fast mode of our game. No animations, movement becomes teleportation, removed the standby/wait times, etc. It wasn’t easy. I don’t have time to write another version of the game solely for GA. Instead, I’m using what we have now and provided a mode that it can be played extremely fast. Finally, I have a mode where AI vs AI takes around 1 minute to complete 15 turns. Still not fast enough, but quite good already.

Then I made something called a “multi frame” GA. Basically, it’s GA that is spread in multiple frames. Note that the assessment function is for AI to play the game. So the GA must wait for such game to end before it can move to the rest of the algorithm. In fact, if there are 10 indiduals in a generation, 10 games must be played before moving on to spawn the next generation.

Our AI configuration is all code. We use a utility based AI, by the way. It is represented as a set of “task scorers”. Each scorer has an arbitrary set of “considerations”. These considerations are classes that help compute the score for the task. The AI generates all the possible tasks, scores them using the scorers, and picks the one with the highest value.

My plan is to use GA to generate different combinations of these scorers and considerations until we get the one that beats the current best configuration. Before anything else, I needed my configuration to be saved in a file. Every time the GA finds a better AI, it should save the configuration in a file. So I turned the AI configuration into XML. I used the class names and variables of consideration classes in this file. I load them back using reflection. It looks like this now:

<ScorerSet id="IndividualZero" fitness="1" timestamp="">

 <Scorer taskId="Bribe">
   <Consideration name="BlockadedDistrictConsideration" />
   <Consideration name="TerrorizedConsideration" />

   <Consideration name="MinimumBudgetConsideration">
     <Variable type="NamedInt" name="minimumBudget" value="1000" />
   </Consideration>

   <Consideration name="ReachableConsideration">
     <Variable type="NamedFloat" name="multiplierIfUnreachable" value="0" />
   </Consideration>

   <Consideration name="MustHaveMatchingPlatformConsideration" />

   <Consideration name="ReachReputationConsideration">
     <Variable type="NamedFloat" name="populationPercentage" value="0.85" />
   </Consideration>

   <Consideration name="BonusTuningConsideration">
     <Variable type="NamedFloat" name="bonusToSet" value="1.0" />
   </Consideration>

   <Consideration name="CommandPointsConsideration" />

   <Consideration name="NeighborCountConsideration">
     <Variable type="NamedInt" name="desirableCount" value="4" />
   </Consideration>

   <Consideration name="OpponentStaffCountConsideration" />

   <Consideration name="BribeBetterThanCampaignConsideration">
     <Variable type="NamedInt" name="minReputationGainDifference" value="1000" />
     <Variable type="NamedInt" name="rankIfMet" value="17" />
     <Variable type="NamedInt" name="bonusIfMet" value="10" />
     <Variable type="NamedFloat" name="multiplierIfMet" value="5" />
   </Consideration>

   <Consideration name="ScandalCountReachedConsideration">
     <Variable type="NamedInt" name="scandalCount" value="4" />
     <Variable type="NamedFloat" name="multiplierIfMet" value="0" />
   </Consideration>
 </Scorer>

 <Scorer taskId="RaiseFunds">
   <Consideration name="BlockadedDistrictConsideration" />
   <Consideration name="TerrorizedConsideration" />

   <Consideration name="ReachableConsideration">
     <Variable type="NamedFloat" name="multiplierIfUnreachable" value="0" />
   </Consideration>

   <Consideration name="HasSignificantFundsConsideration">
     <Variable type="NamedInt" name="preferredAmountToRaise" value="1000" />
   </Consideration>

   <Consideration name="BonusTuningConsideration">
     <Variable type="NamedFloat" name="bonusToSet" value="1.0" />
   </Consideration>

   <Consideration name="LowFundsConsideration">
     <Variable type="NamedFloat" name="fundsThreshold" value="0.3" />
   </Consideration>

   <Consideration name="CommandPointsConsideration" />
   <Consideration name="HigherReputationConsideration" />

   <Consideration name="NeighborCountConsideration">
     <Variable type="NamedInt" name="desirableCount" value="4" />
   </Consideration>

   <Consideration name="NearBailiwickConsideration" />
 </Scorer>

 <!-- ... Rest is ommitted -->
</ScorerSet>

The mutation function then is just a matter of:

  • Add random consideration
  • Remove random consideration
  • Change some random variables

Crossover between two individuals is simply swapping a random number of considerations.

For the initial population, I used our existing AI as a starting point. I called it “Individual Zero”. The first population are individuals that are mutated versions of him. Now with all the pieces together, I have a GA that looks for a better AI.

First Runs

When I was about to finish, I was too excited to run it. I know it will take time so I plan to run the simulation overnight when I’m asleep. It was Saturday night. I’m sleepy and tired. I fixed some errors that would potentially halt the simulation. I set the settings for the fast mode. Every game lasts only 15 turns. The population size is set to only 10. Then finally I let the baby run. I watched the first few AI matches. After a while, nothing goes wrong, I locked the laptop and went to bed.

When I woke up, I checked it immediately. The simulation had hung. I looked at the Task Manager. 90% RAM usage. So I closed it. But lo and behold, it generated 10 AI configurations, each one better than the last one before it. So I was like “that went well”. I pushed my work and restarted my laptop. Then my laptop won’t start. It only said “Diagnosing Your PC”. But I wasn’t worried. I know I didn’t do anything stupid. The simulation just probably messed up the memory. An hour later, my laptop was alive again. I immediately fixed the memory leak. Lesson learned when writing a resource intensive GA like this one.

After coffee, I decided to play a game against the new AI and see if it’s really indeed better. Maybe it’s just better against other AI but not humans. So I used my usual strategy. Bribed supporters on the first few turns. I acquired districts as soon as I can. I was winning. Every 5 turns, the game shows this graph about how many supporters each candidate has. I beat his numbers every time. I even have more acquired districts than him. Am I really playing against a better AI? On the last turn, I’ve made my actions. I knew I was gonna win. I controlled more districts than him. That dread that I probably just wasted time returned again. Election came… the AI won an unbelievable landslide. Jaw dropped. Turns out he didn’t care about the voters and was befriending more patrons than me. Well, I guess I created a monster.

One AI is Easy, but FOUR?!?!?

I have implemented my own FSM and Behavior Tree systems for AI in games. I know how to properly use them in most cases. They’ve worked really well. I honestly think I already have the working knowledge for the AI of the kind of games I’d like to make… until Party Animals came. I felt stupid and incapable.

FSMs and Behavior Trees are good for single agent AI. Single agent here means that the simulated behavior is only responsible for itself. It doesn’t see other agents and try perform tasks together more effectively. For example, let’s describe a single agent behavior of an enemy unit guarding a gate:

  • Unit stands still near the gate
  • If it sees a player unit within its range,
    • Chase and attack that player unit
    • If player unit dies or goes out of range, go back to gate

Now let’s say we assign two enemy units on that gate with this behavior. It will still work but probably not the best strategy to defend the gate. Both of them will chase the target as soon as it sees one. They could have decided that one of them chases the player’s unit while the other one remains on guard. This is no longer single agent. Multiple agent behavior is another beast with its own set of complexities.

AI vs AI in Action!

In Party Animals, a player controls 4 units. They are called “Staff”. One is the candidate character himself and the other 3 could be any combination of Police, Accountant, Investigator, PR Manager, or Goon (we’ll add more). Each staff has a common and unique sets of abilities. The game is turn based. Each Staff can execute one action and can also move to an adjacent district once in every turn. The player may choose to move then execute an action or execute first then move to another district for each staff. Aside from this, the player also has to assign budget for each action. The budget allocated affects the effect of the action.

Here’s the college project: implement an AI opponent that controls 4 Staff units that should pose a challenge to the player. It may not play the best moves, but should at least be smart enough to pick and order its actions sensibly. Sounds easy? (If yes, email me your solution please.)

I think the main difference between single agent AI systems and multiple agent AI ones is the concept of planning. There’s no planning in FSMs or Behavior Trees. You can simply pick which branch to choose based on some conditions and let the agent perform the behavior described in that branch. You can’t simply pick a branch of behavior for a multiple agent AI. There are questions to be answered before letting agents do their tasks:

  • Are the agents assigned the best task for them in the current situation?
  • Is the order of execution of agents’ tasks give the highest benefit?
  • Are the selected actions sensible enough in the player’s perspective? Is the AI playing the game right?

Single agent systems just can’t answer these questions. I had to look for other ways.

Solution 1: Genetic Algorithm

I love brute force solutions. A genetic algorithm could certainly be used for Party Animals. Since the game is turn based, there’s no pressure for the AI to run as fast as possible. I don’t need the best answer either. I was thinking that I could run 1000 generations for each turn and select the best gene as the set of actions that the AI would take in that turn. In the end, I decided against it because I’m not so sure how long 1000 generations would take. I mean sure the game is turn based but we don’t want the player to wait too long for the AI to execute. The chromosome encoding/decoding itself would take too much time to implement as there are lots of possible actions. The fitness scoring would be quite complicated, too. There are lots of variables to consider.

Solution 2: SHOP Planning System

Here’s the link to the paper of the algorithm. It looks simple but I’m not quite sure how to implement it. I don’t even know if it’s a multi agent solution. It’s more like a distant cousin of GOAP.

Solution 3: Resource Assignment Algorithm

I found this one in Gamasutra. I liked this one because it’s really easy to understand and convert to code. It’s very straightforward. I implemented it as soon as I understood it. The current AI actually uses this solution. The algorithm is somewhat brute force but wouldn’t take as much time as in Genetic Algorithm. These are the major steps:

  1. Generate all possible tasks
  2. Generate all possible assignments (all possible agent-task pair)
  3. Assign a score to each possible assignment (this is based on game rules)
  4. Sort possible assignments by score in descending order
  5. Assign tasks to agents from the sorted possible assignments until all agents has a task

The hardest part of this solution is step 3. There’s no one rule how to compute the score for each task. It highly depends on the rules and the current conditions of the agents of the AI. In a way, you’re like turning the quality of a task that is to be assigned to an agent into a number. This value should at least be comparable for all types of tasks since they are all treated the same way when sorted in step 4. Coming up with these computations is not easy. I think I need to study a special math for this.

For instance, the Campaign task has various factors like population, the platform value of the candidate, the concern value of the target district, and the distance of Staff that will carry out the task. Another task called Gift has another set of factors. These are population, inverse of platform value of the candidate (meaning the lower the value, the more likely the task should be done), and distance. Campaign has 4 factors while Gift only has 3. How do I come up with a formula or system of scoring such that it gives reasonable values among these two tasks? There’s also the case of precedence. Campaign should be a more important task than Gift but because it has more factors, it could happen that it has a much lower score. The scoring system should be able to compensate for this.

Resource Assignment main function
Resource Assignment main function

Solution 4: Dynamic Task Allocation

This is a lengthy Master’s Thesis material. I did not understand it the first time I read it. I learned to appreciate it when I ran it through again. It’s actually very clever. The algorithm was inspired from swarm colonies like ants, bees, and wasps. It turns out that colonies don’t have a central command that assigns tasks, not even the queen. How does the colony know which tasks to prioritize then when certain circumstances arise?

The individuals of the colony emit pheromone stimuli. Larvae emit pheromone to signal their hunger, sentries emit pheromones to signal intruders, etc. Each caste in the colony has a set of thresholds associated to each stimulus. When a stimulus becomes greater than an individual threshold, that individual will answer to that stimulus with great probability. Answering to that stimulus will in turn lower the stimulus. For example, worker ants would have a low threshold for a task say “gather food” but has a high one for “defend colony” task. This way, worker ants will respond to hunger stimuli more than to intruder alert stimuli. You can probably imagine by now what a soldier ant’s threshold values looks like.

I would have used this algorithm had I understood it earlier. I was almost finished with the implementation of Resource Assignment Algorithm. I could not afford to tear it down and start again. We need a working AI soon. Either way, I could still add this algorithm in another AI profile in the future, if there’s still time.

A Generic Duration Timer Class

In game development, there are a lot of instances where you would like to do the following:

  • Wait for X seconds
  • Fly for X seconds
  • Fire for X seconds
  • Move forward for X seconds

The common element in these actions is the concept of a time duration. The most common solution that I see in tutorial pages looks like the following:

public class SomeComponent : MonoBehaviour {
    [SerializeField]
    private float durationTime; // the time to wait or go through

    private float polledTime;

    void Awake() {
        this.polledTime = 0;
    }

    void Update() {
        this.polledTime += Time.deltaTime;

        if(this.polledTime >= this.durationTime) {
            // time's up
            // maybe reset or do something since duration is up
            return;
        }

        float ratio = this.polledTime / this.durationTime;
        // do something with ratio like interpolation (lerp)
    }
}

This is good enough, but we can do something better. Imagine if you are maintaining more than one duration time. You’d probably do something like this:

public class SomeComponent : MonoBehaviour {
    [SerializeField]
    private float waitDurationTime;

    private float waitPolledTime;

    [SerializeField]
    private float fireDurationTime;

    private float firePolledTime;

    void Update() {
        UpdateWait();
        UpdateFire();
    }
    
    private void UpdateWait() {
    	this.waitPolledTime += Time.deltaTime;

        if(this.waitPolledTime >= this.waitDurationTime) {
            // time's up
            // maybe reset or do something since duration is up
            return;
        }

        float ratio = this.waitPolledTime / this.waitDurationTime;
        // do something with ratio
    }
    
    private void UpdateFire() {
    	this.firePolledTime += Time.deltaTime;

        if(this.firePolledTime >= this.fireDurationTime) {
            // time's up
            // maybe reset or do something since duration is up
            return;
        }

        float ratio = this.firePolledTime / this.fireDurationTime;
        // do something with ratio
    }
}

We can see here that the floating point arithmetic is repeated for each timed duration domain. A simple solution to manage this is to refactor and create a reusable DurationTimer class.

using UnityEngine;

/**
 * Generic class for implementing timers (specified in seconds)
 */
public class DurationTimer {
    private float polledTime;
    private float durationTime;

    /**
     * Constructor with a specified duration time
     */
    public DurationTimer(float durationTime) {
        Reset(durationTime);
    }

    /**
     * Updates the timer
     */
    public void Update() {
        this.polledTime += Time.deltaTime;
    }

    /**
     * Resets the timer
     */
    public void Reset() {
        this.polledTime = 0;
    }

    /**
     * Resets the timer and assigns a new duration time
     */
    public void Reset(float durationTime) {
        Reset();
        this.durationTime = durationTime;
    }

    /**
     * Returns whether or not the timed duration has elapsed
     */
    public bool HasElapsed() {
        return Comparison.TolerantGreaterThanOrEquals(this.polledTime, this.durationTime);
    }

    /**
     * Returns the ratio of polled time to duration time. Returned value is 0 to 1 only
     */
    public float GetRatio() {
        if(Comparison.TolerantLesserThanOrEquals(this.durationTime, 0)) {
            // bad duration time value
            // if countdownTime is zero, ratio will be infinity (divide by zero)
            // we just return 1.0 here for safety
            return 1.0f;
        }

        float ratio = this.polledTime / this.durationTime;
        return Mathf.Clamp(ratio, 0, 1);
    }

    /**
     * Returns the polled time since it started
     */
    public float GetPolledTime() {
        return this.polledTime;
    }

    /**
     * Forces the timer to end
     */
    public void EndTimer() {
        this.polledTime = this.durationTime;
    }

    /**
     * Returns the durationTime
     */
    public float GetDurationTime() {
        return this.durationTime;
    }

}

What we did here is we simply contained the duration timer variables and routines in a separate class. Take note of the usage of Comparison functions in HasElapsed() and GetRatio(). Read here for the reason why.

Let’s look at the basic usage using of this class:

public class SomeComponent : MonoBehaviour {
    [SerializeField]
    private float duration; // the time to wait or go through

    private DurationTimer timer;

    void Awake() {
        this.timer = new DurationTimer(this.duration);
    }

    void Update() {
        this.timer.Update();

        if(this.timer.HasElapsed()) {
            // time's up
            // maybe reset or do something since duration is up
            return;
        }

        float ratio = this.timer.GetRatio();
        // do something with ratio like interpolation (lerp)
    }
}

By using this class, we reduce the clutter of floating point arithmetic (increment, check, compute ratio). We use the functions of the class instead, making it more readable. See how it looks like when it’s used for more than one timed duration:

public class SomeComponent : MonoBehaviour {
    [SerializeField]
    private float waitDuration; // the time to wait or go through

    private DurationTimer waitTimer;
    
    [SerializeField]
    private float fireDuration;
    
    private DurationTimer fireTimer;

    void Awake() {
        this.waitTimer = new DurationTimer(this.waitDuration);
        this.fireTimer = new DurationTimer(this.fireDuration);
    }

    void Update() {
        UpdateWait();
        UpdateFire();
    }
    
    private void UpdateWait() {
    	this.waitTimer.Update();

        if(this.waitTimer.HasElapsed()) {
            // time's up
            // maybe reset or do something since duration is up
            return;
        }

        float ratio = this.waitTimer.GetRatio();
        // do something with ratio like interpolation (lerp)
    }
    
    private void UpdateFire() {
    	this.fireTimer.Update();

        if(this.fireTimer.HasElapsed()) {
            // time's up
            // maybe reset or do something since duration is up
            return;
        }

        float ratio = this.fireTimer.GetRatio();
        // do something with ratio like interpolation (lerp)
    }
}

We can also reuse this class if we’re making other frameworks that uses time.

public class WaitAction : FsmAction {
    private DurationTimer timer; // pretty cool huh?
	
    ...
}