Various test entities made from multiple actors. (Oops, no art makes them really difficult to parse from a freeze frame.)
This month, I tried to wrap up some missing pieces that are important to prototyping gameplay. I got close, but spent a lot of time refactoring other subsystems to accommodate the new things, and the remaining work will have to spill over into September.
About halfway through the month, I ran into Lua garbage accumulation issues again, and dropped everything to look into it. I now understand what’s going on, and have an interim workaround.
Some more additions to the routine macro system:
- Added support for multi-line and nested ‘if’
- Added ‘if/else’
- Added simple ‘while’ loops with ‘continue’ and ‘break’ with nesting support
- Added ‘and/or’ operators in conditions, with ‘and’ getting higher precedence
- Added time scaling variable for routine states so that they can step through a list of states at different speeds
RoutineState debug visualization of a ‘Game of Life’ macro.
Here’s a followup to the macro performance concerns from last month. I wrote an actor that implements Conway’s Game of Life over a region of terrain in a room, and then I wrote another version of it with most of the logic handled by routine macros. The cell region / playfield was 11×16, big enough to hold one 9×16 pentadecathlon pattern.
Performance was surprisingly not that bad. Running one cycle per frame with VSync disabled, the Lua version was 553 FPS, and the macro version was 492 FPS. This measurement isn’t super accurate because the update cycles happen on a per-tick basis, and ticks run on a fixed timestep of 256 ticks per second, but I was expecting much worse numbers for the macro version.
Conway’s Game of Life is not a good representation of the kinds of things I expect routine macros to handle in-game. Instead of implementing the logic as macros, you would define a command which implements Game of Life in pure Lua, and then have the RoutineState idle on that command. This was just a silly idea that I had to see through to completion.
Auto Collision Handlers and Masking
I tried to do something like this back in January, but struggled with the implementation and backed out my changes. I have a better idea of what I want now.
Handlers are declared and attached to actors. When an actor is being processed, we go through every overlapping actor, and if the handlers are relevant to the collision event, they are executed. They’re good for implementing broadly-applicable behaviors, and not so good for infrequent or very specific kinds of collisions.
Bitmasks determine if a collision is relevant. Each actor maintains two bitmasks: one for the kinds of collision events that it accepts, and one for the kinds of collision events that it advertises to other actors. Handlers also maintain a bitmask, usually with just one ‘on’ bit for the specific category that they deal with. For a collision to register, the ‘ingress’ and ‘egress’ masks must have at least one matching ‘on’ bit. For example, the player actor’s ingress mask could include a “powerups” bit to indicate that it’s interested in examining collisions with power-ups and bonus items. An extra life power-up would have the same bit set in its egress mask. The player would receive overlap events with the power-up, but not vice versa. The typical enemy creature doesn’t need to deal with power-ups (unless it’s some kind of thief character, I guess), and so it wouldn’t have the “powerups” bit set, and therefore wouldn’t receive overlap information for that event.
The bitmasks are 32-bits wide. This is a byproduct of using LuaJIT’s bit module to do the bitwise comparisons. The last bit is reserved as an “always send this event through” indicator, so that actors can deal with special cases manually. I think I’ll be OK with this upper limit for the whole project. The masking doesn’t need to be 100% precise: even with handlers, further checks can be performed to determine if a collision event is of interest.
Bolero has supported floating platforms in one form or another since Hibernator, but just this month I became aware of a serious limitation: there is no general method of treating AABB regions like solid obstacles that other actors can’t pass through on all sides.
I’ve made a few barrier devices, like a sliding door that the player can’t pass through, and a column-like enemy that is also impassable, but these are one-off entities that only push actors horizontally. Floating platforms only have to worry about pushing players upwards. In attempting to make an actor which is impassable on the left and right sides, but which also serves as a surface to stand on, and a ceiling to bump into, I realized I had a pretty big problem.
In order to simplify platformer collisions, an actor’s movement is handled one axis at a time. first, horizontal velocity is applied, and then the horizontal position is corrected to account for tilemap collisions. Then vertical velocity is applied, followed by more tilemap corrections. If corrections happen after actors have moved along both axes, a lot of context is lost, especially when attempting to resolve collisions near the corners of bounding boxes.
The floating platforms and obstacle actors do collision detection and correction after the actor has moved on both axes. This is OK for floating platforms, but unreliable for any entity that deals with both horizontal and vertical bounds. I spent a few hours trying to come up with a compromise. Maybe it’d be alright to prioritize surface connections, then ceilings, then left and right sides? Ultimately, the amount of teleportation near corners was unacceptable.
My solution was to check for ‘blocking’ actors within the platformer tick, one axis at a time, close to the code that corrects position against the tilemap. The action scene maintains a table identifying all blocking actors, and for every actor which is monitoring for blocking collisions, a for loop compares their AABB against the AABB of every blocking actor. A 4-bit mask determines which sides of a blocking actor need to be dealt with and which can be ignored.
This isn’t perfect (and probably won’t ever be perfect to be honest), but at least I can now whip up large actors that behave like floor, wall and ceiling. Cases where actors get squeezed against the environment need more consideration.
I added a few built-in actor references with special behavior:
- Target: The actor that ‘self’ is currently fixated on.
- Leader: If specified, ‘self’ copies this actor’s target reference instead of searching for targets itself.
- Children: An array of references to actors that were created by calling self:addChild(). The child actors start with the parent assigned as its leader.
These references are intended for implementing the following:
Leader-Follower Chain of Command
A visualization of actor leader / follower connections. Tinted actors are further down the chain of command. (Neither the lines nor the tinting would be visible in a normal game session.)
After I added target hunting in May, I was concerned about enemies wasting a lot of time on the selection process. My planned solution was defer the selection to nearby ‘leaders’. In the group of eleven creatures shown above, the rightmost creature’s target selection flows down to the rest. If the player defeats that creature, then the next actor in the list becomes leaderless and begins hunting for himself, becoming the new leader of the group.
This was in my todo list for a long time, but now I’m not sure if it’s really necessary. I’m facing these issues:
- How do actors determine who is the leader and who are the followers? For testing, I created a basic ranking system where the longest-lived enemy with the highest index becomes leader to any nearby enemies. This demonstrates that target selection flows down successfully, but it’s not actually useful as-is. You’d probably want some sort of commander entity to direct soldiers to a specific target, and that’s an entity that I don’t have designed or implemented yet.
- How far should an actor search for a leader? If an enemy picks up on a leader chain which is targeting something on the other side of the screen, this leaves the enemy unable to select targets nearby that pose an immediate threat.
- And the biggest issue: target selection really isn’t that bad in terms of CPU usage right now. Before chopping up the scene context into little rooms that can be flipped on or off, my test levels had a lot of distant actors that just sat around, waiting for the player to get in range. This is no longer the case.
So I’ll probably leave this stuff alone until there’s a better use case. It could still be good for modelling enemies with ‘assistants’ who always work together, but right now, this is the carriage leading the horse.
Group Actor Contraptions (AKA ‘gac’ / ‘gacs’)
This was more successful. Bolero wasn’t designed to support multiple hitboxes per actor, and it’s too late to tear everything apart right now, so these are basically a hack to streamline designing entities with pinned or moving parts.
Actors can now spawn child actors from a list, with all the error-checking boilerplate wrapped away. The parent keeps an array of references to its children, and each child has the parent assigned as its leader. I planned in my todo list to have a more formal system of attachment rules, and to manage what happens when the parent is removed, but I need more game content to understand exactly what’s needed. For now, all gacs position themselves using one-off logic.
A good example of a gac is a floating platform which is safe to stand on, but hazardous to touch anywhere below the top surface. Make the parent the hazardous region, and spawn a child to represent the platform. Position the child above the parent on every tick. I also converted the controllable multi-actor vehicle described in April’s devlog post.
I’m not done with this, but like the leader-follower thing, I need more content to get a better picture of what my actual needs are.
More Garbage Accumulation Issues
I noticed that I was getting uneven CPU usage in scenes with over 100 actors. While investigating, I found that the Lua memory counter was showing the same kind of rapid accumulation that I saw back in January. This wasn’t a memory leak on my end: I’m not creating tons of new tables or concatenating 10000 strings every frame, and if I manually call a garbage collection cycle, usage drops from hundreds of megabytes down to 15-20 MB.
I didn’t have an embedded CPU usage graph in January, so maybe I didn’t notice the performance spikes back then. Or maybe the engine was just a lot more prone to spiking at the time for many other reasons, and I couldn’t tell the difference. I turned off game features one by one, and eventually noticed that the problem wasn’t happening (or it was so subdued that I didn’t notice) when actor-to-actor collision checking was disabled. It was also less serious when fixed-grid spatial partitioning was enabled as the method for broad-phase collision detection, instead of brute-forcing every possible collision. In my test room, the partitioning halved the number of actor-to-actor checks being made.
Were the collision checks just taking too long? The last thing I added to the loop was a couple of bitwise ANDs using the LuaJIT bit operations module, as a way to filter incoming and outgoing collision events. Surely a bitwise AND couldn’t be the cause? Running a zillion bitwise ops in a separate LuaJIT prompt completes in no time.
I made several optimizations along the way, but they weren’t making much difference. Finally, one change to how actors are instantiated did help. Every time a new actor instance was added to a scene, I was using pairs() twice to:
- Clear out any existing fields in the recycled pooled table
- Copy actor defaults over to the table
The pairs() iterator is not eligible for compilation in LuaJIT. At the time, I did it this way because I was having issues with corruption, or fields not being written, or something like that. I switched to an alternate method of copying template fields using a secondary ‘guide’ table to supply the keys, and now actor-to-actor collision checking is substantially faster across the board, and I stopped seeing the memory count skyrocket when there were many actors added and removed frequently. Workloads that were crawling at single-digit frames per second are now getting over 100 FPS. Wow!
Unfortunately, this didn’t solve the issue from January, which was somewhat held at bay with a couple of ‘magic’ collectgarbage(“count”) calls in the main scene loop. I was frustrated at the time, but decided to just leave those bits of code as-is so I could focus on developing the rest of the game.
I tried taking the collectgarbage(“count”) lines out, just to see what would happen. It made the original January problem appear once again. I couldn’t really continue development without some kind of resolution to this, so I dropped my other tasks and looked into my remaining troubleshooting options.
1: Turn off JIT compilation and monitor performance + memory usage
Guess I should have tried this back in January. LuaJIT’s interpreter is fast (still 100+ FPS in my test rooms), reports consistent CPU usage, and most importantly, does not produce tons and tons of garbage with my current codebase. Upon finding this, I was half-inclined to immediately just turn off JIT for the whole project once the resource initialization code is done.
love.update() graph with JIT off:
love.update() graph with JIT on:
2: Figure out how to enable LuaJIT -jv mode from within LOVE
LuaJIT has a “verbose” mode that logs the current state of tracing, and which also provides messages when an NYI component is encountered during a trace. It can be enabled from within LOVE, though you have to copy some files from the LuaJIT installation to a location that require() can see, and they have to be from the same LuaJIT version that LOVE is using. This forum post goes over the process, along with some other debug features. I had to find the LuaJIT version that matches the version of LOVE I’m using (I have a few different appimage containers, and also a version installed through Fedora’s package manager):
print(jit.version, jit.version_num) -- LuaJIT 2.1.0-beta3 20100
I downloaded the LuaJIT source matching that version and compiled it with make. Once it was done, I found the relevant files in ./src/jit. I copied the whole jit folder to my project folder, and included v.lua at the beginning of my project’s main.lua:
local jitVerbose = require("jit.v") jitVerbose.on([filename])
This was enough to get JIT status messages printing to the terminal. I was reluctant to dive into this because that I was concerned that I wouldn’t understand the logs, but the messages pointing out NYI features are self-explanatory.
Skimming through the output (which is high-volume!), I noticed that even just the “count” variation of collectgarbage is NYI. Maybe this is why having two collectgarbage(“count”) calls in the main scene runner function — one at the beginning and one at the end — had a positive effect back in January. Was it causing trace attempts in that area to abort, which would otherwise cause issues (somehow)?
[TRACE --- (150/1) animation.lua:525 -- NYI: FastFunc collectgarbage at scene_ops.lua:1389]
I thought that perhaps these memory allocation failures were the root of my problems. mcode is the compiled machine code that LuaJIT emits. These messages appear over and over again in the log, and it’s always followed by a trace flush, which AFAICT throws out all existing JIT work and starts over. (?)
[TRACE --- (155/0) platforming.lua:236 -- failed to allocate mcode memory at collision.lua:217] [TRACE flush]
Searching the web for this error brings up some discussions where similar errors appear when running LuaJIT on Android . The suggestion at the end of this chain to set sizemcode and maxmcode to much larger values than the defaults did seem to help, at least for a little while.
I’ve kept a pared-down test version of my project from January so that I can refer back to this issue. In that copy, turning JIT off resolves the garbage accumulation issue (although due to poor optimization at the time, it quickly drops to single-digit FPS as more actors spawn in.) Verbose mode shows the same mcode allocation failures, like the project in its current state does. Increasing sizemcode + maxmcode also seems to mostly clean up the garbage issue, for the few minutes that I ran it.
I thought that was that, but the following day, even with the mcode allocation parameters raised, I found that once again, the game started having garbage accumulation issues. Looking at the verbose output, many flushes are happening when it’s behaving like this. My assumption at this point is that 1) I have too much code that’s eligible for JIT and 2) it’s full of if-then-else blocks which cause too many side traces to be started, which then causes everything to be flushed.
At this point, I’m tired of investigating this and just want to work on the game, so I’m disabling JIT after initialization is complete.
I moved a lot of long-winded comment blocks out of the codebase and into text files as documentation.
Flattening Read-Only Tables
Added some functions to merge tables that contain the same key-value contents*, and applied it to spawnpoint data and geometric shape info at startup. These both contain a lot of redundant sub-tables, often with the same contents (In some cases, lots and lots of empty tables.) This data is considered read-only at run time, so it seems OK to re-point these references. Whether it makes any real difference is a question I can’t answer until the project is closer to the finish line.
(* Limitations: anything with NaN is considered a different table; tables with different subtables that have the same contents are still considered different tables.)
Linked Lists for Spatial Partitions
I added doubly-linked lists to the hunt partition using the method described in Game Programming Patterns Chapter 20. (Nope, Lua doesn’t have them built-in.) The rationale was to avoid using pairs() or next to iterate through lists of targets, which are the only way to visit every item in a hash table when they’re not known ahead of time, or if all possible elements would be too much to check. Both are ‘NYI’ in LuaJIT. I guess it doesn’t matter that much right now, since I have JIT completely off after startup.
Shifting Modules, Cleaning up State Tables
I moved several ‘toolkit’ modules into the core codebase: spatial partitioning, the loops to check actor-to-actor overlap, ‘signaling’ triggers, and the module that generates actor references and refreshes them per-tick.
Added four general-purpose fields to each actor: ‘a’ through ‘d’. This is part of a plan to make lightweight actors with fewer subtables at some point in the future.
Spawn Priority Flag
Added a ‘high priority’ flag for spawnpoints in Tiled. When maps are converted to the internal format, if this flag is set, then the spawnpoint is moved to the beginning of the list of spawnpoints. The quantity of high priority spawnpoints is also added to the map properties. This does nothing right now, but I plan to eventually support spawning actors incrementally, and this would allow critical actors to be exempt from that system. For example, you’d probably want an elevator to spawn in before the things that are supposed to be resting on top of it appear.
I got rid of the SpawnDef and actor specification stuff from about a year ago. What a relief. I haven’t used either in a long time, if ever. Instead of trying to preemptively cram actor configuration into the spawn-into-scene code, it’s much more sensible at this point to just spawn the damned actor and make the changes you want. If multiple entities need to do the same kinds of customization, then just wrap that code in another function and call it instead.
It did kind of make sense at one point, when map spawnpoints were implemented as a special tilemap layer with individual tiles representing actors, and there was no way to attach additional custom data to them. The project has outgrown this, and if I really need tilemap-based spawning, there are other ways to do it at a project level.
Async Event Inboxes
I started writing a per-actor asynchronous event system. I’ve started on this before, and ended up scrapping the work as I couldn’t demonstrate a practical use to myself. This time, I can kind of justify its existence, for some specific use cases, but it’s really a “this would be neat” thing and not a “this is necessary to continue” thing, so I stopped again, just short of deleting the work.
While cleaning out my project docs folder, I found some old todo and wishlist files. I’ve completed a fair chunk of the items described since I last edited them, like using spritebatches and shaders, and supporting multiple maps per scene. These things felt insurmountable at the time.
I’m hoping that by October, I’ll be able to work on more prototype gameplay fragments without having to deal with lower-level stuff. There are just a few issues related to actor attachment / placement that I need to get sorted out. I’ll post another update around the end of September.
Codebase issue tickets:
Project lines of code:
(e 31 Aug 2020: Spelling)