Memory Management in Game Engines: What I’ve Learned (So Far)



This post is a note-to-self as we figure out what memory management looks like for RendererEngine. It documents what I’ve learned so far — and where I am still unsure.
Why Memory Management Matters in Games
Game development is often about simulating real-world movement — and that means performance really matters. Most games aim to run at 60 FPS, which gives you roughly 16 milliseconds per frame to handle game logic, physics, AI, rendering, and more. That’s not a lot of time and every microsecond counts.
And that’s where memory allocation becomes a big deal. Standard allocation approaches can introduce performance problems that are unacceptable in real-time applications.
GameObject* enemy = new GameObject();
// ... game logic ...
delete enemy;
The standard new
and delete
call the system heap allocators like malloc
under the hood and these heap allocators are mostly designed for safety not speed, Unlike malloc
, new doesn't just give you memory, it initializes the object too (via constructor calls), that adds a performance cost.
Well, you can fix this by using malloc
but you still risk memory fragmentation which will slow down future allocation still. And in game dev, you don’t have time to spare, or memory to waste either.
After researching various approaches, I've found several strategies, each with distinct advantages and disadvantages:
- Memory Arenas Memory arenas involve allocating a large block upfront and then managing allocations within that space.
Pros:
- Fast allocation (often just a pointer increment)
- No fragmentation within the arena
- Simple implementation
Cons:
- Objects with different lifetimes share the same arena, which can waste memory
- Doesn't work well with varying object sizes
- Usually requires manual destruction of objects
- Object Pools Object pools pre-allocate fixed-size objects of the same type.
Pros:
- Very fast allocation and deallocation
- No fragmentation
- Excellent for frequently allocated/deallocated objects of the same type
Cons:
- Fixed object size limits flexibility
- Can waste memory if pool is too large
- Requires knowing object types and counts beforehand
- Free Lists Free lists track available memory blocks for reuse.
Pros:
- Reuses memory efficiently
- Can handle varied allocation sizes
- Reduces fragmentation compared to general allocators
Cons:
- More complex implementation
- Still susceptible to some fragmentation
- Slower than pools or arenas
- Custom Allocators Building purpose-specific allocators for different subsystems.
Pros:
- Optimized for specific use patterns
- Can be tuned for performance or memory efficiency
- More control over memory usage
Cons:
- Significantly more complex to implement and maintain
- Different systems may need different allocators
- Requires deep understanding of memory usage patterns
When to Use What: My Current Mapping
Strategy | Best Used For | How to Use It |
---|---|---|
Arena | Per-frame temporary data | Allocate a large block at init; bump pointer per allocation; reset each frame |
Object Pool | Repeated small, fixed-size objects | Preallocate pool of reusable objects; recycle via free list |
Free List | Varied-size, dynamic allocations | Manage a reusable set of free memory blocks; reuse freed memory |
Custom Allocator | Specialized subsystems (scripting, audio) | Tailor behavior to specific allocation patterns; use where performance-critical |
Virtual Memory | Large dynamic systems (e.g., asset streaming) | Reserve large virtual space; commit physical memory on demand |
Current Implementation in RendererEngine
Currently, RendererEngine uses a hybrid approach combining arena allocation and pool allocation:
// Simplified example of our current approach
struct ArenaAllocator {
void* Allocate(size_t size);
void Clear();
uint8_t* m_memory;
size_t m_current_offset;
};
struct PoolAllocator {
void* Allocate();
void Free(void* ptr);
PoolFreeNode* head;
size_t chunk_size;
};
// Temporary arena for frame allocations
struct ArenaTemp {
ArenaAllocator* Arena;
size_t SavedOffset;
};
ArenaTemp BeginTempArena(ArenaAllocator* arena);
void EndTempArena(ArenaTemp temp);
We initialize a global arena at startup, then create subsystem-specific pools and sub-arenas. This gives fast per-frame allocation without fragmentation, reuse of memory across frames and simplified reset semantics
But there are still big gaps to resolve:
- What to do when an arena runs out mid-frame
- How to size memory pools to avoid both under- and over-allocation
Key Questions I'm Still Wrestling With
- Memory Reservation Strategy
- Should I preallocate a fixed-size region?
- What if it’s too small or too large?
- Should I reserve virtual memory and commit pages as needed?
- What’s the overhead of dynamic commitment vs. precommit?
- Mixing Allocation Strategies
- How do I combine arenas + pools + TLSF cleanly?
- Which systems benefit most from specialization?
- What happens if allocations cross subsystem boundaries?
- Growth Strategy
- When should memory grow?
- How much should I grow by? (50%, double?)
- How do I avoid stalling the frame while growing?
I don’t have all the answers yet. This post is a snapshot of what I’ve figured out — and what I haven’t. My current approach works, but I know it won’t scale forever. I’ll keep refining it, benchmarking, and documenting the journey.