Understanding ECS. Finally explained

What is ECS? How did Unity make ECS? How do I make ECS? Data-oriented ECS? And what does ECS stand for?

The last question: ECS stands for Entity Component System

ECS is essentially a system primarily for game development where everything in the game world is defined as an Entity, and that entity can have attributes or aspects that we call Components. Hence the name ECS.

So after my last post I realized I needed to figure out these questions for myself a tad more before trying to explore them in detail on a blog post. I had three more blog posts on some conceptual ideas for ECS planned but realized, no one (myself included), would want to read that! So, after watching some talks by unity (specifically this one (Understanding data-oriented design for entity component systems - Unity at GDC 2019) This one also helped (but not until I had watched the previous video did it make more sense). I am now ready to give an overview of Unity's ECS or at least how I would make ECS.

Entities and Components

Made in draw.io

Above you can see a super-simplified ECS system I made to help you get the idea.

As you can see in the diagram above accessing components for a specific Entity is as simple as an index into the component arrays. So we really have no need for an Entity object, instead, an Entity becomes really just an int. So say we want all components that are for Entity 2, we just go into each array at index 2 and we now have access to an Entity's components. For simple applications, this works great, but if you wanted you could represent an "Entity" as a component:

What if you want to "delete" or really just remove an Entity from the components array. If we just went straight ahead and did that we would end up with a gap in our arrays. To solve that we can use a queue and every time we remove an Entity's components from the arrays we can store the index in a queue of to_be_filled indexes (literally a queue of ints) and when instantiating we can just dequeue from to_be_filled and "fill" the gaps. In reality though how would we mark an Entity as removed/deleted? In this case I can just use the "Entity" components and store a bool value in them. Whenever we want to "instantiate" a new Entity we simply set that Entity component's bool value to true, and whenever we want to remove an Entity we can just set the bool value to false. Below could be code for an Entity component:

public struct Entity
{
      int id; // potentially very useful
      Entity *parent; // direct memory pointer to a parent entity
        bool alive = false; // by default its set to false
}

The first diagram uses fixed arrays and doesn't allow for more than 6 entities! What if I want to instantiate more than 6? (just an example obviously you would make those arrays longer) To solve this we can have lists of arrays and when our list's arrays are full, add another array to the list.

Here you can see there are lists of arrays for all our components. Now we can allocate new space as arrays and when we run out of room and set their values. Why not just use lists for the whole thing? Because lists actually take up twice as much space as arrays, and each element in a list is stored somewhere random in (usually not very close to each other) and moving around in memory is bad for performance and it takes up precious milliseconds.

Although this design allows us to instantiate more components we are now moving around a lot in memory between each list of arrays, and if you are wanting to handle lots of Entities that can become a problem. Say you wanted to access the meshRenderer and position component of each Entity. You would have to move between each array in the list of arrays each iteration. That would slow down our game. So if we could just store them all packed together we would eliminate lots of moving around in memory. And with the current setup, we can't do this very effectively.

Unity solved this by allocating Archetype data in 64k chunks of memory (I think), each chunk holding arrays of components. The chunks are linked together using direct pointers, in this case that is ok since we're not doing following pointers that often.

I'm guessing 64k chunks because of the L1/L2 CPU cache line size:

A cache line is the unit of data transfer between the cache and main memory . Typically the cache line is 64 bytes. The processor will read or write an entire cache line when any location in the 64 byte region is read or written.

from medium.com (I highly recommend reading this post!)

The chunk of arrays (or an array of arrays) of components are limited in length due to the size of the chunk, so if we have an archetype with lots of components, fewer Entities and their corresponding components will fit in those chunks. This can be more easily understood if we revise our previous diagram to better reflect what a chunk really looks like:

ECS Archetype Chunks ECS Component Chunks

This is to be expected, and its not bad, just something to be aware of. We can still access the components by an index but they are now as close as they can be in memory making accessing multiple components of an Entity, very fast. (edit: sorry about the duplicate rot3 components I didn't mean to do that).

Comments

AnonymousDecember 28, 2020 at 4:15 PM
Nice write-up! It will take some time for me to get a grasp on this. It's something new to me. Anyways, your post here will help in doing so for sure. Thanks a lot!
Abdiel LopezJanuary 5, 2021 at 10:18 AM
This comment has been removed by the author.

Paper Prototypes Blog

Search This Blog