Object-Oriented Entity-Component-System Design
de_keyboard 2021-08-16 16:31:49 +0000 UTC [ - ]
* physics - position component + physics component
* rendering - position component + animation component
* etc..
How do we now store these components? How do we create / access aggregates efficiently?
If we have two arrays then there is lots of hopping around:
PositionComponent[]
PhysicsComponent[]
Maybe we need some kind of grouping class? struct EntityData {
PositionComponent position; // Null if not a position entity
PhysicsComponent physics; // Null if not a physics entity
}
Interfaces have their issues too, but at least it's fairly clear what to do: class BouncyBall : IHasPosition, IHasPhysics {
IPosition getPosition() {
// ...
}
IPhysics getPhysics() {
// ...
}
}
Anyone solved this before?
royjacobs 2021-08-16 16:36:32 +0000 UTC [ - ]
Of course, if your system needs to iterate across 20 components to do its job then maybe you need to check if you've factored your components correctly.
munificent 2021-08-16 17:45:40 +0000 UTC [ - ]
It's bad for spatial locality. You end up with many more CPU cache misses, which significantly slows down execution. Using the CPU cache effectively is one of the primary reasons to use ECS.
Twisol 2021-08-16 18:05:06 +0000 UTC [ - ]
Of course, if you access N components you need N pages in the cache concurrently, which is going to fall over for a not-too-large N. But N=2 or N=3 seems unlikely to kill spatial locality.
I can imagine it gets a little more complicated with prefetch, but you're still using the prefetched pages -- you just need to prefetch pages for two separate arrays (potentially at different rates based on component size) rather than one. Do these details end up snowballing in a way I'm not seeing, or are there details I'm just missing outright?
munificent 2021-08-16 18:43:47 +0000 UTC [ - ]
It sounds like you're thinking of virtual memory (i.e. pages of memory either being in RAM or on disk). But CPU caching is about the much smaller L1, L2, and L3 caches directly on the chip itself.
Let's say you have two kinds of components, A and B. You have those stored in two contiguous arrays:
AAAAAAAAAAAAAAAAAAA...
BBBBBBBBBBBBBBBBBBB...
Each entity has an A and B and your system needs to access both of those to do its work. The code will look like: for each entity:
access some data component A
access some data component B
do some computation
On the first access of A, the A component for that entity and a bunch of subsequent As for other entities get loaded into the cache line. On the next access of B, the B for that entity along with a bunch of subsequent Bs gets loaded into a cache line. If you are lucky the A and B arrays will be at addresses such that the chip is able to put them in different cache lines. At that point, I think you'll mostly be OK.But if you're unlucky and they are fighting for the same cache line, then each access can end up evicting the previous one and forcing a main memory look up for every single component.
Twisol 2021-08-16 23:16:50 +0000 UTC [ - ]
I think I mostly just used the wrong term (page instead of line) -- I'm on board with the distinction you're drawing.
> But if you're unlucky and they are fighting for the same cache line, then each access can end up evicting the previous one and forcing a main memory look up for every single component.
Yeah, I think this specifically is what I'm having a hard time imagining. I guess it depends on how which memory regions get mapped into which cache lines, and I think I'm assuming something closer to fully-associative, while you're allowing for something closer to direct-mapped.
I can see the trouble -- you don't really have much control over the mapping, unless you make some really deep assumptions about the CPU architecture and go to great lengths to position your arrays appropriately in memory. That would certainly be in the original spirit of data-oriented design, but it's perhaps a bit beyond reasonable for most systems.
Thanks for working through this with me!
(P.S. For others, I found this illustration that helped load a bit of CPU cache architecture back into my brain: http://csillustrated.berkeley.edu/PDFs/handouts/cache-3-asso...)
quotemstr 2021-08-16 18:49:12 +0000 UTC [ - ]
munificent 2021-08-16 21:12:40 +0000 UTC [ - ]
renox 2021-08-17 06:33:25 +0000 UTC [ - ]
Here it's a cache (associativity) conflict, as gugagore said it can impacts performance even in read only situation.. And yes, you're describing a way to avoid the conflicts.
gugagore 2021-08-16 20:03:13 +0000 UTC [ - ]
nikki93 2021-08-16 18:36:59 +0000 UTC [ - ]
royjacobs 2021-08-16 19:25:27 +0000 UTC [ - ]
It's also an option to have your component data interleaved, if you know the iteration usage upfront, I suppose.
jayd16 2021-08-16 20:33:23 +0000 UTC [ - ]
BulgarianIdiot 2021-08-16 17:09:58 +0000 UTC [ - ]
Ideally you want to have one modifier/controller, but you can have as many readers as you want.
When you can't have a single controller, you have several options:
1. Pass ownership. Animated components control position only by animation. Physics components control position only by physics. You can pass this control in time from physics to animation and back.
2. Express one through the other. In this case, express animation as acting on physics constraints, and let the physics engine compute the final position. This way animation becomes just another "physical force" in your game. It can be hard to do sophisticated animation this way though.
3. Have physics-specific position and animation-specific position and have the final position be computed as a formula of both. Maybe you sum them. So either one that moves from a base offset, impacts the position. This depends on what the position is of.
jayd16 2021-08-16 16:46:53 +0000 UTC [ - ]
Unity then schedules work for your system by passing all the relevant arrays.
Described in detail here: https://docs.unity3d.com/Packages/com.unity.entities@0.17/ma...
danbolt 2021-08-16 18:20:15 +0000 UTC [ - ]
I didn't work on the game, but I spoke with some of the developers of Homeworld: Deserts of Kharak. Since there was a straightforward quantity of entities and components (a bunch of vehicles in a closed desert space), the space for all data was preallocated at initialization time. I can't speak further on the specifics though.
[1] https://github.com/skypjack/entt/blob/master/docs/md/entity....
[2] https://docs.rs/specs/0.17.0/specs/struct.VecStorage.html
sparkie 2021-08-16 16:42:33 +0000 UTC [ - ]
[1]:https://www.doc.ic.ac.uk/%7Escd/ShapesOnwards.pdf; A more recent revision of the work here: https://www.researchgate.net/publication/341693673_Reshape_y...
quotemstr 2021-08-16 18:45:25 +0000 UTC [ - ]
dkersten 2021-08-16 20:49:06 +0000 UTC [ - ]
Sure. Take a look at EnTT[1], a popular C++ ECS library. It comes with two main tools to deal with this: Sorting[2] and groups[3]. EnTT gives you a large spectrum of tools with different trade-offs so that you can tune your code based on usage patterns. Obviously different bits of code will have conflicting access patterns, so there's no one-size-fits-all solution, but EnTT lets you optimise the patterns that are most important to you (based on profiling, hopefully).
[1] https://github.com/skypjack/entt
[2] Sort one component to be in the same order as another component, so that they can be efficiently accessed together: https://github.com/skypjack/entt/wiki/Crash-Course:-entity-c...
[3] https://github.com/skypjack/entt/wiki/Crash-Course:-entity-c...
throw149102 2021-08-16 18:01:23 +0000 UTC [ - ]
See: https://en.wikipedia.org/wiki/AoS_and_SoA
Jonathan Blow has a good talk about it here: https://www.youtube.com/watch?v=YGTZr6bmNmk
beiller 2021-08-16 18:47:43 +0000 UTC [ - ]
meheleventyone 2021-08-16 16:41:23 +0000 UTC [ - ]
This is partly why a lot of ECS demos have a lot of homogeneous elements (they share all components in common). For example particle systems have long been written in a data oriented manner when running on the CPU. So if you implement it in the ECS style you can just run through the arrays in order and its all good. Or Unity's city sim example. But games tend to have much more heterogeneous entities (they share less or few components in common).
The most obvious example I can think of to dispel the myth of ECS's inherent DoDness is an ECS wherein each component storage is a linked list with each element individually allocated. Even iterating through the homogeneous entity example is likely to be extremely slow in comparison to flat arrays. So there is nothing about the pattern that demands it be implemented in a data-oriented manner.
But back to a more heterogeneous example. I'm going to try to explain it generally because I think a worked version would be enormous and maybe cloud things more? Typically component storage is indexed by the entity ID. You want to look up the component in the storage associated with a particular ID. If all your storages are flat arrays where the entity ID is just an index into the array the more heterogeneous your entities the more gaps you will have to iterate over and correspondingly more memory your game will take up. This isn't great for cache locality or memory usage and we have to iterate over every entity for all systems to find the valid ones.
So the next step uses a dense array and a secondary backing array that is indexed by the entity id. So we can keep our components packed nicely but still look them up easily.
Instead of iterating over all the entities for every system we can find the shortest component storage for the set of components the system uses and iterate directly over that and lookup the other components in their storages by the current entity ID. Now we iterate over potentially many fewer entities but essentially do a random lookup into the other component storages for each one. So we're introducing cache misses for the benefit of less things to iterate over.
So what we want is the benefits of blazing through arrays without the downsides of them being pretty sparse and ideally minimizing cache misses. Which is why the concept of an Archetype was invented. If we keep our components in flat arrays but crucially change our storage so we're not keeping flat arrays of every component but keeping separate component storages for each archetype of entity we have right now.
Going from:
AAAAAAAAAA
BBBBBBBBBB
CCCCCCCCCC
To:
(ABC) A B C
(AB) AAA BBB
(AC) AAAAA CCCCC
(C) CCCCC
If we have a system that just iterates C's it can find all the archetype storages and iterate straight through the C array for them one by one. So ideally we only pay a cache miss when we change archetype, have good cache locality and are iterating the minimum set. Similarly a system that uses components A and C will only iterate the archetype storage of ABC and AC and blaze straight through the A and C arrays of each. Same deal.
This comes at a cost of making adding and removing components from an entity more expensive.
We're also ignoring interacting with other components or the world and how that might work. For example we might want to do damage to another entity entirely. Or we might want to look up the properties of the piece of ground we're stood on. So there is a whole other layer of places we can ruin all this good work by wanting to access stuff pretty randomly. Relationships in games tend to be spatial and stuff tends to move around so it's hard to see a general case solution to the problem.
Then there is other axis to think on like ease of creating the game, how flexible it is to change the game, iteration speed, designer friendliness and so on. Rarely IME has the gameplay code itself been the bottleneck outside of stupid mistakes.
In games this level of optimization is really great when you do have a big mostly homogenous set of things. Then it's well worth the time to structure your data for efficient memory access. City Sims, games like Factorio and so on are classic examples.”
hypertele-Xii 2021-08-16 17:15:03 +0000 UTC [ - ]
Or is CPU cache really so slow it can literally only look at one stride of memory at a time?
I'm skeptical this kind of optimization is necessary.
meheleventyone 2021-08-16 17:22:15 +0000 UTC [ - ]
This archetype based approach is used in quite a few big ECS projects. Unity’s ECS and Bevy amongst them.
As with anything performance related though, particularly when considering the underlying principles of data oriented design you should be analysing the performance of your approach on the target hardware.
de_keyboard 2021-08-16 16:52:15 +0000 UTC [ - ]
> This comes at a cost of making adding and removing components from an entity more expensive.
I think you could write a design-time tool that takes a simple description file (with hints) and outputs code that stores your entities and components efficiently.
Description file:
{
"archetypes": [
{
"components": [ "A", "B", "C" ]
},
{
"components": [ "B" ]
},
{
"components": [ "B", "C" ]
}
]
}
Output: class ArchetypeABC {
A a;
B b;
C c;
}
class ArchetypeB {
B b;
}
class ArchetypeBC {
B b;
C c;
}
class EntityStore {
ArchetypeABC[] entitiesABC;
ArchetypeB[] entitiesB;
ArchetypeBC[] entitiesB;
}
codetrotter 2021-08-16 17:10:22 +0000 UTC [ - ]
So I would expect the code corresponding to their comment to look like this instead of what you wrote:
class ArchetypeABC {
A[] as;
B[] bs;
C[] cs;
}
class ArchetypeB {
B[] bs;
}
class ArchetypeBC {
B[] bs;
C[] cs;
}
But maybe I misunderstood?
meheleventyone 2021-08-16 17:02:11 +0000 UTC [ - ]
throwaway13337 2021-08-16 16:39:08 +0000 UTC [ - ]
But, assuming it isn't, you would make that portion of things it's own component/trait of the entity. Components that rely on it could be declared to require it (in Unity, there is a RequireComponent annotation). So you can be sure if that component exists, that its required component also exists on the entity. I think this is a reasonably satisfying solution.
learc83 2021-08-16 17:38:33 +0000 UTC [ - ]
ECS is a specific software architecture where (among other things) there is no entity level property because an entity is just an identifier--all properties are stored in logicless, data only components. Unity DOTS has an implementation of this.
jcelerier 2021-08-16 16:41:00 +0000 UTC [ - ]
that's a very very video game centric point of view. If a pattern only works in a couple fields of application, it's not a very good pattern..
munificent 2021-08-16 16:56:11 +0000 UTC [ - ]
ECS was invented for and is primarily used by videogames.
> If a pattern only works in a couple fields of application, it's not a very good pattern.
I completely and totally disagree. How would you even define a "field of application" without there being patterns and practices that are unique to it? If every domain uses the same techniques, what is the difference?
Off the top of my head, here are some patterns that I rarely see outside of their primary domain:
Programming languages and compilers:
* Recursive descent
* Top-down operator parsing
* The visitor pattern
* Mark-sweep garbage collection and the tri-color abstraction
Game and simulation programming:
* ECS
* Per-frame arena allocators
* Game loops
* Double buffering
* Spatial partitioning
jcelerier 2021-08-16 22:23:03 +0000 UTC [ - ]
munificent 2021-08-16 23:51:09 +0000 UTC [ - ]
In principle, yes. In practice, I've almost never seen the visitor pattern used outside of programming languages. Maybe once or twice in UI frameworks.
> methods and techniques maybe. These words don't mean the same thing, a pattern is much more general than a technique !
What's your definition of a "pattern" and why wouldn't the things I list fit in it?
ehaliewicz2 2021-08-17 00:02:07 +0000 UTC [ - ]
void_mint 2021-08-16 16:55:34 +0000 UTC [ - ]
> Entity–component–system (ECS) is a software architectural pattern that is mostly used in video game development.
adamrezich 2021-08-16 16:55:21 +0000 UTC [ - ]
viktorcode 2021-08-16 17:12:26 +0000 UTC [ - ]
The problem here lies in linking those separate components (i.e. indices in arrays) to entities.
michannne 2021-08-16 16:59:39 +0000 UTC [ - ]
> if we have two arrays then there is lots of hopping around
Why? You can just create a custom allocation scheme assigning one giant chunk of memory to all components and give each system a custom iterator that iterates accordingly for the alignment of components it cares about
resonantjacket5 2021-08-16 18:06:38 +0000 UTC [ - ]
That way each entity can access it's data quickly, and if there is some heavy physics computation it can easily iterate it in a list (aka checking for collisions).
gh123man 2021-08-16 17:17:50 +0000 UTC [ - ]
echohack5 2021-08-16 17:25:16 +0000 UTC [ - ]
wlamartin 2021-08-17 09:25:33 +0000 UTC [ - ]
After some serious initial difficulty getting my head away from "this is a block" to "this entity has a block" I think it's turned out to make a lot of sense.
For example, checking whether a value shared by a "when there is a new tweet" is in scope when someone wants to use it (i.e. in nested block) is as simple as having a few components to represent a pointer from value to the ID of the block that shares it, and a list of blocks that share scope for this location and then compare the two.
kvark 2021-08-16 18:52:39 +0000 UTC [ - ]
For example, a mesh may contain multiple materials. Is each material chunk a separate entity? Or maybe each bone in a skeleton is a separate entity with its own "parent" and "transform" plus other component.
One of the different approaches is component graph systems [1]. It lacks the ability to mix and match components, but provides a more natural (and simpler) model to program for.
[1] https://github.com/kvark/froggy/wiki/Component-Graph-System
runald 2021-08-17 06:30:56 +0000 UTC [ - ]
Wouldn't that take too long? Not really, just copy an actual released game, but using shitty (I mean, free) assets and make up some random content. It should cut the total development time from a week to a month at most. By copying, I mean everything, from the main menu, to in game-menu, to the little bits of logic that makes a game fun and engaging.
Kinrany 2021-08-17 09:03:46 +0000 UTC [ - ]
serverholic 2021-08-16 21:20:12 +0000 UTC [ - ]
meheleventyone 2021-08-16 19:01:16 +0000 UTC [ - ]
nikki93 2021-08-16 18:33:35 +0000 UTC [ - ]
- Combined with an inspector UI, you can explore gameplay by adding and removing components from entities and arrive at design emergently. One way to look at this is also that you write components and systems to handle the main gameplay path you start out thinking about, but your queries encode many other codepaths than just that (a combinatorial explosion of component membership in entity is possible). This lets you get a kind of "knob crawl" that you see in eg. sound design when tweaking parameters live with synths too. It lets artists and designers using the editor explore many gameplay possibilities.
- The way I see the component data is it's kind of an interface / source of truth, but some subsystems may end up storing transient data elsewhere at runtime (eg. a spatial octree or contact graph for physics). However as components are added or removed or component properties updated, the caches should be updated accordingly. You get a single focal point for scene state. Once some state is expressed as a component you get undo and redo, saving to scene files, saving an invidual (or group of) entity as a blueprint to reuse, ...
The cache thing feels like a minor point to me, inside a larger category of allowing you to massage your data based on access patterns by decoupling the logic acting on it. With performance being one of the goals of said massaging along with many others.
I also find myself not really focusing on the "system" aspect as much as the entity / component; esp. re: embedding constructs for that into a library. I've found you can get far just having one large "updateGame()" function that does queries and then performs the relevant mutations in the query bodies, and you can then separate code into more functions (usually just simple free functions without parameters) from there that become your systems. There's a bit of a rabbit hole designing reusable scheduling and event systems and whatnot but I feel like just simple calls to top level / free functions like this on a per-game basis seems a lot clearer and ultimately more flexible (it's just regular procedural code and you're in control of what happens when). I like seeing the backtrace / callstack and being the owner of things and then being explicit all the way up vs. entering from some emergently scheduled event dispatch system.
peterthehacker 2021-08-16 15:23:26 +0000 UTC [ - ]
[0] https://martinfowler.com/bliki/DesignStaminaHypothesis.html
nodivbyzero 2021-08-16 16:57:21 +0000 UTC [ - ]
EnTT is a header-only, tiny and easy to use library for game programming and much more written in modern C++. Among others, it's used in Minecraft by Mojang, the ArcGIS Runtime SDKs by Esri and the amazing Ragdoll.
elteto 2021-08-16 18:09:22 +0000 UTC [ - ]
entt.hpp:
#include "config/version.h" #include "core/algorithm.hpp" #include "core/any.hpp" #include "core/attribute.h" #include "core/family.hpp" #include "core/hashed_string.hpp" #include "core/ident.hpp" #include "core/monostate.hpp" #include "core/type_info.hpp" #include "core/type_traits.hpp" #include "core/utility.hpp" #include "entity/component.hpp" #include "entity/entity.hpp" #include "entity/group.hpp" #include "entity/handle.hpp" #include "entity/helper.hpp" #include "entity/observer.hpp" #include "entity/organizer.hpp" #include "entity/poly_storage.hpp" #include "entity/registry.hpp" #include "entity/runtime_view.hpp" #include "entity/snapshot.hpp" #include "entity/sparse_set.hpp" #include "entity/storage.hpp" #include "entity/utility.hpp" #include "entity/view.hpp" #include "locator/locator.hpp" #include "meta/adl_pointer.hpp" #include "meta/container.hpp" #include "meta/ctx.hpp" #include "meta/factory.hpp" #include "meta/meta.hpp" #include "meta/node.hpp" #include "meta/pointer.hpp" #include "meta/policy.hpp" #include "meta/range.hpp" #include "meta/resolve.hpp" #include "meta/template.hpp" #include "meta/type_traits.hpp" #include "meta/utility.hpp" #include "platform/android-ndk-r17.hpp" #include "poly/poly.hpp" #include "process/process.hpp" #include "process/scheduler.hpp" #include "resource/cache.hpp" #include "resource/handle.hpp" #include "resource/loader.hpp" #include "signal/delegate.hpp" #include "signal/dispatcher.hpp" #include "signal/emitter.hpp" #include "signal/sigh.hpp"
linkdd 2021-08-16 21:03:50 +0000 UTC [ - ]
So yes, this is tiny. Tinier than Unity, CryEngine, Unreal Engine, or other huge frameworks of that kind.
meheleventyone 2021-08-16 22:40:09 +0000 UTC [ - ]
linkdd 2021-08-16 23:07:08 +0000 UTC [ - ]
Also, there is Hazel[0] which is based on entt and is the subject of an amazing youtube series[1].
IMHO, tinier means less features, at that point what could be considered comparable?
[0] - https://github.com/TheCherno/Hazel
[1] - https://www.youtube.com/playlist?list=PLlrATfBNZ98dC-V-N3m0Go4deliWHPFwT
meheleventyone 2021-08-17 07:04:22 +0000 UTC [ - ]
A comparable ECS implementation is what you’re after. Is entt tiny in comparison to similar projects.
reidjs 2021-08-16 16:23:35 +0000 UTC [ - ]
codr7 2021-08-16 17:34:04 +0000 UTC [ - ]
You can sort of, kind of get there with free functions and overloading as long as you're not doing anything fancy with the methods.
NovaX 2021-08-16 19:19:08 +0000 UTC [ - ]
component = metadata + data
system = metadata processors
This way you can decouple business rules from the model by using metadata to instruct how the entity should be processed. At work we use json schema as our type system to describe our entities, where every instance includes the schemas that it implements. The metadata allows us to render an entity, process it via server-side triggers, store and search, use generic crud routes, etc. In a language like Java, this is the same as using reflection to inspect interfaces and annotations for processing an object.
reidjs 2021-08-16 20:17:56 +0000 UTC [ - ]
NovaX 2021-08-16 20:26:54 +0000 UTC [ - ]
royjacobs 2021-08-16 16:34:26 +0000 UTC [ - ]
All of this is pretty much out of the window for this design, so what is the benefit of the ECS here?
ScoobleDoodle 2021-08-16 16:44:35 +0000 UTC [ - ]
At the individual object level the different components are not cache coherent. So the rendering and physics instances of one object are not in any memory coherent location.
Because the physics will do it’s SIMD to resolve the mutual state. And then rendering will do the SIMD for their aggregate.
royjacobs 2021-08-16 19:27:23 +0000 UTC [ - ]
jayd16 2021-08-16 16:07:51 +0000 UTC [ - ]
insaneisnotfree 2021-08-17 04:39:04 +0000 UTC [ - ]
TinkersW 2021-08-17 02:42:28 +0000 UTC [ - ]
Just no! Pick one and stick with it.
adamnemecek 2021-08-16 17:53:59 +0000 UTC [ - ]
jeremycw 2021-08-16 17:07:57 +0000 UTC [ - ]
It's that all games essentially (and most software in general) boil down to: transform(input, current_state) -> output, new_state
Then, for some finite set of platforms and hardware there will be an optimal transform to accomplish this and it is our job as engineers to make "the code" approach this optimal transform.
bob1029 2021-08-16 19:25:18 +0000 UTC [ - ]
Something about this does not sit well with me.
Data is absolutely worthless if it generated on top of a garbage schema. Having poor modeling is catastrophic to any complex software project, and will be the root of all evil downstream.
In my view, the principal reason people hate SQL is because no one took the time to "build the world" and consult with the business experts to verify if their model was well-aligned with reality (i.e. the schema is a dumpster fire). As a consequence, recursive queries and other abominations are required to obtain meaningful business insights. If you took the time to listen to the business explain the complex journey that - for instance - user email addresses went down, you may have decided to model them in their own table rather than as a dumb string fact on the Customers table with zero historization potential.
Imagine if you could go back in time and undo all those little fuck ups in your schemas. With the power of experience and planning ahead, you can do the 2nd best thing.
jeremycw 2021-08-16 19:46:15 +0000 UTC [ - ]
nikki93 2021-08-17 00:05:41 +0000 UTC [ - ]
Just like arrays and structs, it's yet another data structure to be used in the general data-oriented approach, one that becomes useful because those creation / destruction patterns come up in games and adding and removing components is a great way to express runtime behavior as well as explore gameplay.
The "focus" on ECS may just come from it being an interesting space as of late vs. arrays, structs and for loops that have been around for ever, but it's mostly just an acknowledgement of common array, struct and for loop patterns that arise. There's also a lot out there about the systems part and scheduling and event handling but I think it's almost best to start out with simple procedural code (that then has access to the aforementioned data structure) and let patterns collect pertinent to the game in question.
One big aspect I personally dig is if you establish an entity / data schema you get scene saving, undo / redo, blueprint / prefab systems that are all quite useful and basically necessary if you want to collaborate with artists and game designers on a content-based game, and empowers them to express large spaces of possibilities without editing the code.
zarkov99 2021-08-17 00:37:17 +0000 UTC [ - ]
typon 2021-08-16 19:44:27 +0000 UTC [ - ]
Keyframe 2021-08-16 21:28:51 +0000 UTC [ - ]
"Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious."
- Fred Brooks
dragonwriter 2021-08-17 07:25:58 +0000 UTC [ - ]
But this understanding is fundamentally, deeply wrong, in the same way that civil engineering based approaches to software engineering are wrong for most software applications.
That is: yes, all software systems are data transformation systems, but most software problems are not “how do I produce the system most narrowly tailored to the present requirements” but more often “how to engineer a system for success with the pace and kind of change that we can expect over time in this space”.
(Now, games, particularly, are both pushing the limits of hardware and fairly static, so making them narrowly-tailored, poorly adaptable static works is often not wrong. But that doesn't generalize to all, or even most, software.)
sitkack 2021-08-17 15:43:04 +0000 UTC [ - ]
Design principles can be applied to all implementation mechanisms.
WorldMaker 2021-08-16 19:24:14 +0000 UTC [ - ]
C# has some good tools to head towards that direction (async/await is a powerful Monadic transformer, for instance; it's not a generic enough transformer on its own of course, but an interesting start), but as this article points out most videogames work in C# today still has to keep the back foot in C/C++ land at all times and C/C++ mentalities are still going to clip the wings of abstraction work.
(ETA: Local maxima are still useful of course! Just that I'd like to point out that they can also be a trap.)
learc83 2021-08-16 20:51:55 +0000 UTC [ - ]
The quotes imply that this is a bad reason, but in soft realtime systems you often want complete control of memory allocation.
Even in the case of something like Unity--in order to give developers the performance they want--they've designed subset of C# they call high performance C# where memory is manually allocated.
In most cases if you're using an ECS, it's because you care so much about performance that you want to organize most of your data around cache locality. If you don't care about performance, something like the classic Unity Game Object component architecture is a lot easier to work with.
unknownOrigin 2021-08-16 22:12:53 +0000 UTC [ - ]
waste_monk 2021-08-17 04:49:49 +0000 UTC [ - ]
Big respect to the work (and the people behind that work) that goes into getting modern AAA games to hit these targets.
WorldMaker 2021-08-16 22:28:05 +0000 UTC [ - ]
The "rule" that C/C++ is always "more performant" is just wrong. It's a bit of a sunk cost fallacy that because the games industry has a lot of (constantly reinvented) experience in performance optimizing C/C++ that they can't get the same or better benefits if they used better languages and higher abstractions. (It's the exact same sunk cost fallacy that a previous games industry generation said C/C++ would never beat hand-tuned Assembly and it wasn't worth trying.)
In Enterprise day jobs I've seen a ton of "high performance" C# with regular garbage collection. Performance optimizing C# and garbage collection is a different art than performance optimizing manually allocated memory code, but it is an art/science that exists. I've even seen some very high performance games written entirely in C# and not "high performance C#" but the real thing with honest garbage collection.
(It's a different art to performance optimize C# code but it isn't even that different, at a high level a lot of the techniques are very similar like knowing when to use shared pools or deciding when you can entirely stack allocate a structure instead of pushing it elsewhere in memory, etc.)
The implication in the discussion above is that a possible huge sweet spot for a lot of game development would actually be a language a lot more like Haskell, if not just Haskell. A lot of the "ECS" abstraction boils away into the ether if you have proper Monads and a nice do-notation for working with them. You'd get something of the best of both worlds that you could write what looks like the usual imperative code games have "always" been written in, but with the additional power of a higher abstraction and more complex combinators than what are often written by hand (many, many times over) in ECS systems.
So far I've not seen any production videogame even flirt with a language like Haskell. It clearly doesn't look anything like C/C++ so there's no imagination for how performant it might actually be to write a game in it (outside of hobbyist toys). But there are High Frequency Trading companies out there using Haskell in production. It can clearly hit some strong performant numbers. The art to doing so is even more different from C/C++ than C#'s is, but it exists and there are experts out there doing it.
Performance is a good reason to do things, but I think the videogames industry tends to especially lean on "performance" as a crutch to avoid learning new things. I think as an industry there's a lot of reason to avoid engaging more experts and expertise in programming languages and their performance optimization methodologies when it is far easier to train "passionate" teens extremely over-simplified (and generally wrong) maxims like "C++ will always be more performant than C#" than to keep up with the actual state of the art. I think the games industry is happiest, for a number of reasons, not exploring better options outside of local maxima and "performance" is an easily available excuse.
vvanders 2021-08-16 23:43:35 +0000 UTC [ - ]
I've seen impressive things done with Lua, from literate AI programming with coroutines to building compostable component based language constructs instead of standard OOP. You have things like GOAL[1] which ran on crazy small systems (the Lua I saw ran in a 400kb block as well).
On performance, data oriented design and efficient use of caches is the way you get faster. I've done it in Java, I've done it in C#, I've done it in Rust and C++. Certain languages have better primitives for data layout and so you see gamedev index into them. We used to do things like "in-place seek-free" loading where an object was directly serialized to disk and pointers were written as offsets that were fixed up post load. Techniques like this easily net 10-30x performance benefits. It's the same reason database engines run circles around standard language constructs.
[1] https://en.m.wikipedia.org/wiki/Game_Oriented_Assembly_Lisp
WorldMaker 2021-08-17 00:16:23 +0000 UTC [ - ]
ehaliewicz2 2021-08-16 23:55:22 +0000 UTC [ - ]
account42 2021-08-17 10:35:29 +0000 UTC [ - ]
C/C++ hasn't beat the performance of hand-tuned assembly - it has simply gotten close enough that the cost of hand-tuned assembly is not worth it in most cases.
pjlegato 2021-08-17 00:19:56 +0000 UTC [ - ]
It's arguably significantly less work to learn how to tune the GC and then optimizing it for your situation than it is to deal with manual memory allocation and all of its fallout.
renox 2021-08-17 06:20:22 +0000 UTC [ - ]
And no, I'm not joking: I work in C++ and I know exactly how annoying memory errors can be.. Thanks a lot valgrind|ASAN developers!
pjlegato 2021-08-18 17:56:46 +0000 UTC [ - ]
There is a common misconception that GC invariably precludes the construction of a "high performance" system, which is not true. If your use case allows you to not care as much about larger memory consumption -- 2x to 3x does seem like a reasonable first approximation of "larger" -- then GC is indeed a viable option for building "high performance" systems.
This case is not uncommon. Not everyone is targeting a memory constrained console or embedded system.
In many (though of course not all) cases, the tradeoff is well worth it -- consume more memory at runtime, spend some time tuning the GC, and in exchange developers can ship a product faster, by having to spend significantly less time dealing with manual memory allocation.
learc83 2021-08-19 01:24:40 +0000 UTC [ - ]
Ignoring the amount of memory used, GC tuning a managed language doesn't give you the flexibility to control memory layout needed for maximum cache locality.
>If your use case allows you to not care as much about larger memory consumption -- 2x to 3x does seem like a reasonable first approximation of "larger" -- then GC is indeed a viable option for building "high performance" systems.
Not ignoring amount of memory used. In the context of this thread--video games specifically "high performance" video games--2x to 3x is almost never going to be acceptable.
gmueckl 2021-08-17 22:40:25 +0000 UTC [ - ]
learc83 2021-08-17 15:18:26 +0000 UTC [ - ]
Who said it's a rule? What C/C++ gets you is the ability to manually allocate memory without jumping through hoops.
> Performance optimizing C# and garbage collection is a different art than performance optimizing manually allocated memory code, but it is an art/science that exists. I've even seen some very high performance games written entirely in C# and not "high performance C#" but the real thing with honest garbage collection.
Performance optimizing C# with garbage collection for high performance soft realtime systems (I've done it) relies on tricks like object pooling to avoid triggering GC along with avoiding many of the more advanced language features. Even then you don't get the same level of control. I'm also almost completely certain that the high performance C# games you're talking about aren't using C# for the engine, but feel free to provide examples so I can take a look.
If your game (or parts of your game) doesn't need the performance that comes with a higher degree of memory layout control, then by all means use whatever tools you want to.
I've written game logic in C#, F#, Ruby, Haxe, Python, Lua, Java, JavaScript and Elixir.
>The implication in the discussion above is that a possible huge sweet spot for a lot of game development would actually be a language a lot more like Haskell, if not just Haskell.
There almost certainly is for game logic. Many modern game engines provide higher level scripting languages.
However, if what you are working on is in that sweet spot, you likely didn't need an ECS to begin with and a classic component architecture would have probably been a lot easier to deal with.
>But there are High Frequency Trading companies out there using Haskell in production.
HFT is not game dev. "Performance" in HFT doesn't mean the same thing as performance in games.
I haven't used Haskell specifically, but I've toyed with using Elixir for gamedev. It's reliance on linked lists makes it extremely difficult to iterate quickly enough. There are work arounds of course, but the work arounds remove most of what is nice about Elixir in the first place.
>Performance is a good reason to do things, but I think the videogames industry tends to especially lean on "performance" as a crutch to avoid learning new things. I think as an industry there's a lot of reason to avoid engaging more experts and expertise in programming languages and their performance optimization methodologies when it is far easier to train "passionate" teens extremely over-simplified (and generally wrong) maxims like "C++ will always be more performant than C#" than to keep up with the actual state of the art. I think the games industry is happiest, for a number of reasons, not exploring better options outside of local maxima and "performance" is an easily available excuse.
The average engine coder writing high performance code in C++ isn't a "passionate teen". They are experienced software engineers who want to stick as close to the metal as they feasibly can.
The games industry (outside of AAA games) also has an extremely low barrier to entry, and it's something that nearly every programmer has thought about doing at some point--if Haskell turns out to be a fantastic language for making games, it will almost certainly happen sooner or later.
WorldMaker 2021-08-17 22:29:34 +0000 UTC [ - ]
Statistically the median age in the games industry is 25 and always has been. It's a perpetually young industry not known for retaining experienced talent. I know that statistically the median doesn't tell you a lot about how long of a tail there is of senior talent, you need the standard deviation for that, but given what I've seen as mostly an outside observer with a strong interest the burn out rate in the industry remains as high as ever and senior developers with decades of experience are most likely to be an anomaly and an exception that proves the rule than a reality. In terms of anecdata all of the senior software developers I've ever followed the careers of on blogs and/or LinkedIn are all in management positions or entirely different industries after 30. I realize my sample size is biased by the people I chose to follow (for whichever reason) and anecdata is not data, but statistically it's really hard for me to square "experienced software engineers" with "in practice, it looks like no one over 30".
learc83 2021-08-19 01:33:44 +0000 UTC [ - ]
Where are you getting this information from? The only hard data I can find is from self selected survey responses, but this survey from IGDA shows only 10% of employed game developers are under 25 [1]. My guess is that (as you've acknowledged is possible) there's some serious selection bias going on. You said you have an interest in burn out rate, so I'm guessing you're more likely to follow/notice game devs who discuss this topic. This group is more likely to be suffering from burn out I'd wager.
Another poster already mentioned that engine devs (the one's writing most of the C++) tend to be older than the industry average.
1. https://s3-us-east-2.amazonaws.com/igda-website/wp-content/u...
gmueckl 2021-08-17 22:47:59 +0000 UTC [ - ]
BobbyJo 2021-08-16 18:50:10 +0000 UTC [ - ]
Doesn't the former enable the latter? Ideally, language (both human and machine) would have the semantics needed to represent all transforms, but that's not the case. Code you rely on, since none of it is written in isolation, needs to enable you to implement data-oriented design should you so choose.
Also, I don't think pointing out that 'all games are essentially...' is particularly useful. It's true, no question, but that doesn't mean it's the most useful mental model for people to use when developing software. Our job as engineers is to make software that functions according to some set of desires, and those desires may directly conflict with approaching an optimal transform.
jeremycw 2021-08-16 19:30:52 +0000 UTC [ - ]
Not necessarily. ECS is a local maxima when developing a general purpose game engine. Since it's general purpose it can do nothing more than provide a lowest common denominator interface that can be used to make any game. If you are building a game from scratch why would you limit yourself to a lowest common denominator interface when there's no need? Just write the exact concrete code that needs to be there to solve the problem.
> Our job as engineers is to make software that functions according to some set of desires, and those desires may directly conflict with approaching an optimal transform.
All runtime desires of the software must be encoded in the transform. So no software functionality should get in the way of approaching the optimal transform. What does get in the way of approaching the optimal transform is code organization, architecture and abstraction that is non-essential to performing the transform.
BrS96bVxXBLzf5B 2021-08-16 20:00:54 +0000 UTC [ - ]
Good luck with that when the exact code to solve the problem is not the exact code the next week, because the problem has changed or evolved.
Not to suggest an ECS is the answer, but this line of thinking is reductive to the realities of creating a piece of art. It's not a spec you can draw a diagram for and trust will be basically the same. It's a creature you discover, revealing more of itself over time. The popularity of the ECS is because it provides accessible composition. It's not the only way of composing data but being able to say "AddX", "RemoveX" without the implementation details of what struct holds what data and what groupings might matter is what makes it appealing.
meheleventyone 2021-08-17 08:04:43 +0000 UTC [ - ]
What you’re basically saying is a solution should be flexible to change because making a game requires trial and error. I totally agree with that.
Using a general solution is one path to flexibility but it does come with a cost associated. It’s flexibility built on a tower of complexity and if you look at a modern ECS implementation that is performant it’s actually quite a lot of complexity. You’re also reducing flexibility in the sense that these sort of solutions generally have preferred patterns you need to fit your game design into. So you end up introducing a learning, maintainance and conceptual burden into the project you might not need.
OTOH if you have a specific problem you can write a specific solution for you will end up with less code, hopefully in a conceptually coherent form. That in itself offers flexibility. Simple code you can easily replace is often more flexible than complex code you need to coax into a new form.
The key is to recognise whether your problem is specific or general you need flexibility.
These architectural patterns are fun to argue over and obsessed over by armchair game developers but are a trap if you’re trying to make a game rather than a general purpose game engine.
Which isn’t to say you don’t want some framework underlying things for all sorts of mundane reasons. But most games could get away with that being an entity type that gets specialised rather than anything more complex.
BrS96bVxXBLzf5B 2021-08-17 08:35:14 +0000 UTC [ - ]
Agreed with the 'flexibility on a tower of complexity', 100%! :) was trying to not appear too dogmatic by describing is as 'accessible composition'; generally any solution that is 'accessible' is also broad enough that it has as many flaws as benefits, and an ECS definitely isn't an exception.
> These architectural patterns are fun to argue over and obsessed over by armchair game developers but are a trap if you’re trying to make a game rather than a general purpose game engine.
Again, agreed. Speaking from experience as an iterator and rapid prototyper who has used an ECS for years, and has been bitten by the complexity but hasn't been able to beat the flexibility of being able to just write something like `entity->Add<ScaleAnimation>(...)`, `entity->Add<DestroyAfter>(...)`, `entity->Add<Autotranslate>(...)`, `entity->Add<Sprite>(...)` to be quickly and easily create a thing that looks nice, pops in smoothly, moves effortlessly, destroys itself thoughlessly. It lets you move between ideas quickly and then you can pivot to addressing concerns if any show up.
meheleventyone 2021-08-17 10:06:27 +0000 UTC [ - ]
BobbyJo 2021-08-16 22:07:30 +0000 UTC [ - ]
There is a need: the limits of the human mind. Nobody can model an entire (worthwhile) game in their head, so unless you plan on recursively rewriting the entire program as each new new oversight pops up, you aren't going to get anywhere near optimal anyway.
debaserab2 2021-08-16 21:25:44 +0000 UTC [ - ]
Coming from the realm of someone who has mostly swam in the OO pool their career, I struggle understanding how a concrete implementation of something like a video game wouldn't spiral out of control quickly without significant organization and some amount of abstraction overhead. That said, I have found ECS type systems be so general purpose that you end up doing a lot of things to please the ECS design itself than you do focusing on the implementation.
Do you have any examples of games and/or code that are written in more of a data oriented way? I'd really love to learn more about this approach.
jeremycw 2021-08-16 22:01:34 +0000 UTC [ - ]
debaserab2 2021-08-16 22:19:34 +0000 UTC [ - ]
throwaway17_17 2021-08-16 22:07:32 +0000 UTC [ - ]
[1] https://www.youtube.com/watch?v=rX0ItVEVjHc
debaserab2 2021-08-16 22:20:06 +0000 UTC [ - ]
typon 2021-08-16 19:49:51 +0000 UTC [ - ]
This Mike Acton post describes it accurately: http://www.macton.ninja/home/onwhydodisntamodellingapproacha...
nikki93 2021-08-17 00:12:38 +0000 UTC [ - ]
An example is the prefab hierarchy you get in Unity, which is expressed through the data (prefabs and their relationships). (Note: I mean specifically the prefab inheritance hierarchy, not the transform spatial hierarchy -- the former has more overlap with the "is a" relationships). The code processing this hierarchy could've just been plain C code that parses the files and maintains an in-memory set of structures about them, even. You then get to define how properties inherit, what overriding means, etc. yourself.
typon 2021-08-17 05:14:13 +0000 UTC [ - ]