IL2CPP is Unity’s AOT solution for C#, which takes compiled C# in the form of IL-code and turns it into C++ code, which can then be compiled using regular native toolchains. I have looked at a lot of this generated C++ code and the machine code it compiles to. There is a lot I’d like to tell you about IL2CPP: what it does, how it could do what it does better, why Burst is often still faster than IL2CPP, why that comparison is misleading etc. However, that is not for today.
Today I want to tell you about a tool I am writing, which is currently going by cpp2better
. You insert this little tool into your build process and the generated C++ code magically gets better. No work needed on your side. More concretely, cpp2better
parses the generated C++ and then makes a million small changes to it to reduce waste. cpp2better
is meant as a final optimization step for master builds and can add a minute of build time. (It is also incompatible with Unity’s cloud build, because you need control over the Unity installation to integrate it into your build.)
cpp2better
affects the size of the resulting binary and the speed of the code. All are improved. All numbers below assume that you build with MasterWithLTCG
(because shipping a master build without LTCG when using IL2CPP seems unreasonable):
- Binary sizes:
- With IL2CPP set to optimize runtime speed, I have seen binary size drops of 20%-30%. (I have seen drops from 240MB to 160MB, and from 64MB to 44MB, for example.)
- With IL2CPP set to optimize binary size, I have seen binary size drops of 10%-15%.
- Runtime performance: depends heavily on the code. Some code gets dramatically faster (almost competitive with Burst), other code gets a good bit faster (50%), and some code doesn’t get faster.
- In a real world example, I have seen savings of up to 2ms of frame time for a reasonably optimized game on PS5 (original frame times around 16ms) that has a good mixture of native engine code, Burst code, and IL2CPP code.
The numbers for runtime performance are unfortunately hard to quantify on average. A typical Unity game has native engine code, Burst code, and IL2CPP code. If your game has moved all of its code into Burst, then speeding up IL2CPP makes no difference, for example. In a synthetic example, I have taken one of Unity’s DOTS samples, disabled Burst, and compared numbers. Across the entire frame, cpp2better
produced between 20% and 25% better frametimes, but a good chunk of the code is still native engine code.
The original motivation for cpp2better
are games that use DOTS but cannot use Burst everywhere. This is the scenario that I have optimized for the most so far, and this is where I would expect to see the largest gains: all the code between Burst calls and all the ECS systems that take a lot of time to move to Burst. In particular, cpp2better
does not intend to replace Burst for jobs. Burst is likely still going to give you better codegen for math-heavy code (float3
and bool4
, in particular).
Cards on the table: I have many more ideas of what to do here, and I would have more time for this if I could turn this into a product I can sell. If this is something you are interested in, please get in touch. I am also looking for studios with either mobile or Switch games to test this on. The tool is running in real games already, but so far only on PC, XBox, and PlayStation.