Notes on ML-Agents Development History (Part 1: Up to Version 1.0)

I've just installed and tested basic functionality of Unity ML-Agents Release 18. And just before that, I did the same with Release 2 which is also referred to as "Verified 1.0.8". I was surprised at the changes visible just between these releases. This made me curious about how this package evolved, and I went looking for information from its past.

Most of them were announced on Unity blog, but some just had GitHub release notes. Here is a compilation of links alongside a few highlights that caught my eye, follow these links for a complete list of changes:

2017/6/26: The earliest public information I could find was Unity announcing their intent to join in AI research and applications. Annoyingly, some of the linked blog posts have since disappeared, apparently in some sort of migration of their blog hosting system. For example the "second part of this blog series" link now leads to a 404 error.

2017/9/18: The ML-Agents Toolkit officially kicks off with version 0.1, describing a general architecture that I'm sure has since evolved and a long list of ambitious ideas they wanted to support. Many of them did come to be! Though of course not all of them, and some have since disappeared.

2017/12/8: Version 0.2 introduced curriculum learning, and launched a community challenge to motivate people to play with the toolkit.

2018/3/15: Version 0.3 introduced imitation learning, multi-brain training, and an optional poll model. Recurrent Neural Networks came in as part of a "Memory-Enhanced Agents" umbrella.

2018/6/18: Version 0.4 allowed training using the Unity editor, no longer requiring a compiled executable. An Udacity nanodegree was introduced, though sadly that's too rich for my blood. More training environments were added, one (Pyramids) specifically demonstrates the "Curiosity" capability. Curiosity got its own blog post.

2018/9/11: Version 0.5 added a Gym interface and replicating a few environments from OpenAI Gym. Also expanded capability to enable/disable discrete actions, but not clear if it was related to OpenAI Gym.

2018/12/17: Version 0.6 is an architectural revamp changing how ml-agents AI brains fit in the Unity object hierarchy. Introduced "demonstration recorder" for off-line imitation learning. Is that still around?

2019/3/1: Version 0.7 is another big infrastructure change, switching runtime neural network inference from external TensorFlowSharp to Unity's own Inference Engine (a.k.a. Barracuda) to support more Unity runtime platforms.

2019/4/15: Version 0.8 infrastructure change allows multiple Unity simulations to run in parallel on a single machine. Strange this is the recommended approach to take advantage of machines with many processing cores. (Later research found that Unity is working to improve multicore performance across the board, not just ml-agents, with something called DOTS.)

2019/8/1: Version 0.9 (release notes) is the first of two releases focused on throughput and efficiency.

2019/9/30: Version 0.10 finished what 0.9 started. Improving sample throughput (asynchronous environments) and sample efficiency via GAIL (0.9) and SAC (0.10) algorithms.

2019/11/4: Version 0.11 (release notes) changed again the brain's place in Unity object hierarchy.

2019/12/2: Version 0.12 (release notes) moved from TensorFlow 1 to 2 via the TF1 compatible interfaces. It appears this work was never finished, ml-agents moved to PyTorch instead of finishing TF2 migration.

2020/1/8: Version 0.13 (release notes)

2020/2/28: Version 0.14 now has ability to train via adversarial self-play. Includes a short history of learning from self-play.

2020/3/6: Not a version, but this is when ml-agents got serious enough to get a course up on Unity Learn (Hummingbirds) as well an "AI for Beginners" course on Unity Learn Premium.

2020/3/18: Version 0.15 (release notes) wrapped up a lot of housekeeping in preparation for 1.0 release.

Development focus for ml-agents changed to more refinement after 1.0 release, along with corresponding reduction in blog announcements.