Recently there have been a lot of excitement around several fields of machine learning that broke out of their research paper phase. Meaning they got good enough for everyday people to see what they could do without having to chew through dense research lingo. Most of the hype centered around large language models powering ChatGPT and friends. Generative systems have also come of age. I played around with image generators briefly, but there are also audio and video generators out there.

None of those systems are specifically applicable to a rover brain, so while technically interesting I didn't see anything I wanted to pursue yet. I knew it was a matter of time, though, for those technologies to pioneer enough infrastructure leading to something that would produce rover-applicable machine learning systems. This might have just happened, with the launch of LeRobot by Hugging Face.

According to Wikipedia, Hugging Face company has a few related divisions. One of them is Hugging Face Hub, a repository of machine learning models and datasets. Sort of like how GitHub is a repository of source code, but at a higher conceptual level and focused on machine learning. The index of models on Hugging Face Hub are tagged by topic. If I click on any of the "Natural Language Processing" tags today, they return tens of thousands of models. But if I click on "Robotics", there are only 27 models. LeRobot is an effort by Hugging Face to jumpstart this field and I'll be watching to see if it takes off or fizzles out.

Rerun.io

Regardless of LeRobot's success or failure, a quick skim through its documentation taught me a lot and I didn't even follow all its links to other projects. Out of those I looked at, the most promising gem is data visualization tool Rerun.io. Skimming through its documentation and examples, Rerun sounded like something right out of my dreams: a time-based data visualization tool not just on 2D graphs like Grafana, but also visualize data in 3D space. From simple spinning LIDAR to depth cameras, Rerun claims to put all data into a single time-indexed visualization. I will definitely take a closer look in the near future.