Window Shopping Google ARCore: Tracking

I started learning about Google ARCore SDK by reading the "Fundamental Concepts" document. I'm not in it for augmented reality, but to see if I can adapt machine vision capabilities for robotics. So while there are some interesting things ARCore could tell me about a particular point in time, the real power is when things start moving and ARCore works to track them.

The Trackable object types in the SDK represent data points in three dimension space that ARCore, well, tracks. I'm inferring these are visible features that are unique enough in the visible scene to be picked out across multiple video frames, and whose movement across those frames were sufficient for ARCore to calculate its position. Since those conditions won't always hold, individual points of interest will come and go as the user moves around in the environment.

From there we can infer such an ephemeral nature would require a fair amount of work to make such data useful for augmented reality apps. We'd need to follow multiple feature points so that we can tolerate individual disappearances without losing our reference. And when new interesting features comes on to the scene, we'd need to decide if they should be added to the set of information followed. Thankfully, the SDK offers the Anchor object to encapsulate this type of work in a form usable by app authors, letting us designate a particular trackable point as important, and telling ARCore it needs to put in extra effort to make sure that point does not disappear. This anchor designation apparently brings in a lot of extra processing, because ARCore can only support a limited number of simultaneous anchors and there are repeated reminders to release anchors if they are no longer necessary.

So anchors are a limited but valuable resource for tracking specific points of interest within an environment, and that led to the even more interesting possibilities opened up by ARCore Cloud Anchor API. This is one of Google's cloud services, remembering an anchor in general enough terms that another user on another phone can recognize the same point in real world space. In robot navigation terms, it means multiple different robots can share a set of navigation landmarks, which would be a fascinating feature if it can be adapted to serve as such.

In the meantime, I move on to the ARCore Design Guidelines document.