In the short time we've had usable generative AI systems, they've quickly evolved from "obviously nonsense but there is an outline of an idea" to "superficially fine but nonsense beyond the surface". Asking an image generator to design a rover has improved from a jumble of pixels to something that looks superficially like a machine but upon closer inspection couldn't possibly work. These systems are evolving rapidly, so I'll check back in a few months to see what progress they've made.

In the meantime, today's systems may be usable if I ignore mechanical functionality and focus on appearance. For this second round, I'm asking Microsoft Bing Image Creator (powered by OpenAI DALL-E) to design a cute mascot for the Sawppy project, hoping for something like the mascot for the Mars 2020 rover naming contest. I gave it the prompt:

Mars rover with a rectangular smiling head and six wheels holding a sign that says "Build Your Own Rover!" in hand-drawn cartoon style on a white background.

And here are the results:

Contestant #1 comes across as a little creepy because it seems to have two faces: one in front of the body and another on top of the mast. It's got only four wheels instead of the six I asked for.

Contestant #2 at least has only a single face, and a friendlier-looking one, but again it has only four wheels and the suspension linkages are missing entirely leaving the body to float in midair. Mars has gravity so this won't work. The sign also skipped the word "own" for some reason, though if that was the only flaw, it's something easily fixable in a photo editor.

Contestant #3 has a single face and a sign with all the words. Still only four wheels, but at least they're connected with mechanical-looking linkages instead of a cartoon arc or missing entirely.

The good news with contestant #4 is that it has more than four wheels. The bad news is that it has five. I guess the AI judged this to be a fair compromise between four and six wheels? Only three of these wheels have visible suspension linkages, and they're connected to the outside of the wheel instead of the center. Perhaps the AI had landing struts and pads in mind, and mistakenly thought replacing the pads with wheels would work equally well. An additional data point is that "five wheels" and "attached to tires" problems also came up for another rover design drawn as a result of Quinn Morley's prompt. (See yesterday's post.) This is not an accident... something in DALL-E is intentionally doing this, but why?

I was going to critique this entry for lacking a smile, until I noticed there are little arcs on the front of the body. That's the wrong distance from the eyes on top of the mast to be a smiling face, but I guess it was satisfactory for an AI "does it have a smiling mouth Yes/No" checklist.


Looking at these as a group, I noticed they're all drawn at the same three-quarter view angle in an orthographic projection with almost no perspective distortion. (Head of #3 and maybe #1 had perspective sides.) That was not part of my prompt and I'm curious if that is typical of "hand-drawn cartoon style".

I like telling the generative engine to draw in cartoon style because it reduces a lot of visual noise and mitigates the uncanny valley effect of generators getting little details wrong. I think I'll start with "cartoon style" for my image generator sessions unless I have a reason otherwise.

I also noticed all of these rovers have a boxy body on top of wheels and a boxy head on top of a mast, so it understood that much of the robots sent to Mars. But its training set must be dominated by vehicles on Earth, or at least that's my hypothesis for its obsession with four wheels instead of the six I asked for.

None of these images are good enough to be the new Sawppy project mascot, but they're very close. I'll try again later. Bing beat Google to the punch on this one, but Google is working on an answer. Adobe also has a limited free tier for their Adobe Firefly product. I'm confident there will be more options in a few months. This was a fun distraction and good enough to let my brain think up a solution to my recent circuit board analysis problem.