I'm teasing apart the stream of data sent by the main board of a Canon Pixma MX340 multi-function inkjet to its control panel. I've separated out its 196-byte bulk data transfers from 2-byte command sequences. The bulk data transfers are LCD screen bitmap, so I could ignore that for now and focus on functional commands.

The main problem here is that I never managed to find a data sheet or useful reference material for the main chip on the control panel, marked as NEC K13988. So these command sequences are opaque bytes picked out by my Saleae logic analyzer. A few of these immediately changed machine behavior so I could make guesses on what they mean, but the rest are just a mystery.

I thought I had a huge challenge on my hands, trying to build a state machine to parse a language without knowing the vocabulary or syntax. After drawing a few diagrams on scratch paper, I noticed they all ended up as a straightforward pattern matching exercise. Well, it would be much easier to treat the problem that way, and I should always try that easy thing first.

Logically this would be a switch statement, but since I'm working in Python, I thought I would try to be a bit more clever using existing data structure infrastructure instead of writing my own. I thought a Python dictionary could do the job. I feed it a command sequence and ask if it's one that I've already seen. The minor twist is that I build up my command sequence in a list as bytes arrive on the serial port, but a list is not valid data type to use for dictionary key because they need to be immutable data types.

The first workaround, then, is to convert a list into an immutable counterpart called a tuple in Python. This mostly worked, but the tuple-to-list conversion has a subtle special case for converting lists of tuples (each two-byte command sequence is a tuple) to a tuple of tuples when the original list has only a single entry. It looks like somewhere along the line, a tuple with a single entry of another tuple is collapsed into just a tuple. I don't fully understand what's going on but I was able to rig up a second workaround to make the dictionary lookup happen.

Once that was up and running, I could successfully look up the LCD screen update sequence and collapse that sequence of commands, including its 5 bulk data transfers, into a single line on my console output. This is a great start! Now I can proceed to fill in the rest.


Source code for this quick-and-dirty data parsing project is publicly available on GitHub.

This teardown ran far longer than I originally thought it would. Click here to rewind back to where this adventure started.