Pairing an AMG8833 thermal sensor with an Adafruit Memento camera gave me a thermal camera, but my code was running quite slowly. I found an example illustrating use of (ulab.numpy subset of) NumPy for interpolating data from AGM8833's sensor grid to a larger grid, and adapted it to my project. My performance marker timers say this resulted in total of ~320ms per frame, or roughly 3 frames per second. Here's an excerpt from rendering four frames:

read 38028 scaled 596 mapped 1520 blit 27626 grid 224501 refresh 24528 total 316799
read 38237 scaled 596 mapped 1520 blit 28789 grid 223636 refresh 24438 total 317216
read 38296 scaled 566 mapped 1580 blit 27567 grid 226170 refresh 24438 total 318617
read 38356 scaled 626 mapped 1728 blit 28849 grid 198901 refresh 24587 total 293047

More important than the interpolation itself was having an example for me to study NumPy. My takeaway is to avoid writing loops iterating through arrays as much as possible. Almost every performance win here boils down to substituting a tightly iterating loop with a single operation.

Bitmap as NumPy Array

The biggest win was converting my thermal overlay drawing commands into a single NumPy operation. The critical part is creating a ndarray view on top of existing bitmap data in order to avoid copying its bits around.

output_ndview = np.frombuffer(output_bitmap,dtype=np.uint16).reshape((240,240))

This was the key allowing me to describe large scale bitmap operations without having to write my own for loops to iterate over x,y coordinates. The loops are still happening, of course, but now they're within fast native code free of Python runtime overhead.

Subset Blues

I knew ulab.numpy was a subset of full NumPy and was curious if the missing parts would be something I wished for or if they're too esoteric and I wouldn't miss their absence. The answer is the former: even as a beginner I quickly ran into situations where I found a NumPy answer on something like a Stackoverflow thread only to find features missing from ulab.numpy. One example is repeat(), which I replaced with my own series of unrolled copy operations.

List Comprehension For Palette Lookup

The final bit of code to be replaced by NumPy operations was a thermal color palette lookup. My first implementation did it easily with nested for loops iterating through x and y axis, but it's not fast. This feels like an operation that might have a NumPy operator, but nothing in ulab.numpy sounded applicable. Full NumPy offers a way to execute an arbitrary Python function over every element in an array, but that was missing from ulab.numpy. After reading through several Stackoverflow threads I decided to create a list comprehension out of palette lookup and build a NumPy array around the list. I've already explained why I didn't like list comprehensions, but performance numbers don't lie: performing palette lookup via list comprehension was at least an order of magnitude faster. For that kind of gain, I'll hold my nose and use a list comprehension.

Final Results

I've replaced almost every for loop in my old code with NumPy operations, the only remaining inner loop for generates my list comprehension. All of these changes add up to quite an improvement. As can be seen in these times involved in generating four frames:

read 38624 scaled 775 interpolated 1132 mapped 2444 blit 28551 grid 6199 refresh 25361 total 103086
read 38624 scaled 626 interpolated 924 mapped 2175 blit 28730 grid 33319 refresh 25153 total 129551
read 38594 scaled 685 interpolated 1043 mapped 2295 blit 27716 grid 6288 refresh 25452 total 102073
read 38504 scaled 656 interpolated 924 mapped 2295 blit 28044 grid 33289 refresh 25213 total 128925

As low as 102ms, almost 10fps, which is great! In fact, it marks the finish line. 9-10fps is as fast as the AMG8833 can deliver due to legal limitations imposed on thermal sensors. Going faster won't gain anything thus ends this practice session of CircuitPython performance optimization. I will wrap up a few details and move on to the next project.


https://github.com/Roger-random/circuitpython_tests/blob/main/pycamera_amg88xx/code.py