Out of curiosity, I took a look at some tools provided by search engines to help sites drive traffic. I'm very glad site traffic isn't important to https://newscrewdriver.com because search engines ask for some weird things. On the other hand I appreciate that making a good internet search engine is a constant battle against unscrupulous people trying to game the system. There are some legitimate things sites can do under the umbrella of "search engine optimization" but there's a lot of deceptive SEO techniques out there as well.

This site is my personal project notebook. I write things down as I go, and it's all searchable whenever I want to look back on something. It's publicly visible in case someone is willing to sift through a lot of my ramblings to find a nugget of gold useful to them. Unfortunately it also means it's publicly available for others to copy for less noble purposes. There's a lot of those! When I want to search my own project notebook, I have to restrict it to my own site or I'll get a lot of mangled junk copies.

Here's one example: several years ago I was playing around with PIC microcontrollers and using Microchip's MPLAB development tools. When I moved from browser-based MPLAB Xpress to desktop MPLAB X IDE, I wrote down a few thoughts about how the move affected my ability to use git source control: MPLAB Xpress vs. MPLAB X: Git Source Control. Recently I started using another tool that wouldn't necessarily work with git (more on that in a future post) and searched for my old post.

My default search engine is DuckDuckGo, which subcontracts out to Microsoft's Bing. I repeated my search on both sites to confirm this observation: If I search on the title of my blog post, both of those search sites' first page of results include at least two hits from sites that copied my content. And there are more copycats in second page, third page, etc. You know what's not on the first few pages of search results? My original page.

I blurred the URLs out of my screenshot above because I am not going to help drive traffic to those sites and I'm certainly not going to click on those links. But preview snippet shown by DuckDuckGo/Bing is enough to show me they've changed a few words in order to avoid being an exact copy. There's a Roman numeral obsession where "MPLAB X" became "MPLAB ten" (uhhhh I guess it might be?) and "C source code" became "century source code" (no, absolutely not.) There was a misguided attempt to spell check when "git source control" became "get source control". But the most puzzling changes involved things like taking my "and" and replaced it with "or" (or vice versa) which drastically changes the meaning of my original sentence. Their search/replace algorithm is not just ignorant of technical terminology, it's just plain dumb at English syntax. I don't know how those sites became ranked higher than my original text, and I'm not sure what purpose they aim to serve. I'm just sad at the possibility someone would be led astray in their own project because they found a mangled version of my project notebook.