Dell XPS 8950 Stress Test with Folding@Home
I had another lengthy saga running In parallel with my lengthy Canon Pixma MX340 teardown. The Dell XPS 8950 I bought primarily for SteamVR with my Valve Index began exhibiting bug checks on an irregular basis. This is not good. I paid a premium over similar-spec computers on the expectation that a XPS would be more reliable and, failing that, Dell is more likely to fix things that go wrong. Well, the first part turned out to be wrong. Thankfully the second part was eventually tested to be true, but it took some work to get there.
The first thing I needed was a better way to reproduce the issue. I want to collect many bug check memory dumps to compare them against each other, and I needed a way to verify the problem has been resolved or not. Since I bought this computer mainly for SteamVR, the bug check usually happens while I'm in the middle of a VR session. It spoiled a few Beat Saber songs and abruptly ended firefights with Combine soldiers in Half Life: Alyx, but not every VR session triggered the problem and I wasn't going to just stay in VR until it occurred.
I found hardware tests in Dell's SupportAssist tool (more on SupportAssist in a future post) and ran those. My computer passed the tests with no errors. I looked for a way to run these tests in a loop but didn't find a way to do so.
I tried just leaving the computer on and running, but not doing anything in particular. After a week, I got two bug checks. This is better than unpredictable crashes in VR sessions, but waiting 3-4 days between reproducing a failure is still not great.
I increased system workload by installing and running Folding@Home. It kept the GPU busy but CPU utilization would drop off after a few minutes. I eventually figured out Windows 11 detected a long-running compute process and decided to restrict Folding@Home to the four power-efficient E-Cores on my i7-12700 CPU. Gah, foiled! I worked around this by disabling the E-Cores in system BIOS. (Where they were called Atom Cores.) With E-Cores out of the picture, CPU utilization stays at 100% with all eight hyper-threaded P-cores running at full blast.
I would rather have a procedure to consistently and immediately reproduce the crash but I never found one. Running Folding@Home the bug check would usually occur within 12-24 hours and this was the best I've got. Over the course of about two weeks, Folding@Home helped me generate a decently sized collection of bug check crash memory dumps to examine.