Realtek r8168 Driver Is Not r8169 Driver Predecessor
I have a Dell Inspiron 7577 whose onboard Realtek Ethernet hardware would randomly quit under Proxmox VE. [UPDATE: After installing Proxmox VE kernel update from 6.2.16-15-pve to 6.2.16-18-pve, this problem no longer occurs, allowing the machine to stay connected to the network.] After trying some kernel flags that didn't help, I put in place an ugly hack to reboot the computer every time the network watchdog went off. This would at least keep the machine accessible from the network most of the time while I learn more about this problem.
In my initial research, I found some people who claimed switching to the r8168 driver kept their machines online. Judging by their names, I thought the r8168 driver was the immediate predecessor to the r8169 driver currently part of the system causing me headaches. But after reading a bit more, I've learned this was not the case. While both r8168 and r8169 refer to Linux drivers for Realtek Ethernet hardware, they exist in parallel reflecting two different development teams.
r8169
is an in-tree kernel driver that supports a few Ethernet adapters including R8168.
r8168
module built from source provided by Realtek.-- Excerpt from "r8168/r8169 - which one should I use?" on AskUbuntu.com:
This is a lot more complicated than "previous version". As an in-tree kernel driver, r8169 will be updated in lock step with Linux updates largely independent of Realtek product cycle. As a vendor-provided module, r8168 will be updated to support Realtek hardware, but won't necessarily stay in sync with Linux updates.
This explains why when someone has a new computer that doesn't have networking under Linux, the suggestion is to try the r8168 driver: Realtek would add support for new hardware before Linux developers would get around to it. It also explains why people running r8168 driver run into problems later: they updated their Linux kernel and could no longer run their r8168 driver targeted to an earlier kernel.
Given this knowledge, I'm very skeptical running r8168 would help me. Some Proxmox users report that it's the opposite of helpful, killing their network connection entirely. D'oh! Another interesting data point from that forum thread was the anecdotal observation that Proxmox clusters accelerate faults with the Realtek driver. This matches with my observation. Before I set up a Proxmox cluster, the network fault would occur roughly once or twice a day. After my cluster was up and running, it would occur many times a day with uptime as short as an hour and a half.
Even if switching to r8168 would help, it would only be a temporary solution. The next Linux update in this area would break the driver until Realtek catches up with an update. The best I can hope from r8168 is a data point informing an investigation of what triggers this fault condition, which seems like a lot of work for little gain. I decided against trying the r8168 driver. There are many other pieces in this puzzle.
Featured image created by Microsoft Bing Image Creator powered by DALL-E 3 with prompt "Cartoon drawing of a black laptop computer showing a crying face on screen and holding a network cable"