Walt Stoneburner's Ramblings

Currating Chaos

Netgate SG-4860 Bricked

This afternoon while updating my Netgate SG-4860 pfSense firewall from version 2.5.2 to 2.6.0, a process that's seamless and takes mere moments to upgrade and reboot, the GUI web console reported that the firewall was not coming back from its soft reboot.

When I went to check on the firewall, it was off. That was highly unusual.

When I unplugged it and plugged it back in, the status light turned red, the web interface wouldn't load, and the device powered itself back off. I couldn't ssh nor ping the device.

At that point, I went to the USB serial console to figure out what was happening. Details of how to do this are readily available at Netgate's Website.

Typically, one grabs a Mini-USB to USB-A cable, be sure to get a real cable (it has a thick cord) and not one for charging devices (it has a thin cord), and a USB Bridge to UART driver. Here are several:

Some of the newer Netgate modles have UARTs that require the full UART capabilities. On a side note, many of these drivers take an unusually long time to install (so much so, you'll think they've hung) on Apple or Microsoft systems. Most modern Linux kernels come with a driver already, so nothing need be installed on that platform unless you are using a much older kernel.

At that point, when a powered up Netgate device is connected to your computer via the USB cable, a device called /dev/cu.SLAB_USBtoUART or /dev/cu.usbserial will appear. (The "cu" means "calling unit", such as the old days of connecting to a phone modem.) You'll want to connect to the serial port with the settings of N,8,1 and a default baudrate of 115200, using no hardware handshakes (RTS/CTS), nor software flow control (^S/^Q). While you could use minicom, it turns out that you can also use screen and specify the baudrate:

$ screen /dev/cu.usbserial 115200

You may have to press enter to wake the pfSense terminal console. And, if you're using screen Control-A d will disconnect.

In my case, I got nothing. No text console. The device was dead.

In reality, the device was bricked.

This reddit thread by pfn0nsense outlines the problem perfectly:

I have a SG-4860 that has been running great for 3+ years, but has recently turned into a brick. The status light is solid red on power up, but eventually just turns off. I am unable to see any output via the USB console port. As far as I can tell this happened out of the blue, not during a reboot or any sort of update or power cycle.

And, ProperToday8 had an answer no one likes to hear -- it's an honest-to-God known hardware problem.

You have fallen victim to the Intel Atom C2000 SoC flaw. Google it.

The Intel Atom C2000 bug has been killing products from a variety of different manufactures, at least since 2017.

Cisco explains it in a support advisory entitles Clock Signal Component Issue.

In some units, we have seen the clock signal component degrade over time. Although the Cisco products with this component are currently performing normally, we expect product failures to increase over the years, beginning after the unit has been in operation for approximately 18 months. Once the component has failed, the system will stop functioning, will not boot, and is not recoverable. This component is also used by other companies.

Other folks have been reporting it:

Netgate, aware of the issue, commits to repairing or replacing the unit if you're still under warranty. If you're not under warranty, you could be looking at a replacement cost upwards of $500 including a $75 diagnostic fee.

More than likely if you are leaning in that direction by being out of warranty, you might as well purchase a new unit. If you reach out to them, they may comp you an extra year on the replacement unit.

For those that are curious, I did try factory reseting the device, downloading the original firmware and attempting to install it via USB stick, but alas -- the device can't read its boot ROM to get that far.

So the "solution" is to decomission the device, plan on it never coming back, and replacing it with a newer, faster, better model. Netgate has many appliances.
pfSense recommends some too.

Should some brilliant electrical engineer out there happen to know how to take a screw driver and a soldering iron to the unit and correct the problem with some replacement component, please reach out to me. I don't intend on disposing the unit for a while. Extra bonus points if the unit can be repurposed to a general computer or the parts salvaged.