More dead hardware06 November 2022
As part of the fault-finding rebuild of my personal workstation I had a leftover motherboard that aside from suspected glitches on the Northbridge DMA interface seemed to be in good working order so decided to use it for the building of a new head-less server. What I thought would be a weekend project ended up taking six weeks and in the process ended up buying enough core components for two servers. Given how much the cost of CPUs is rocketing having an extra bare-bones system ready to go made economic sense but the whole process of getting them built is something I could have done without.
A nasty surpriseI ordered in the cheapest AM4 socket CPU I could find which was a Ryzen 3 off Amazon which I installed onto the motherboard, and this in turn was in a rack-mount case since this was intended to be a headless server. After hooking up my mini-monitor and a USB keyboard I was greeted by spinning fans but nothing on the display. At the time I thought a suspect motherboard had given up the ghost so I RMA'd it and ordered in a new motherboard and some more memory — as it would turn out the latter would have the same problem so I sent the CPU back to Amazon and ordered a new one, but in hindsight this CPU may have actually been good.
The new CPU had the same problems with the new motherboard and so I ordered in a new power supply just in case that was the problem, even though a quick check with a multi-meter suggested the existing power supply was good, but either way it was the same results. As a precaution I checked the two pairs of RAM simms I had in my personal workstation using Memtest86+ but they came out good. Getting desperate I ordered a piezo speaker in the hope that BIOS beep sequences could be used for diagnostic but no sound was emitted. Rather than do another RMA I went ahead and ordered in a Ryzen 5 4500 just in case it was a problem with the Ryzen 3 CPUs model, but I had the same problem with that one as well.
At this point I was seriously contemplating the prospect that I had had three dead-on-arrival CPUs in a row from two vendors, which to me was hard to believe. Handling was not any differently to previous system I had built and considering assembly was done at my electroncs bench which had an anti-static mat, I doubt electrostatic discharge could be blamed for three faulty parts. At this point I already had bought in duplicates of every single core component and given how much the cost of CPUs had already shot up this year OI may as well push ahead and assemble two complete systems.
Some actual successStock CPU coolers are a pain to clean and just in case it was a cooler fan issue I order in two Noctua CPU coolers that I had used in my previous workstations. By this time the motherboard I had RMA'd had been tested as good by the vendor and was returned so I decided to put the Ryzen 3 onto it Again the system did not pass POST but then I got a beep sequence of one long and three short beeps which assuming I was reading the correct guide was the sequence for graphics card failure and was typically associated with power issues. Luckily I had some spare PCIe graphics cards and after installing one of them things got past POST and into the BIOS setup —I don't know if the on-board graphics was disabled or malfunctioning but I now had a system that was booting.
At this point I could also verify both my power supply units were good and that it was a dead or incompatible motherboard I was dealing with, and I noticed that it used the A320 chip rather than the B550 chip that every other AMD motherboard I had bought this year used. Upon it arriving I kitted it all up and after getting the same beep sequence as before I stuck in a graphics card and I now had a second working bare-bones computer. It is possible that A320-based motherboard just needed a BIOS upgrade but that was a moot point as there was no way of applying one without an operational CPU.
Root-causesAll in this year I have initiated four returns of merchandise of which only one was comfirmed as faulty so it is worth looking at what triggered the returns. The first CPU I sent back was indeed faulty but glitches I noticed with its replacement I now believe were down to a fault with the Audigy Fx since that system also had intermittant latency issues that went away when it was left out when the motherboard was replaced with one that had more PCI-Express slots. An expansion card doing stuff to the Northbridge DMA controller that shows up as memory faults on the CPU at the very least sounds cirumstantial but with everything else checking out good in isolation, it prime-facie looks like the culpret. Or it could just have been the sort of random glitch that has to be expected with a whole 64 gigabytes of non-error checking memory.
Most of the headaches in the last month or so were down to motherboards actually needing a graphics card even though they had on-board video out, and it was only after I bought in a piezo speaker in order to hear the POST beep sequences — the rack-mount case I was using did not have a speaker — that I was pointed in the right direction. I had not realised that on-board graphics chips these days are instead integrated into CPUs. In the past I have always installed a dedicated graphics card rather than use on-board video on self-builds so I had never come across this gotcha before, and although the CPU I sent back to Amazon was accepted as faulty I suspect in reality it was good, so this pit-fall was responsible for two false fault returns. Ironically sticking in a graphics card was something I tried with the A320 motherboard before the non-faulty B550-based one was returned to me.