It's time for a new server
Setting out to waste some money
It’s only been about two years since I built an atrocity of a VM server in a decade-old workstation box. It still gets the job done, but there is empty space to fill in my homelab cabinet so it’s time for an upgrade. In the interest of killing two birds with one stone, I’d like it to replace my old Synology NAS too, which is also perfectly adequate despite its age.
The constraints and use case of this system make it a pretty challenging build to spec out:
- Two GPUs: one for the gaming VM, one for an AI VM. Both must have at least PCIe4 x8 bandwidth.
- Shallow 2u chassis: so it fits in my cabinet. I’ve already been though too many different sized ones.
- At least 16 cores. Ideally, more.
- At least 128 GB of RAM.
- ECC support. This limits us to server/workstation SKUs and only the most recent consumer ones.
- Room for enough SSDs to get at least 10 TB of usable capacity. I’ve currently got about 5 TB used so this gives me decent headroom.
Let’s start picking some parts.
Case and GPUs
These choices end up being intrinsically linked, since the shallow cabinet ends up being the biggest problem. Nothing with more than about 14" depth will really fit. I considered 3u and 4u chassis as well, but somehow 2u ended up working out the best.
While 3u ostensibly supports a full-height card, consumer GPUs don’t really fit because they tend to extend a bit above the PCI bracket. You’d think 4u avoids this problem, but the only 4u case I can find that is 14" deep or less has the motherboard on an elevated tray that makes it no better than a 3u.
There are way more options available for 2u cases, but this constrains me to half-height GPUs. There aren’t many options with risers to circumvent this, and none of them are good. There also aren’t many good half-height GPUs. Thus, it was time to lock in our first two part selections:
- Gigabyte GeForce RTX 4060 OC Low Profile 8G, for the gaming VM
Nvidia RTX A2000 12GB, for the AI VM(Update: I got a P4 instead. It was cheaper.)
The 4060 is basically the only “gaming” GPU available in this form factor that isn’t a decade old. I acknowledge that the A2000 is a weirder choice here, but ultimately I chose it because:
- It has a decent amount of RAM.
It’s way cheaper than any of the half-height Teslas.(Update: Actually, it wasn’t.)
To fit both of these dual-slot cards in a 2u case, now one of two things must be true:
- It takes a full ATX motherboard, so there are more than 4 bays, or
- I find a mATX board that actually puts two (physically) x16 slots in the 1st and 3rd positions.
After hours of searching, I could not really find any mATX board that worked here, so I settled for the only shallow 2u full-ATX case I could find, locking in our third part:
- Sliger CX2137b
It’s only 13" deep, so no problem there. It has 7 PCI bays and takes a flexATX PSU. Looks pretty decent too.
Only one issue. I originally planned to find a case with room for two 5.25" bays so I could put a backplane in them and fill it with 2.5" SSDs. This case doesn’t have the 5.25", so I had to find another option. I could fill the interior of the case with SATA SSDs, but it only really fits five. I’d end up with a 4-member RAID-10, a non redundant boot disk, and a huge mess of cables. Time to figure out how to solve this problem.
Motherboard
At this point, I decided to commit to slapping a bunch of NVMe in it. Thus, a new constraint has been introduced: we need a buttload of M.2, or at least lots of PCIe slots we can stick risers in.
After looking at basically every motherboard SKU produced for the latest 3 or 4 generations of both AMD and Intel chips, I figured out that there is basically only one chipset that will get the job done: Intel W680.
There aren’t a ton of boards with these chips in them, since it targets the pretty-niche part of the workstation segment that isn’t on Xeons. However, the workstation application is ultimately the consideration that makes this the best chip for the job: they are all intended to support multiple GPUs.
All of the W680 boards I found let you run the 16 PCIe Gen5 lanes off the CPU as either a single x16 or two x8’s. We’ll be choosing the latter option.
We must now compare boards for their other features, particularly:
- How many ethernet ports does it have?
- Does the firmware actually support ECC?
- How much storage can I attach to it?
- Any extra goodies like IPMI?
At this point, I had three serious candidates: one from Supermicro (X13SAE-F), one from Asus (Pro WS W680-ACE IPMI), and one from Gigabyte (MW34-SP0).
Though I’m a pretty big Gigabyte stan, unfortunately there’s enough FUD on the internet to make me wary that they flaked pretty hard on the ECC support for this one. It also only has one 2.5GbE port - which is enough bandwidth, but I’ve only got gigabit switches so was betting on two 1GbE so I could LACP them. With two strikes against it, I decided to eliminate it from the running.
Both had IPMI (though the Asus is weird, we’ll get to that in a second) and ECC support that apparently works. The decision ultimately came down to the Asus being $100 cheaper and having two 2.5GbE ports instead of just one.
Indeed, we have finally locked in our fourth part:
- Asus Pro WS W680-ACE IPMI
It has a PCIe layout that looks something like this:
- CPU
- PCIe 5.0 x8 (in a x16 slot)
- PCIe 5.0 x8 (in a x16 slot)
- PCIe 4.0 x4 M.2
- PCH
- 2x PCIe 3.0 x4 (in a x16 slot)
- PCIe 3.0 x1 (for the IPMI card)
- 2x PCIe 4.0 x4 M.2
Yeah. That’s the weird part - the IPMI is on an expansion card. It seems like the motherboard does have the BMC on it, but this card is needed to give it ethernet and VGA to turn it into a full-fledged IPMI. Odd choice, but whatever. Fortunately it doesn’t take up any of the PCIe I actually want to use. Still plenty of PCIe lanes for a bunch of NVMe.
Storage
Update: Don’t buy the parts listed in this section. They didn’t work. Refer to my new post.
We’re going to jam 6 M.2 drives in this thing, using the two onboard M.2’s attached to the PCH, and two carrier cards to add two more in each of the x4 slots. Since the slots are x4, this means we need active cards:
2x GLOTRENDS PA20 Dual M.2 NVMe to PCIe 3.0 X4 Adapter
After considering several options, the smart move is to just make the bulk of the storage something cheap, reliable, and high density, even if it doesn’t have the highest performance:
6x TEAMGROUP MP34 4TB
We were limited to Gen3 speeds and bottlenecked through those switches anyway, so the only real loss here is in random IOPS. Maybe we can make up for it with the 3-way stripe or the mirroring.
We’ll run some random spare SSD I have around for boot:
Intel DC P4511 1 TB
Power supply
We’re stuck with a flex PSU thanks to the case we had to choose. Let’s just pick the biggest one available from a respectable brand:
- Silverstone FX600 Platinum
CPU
We want a CPU that supports ECC and has a ton of cores. However, we’re pretty limited in terms of power. After doing some math, I figured I can afford a 65W TDP CPU. This means an i9-13900 is the best choice. However, it so happens that the -K is cheaper right now. I’ll get one of those and clock it down to non-K speeds.
- Intel i9-13900K
Not opting to pay extra for a 14900, since it doesn’t really seem to perform any better.
RAM
We need 4x 32GB DDR5 ECC UDIMMs. There is only one SKU on the QVL that I can actually seem to purchase right now, and I don’t feel like playing games:
- Kingston DDR5-4800 32GB ECC
Adding it all up
The last touches are fans and a CPU cooler. Let’s find those, recap the BOM and add up some numbers. I planned to get a bunch of stuff from eBay to save money, but it turns out that most of these parts aren’t exactly plentiful on the secondary market.
Item | Qty | Condition | $/each | Total |
---|---|---|---|---|
Sliger CX2137b | 1 | New | $199.00 | $199.00 |
Arctic F8 PWM Fan | 3 | New | $8.00 | $24.00 |
Asus Pro WS W680-ACE IPMI | 1 | New | $399.99 | $399.99 |
Intel i9-13900K | 1 | New | $490.00 | $490.00 |
Dynatron Q5 LGA1700 Cooler | 1 | New | $49.95 | $49.95 |
Kingston DDR5-4800 32GB | 4 | New | $118.75 | $475.00 |
GIGABYTE GeForce RTX 4060 | 1 | New | $319.99 | $319.99 |
Silverstone FX600 | 1 | New | $174.99 | $174.99 |
Update: I ended up deviating from my original plan. An explanation and new BOM is in this new post. tl;dr: I got a cheaper GPU, the Team NVMe was buggy, my Intel NVMe failed, and the PCIe switches sucked.
Everything’s been ordered except the A2000, which I’m going to put off for a little bit to make sure everything else works first.
[a previous version of this post had a pie chart of the cost breakdown here, but I didn’t feel like updating it after changing the BOM, so it has been removed]
My power budget is also pretty tight, so let’s see what kind of shape we’re in. I could not find actual power data for some of these parts, so it’s estimated based on similar ones (or wild guesses).
Item | Power |
---|---|
Intel i9-13900K (throttled) | 219 W |
GIGABYTE GeForce RTX 4060 | 115 W |
Asus Pro WS W680-ACE IPMI | 30 W |
4x Kingston DDR5-4800 32GB | 20 W |
Max draw looks like 520 W, so we still have a decent amount of headroom with the 600 W PSU. If I end up running into any problems I can just further limit the CPU or one of the GPUs.
Update: I changed the BOM, but I don’t feel like finding new power numbers. It should be basically the same.
Now we wait
The parts have yet to arrive. Will update with a post about my struggles when they do.