I want to host some LLM’s locally and use more advanced models. Since new hardware is out of the question, I think I should be able to pull something off buying some yesteryear equipment on ebay etc. Did anybody attempt such a project? Does it scale horizontally? (I.e. can I connext two boxes to overcome single box slowness?)


The trend I see are the Mac Minis with a lot of unified memory. These are typically very well off people though. Prices for even old GPUs like 3090s are ridiculous now. I don’t think connecting 2 machines over Ethernet would work well, but putting 2 GPUs in a single machine does.
I bought a used 3090 two years ago, and back then they were usually listed for €800-1000 in my country. I thought I was lucky to find one for €700 after searching for a few months, and I don’t think they’ve ever been cheaper than this here. There are definitely fewer of them available now, but you can still buy one for €950 (and possibly even lower if you’re patient). So prices have gone up, but IMO not by ridiculous amounts like RAM.