• teuto@lemmy.teuto.icu
    link
    fedilink
    English
    arrow-up
    6
    ·
    8 days ago

    I picked up a pair of old Tesla P40s. Right now I’m running a Q4 quant of Qwen 2.5 72B that fits in the combined 48GB of VRAM with 12k context. They aren’t as fast as newer consumer cards, but it generates as fast as I can read while costing less than a used 3080.

      • teuto@lemmy.teuto.icu
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 days ago

        I have a dell power edge 730, which was about $200. It’s CPU shrouds perfectly match the GPU intakes so air just flows through both from the server fans. I’ve seen a few 3d printable fan mounts for jury rigging them into a regular tower too.