• xyzzy@lemm.ee
    link
    fedilink
    English
    arrow-up
    12
    ·
    2 months ago

    The 1B parameter version of Llama 3.2 showed even slower results at 0.0093 tokens per second, based on the partial model run with data stored on disk.

    I mean, cool? They got a C interface library to compile using an older C standard, and the 1B model predictably runs like trash. It will take hours to do anything meaningful at that rate.

  • stingpie@lemmy.world
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    2 months ago

    I would be so much more positive about this if you linked the actual source, not just an article that regurgitates everything word for word. Also, why is this article on ‘indian defense review?’ India and Pakistan nearly had a nuclear war this morning.