• MagicShel@lemmy.zip
    link
    fedilink
    English
    arrow-up
    28
    ·
    4 days ago

    I guess there’s no need to make better AI, then. It’s already as good as it’s allowed to be. Now we can focus on running existing models more efficiently and maybe local. There’s that chip company who said they could 10x the energy efficiency of AI if it were built into the chip, but it would mean losing the innovation of rapidly iterating models. Well, this is the moment. If we could have local models that ran for the cost of charging your car and everyone had access to them, maybe that would give us a measure of what AI has always promised mankind.

    • Eager Eagle@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      4 days ago

      yeah, that is pretty insane at 17k tokens per second. It feels as if answers are already cached just waiting for you to prompt.

      https://chatjimmy.ai/

      If they manage to fit larger and more recent models in it, it could greatly improve the energy requirements to run these models.

      The recent diffusion LLM from google is also really exciting, and I think they might become the new architecture of choice for these models when running on general purpose GPUs - especially consumer cards, which are usually memory constrained, but computing-capable.

    • partofthevoice@lemmy.zip
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 days ago

      IIRC, that particular configuration was more like 72x not 10x. Would probably be good for large models that can do many things, so you’re less “stuck.”