

Your company has bought you the latest and greatest and likely supports commercial token usage too
You can’t compare LLMs at scale to running it locally; same experience and capabilities


Your company has bought you the latest and greatest and likely supports commercial token usage too
You can’t compare LLMs at scale to running it locally; same experience and capabilities


Which are increasingly out of reach for a normal person. Phones let alone PC hardware have increased exponentially in recent history


Describe greased lightning, because it’s much slower and needs to handle compression for context
We’re moving in that direction but an M5 is not what the majority of people are running at home


How does that compare to closed models that Anthropic offers, at the context and scale they offer.
I run Qwen3.6 27B locally and it’s usable with 16G vram but still not the same as a data centre of Blackwell clusters.


It’s not the 90s anymore. Unless there’s a compression algorithm putting billions of relationships into a manageable size, local AI is highly specific under 8G vram (text-to-speech as an example is under 1G) let alone the context required for keeping a conversation or writing code.


What’s the cost of the compute you have to run something locally?
Majority of people don’t have 32G of vram to run something remotely as capable


Hacktivists on GitHub. Show me your forgejo


deleted by creator


These systems support a latent load so it’s not all at once. Something like this but at a massive scale.
https://www.ti.com/lit/an/slva670a/slva670a.pdf
Very cool engineering.


Yes, this is similar to opencode or hermes. A gateway platform to integrate LLMs and tools
Likely something like this
https://www.fresh222.com/loss-prevention-system-for-retail-stores-and-warehouses/


This is the way. Computer use agents are common and can easily ‘browse’ to a page and grab the content.


Tmux/screen foreign concepts


Quote me in full.
You can run it at scale, on huawei. You can also run it on a cpu


Thank you for proving my point. It can be run on a cpu
“It’s slow, it’s inefficient” it still runs
It’s a foundational model just like R1 was.


Yes, you can run it at scale. Which is why it uses Huawei hardware.
You can run it on anything, scaled or not


You can run it on CPU alone. Not surprising they’re building their own AI ecosystem
Marketing incumbent