Where people landed is pretty clear. Qwen 3.6 27B is one of the first local coding models that feels broadly useful rather than a toy, but the economics of running it locally are still situational. If you care about privacy, offline use, censorship resistance, or simply owning the stack, local runs make sense. If you only care about getting the best coding help per dollar, API access still wins by a mile. Several commenters did the math and concluded that even expensive hosted usage would take years to catch up with the cost of a maxed MacBook. Others pushed back that tokens are a metered dependency, while hardware is an owned asset and avoids shipping code to a provider.
The most useful technical correction was about hardware tradeoffs. For dense models like Qwen 3.6 27B, memory bandwidth matters as much as raw RAM, which is why Apple Silicon is attractive in the first place. But that same fact means a quiet headless Mac Mini,
Strix Halo box,
DGX Spark, or used multi-GPU desktop can be a better fit than a laptop depending on whether you want portability, silence, context length, or raw
tokens per second. A repeated theme was that running heavy local inference on the same laptop you are actively using is unpleasant. Heat, fan noise, battery drain, and UI lag are real. Plenty of people have instead settled on a dedicated box on the local network and connect to it from a lighter client machine.
People also drew a sharper line between the dense
27B model and Qwen’s faster sparse variants like
35B-A3B. The sparse models can feel much snappier and are often “good enough” for planning, tool use, and background agents, but several people said the dense 27B is still the smarter coding model when tasks get harder. That led to a broader point. Benchmarks and greenfield demos flatter local models. The real test is messy existing codebases, long context, tool calling, and edit-heavy sessions. There the consensus was more modest. Qwen 3.6 27B is good enough to accelerate real work, especially when tightly scoped, but it still falls short of frontier cloud models on difficult brownfield tasks.
A final thread running through the comments was that local use is valuable even when it is not economically optimal. A lot of people are using smaller local models to learn the stack, understand the jargon, and get a concrete feel for how model weights, runtimes, quants, context windows, and tool calls actually behave. That educational and strategic value came up almost as often as coding performance itself. The mood was not that local models have already won. It was that Qwen 3.6 27B makes the category impossible to dismiss anymore.