LongCat-2.0, a large-scale MoE model with 1.6T total and 48B Active

AI
Hardware
Open Source
China
Infrastructure

LongCat-2.0 is Meituan’s new open-model announcement for a very large mixture-of-experts model. The post claims 1.6 trillion total parameters with 48 billion active, pretraining over 35 trillion tokens, and both training and serving on huge in-house AI ASIC clusters. It also highlights architectural choices like n-gram embeddings and pitches the model as a serious large-scale system rather than a consumer-local model. That framing drove the real interest. People read this less as “here is one more chatbot” and more as evidence that a Chinese company outside the usual AI-lab shortlist may have trained and deployed a giant model on a non-Nvidia stack.

Treat this as a hardware and ecosystem signal first, not just another model launch. If you depend on export controls or Nvidia lock-in as a moat, update that view now, but wait for released weights and independent use before treating LongCat itself as a frontier model worth building around.

June 30, 2026
longcat.chat
Discuss on HN

Key insights

The bigger story is the hardware stack

What stands out is not another giant parameter count. It is the claim that a company may have trained and deployed a model this large on Huawei Ascend class hardware despite the weaker surrounding software ecosystem. That changes the competitive picture because AI leadership depends on an entire stack of compilers, runtimes, networking, and operations, not just buying fast chips. If that stack is now good enough for full-scale training, export controls have already done the job of forcing an alternative ecosystem into existence.

Stop modeling China’s AI capacity as a direct function of Nvidia access. Track whether domestic software tooling and cluster operations are catching up, because that is the piece that turns substitute chips into a real platform.

Attribution:

gardnr #1
BoorishBears #1
chvid #1

Compute scale claims need frontier context

A big raw chip count does not mean frontier parity. One comment argued that a 50,000-chip system is still small compared with the largest Western training runs, which fits the idea that matching the frontier is costlier than following it with a similar architecture and lessons already visible in public work. The useful read is not that LongCat leapfrogged OpenAI or Anthropic. It is that fast followers can now assemble enough compute to field serious large models without the exact same budget or hardware base.

Separate 'credible large-scale model builder' from 'frontier leader' in your planning. Fast-following labs can become strategically relevant well before they match the very largest training runs.

Attribution:

throwa356262 #1 #2
mrngld #1

N-gram embeddings are the real technical novelty

The most concrete architectural point people pulled out was the continued use of n-gram embeddings, which LongCat had explored in earlier smaller releases. That matters because most launch chatter collapses into parameter counts, while this is one of the few specifics that could plausibly affect efficiency or capability in a nontrivial way. Commenters connected it to other efficiency work like low-bit models and saw it as part of a broader pattern of practical model-design experiments escaping the big US labs.

When the weights land, look past benchmark tables and inspect the architecture. If you care about efficiency, features like n-gram embeddings may be more reusable than whatever headline score this model posts.

Attribution:

Imustaskforhelp #1

Ad hoc LLM testing is easy to overread

The niche nuclear-fuel test sparked a better point than the result itself. One-shot prompts on obscure domains mostly show that evaluation design is hard, because wording, hidden assumptions, and randomness all matter. A more useful test would supply source material in context and probe whether the model can reason over it. That shifts the question from memorized trivia to applied competence, which is closer to how many teams actually use these systems.

Do not greenlight or reject a model on a single clever stump question. Build evals around your real workflows, include the context your product would provide, and run enough trials to see variance.

Attribution:

bel8 #1
icepush #1
teaearlgraycold #1

Open model does not mean locally usable

Even with mixture-of-experts sparsity, a 1.6 trillion parameter model with 48 billion active is far outside normal local setups. People argued over the exact practical cutoff, but the common point held: this is not a llama.cpp-on-a-laptop release for most users. Specialized high-memory machines and aggressive quantization might make experimentation possible, yet bandwidth and tooling are still the real constraints. The term 'open' here points more to access and ecosystem than to broad personal deployability.

If your strategy depends on local deployment, screen announcements for active parameter size, memory footprint, and runtime support before getting excited. Many 'open' launches are only open to teams with serious inference hardware.

Attribution:

lcampbell #1
nl #1
aetherspawn #1
hnfong #1

Missing artifacts undercut the launch

The absence of downloadable weights, the weak tooling support, and broken or missing Hugging Face assets made several readers treat the release as incomplete at best. That matters because open-model credibility now depends on operational details. Can people actually run it, integrate it, and inspect it. A glossy blog post without artifacts lands closer to marketing than to a meaningful open release, especially when the model lineage is already in question.

Judge open-model announcements by the release package, not the blog copy. Weights, licenses, runtime compatibility, and reproducible docs are what determine whether your team can do anything with it.

Attribution:

blagui #1
gwerbin #1
james2doyle #1
tcper #1
yorwba #1

Against the grain

This may be much closer to DeepSeek than advertised

Skeptics argued that the public materials do not make it obvious where LongCat ends and DeepSeek begins. With the preview release timing lining up with DeepSeek V4-Pro and key architecture choices looking familiar, the burden is on Meituan to show what is actually new. That does not mean there is no contribution. It means the current evidence is too thin to distinguish an original large training effort from a heavily derivative model release.

Until the full report and weights arrive, treat novelty claims conservatively. If your team tracks model vendors, keep separate notes for architecture reuse, post-training differences, and independently verified training claims.

Attribution:

doctorpangloss #1
MikuMikuMe #1

Political refusals still limit Chinese model utility

One simple test produced a refusal on a Mao question, and commenters treated that as expected rather than surprising. For many global use cases that is not a side issue. It is a product constraint, because refusal patterns leak into summarization, search, and enterprise knowledge work in ways that are hard to predict from benchmarks alone.

If you serve international users or sensitive domains, add political and historical prompts to your acceptance tests. Capability benchmarks will not tell you where a model’s hard refusal boundaries sit.

Attribution:

mlmonkey #1
gitowiec #1

In plain english

ASIC ↩

Application-Specific Integrated Circuit, a chip designed for a narrow class of tasks rather than general-purpose computing.

DeepSeek V4-Pro ↩

A large language model from DeepSeek that commenters suspected LongCat may partly build on or resemble.

llama.cpp ↩

A widely used open source project for running language models efficiently on local machines.

n-gram embeddings ↩

A method that represents short sequences of tokens together, rather than only individual tokens, to capture local patterns more directly.

pretraining ↩

The initial large-scale training phase where a model learns patterns from massive amounts of text or other data before any task-specific tuning.

quantization ↩

A technique that reduces the precision of model weights to shrink memory use and speed up inference, often with some quality tradeoff.

Reference links

Model and release pages

LongCat-2.0 on Hugging Face
Used to show that the model page existed but weights were not yet available.
LongCat Flash Chat on Hugging Face
Cited as evidence that Meituan had previously published a LongCat model and was not obviously faking the project.

Hardware and ecosystem references

Nitter post speculating about Huawei Ascend 910C use
Referenced to support the inference that LongCat may have used Huawei Ascend chips.
Dwarkesh Patel interview with Jensen Huang
Linked in a subthread about why the software and hardware ecosystem around GPUs matters as much as the chips themselves.

Background on Meituan

Meituan Wikipedia page
Shared to show that Meituan is a broad tech conglomerate, not just a food-delivery app.
Wang Xing Wikipedia page
Linked for background on Meituan’s founder.

Prior discussion

Earlier Hacker News submission about LongCat n-gram embeddings
Provided as background for the comment that n-gram embeddings were a notable ongoing LongCat research direction.

LongCat-2.0, a large-scale MoE model with 1.6T total and 48B Active

Discussion mood

Key insights

Against the grain

In plain english

Reference links

Model and release pages

Hardware and ecosystem references

Background on Meituan

Prior discussion