Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

AI
Open Source
Government
Machine Learning

Rio’s municipal IT arm released Rio-3.5-Open-397B as a locally developed Qwen 3.5 derivative that supposedly beat comparable open models and had been trained with public backing. The linked GitHub issue claims the uploaded weights were not a fresh post-trained model, but an almost exact 60 percent Nex-N2 Pro and 40 percent Qwen 3.5 blend. That matters because Nex itself is already a Qwen-derived model, so this is not some impossible cross-model hack. It is a known merge technique that can work when models share the same architecture and training lineage. Once people noticed the model answering with Nex’s name and compared tensors, the claim shifted. The Hugging Face page was updated to say the release was a merge followed by on-policy distillation, and that the wrong checkpoint had been uploaded by mistake. Most people did not buy that explanation, largely because no corrected model appeared quickly and the archived benchmarks looked roughly halfway between Qwen and Nex, which is exactly what a simple merge would suggest. The conversation landed in two places. First, the scandal is about misrepresenting process, not about violating some sacred ownership norm in open weights. Second, model merging itself is real but easy to oversell. It can nudge benchmark scores, especially when combining a base model with one of its fine-tunes, yet often produces chimera models that look better on narrow evals than in broad real use.

Treat flashy model launches the way you treat security claims. Ask for model cards, reproducible evals, and evidence of the actual training pipeline before you give credit, budget, or press coverage.

June 14, 2026
github.com
Discuss on HN

Key insights

Why the merge could work at all

Because Nex-N2 Pro is itself a Qwen 3.5 derivative, the alleged blend is not mixing unrelated brains. It is combining a base model with one of its descendants, which makes direct weight interpolation much more plausible. Comments pointed to prior work on model soups and loss-surface smoothness to explain why these blends can stay functional instead of collapsing.

Do not generalize from this case to “any two models can be merged.” If you are evaluating or attempting model merges, first check shared architecture, tokenizer, and lineage.

Attribution:

x312 #1
oofbey #1
nightpool #1
bwhitty #1

Benchmark gains are exactly where merges mislead

A few commenters pushed past the drama and looked at the likely performance profile. The archived Rio numbers appeared to sit roughly between Qwen and Nex, which is what you would expect from a weighted blend. That fits a common pattern where merged or surgically modified models show a bump on a few targeted evals, then lose coherence on broader tasks or long reasoning chains.

If a merged model posts surprising leaderboard wins, test it on your own workload before switching. Narrow benchmark uplift is weak evidence for production quality.

Attribution:

Aurornis #1
andai #1
manquer #1
avereveard #1

The giveaway was the model naming itself

The fastest clue was behavioral, not forensic. Without a system prompt, the model reportedly identified itself as Nex, which suggests fine-tuned identity text survived inside the weights. That is a useful reminder that post-training leaves recognizable fingerprints, and so does failing to do the post-training you claimed.

Simple prompting can expose provenance issues before you run deeper analysis. Ask a model about its identity, style, and baked-in behaviors when you are vetting a supposedly new release.

Attribution:

jdiff #1 #2

The public funding angle raised the stakes

What turned this from ordinary open-model drama into a political story was the public bragging. Commenters pointed to the mayor’s post describing the model as publicly funded and trained in Rio over the last year. Once officials tie civic prestige and taxpayer money to a technical claim, a sloppy checkpoint story stops looking like harmless launch chaos.

If public money or executive sponsorship is involved, demand artifact-level auditability before the announcement. Governance risk shows up faster than model quality risk.

Attribution:

jdiff #1
low_tech_love #1
mgambati #1

Against the grain

The wrong-checkpoint explanation is still testable

One defense held that the team may have uploaded the pre-distillation merge while the real contribution was on-policy distillation applied afterward. That would not excuse the launch, but it would change the technical conclusion from “pure fabrication” to “bad release hygiene plus bad communications.” The claim lives or dies on whether a distinct final checkpoint ever appears and can be verified.

Leave a narrow lane open for operational error until the artifacts settle. If a team says the wrong model was uploaded, the next move is simple: wait for the replacement and compare weights and evals.

Attribution:

rafaquintanilha #1

A city experimenting with local AI is reasonable

A few Brazilian commenters argued that the embarrassing launch should not erase the underlying goal. They would rather see government invest in domestic AI capability than depend entirely on foreign vendors, especially in countries without a strong private AI sector. The bad part is the apparent misrepresentation, not the idea of public-sector model work itself.

Do not let this incident harden into “governments should never build AI.” Separate the legitimacy of strategic local capability from whether this specific project earned trust.

Attribution:

thimabi #1 #2

In plain english

architecture ↩

The structural design of a neural network, including layer layout, dimensions, and other core building choices.

checkpoint ↩

A saved snapshot of a model's weights at a particular stage of training or post-training.

evals ↩

Evaluations, usually repeatable tests used to measure how well a model performs on specific tasks.

Hugging Face ↩

A company and platform widely used to host, share, and run machine learning models and datasets.

model soups ↩

A research term for combining multiple trained models or fine-tunes by averaging their weights.

Nex-N2 Pro ↩

A separately released open-weight language model that commenters say was itself derived from Qwen 3.5.

on-policy distillation ↩

A distillation method where training uses outputs generated by the current student model, then scores or improves them using a teacher.

post-training ↩

The work done after a base model is trained, such as tuning behavior, improving safety, or adapting it to tasks.

Qwen ↩

A family of language models from Alibaba that the authors mentioned as a future student base for further tests.

Reference links

Primary evidence and official model pages

GitHub issue on Nex-N2 repo alleging Rio model is a merge
Primary source for the weight-comparison allegation and provenance investigation
Rio-3.5-Open-397B Hugging Face page
Official model page that commenters say was updated to add the merge explanation
Nex-N2-Pro Hugging Face page
Reference model used in the alleged merge and benchmark comparisons

Archived evidence and public statements

Archived Rio model card on Wayback Machine
Used to compare the earlier benchmark claims and pre-edit model card text
Mayor’s X post describing the model as publicly funded and trained in Rio
Cited as evidence that public officials were publicly claiming a homegrown, publicly funded training effort

Model merging references

Hugging Face PEFT model merging guide
Practical explainer for how direct weight merging works
Model soups paper
Frequently cited paper showing that averaging compatible model weights can work surprisingly well
Mode connectivity and linear interpolation paper
Cited to explain why linear combinations of trained models can remain in a low-loss region
SwiReasoning paper
Paper named on the Rio model card as part of the claimed inference or training approach

Related background references

Folklore.org: A Rich Neighbor Named Xerox
Source for the Bill Gates quote used as an analogy about copying and attribution
MIT Thickets
Dropped into the conversation as a related reference on representation geometry and model internals

Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

Discussion mood

Key insights

Against the grain

In plain english

Reference links

Primary evidence and official model pages

Archived evidence and public statements

Model merging references

Related background references