Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model
- AI
- Open Source
- Government
- Machine Learning
Rio’s municipal IT arm released Rio-3.5-Open-397B as a locally developed Qwen 3.5 derivative that supposedly beat comparable open models and had been trained with public backing. The linked GitHub issue claims the uploaded weights were not a fresh post-trained model, but an almost exact 60 percent Nex-N2 Pro and 40 percent Qwen 3.5 blend. That matters because Nex itself is already a Qwen-derived model, so this is not some impossible cross-model hack. It is a known merge technique that can work when models share the same architecture and training lineage. Once people noticed the model answering with Nex’s name and compared tensors, the claim shifted. The Hugging Face page was updated to say the release was a merge followed by on-policy distillation, and that the wrong checkpoint had been uploaded by mistake. Most people did not buy that explanation, largely because no corrected model appeared quickly and the archived benchmarks looked roughly halfway between Qwen and Nex, which is exactly what a simple merge would suggest. The conversation landed in two places. First, the scandal is about misrepresenting process, not about violating some sacred ownership norm in open weights. Second, model merging itself is real but easy to oversell. It can nudge benchmark scores, especially when combining a base model with one of its fine-tunes, yet often produces chimera models that look better on narrow evals than in broad real use.
Treat flashy model launches the way you treat security claims. Ask for model cards, reproducible evals, and evidence of the actual training pipeline before you give credit, budget, or press coverage.
- github.com
- Discuss on HN