Open Reproduction of DeepSeek-R1

AI
Open Source
Machine Learning
Developer Tools

The submitted repo is Hugging Face's Open-R1 project, an attempt to recreate DeepSeek-R1 in the open. The page sounds ambitious, but the key update buried in the repo is that only step 1 is done: a 350,000-example reasoning dataset called Mixture-of-Thoughts and a recipe for a 7B distilled model meant to match DeepSeek-R1-Distill-Qwen-7B. People reading closely said that is not the same thing as reproducing the full R1 model or its training pipeline.

If you care about genuinely reproducible LLM training, treat Open-R1 as a partial artifact and benchmark it against projects like OLMo, Nemotron, and OpenThoughts that expose more of the stack. For strategy and budgeting, assume that "open" and "reproduced" still need to be checked line by line, especially around datasets, validators, and training recipes.

June 11, 2026
github.com
Discuss on HN

Key insights

Validator shortcuts undermine reproduction claims

The code example with a TODO for a proper validator and a fallback to exact line-by-line stdout matching shows why many reproduction efforts look stronger in announcements than in implementation. For a reasoning model, evaluation logic is part of the result. If that piece is brittle or unfinished, matching reported capability becomes hard to trust even when weights and scripts are public.

When you assess an open model project, inspect the evaluators before you trust the benchmarks. If your team is building on one of these repos, budget time to replace placeholder reward and validation code.

Attribution:

spmurrayzzz #1

OLMo is the clearest open baseline

OLMo stood out because it releases the full datasets, not just weights and a recipe, and one commenter pointed to an independent reproduction by AMD as evidence that outsiders can actually rebuild something close to the original. Nemotron was treated as useful but weaker on openness because NVIDIA publishes only part of the training data blend. That difference matters more than model branding because the missing data is exactly what blocks outside verification.

If you want a serious reference for open LLM operations, start with OLMo and use Nemotron as a partial template. Ask vendors and research teams for dataset completeness, not just model cards and training scripts.

Attribution:

aesthesia #1
achrono #1
lambda #1

OpenThoughts is stronger on data curation

OpenThoughts got attention because it ships a widely used reasoning dataset and explains how that data was curated, which is the part many projects still wave away. Commenters also noted recent 32B Qwen3-based releases on Hugging Face, suggesting the project is still moving even if the public blog looks quiet. That made it a more actionable source for reasoning-data methodology than Open-R1's still-aspirational later steps.

If your bottleneck is reasoning data rather than base-model pretraining, study OpenThoughts before copying Open-R1. The curation recipe is likely to be more reusable than a headline claim about eventual full reproduction.

Attribution:

madiator #1
lambda #1
poppafuze #1

Training cost claims remain too fuzzy

The cost discussion landed in a very wide range. One commenter cited DeepSeek's own claim that R1 training cost $294,000, then contrasted it with OLMo 3's estimated market-rate cost of $2.75 million. That gap reinforces the same core problem as the reproducibility debate. Published numbers often hide donated compute, omitted stages, or selective accounting.

Do not use splashy training-cost numbers for planning without breaking them into compute, data, and post-training stages. For budgeting, carry scenarios from low seven figures upward unless the team also publishes auditable assumptions.

Attribution:

lambda #1

Against the grain

Age alone makes the repo irrelevant

Calling the project simply too old cuts against the more nuanced view that partial open artifacts still have value. The point is that the repo no longer tracks the frontier closely enough to anchor current expectations about reasoning-model reproduction, regardless of its original ambition.

If you need a current competitive stack, do not anchor on older replication efforts just because they were widely discussed. Check the last substantive milestone before treating a repo as a live reference.

Attribution:

yieldcrv #1

In plain english

7B ↩

A model with roughly 7 billion parameters, a common size label for large language models.

AMD ↩

Advanced Micro Devices, a major CPU vendor.

DeepSeek-R1 ↩

A reasoning-focused large language model from DeepSeek that was widely discussed for its performance and reportedly low training cost.

Mixture-of-Thoughts ↩

A curated dataset of reasoning examples released as part of Open-R1's first step.

Nemotron ↩

A model family from Nvidia that commenters referenced as part of the more fully open and reproducible end of the AI spectrum.

OLMo ↩

A family of AI models from Allen Institute for AI positioned as highly transparent and reproducible, with training details released publicly.

Open-R1 ↩

A Hugging Face project that aims to openly reproduce the DeepSeek-R1 training process and related models.

Qwen3 ↩

A family of large language models used in the benchmark examples.

stdout ↩

Standard output, the normal text stream a program writes as its output.

weights ↩

The learned numerical parameters inside a machine learning model that determine how it behaves.

Reference links

Open model projects

Open-R1 GitHub repository
The submitted project attempting an open reproduction of DeepSeek-R1.
OLMo GitHub repository
Presented as the best example of a fully open modern LLM training pipeline.
Nemotron GitHub repository
Referenced as a strong modern model family with an open recipe but incomplete released data.
OpenThoughts
Suggested as a better resource for reasoning-data curation and smaller reasoning models.

Documentation and technical evidence

Open-R1 rewards validator code
Used to show that a key validator still falls back to exact string matching on stdout.
Nemotron pretraining documentation
Quoted to show that only part of Nemotron's training data blend is openly released.
OpenThinker Agent Complete collection
Referenced as evidence that OpenThoughts has recent 32B Qwen3-based model releases.
AMD 1B language model article
Cited as an independent reproduction related to OLMo.

Cost references

TechSpot on DeepSeek R1 training cost claim
Source for the cited claim that DeepSeek said R1 cost $294,000 to train.
OLMo 3 paper
Referenced for OLMo 3's estimated market-rate training cost of $2.75 million.

Open Reproduction of DeepSeek-R1

Discussion mood

Key insights

Against the grain

In plain english

Reference links

Open model projects

Documentation and technical evidence

Cost references