HackerRank open sourced its ATS. My resume scored 90/100. Oh wait 74. No – 88
- AI
- Hiring
- Regulation
- Machine Learning
The post examined HackerRank’s open-source hiring tool, which uses multiple LLM calls to extract resume data and assign a score out of 100 plus bonuses. Running the same resume repeatedly produced materially different results, enough to move a candidate above or below an arbitrary cutoff. That made the author’s core point easy to grasp even for non-ML readers: if the same input can swing from pass to fail, the score is not a stable measurement. People dug into the mechanics, but the useful conclusion was broader than temperature settings or sampler details. LLM scoring is noisy, and turning the noise deterministic would not fix the underlying issue that the rubric itself is thin, subjective, and badly aligned with actual hiring quality.
If you use LLMs to rank or score candidates, treat the output as noisy triage at best, not a decision. Audit the rubric first, measure variance across repeated runs, and assume legal and fairness risk if public artifacts like GitHub activity stand in for job quality.
- danunparsed.com
- Discuss on HN