Fixing a kubelet memory leak in Kubernetes 1.36
- Infrastructure
- Open Source
- Programming
The post is a debugging writeup from the engineer who found a kubelet memory leak in Kubernetes 1.36.0, 1.36.1, and 1.36.2. Kubelet is the per-node agent that manages pods, so a leak there grows on every affected node. The bug came from Go context cancellation logic in the pod sync path. A cancel function was stored and never called, which kept memory alive across repeated syncs. The author recommends checking `kubectl version`, inspecting each node’s `process_resident_memory_bytes`, and using a kubelet restart as a stopgap until 1.36.3 ships.
If you run Kubernetes 1.36.x, check kubelet resident memory on every node now instead of assuming liveness probes or basic health checks will catch this. More broadly, treat control-plane and node-agent resource regressions as something to monitor explicitly after upgrades, especially on small nodes where leaks surface first by evicting workloads, not by killing the leaking daemon.
- heyoncall.com
- Discuss on HN