HN Debrief The signal in the discussion

Use your Nvidia GPU's VRAM as swap space on Linux

Infrastructure
Hardware
Linux
Developer Tools

The project exposes Nvidia VRAM as a block device and lets Linux use it as swap. The author built it for a hybrid-graphics laptop where the discrete Nvidia GPU and its VRAM were mostly idle, while the soldered system RAM was the real bottleneck. That framing mattered. People did not treat this as a general memory upgrade. They treated it as a practical hack for machines with fixed RAM, spare VRAM, and workloads that are bursty enough that you can reclaim the swap before launching a game, local model, or GPU job.

The main reaction was "fun idea, rough implementation." The posted benchmark of about 1.3 GB/s on an RTX 3070 laptop looked far too low given PCIe 4.0 bandwidth and GDDR6 bandwidth, and several comments pinned that on the software path, not the hardware. This design goes through user space, the Linux NBD stack, CUDA copies, and bounce buffers. That means lots of context switches and poor handling of tiny 4 KB page traffic, which is exactly the wrong shape for saturating PCIe. A few people noted that NVMe swap is actually a very optimized path already, so VRAM only wins if you care more about latency characteristics or avoiding SSD wear than raw throughput. Even then, the thread landed on a more sobering point. Linux swap itself is a bottleneck at higher speeds because page unmapping and TLB shootdowns are expensive. So this is not just "rewrite the driver and get 20x better numbers." There was also a hard boundary between "swap" and "real memory." Consumer GPU memory is generally not cache coherent with the CPU, so you cannot just add VRAM to system RAM and call it a 32 GB machine. At best you can use it as a slow, distant memory tier or as block-backed swap. That is why several people pointed to CXL, not PCIe 5, as the thing that would make expansion-card memory behave like actual RAM. Until then, VRAM sits in the awkward middle. Faster than disk in some ways, but still high latency and operationally fragile. That fragility came up a lot. If VRAM is occupied by swap, starting a game, an LLM, Wayland compositor work, or any other GPU-heavy task can create ugly failure modes unless the swap is drained first. The author said dynamic handoff is still on the to-do list. Laptop users also worried that keeping the discrete GPU "in use" would block power gating and hurt battery life. The consensus was that this is fine for plugged-in systems or desktops, but a bad default for mobile use. Comments also surfaced that this is not a brand new idea so much as a new implementation. Older Linux approaches used MTD or phram to map video memory. Other projects use OpenCL, FUSE, or similar tricks on AMD and Nvidia. That gave the project some legitimacy as part of a long-running class of hacks, while reinforcing the same lesson every time. It works. It is clever. It is not magic. If your box regularly needs this, the better fix is still more RAM, or at least zram plus a sane disk-backed swap setup.

Idle GPU memory can be repurposed for niche memory pressure problems, but today’s Linux and driver stack make it more of an interesting workaround than a credible replacement for RAM, NVMe swap, or forthcoming coherent memory standards like CXL.

26 May, 2026
github.com
Discuss on HN

Discussion mood

Curious and positive about the hack itself, but skeptical that it is broadly useful in its current form because the software path is slow, the operational edge cases are nasty, and normal NVMe or zram setups are often good enough.

Key insights

01 The disappointing benchmark is mostly a software architecture problem, not a verdict on VRAM.
Routing swap through NBD, user space, CUDA, and bounce buffers turns simple page traffic into a storm of copies and wakeups, while Linux swap internals add their own ceiling through page unmapping and TLB shootdowns. Even a cleaner driver would still run into kernel memory-management limits before getting close to PCIe or GDDR bandwidth.

The hardware is not the real bottleneck here. Linux swap and the chosen I/O path are.
- Teknoman117 #1
- lstodd #1
- tumblestick #1
02 Consumer VRAM is not just "more RAM on another bus.
" Without cache coherency, the CPU cannot safely treat GPU memory like normal system memory, which is why this has to be swap or another explicitly managed tier. CXL is the missing piece for real memory expansion over PCIe-style links, not a faster PCIe generation or a kernel toggle.

VRAM can be a storage tier. It cannot be transparent main memory on ordinary consumer hardware.
- Tuna-Fish #1 #2
- Teknoman117 #1
- tiberious726 #1
03 The viable use case is narrower and cleaner than the headline suggests.
This works when GPU-heavy and memory-heavy workloads do not overlap, and when you can `swapoff` the VRAM device before starting a game or local model. That makes it a tactical pressure valve for fixed-RAM machines, not a permanently enabled feature.

This only makes sense when your idle GPU and your memory-hungry apps take turns.
- c0dejedi #1
- Saris #1
- nuccy #1
- NortySpock #1
04 The operational risk is not just speed.
Keeping swap in VRAM can interfere with desktop graphics allocation and keep the discrete GPU from fully idling, which means possible Wayland crashes under VRAM pressure and worse battery life on laptops. The cost of "using idle VRAM" is that the GPU may stop being meaningfully idle.

VRAM swap competes with graphics stability and power management. That tradeoff is easy to ignore until it bites you.
- drdaeman #1
- kllrnohj #1
- nyrikki #1
- qingcharles #1

Against the grain

01 Avoiding SSD wear is one of the few clean arguments for this project.
If a machine does swap regularly, moving some of that churn off NAND and onto VRAM removes write-cycle concerns entirely. That benefit is real even if the performance case is mixed.

If flash endurance is your pain point, VRAM swap solves a problem NVMe never will.
- dannyw #1
- c0dejedi #1
02 SSD wear is usually overstated for mainstream use.
One commenter reported a decade-old heavily swapped laptop SSD that still looks healthy, while another pointed out that niche workflows like Gentoo builds can absolutely create enough churn for endurance to matter. The better reading is that wear is workload-specific, not a universal reason to prefer VRAM.

Flash wear is not fake, but it is also not a blanket justification for this hack.
- markhahn #1
- LtdJorge #1
03 Treating GPU memory as a first-class managed resource in the kernel is not obviously absurd.
The current project is a hack because the platform does not expose VRAM as coherent memory, but the broader instinct is sound. Operating systems increasingly need to manage multiple memory tiers, and accelerators are part of that future.

The implementation is niche today. The resource-management idea is bigger than this project.
- londons_explore #1

← Prev
19 / 26
Next →

Reference links

Kernel and systems research

Research paper on Linux swap bottlenecks and TLB shootdowns
Cited to argue that Linux memory-management overhead, not just this project's design, limits swap bandwidth at high speeds.
Linux PCI peer-to-peer DMA documentation
Referenced as an existing Linux mechanism for direct device-to-device data movement over PCIe.

Related VRAM-backed storage projects

ArchWiki: Swap on video RAM
Older Linux approaches to using video memory for swap or storage.
Overv vramfs
Earlier VRAM-backed filesystem project mentioned as a predecessor.
GpuRamDrive
Windows proof of concept for creating a virtual drive backed by GPU RAM.
GpuRamDrive AMD fork
Fork adding AMD support to the Windows GPU RAM drive idea.
nbdkit vram plugin
OpenCL-based VRAM block device plugin that works beyond Nvidia CUDA.
vramblk
Another implementation of the same concept using OpenCL.

GPU storage and interconnect references

Nvidia GPUDirect Storage documentation
Referenced as a way for GPUs to communicate more directly with storage devices.
Microsoft DirectStorage repository
Mentioned alongside GPUDirect Storage as related work for fast storage-to-GPU paths.

Background references from comments

Earlier Hacker News discussion about why Linux swap still matters
Used to support the argument that swap is useful for memory reclamation, not just emergency overflow.
IBM CAPI overview reference
Named as an example of the kind of hardware support needed for coherent attached memory, though no specific page was given.