HN Debrief

Claude Code and Codex can have real-time conversation via Git

  • AI
  • Developer Tools
  • Open Source
  • Programming

The post demonstrates a small system that lets Claude Code and Codex pass messages through a Git repository in near real time. Instead of wiring the agents together directly, it stores intermediate state in Git so handoffs, risks, prompts, unresolved questions, and final decisions can be attached to the same branch or pull request as the code. That landed as the only distinctive part. Few people thought "agents can send text to each other" was new. Many already do it with append-only files, tmux panes, inbox and outbox folders, SQLite, GitHub issues, Jira, NATS, or custom tools. The conversation settled on a blunt point: transport is easy, oversight is hard.

If you are building multi-agent developer workflows, focus less on inventing a chat channel and more on reviewability, sandboxing, and human approval before changes ship. Git can be a useful substrate when you want provenance tied to branches and pull requests, but it is not the hard part and may add complexity you do not need.

Discussion mood

Curious but skeptical. People liked the idea of tying agent handoffs to Git for auditability, but most felt the communication mechanism itself was ordinary and already achievable with simpler tools. The mood turned practical fast: sandbox agents, keep a visible trail, and do not mistake cross-agent chatter for real progress.

Key insights

  1. 01

    Audit trail is the actual product

    What makes this approach interesting is not agent-to-agent messaging. It is the attempt to make intermediate state reviewable inside normal code workflows. The useful artifact is a branch or pull request that captures review requests, risks, unresolved claims, prompts, handoffs, and final decisions. That reframes the tool from "chat for agents" into provenance for human reviewers. It also connects to a practical constraint raised nearby. Directly driving one tool from another may trigger new API terms and costs, so storing the process in Git can be a cheaper and more inspectable control plane.

    If you build agent workflows, store reasoning checkpoints and handoffs where your team already reviews code. Optimize for what a human can inspect later, not for how clever the agents’ transport layer looks.

      Attribution:
    • syumei #1
    • iandanforth #1
    • jeswin #1
  2. 02

    Custom Git refs make behavior harder to reason about

    Using dedicated refs keeps agent traffic out of the working tree, but it also hides state from the Git model most developers already understand. That makes ordinary operations harder to predict. Rebases may strand context commits, worktrees may not map cleanly to conversation state, and append-only JSONL logs still need a story for ordering and branching. The critique is not that Git is wrong. It is that once you step outside normal commits and branches, you lose a lot of battle-tested mental models and tooling.

    If you put workflow state in Git, prefer structures that behave like standard Git objects and histories unless you have a strong reason not to. Your future debugging cost will come from edge cases around rebases, merges, and clones, not from writing messages.

      Attribution:
    • snthpy #1
  3. 03

    The winning pattern is constrained agents plus human review

    The most grounded implementations treat agents as sandboxed workers with narrow permissions, not peers you trust to self-organize. People described running them in Proxmox VMs or Docker containers, giving them restricted GitHub access, and routing work through Jira, GitHub issues, and pull requests. That setup matters because the bottleneck is not getting agents to coordinate. It is getting a human to verify what they agreed to do before anything reaches production. The issue tracker becomes less a project management tool and more the safety rail that bounds scope and creates accountability.

    Put every agent behind a permission boundary and a workflow tool your team already trusts. Make shipping contingent on human review of the artifact, not on whether two models appear to agree.

      Attribution:
    • rigonkulous #1 #2 #3
    • resonious #1
    • mohsen1 #1
    • tuo-lei #1
  4. 04

    Cross-checking models works best as critique, not consensus

    Using one model to review another can be useful, but only if you expect disagreement and treat it as signal for your own review. A commenter building an electronic medical record said a fresh Claude session harshly criticized a model design proposed by an earlier session. Another said this is exactly why they ask a second model whether code actually solves the stated problem and matches spec before they inspect it themselves. The value is not that the models converge. It is that they expose weak assumptions and overconfident mistakes faster than a single pass does.

    Use a second model as an adversarial reviewer on scoped tasks like spec compliance, security concerns, or architecture checks. Do not wait for model consensus before acting. Use disagreement to focus your own review.

      Attribution:
    • mexicocitinluez #1
    • peddling-brink #1
    • F7F7F7 #1
  5. 05

    Earlier multi-agent tools failed from complexity and burn

    Past attempts in this space already show the failure mode. Beads was remembered as useful when simple, then degraded as features piled up and upgrade paths became painful. Gastown drew even harsher comments for janky setup, fragile sessions, and surprising token or subscription burn even while idle. That history sharpens the critique of any new coordination framework. The risk is not just technical overdesign. It is building a system that consumes money and attention faster than it produces reliable output.

    Keep multi-agent infrastructure boring. Measure idle cost, upgrade friction, and setup reliability early, because those are the things that kill adoption long before raw capability does.

      Attribution:
    • Game_Ender #1
    • theshrike79 #1
    • ffsm8 #1

Against the grain

  1. 01

    Prompted coordination is less interesting than emergent use

    The Git bridge does not prove much if both agents were explicitly instructed to use it. The more meaningful milestone would be an agent recognizing when a shared channel would help, discovering or proposing it, and coordinating its use with another agent on its own. That raises the bar from a scripted demo to adaptive behavior. It also hints at a future ambiguity. Once models are trained on patterns like this, it will get harder to tell whether a tactic was improvised or merely recalled.

    Judge agent demos by when and why the agent chooses a coordination method, not just by whether the method works once shown. Autonomous tool selection is a more useful benchmark than successful message passing.

      Attribution:
    • xg15 #1 #2
    • avaer #1
  2. 02

    Multi-agent chatter may just mask bad task specification

    Some people rejected the whole premise that more agent conversation adds value. In that view, the expense and complexity come from using multiple models to patch over weak requirements and vague prompts. If the task were specified tightly enough, extra negotiation between agents would mostly disappear. That does not kill the idea of review or provenance, but it does challenge the assumption that agent teams are the right default.

    Before adding another model, tighten the task, interface, and success criteria for a single one. If the second agent only exists to compensate for ambiguity, fix the ambiguity first.

      Attribution:
    • burgerone #1
    • varispeed #1

In plain english

A2A
Agent-to-Agent, a protocol or category of systems for software agents to communicate with each other.
Docker
A platform for packaging and running software in isolated containers so it behaves consistently across environments.
Git
A version control system developers use to track code changes and switch between branches of a project.
JSONL
JSON Lines, a text format where each line is a separate JSON object, often used as an append-only log.
NATS
A lightweight messaging system for publish-subscribe and request-reply communication between services.
Proxmox
An open source platform for running virtual machines and containers on a server.
SQLite
A lightweight embedded database that stores data in a single file.
tmux
A terminal multiplexer that lets one terminal window host multiple persistent command-line sessions and panes.

Reference links

Agent protocols

Agent coordination tools and repos

  • Deciduous
    Example tool that keeps agent decisions in a DAG so other agents can read and respond.
  • Beads
    Referenced as a related agent collaboration and external memory system.
  • Beans
    Suggested as a simpler alternative to Beads.
  • grpvn
    Shared as a small Go and SQLite app for agents to talk to each other.
  • piclaw
    Tool with a chat feature so active model sessions can talk to each other.
  • mori
    Broader tool that uses NATS pub-sub so agents can communicate and consult each other.

Git and context syncing ideas

  • csp
    Similar experiment for syncing agent context over Git.
  • grug-brain.mcp
    Memory system built on Git and used across computers.
  • git-meta
    Suggested as a more suitable basis for this kind of Git-backed coordination.

Operational tooling and examples

  • synadia-agents
    Offered as another protocol or tooling direction for supervised agent systems.
  • NATS Messaging
    Linked to explain that NATS is a real-time messaging system rather than a memory store.
  • Tmux-cli
    Shared as a wrapper that makes tmux-based agent communication more reliable.