Claude Code and Codex can have real-time conversation via Git

AI
Developer Tools
Open Source
Programming

The post demonstrates a small system that lets Claude Code and Codex pass messages through a Git repository in near real time. Instead of wiring the agents together directly, it stores intermediate state in Git so handoffs, risks, prompts, unresolved questions, and final decisions can be attached to the same branch or pull request as the code. That landed as the only distinctive part. Few people thought "agents can send text to each other" was new. Many already do it with append-only files, tmux panes, inbox and outbox folders, SQLite, GitHub issues, Jira, NATS, or custom tools. The conversation settled on a blunt point: transport is easy, oversight is hard.

If you are building multi-agent developer workflows, focus less on inventing a chat channel and more on reviewability, sandboxing, and human approval before changes ship. Git can be a useful substrate when you want provenance tied to branches and pull requests, but it is not the hard part and may add complexity you do not need.

June 4, 2026
medium.com
Discuss on HN

Discussion mood

Curious but skeptical. People liked the idea of tying agent handoffs to Git for auditability, but most felt the communication mechanism itself was ordinary and already achievable with simpler tools. The mood turned practical fast: sandbox agents, keep a visible trail, and do not mistake cross-agent chatter for real progress.

Key insights

Audit trail is the actual product

What makes this approach interesting is not agent-to-agent messaging. It is the attempt to make intermediate state reviewable inside normal code workflows. The useful artifact is a branch or pull request that captures review requests, risks, unresolved claims, prompts, handoffs, and final decisions. That reframes the tool from "chat for agents" into provenance for human reviewers. It also connects to a practical constraint raised nearby. Directly driving one tool from another may trigger new API terms and costs, so storing the process in Git can be a cheaper and more inspectable control plane.

If you build agent workflows, store reasoning checkpoints and handoffs where your team already reviews code. Optimize for what a human can inspect later, not for how clever the agents’ transport layer looks.

Attribution:

syumei #1
iandanforth #1
jeswin #1

Custom Git refs make behavior harder to reason about

Using dedicated refs keeps agent traffic out of the working tree, but it also hides state from the Git model most developers already understand. That makes ordinary operations harder to predict. Rebases may strand context commits, worktrees may not map cleanly to conversation state, and append-only JSONL logs still need a story for ordering and branching. The critique is not that Git is wrong. It is that once you step outside normal commits and branches, you lose a lot of battle-tested mental models and tooling.

If you put workflow state in Git, prefer structures that behave like standard Git objects and histories unless you have a strong reason not to. Your future debugging cost will come from edge cases around rebases, merges, and clones, not from writing messages.

Attribution:

snthpy #1

The winning pattern is constrained agents plus human review

The most grounded implementations treat agents as sandboxed workers with narrow permissions, not peers you trust to self-organize. People described running them in Proxmox VMs or Docker containers, giving them restricted GitHub access, and routing work through Jira, GitHub issues, and pull requests. That setup matters because the bottleneck is not getting agents to coordinate. It is getting a human to verify what they agreed to do before anything reaches production. The issue tracker becomes less a project management tool and more the safety rail that bounds scope and creates accountability.

Put every agent behind a permission boundary and a workflow tool your team already trusts. Make shipping contingent on human review of the artifact, not on whether two models appear to agree.

Attribution:

rigonkulous #1 #2 #3
resonious #1
mohsen1 #1
tuo-lei #1

Cross-checking models works best as critique, not consensus

Using one model to review another can be useful, but only if you expect disagreement and treat it as signal for your own review. A commenter building an electronic medical record said a fresh Claude session harshly criticized a model design proposed by an earlier session. Another said this is exactly why they ask a second model whether code actually solves the stated problem and matches spec before they inspect it themselves. The value is not that the models converge. It is that they expose weak assumptions and overconfident mistakes faster than a single pass does.

Use a second model as an adversarial reviewer on scoped tasks like spec compliance, security concerns, or architecture checks. Do not wait for model consensus before acting. Use disagreement to focus your own review.

Attribution:

mexicocitinluez #1
peddling-brink #1
F7F7F7 #1

Earlier multi-agent tools failed from complexity and burn

Past attempts in this space already show the failure mode. Beads was remembered as useful when simple, then degraded as features piled up and upgrade paths became painful. Gastown drew even harsher comments for janky setup, fragile sessions, and surprising token or subscription burn even while idle. That history sharpens the critique of any new coordination framework. The risk is not just technical overdesign. It is building a system that consumes money and attention faster than it produces reliable output.

Keep multi-agent infrastructure boring. Measure idle cost, upgrade friction, and setup reliability early, because those are the things that kill adoption long before raw capability does.

Attribution:

Game_Ender #1
theshrike79 #1
ffsm8 #1

Against the grain

Prompted coordination is less interesting than emergent use

The Git bridge does not prove much if both agents were explicitly instructed to use it. The more meaningful milestone would be an agent recognizing when a shared channel would help, discovering or proposing it, and coordinating its use with another agent on its own. That raises the bar from a scripted demo to adaptive behavior. It also hints at a future ambiguity. Once models are trained on patterns like this, it will get harder to tell whether a tactic was improvised or merely recalled.

Judge agent demos by when and why the agent chooses a coordination method, not just by whether the method works once shown. Autonomous tool selection is a more useful benchmark than successful message passing.

Attribution:

xg15 #1 #2
avaer #1

Multi-agent chatter may just mask bad task specification

Some people rejected the whole premise that more agent conversation adds value. In that view, the expense and complexity come from using multiple models to patch over weak requirements and vague prompts. If the task were specified tightly enough, extra negotiation between agents would mostly disappear. That does not kill the idea of review or provenance, but it does challenge the assumption that agent teams are the right default.

Before adding another model, tighten the task, interface, and success criteria for a single one. If the second agent only exists to compensate for ambiguity, fix the ambiguity first.

Attribution:

burgerone #1
varispeed #1

In plain english

A2A ↩

Agent-to-Agent, a protocol idea for software agents to discover and interact with each other.

Docker ↩

A platform for packaging software into portable containers so it runs consistently across different machines.

Git ↩

A distributed version control system used to track changes in source code and coordinate software development.

JSONL ↩

JSON Lines, a file format where each line is a separate JSON object.

NATS ↩

A lightweight messaging system used by applications to send events and data between services.

Proxmox ↩

A virtualization platform commonly used to run virtual machines and containers on a server.

SQLite ↩

A widely used embedded database engine that runs inside an application rather than as a separate server.

tmux ↩

A terminal multiplexer that lets a user run and manage multiple terminal sessions inside one window.

Reference links

Agent protocols

A2A Protocol
Mentioned as the newer protocol effort that ACP was rolled into.
Agent Communication Protocol introduction
Cited as an existing protocol that can let agents control one another.
Agent CLIENT Protocol
Linked to clarify it is a different project from ACP/A2A.

Agent coordination tools and repos

Deciduous
Example tool that keeps agent decisions in a DAG so other agents can read and respond.
Beads
Referenced as a related agent collaboration and external memory system.
Beans
Suggested as a simpler alternative to Beads.
grpvn
Shared as a small Go and SQLite app for agents to talk to each other.
piclaw
Tool with a chat feature so active model sessions can talk to each other.
mori
Broader tool that uses NATS pub-sub so agents can communicate and consult each other.

Git and context syncing ideas

csp
Similar experiment for syncing agent context over Git.
grug-brain.mcp
Memory system built on Git and used across computers.
git-meta
Suggested as a more suitable basis for this kind of Git-backed coordination.

Operational tooling and examples

synadia-agents
Offered as another protocol or tooling direction for supervised agent systems.
NATS Messaging
Linked to explain that NATS is a real-time messaging system rather than a memory store.
Tmux-cli
Shared as a wrapper that makes tmux-based agent communication more reliable.