Pick your backend (Go or Rust) and frontend above. Watch the spotlight: by the merge step, two people typing in the same spot — even one of them offline — produce one agreed document, with no lock, no “resolve conflict” dialog, and no lost keystrokes. That guarantee is the whole project. The first time you see convergence is the ★ merge step’s unit test — two inserts at the same spot, both survive, byte-equal — about a third of the way in; everything before it is the scaffolding that makes that test honest.
Understand why CRDTs beat locks and OT
IntermediateCompare the three ways to make collaborative editing work — locking, Operational Transformation, and CRDTs — so you can see why a CRDT is the only one that merges concurrent and offline edits without a central arbiter.
New in this step
CRDT A Conflict-free Replicated Data Type — a data structure whose merge is built in, so concurrent edits always combine into one agreed result with no central server.
Operational Transformation (OT) The older approach (early Google Docs) where a central server rewrites each op against the others; correct but needs a transform function for every pair of op types.
strong eventual consistency The guarantee that any two replicas which have seen the same set of edits hold the same state, regardless of order — what a CRDT gives you for free.
commutative operations Operations whose result does not depend on order; applying them in any sequence, with duplicates, lands the same state — the property that removes the need for a lock or arbiter.
Locks, OT, CRDTs — the trade-off
Locking (one writer at a time) kills the experience and can’t handle offline edits. Operational Transformation (OT, what early Google Docs used) is correct but notoriously hard: every pair of op types needs a transform function, and a central server must order everything. A CRDT (Conflict-free Replicated Data Type) is a data structure designed so that concurrent updates commute — apply the same set of ops in any order, on any replica, with duplicates, and you reach the same state (strong eventual consistency). No central arbiter, offline-friendly, and the merge is built into the type. The cost is a cleverer data model and per-element metadata — which is where a fast runtime earns its keep.
Set up the toolchain, Redis, and Postgres locally
BeginnerInstall your backend’s toolchain plus a CLI WebSocket client and start Redis and Postgres in Docker — the one-command local stack every later step assumes is already running.
New in this step
Docker Compose A YAML file that defines and runs containers (here Redis and Postgres) so everyone gets the same throwaway services with docker compose up.
WebSocket A long-lived, two-way connection over one TCP socket — the transport that lets the server push every collaborator’s keystroke to every client in real time.
websocat A command-line WebSocket client; you open two of them into one room to drive two collaborators and watch them converge without writing any UI.
Redis An in-memory data store used here for two jobs: pub/sub fan-out of edits across server instances, and presence (who is online).
PostgreSQL The durable database that holds compacted snapshots of each document so a room survives a server restart.
REDIS_URL / DATABASE_URL Environment variables holding the connection strings; reading them from the env means the same build runs locally and in the cloud.
What you need on the machine first
This is the only environment-setup step, so it covers everything later steps assume. Install the toolchain
for the backend you picked: Go (go.dev/dl) for the default path, or Rust via
rustup for the spotlight path. Install websocat (or wscat) — a CLI WebSocket
client you’ll use to drive two clients into one room and watch them converge. The AI feature modules
(Copilot, semantic search, /ai commands) need a free Gemini key; everything in the base build runs without
one. Costs nothing: Redis and Postgres run in local Docker, the toolchains and websocat are free, and the
Gemini free tier covers the optional features.
docker-compose.yml
services:
redis:
image: redis:7
ports: ["6379:6379"]
db:
image: postgres:16
environment: { POSTGRES_PASSWORD: dev, POSTGRES_DB: concord }
ports: ["5432:5432"]Run it + set the env every later step reads
docker compose up -d
export REDIS_URL="redis://localhost:6379"
export DATABASE_URL="postgres://postgres:dev@localhost:5432/concord?sslmode=disable"
redis-cli ping # PONG
# CLI WebSocket client for the demo (one of):
cargo install websocat # or: brew install websocat / npm i -g wscat
# Only for the optional AI features — create a free key at https://aistudio.google.com/apikey :
export GEMINI_API_KEY="your-key-here"What success looks like
redis-cli ping answers PONG and both env vars are exported — every later server reaches Redis on :6379 and Postgres on :5432.
Model the document as a sequence CRDT
IntermediateRepresent the text as an ordered list of elements, each with a globally-unique, totally-ordered id and a tombstone flag — the shape that lets every replica sort the same characters the same way with no server assigning positions.
New in this step
sequence CRDT A CRDT specialised for ordered text — it merges concurrent inserts and deletes into one agreed sequence of characters.
RGA (Replicated Growable Array) A common sequence-CRDT design where each insert names the element it goes after; it is the algorithm you hand-roll on the Go path.
(site, counter) identifier A character’s globally-unique id: site names the replica that created it and counter is that replica’s local tick, so no two replicas ever mint the same id.
totally-ordered ids A rule that puts any two ids in a definite order on every replica, so insertions never collide and never need a server to assign positions.
tombstone A deleted element kept in place but flagged hidden, so a concurrent edit that referenced it still resolves instead of pointing at nothing.
Stable ids + tombstones = order without coordination
A text CRDT (RGA / YATA-style) gives every inserted character a unique id (siteId, counter) and records
which element it was inserted after. Because ids are globally unique and totally ordered, any two replicas
sort the same elements the same way — so insertions never collide and never need a server to assign
positions. Deletes don’t remove the element; they set a tombstone so concurrent edits referencing it
still resolve. Garbage-collecting tombstones safely is an advanced topic; snapshot compaction later handles
the practical size, so you never have to.
The element shape (illustrative)
// one character/element in the sequence
{
"id": { "site": "a3f9", "counter": 42 }, // unique + totally ordered
"after": { "site": "a3f9", "counter": 41 }, // the element this was inserted after
"value": "h",
"deleted": false // tombstone
}Define the op and sync wire protocol
IntermediateSpecify the messages clients and the server exchange — edit ops, an initial state sync, and ephemeral presence — as one canonical envelope so the server, the tests, and every client serialize the exact same bytes.
New in this step
wire protocol The exact byte/JSON shape of every message on the connection; pinning it once means the server, tests, and all clients agree without re-inventing the format per step.
t tag (sync / op / presence) A discriminator field naming each message’s type, so a receiver routes a frame correctly; an unknown t is ignored to keep the protocol forward-compatible.
seq A per-client monotonically increasing op counter the sender stamps on each op; it lets a reconnecting client de-duplicate its own replayed ops, but never gates convergence.
base64-encoded state Binary engine state encoded as ASCII text so it rides safely inside JSON; sync.state carries the full document this way.
RGA op (kind / id / after / value) — Go path On the Go backend, op.data is the self-describing RGA edit itself — insert a value with an id after another id, or delete an id — so the client speaks element ids.
yrs update v2 (base64) — Rust path On the Rust backend, op.data is an opaque, base64-encoded encode_update_v2() blob; the yrs engine has no element ids on the wire, so the client just round-trips the update bytes.
One envelope, two engines: op.data is engine-specific
Every message has a type t. A newcomer first gets one sync (the current state) before it can apply live
ops; each edit is an op; presence (cursor, name, colour) is presence — ephemeral, broadcast but never
persisted. The subtlety that bites cross-backend reuse: the inside of an op depends on the engine.
- Go (hand-rolled RGA):
op.datais the RGA op itself —insertafter an id, ordeletean id. The client speaks element ids. - Rust (yrs):
op.datacarries an opaque, base64-encodedencode_update_v2()update — the yrs engine has no element ids at the wire; the client round-trips update bytes.
Both are the same envelope ({"t":"op","seq":N,"data":{…}}) and both converge under the CRDT guarantee —
they just disagree about what data holds. The seq is a per-client, monotonically increasing op counter
the sender stamps on each op; it lets a client de-duplicate its own replayed ops on reconnect and gives the
log a stable order, but it never gates convergence (the CRDT tolerates gaps and duplicates regardless). Keep
the protocol small and versioned; ignore an unknown t so the wire stays forward-compatible.
Wire messages (the canonical contract)
// server -> newcomer, once on connect: base64 of the engine's full-state encoding
{ "t": "sync", "state": "<base64 state>" }
// either direction, per edit — SAME envelope, engine-specific data:
{ "t": "op", "seq": 41, "data": { "kind": "insert", "id": {"site":"a3f9","counter":42},
"after": {"site":"a3f9","counter":41}, "value": "h" } } // Go (RGA)
{ "t": "op", "seq": 42, "data": { "kind": "delete", "id": {"site":"a3f9","counter":40} } } // Go (RGA)
{ "t": "op", "seq": 41, "data": { "update": "<base64 yrs update>" } } // Rust (yrs)
// ephemeral — broadcast, never persisted into the document:
{ "t": "presence", "user": "kerim", "cursor": 128, "color": "#2DD4BF" }Make the contract a real source file, not a comment
This envelope is referenced by every step downstream — the scaffold, the merge, fan-out, snapshots, the demo,
and the frontends — so it should exist as one canonical type module per backend, not as prose each step
re-invents. The deliverable below produces exactly that: internal/wire/wire.go in Go (the Msg, Op, and
ID types every package imports) and src/wire.rs in Rust (a serde-derived Msg). Defining the shape once,
in one file, is what guarantees the server, the tests, and the client all serialize the same bytes — and
it is the single place to change if the contract ever grows.
Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (produce BOTH backends' canonical wire module).
Context: The wire contract above — {"t":"sync","state":<base64>}, {"t":"op","seq":N,"data":{...}},
{"t":"presence","user","cursor","color"} — with op.data engine-specific (Go = RGA op {kind,id,after,value};
Rust = {update:<base64>}). This is the ONE shared type module both backends import.
Task: Create the canonical wire-types module for each backend:
- Go: internal/wire/wire.go with Msg (t, plus the per-variant fields), Op (seq + data), and ID {Site,Counter}.
- Rust: src/wire.rs with a serde Serialize/Deserialize Msg mirroring the same JSON.
Requirements:
- One file per backend; every other package/module imports these types rather than re-declaring the shape.
- Go: json struct tags match the wire keys exactly (t, seq, data, state, user, cursor, color); ID is
{Site,Counter} as in the crdt package. Use json.RawMessage (or an interface) for op.data so the RGA op
rides inside unchanged.
- Rust: #[serde(tag = "t")] enum (or a struct with an Option set) so {"t":"sync"|"op"|"presence"} round-trips;
op.data carries {"update": String} (base64). An unknown t deserializes without erroring (forward-compatible).
- No behaviour — pure types + (de)serialization. Round-tripping a sample of each frame is byte-stable.
Tests / acceptance:
- Go: a table test marshals each of sync/op/presence and unmarshals back to an equal value; `go vet ./...` passes.
- Rust: a serde round-trip test for each variant; `cargo clippy -- -D warnings` is clean.
Output: a unified diff (both files) plus one line on why the shape lives in exactly one module.What success looks like
The round-trip test goes green on both backends: marshalling each of sync/op/presence and unmarshalling back yields an equal value, byte-stable. The op envelope keeps op.data opaque to the wire types — Go carries the RGA op JSON {kind,id,after,value} inside, Rust carries {"update":"<base64>"} — and an unknown t deserializes without erroring, so the contract stays forward-compatible.
Scaffold the Rust sync server (axum + WebSocket)
Rust IntermediateCreate a Rust axum service that upgrades GET /rooms/{id}/ws to a WebSocket and echoes a hello frame — the transport skeleton you hang the CRDT engine on next.
New in this step
axum A Rust web framework (on tokio) that routes HTTP and upgrades WebSocket connections; it handles the I/O while the CRDT does the thinking.
tokio Rust’s async runtime — it drives the many concurrent connections a collaborative server holds open at once.
WebSocketUpgrade The axum extractor that turns an incoming HTTP request into a live WebSocket, handing you a per-connection task to read and write frames.
path parameter (axum 0.8) axum 0.8’s route-capture syntax: /rooms/{id}/ws binds the room id from the URL so each connection knows which document it joined.
Why Rust for the engine
The merge runs on every keystroke from every collaborator — it’s the hot path. Rust gives you that
performance with no GC pauses, plus the most mature CRDT libraries (yrs, the Rust port of Yjs, and
automerge). axum + tokio handle the WebSocket I/O; the CRDT does the thinking.
Dependencies
# Cargo.toml
[dependencies]
axum = { version = "0.8", features = ["ws"] }
tokio = { version = "1", features = ["full"] }
yrs = "0.27" # the y-crdt sequence CRDT
redis = { version = "0.27", features = ["tokio-comp"] } # tokio-comp = async multiplexed connection
serde = { version = "1", features = ["derive"] }
serde_json = "1"Agent prompt — paste into an agent with repo access
Role: Senior Rust engineer in this repo.
Context: axum 0.8 with the "ws" feature, tokio. We are building a collaborative-editor sync server.
Task: Scaffold a server with GET /rooms/{id}/ws that upgrades to a WebSocket and, on connect, sends a JSON
{"t":"hello","room":<id>} frame, then echoes any text frame back.
Requirements:
- Use axum::extract::ws::{WebSocketUpgrade, WebSocket, Message} and a per-connection task.
- Graceful handling of close frames; structured logging.
- No CRDT yet — just the transport.
Tests / acceptance:
- `cargo build` and `cargo clippy -- -D warnings` are clean.
- A websocat/ws client to /rooms/demo/ws receives the hello frame and gets echoes.
Output: a unified diff plus the connection lifecycle in one paragraph.What success looks like
cargo build and cargo clippy -- -D warnings are clean. Connecting websocat ws://localhost:8080/rooms/demo/ws prints {"t":"hello","room":"demo"} immediately on connect, and anything you type is echoed straight back. This is transport only — no document merges yet.
Scaffold the Go sync server (WebSocket)
Go IntermediateCreate a Go service that upgrades GET /rooms/{id}/ws to a WebSocket and echoes a hello frame — the transport skeleton you hand-roll the CRDT engine onto next.
New in this step
coder/websocket A clean, modern Go WebSocket library; websocket.Accept upgrades the request and wsjson reads/writes JSON frames.
goroutine-per-connection Go’s idiom of handling each open socket in its own lightweight goroutine — a natural fit for the many long-lived connections a collaborative server holds.
r.PathValue (Go 1.22) The standard library’s route-parameter reader: mux.HandleFunc("GET /rooms/{id}/ws", ...) then r.PathValue("id") gives you the room id, no third-party router needed.
Go is capable here — with a caveat
Go’s goroutine-per-connection model is a great fit for WebSocket fan-out, and github.com/coder/websocket
is a clean, modern library. The caveat lands at the next step: Go has no widely-adopted, production-grade
text-CRDT library, so you’ll hand-roll the merge (or bind a Rust/C one). That gap is exactly why this
project scores Rust 5 and Go 3.
Set up the module
go mod init github.com/you/concord
go get github.com/coder/websocket github.com/redis/go-redis/v9 github.com/jackc/pgx/v5Upgrade to a WebSocket
mux := http.NewServeMux()
mux.HandleFunc("GET /rooms/{id}/ws", func(w http.ResponseWriter, r *http.Request) {
c, err := websocket.Accept(w, r, nil)
if err != nil { return }
defer c.CloseNow()
room := r.PathValue("id")
ctx := r.Context()
_ = wsjson.Write(ctx, c, map[string]any{"t": "hello", "room": room})
for {
var msg json.RawMessage
if err := wsjson.Read(ctx, c, &msg); err != nil { return }
_ = c.Write(ctx, websocket.MessageText, msg) // echo for now
}
})Agent prompt — paste into an agent with repo access
Role: Senior Go engineer in this repo.
Context: github.com/coder/websocket. Building a collaborative-editor sync server.
Task: Scaffold GET /rooms/{id}/ws that accepts a WebSocket, sends {"t":"hello","room":id}, then echoes frames.
Requirements:
- Use coder/websocket with wsjson for JSON frames; one goroutine per connection; respect r.Context().
- Clean close handling; structured logging with log/slog.
Tests / acceptance:
- `go vet ./...` and `go build ./...` pass.
- A ws client to /rooms/demo/ws receives the hello frame and echoes.
Output: a unified diff plus a note on read/write deadline handling.What success looks like
go vet ./... and go build ./... pass, and websocat ws://localhost:8080/rooms/demo/ws receives {"t":"hello","room":"demo"} on connect then echoes every text frame — the same transport-only observable as the Rust scaffold; merge engine next.
★ Apply and merge edits with a CRDT (Rust)
Rust AdvancedHold each room’s document in a yrs text type and apply local and remote edits as CRDT updates — the spotlight step where convergence becomes real and you write zero transform functions.
New in this step
yrs / Yjs yrs is the Rust port of Yjs, a battle-tested sequence CRDT (the YATA algorithm); it handles insert ids, ordering, and tombstones internally so you don’t.
Y.Doc / Y.Text A Doc is one CRDT document; get_or_insert_text("body") gives a TextRef (a Y.Text) — the editable shared string you insert into and read from.
transact_mut Opens a write transaction on the Doc; every mutation runs inside one, and committing it produces the update you broadcast.
encode_update_v2 Encodes the change a transaction made as opaque bytes — these are exactly the bytes the wire step’s op.data.update carries (base64) to every peer.
apply_update Merges a peer’s decoded update into the local Doc; it is idempotent and order-independent, so re-applying or reordering updates converges — that is the CRDT guarantee.
state vector A compact summary of how much of each replica’s history a Doc already has; an empty one means know nothing yet, so ask for everything.
encode_state_as_update_v2 Encodes a Doc’s full state as one update; called with a default (empty) state vector it returns the whole document — the bytes a newcomer’s sync frame carries.
Arc Mutex shared state Each room’s Doc is shared mutable state across async tasks, so it lives behind Arc<Mutex<…>> — the shared-ownership-plus-lock pattern the Rust track teaches.
The spotlight: convergence for free
yrs implements a battle-tested sequence CRDT (the YATA algorithm behind Yjs). You apply an edit to a
TextRef inside a transaction, encode the resulting update as bytes, and broadcast it (these are the
opaque bytes the wire-protocol step’s op.data.update carries); receivers apply_update in any order, any
number of times, and converge — that’s the CRDT guarantee, and you didn’t write a single transform function.
The engine handles insert ids, ordering, and tombstones internally. For a newcomer’s sync, encode the full
state with encode_state_as_update_v2(&StateVector::default()) — an empty state vector means “give me
everything”. Each Room’s Doc is shared mutable state across async tasks, so it lives behind the same
Arc<Mutex<…>> pattern the Rust track teaches in Share state safely across async tasks.
Apply + encode an update (yrs)
use yrs::{Doc, GetString, ReadTxn, StateVector, Text, Transact, Update};
use yrs::updates::decoder::Decode;
use yrs::updates::encoder::Encode;
let doc = Doc::new();
let text = doc.get_or_insert_text("body");
// local edit inside a transaction
let update = {
let mut txn = doc.transact_mut();
text.insert(&mut txn, 0, "hello");
txn.encode_update_v2() // <-- broadcast these bytes (op.data.update, base64)
};
// full state for a newcomer's {"t":"sync"} (empty StateVector = everything):
let snapshot = doc.transact().encode_state_as_update_v2(&StateVector::default());
// applying a remote update converges, regardless of order
// (both decode_v2 and apply_update return a Result in current yrs — propagate it):
// let mut txn = doc.transact_mut();
// txn.apply_update(Update::decode_v2(&bytes)?)?;Agent prompt — paste into an agent with repo access
Two clients each insert a character at position 0 at the same moment, then exchange their encode_update_v2 bytes. After both apply both updates, how do the two get_string outputs compare — and could the result ever be that one keystroke wins and the other is dropped?
Role: Senior Rust engineer in this repo.
Context: yrs 0.27. Each room owns a yrs Doc with a text named "body", held behind Arc<Mutex<Room>>.
Task: Implement Room::apply_local(edit) -> Vec<u8>, Room::apply_remote(update_bytes), and Room::sync_state()
so concurrent edits converge and a newcomer can be bootstrapped.
Requirements:
- Local edits run in a transact_mut and return encode_update_v2() bytes to broadcast as op.data.update.
- apply_remote decodes (Update::decode_v2) and applies (apply_update); both return Result — propagate it.
It is idempotent and order-independent (yrs guarantees this).
- sync_state() returns encode_state_as_update_v2(&StateVector::default()) — the full state for {"t":"sync"}.
- Expose text() via GetString for the demo and snapshotting.
Tests / acceptance:
- A unit test: build two Docs, apply two concurrent inserts at position 0 in opposite orders on each, and
assert both Docs' get_string are byte-equal after exchanging updates.
- A second client built only from sync_state() (apply_update into a fresh Doc) is byte-equal to the source.
- `cargo clippy -- -D warnings` clean.
Output: a unified diff plus why no transform functions are needed.What success looks like
The convergence unit test goes green and no keystroke is dropped. Two Docs each insert a character at position 0, exchange encode_update_v2() bytes, and apply them in opposite orders — both get_string() outputs come out byte-equal (both characters survive, interleaved deterministically; never last-write-wins). A fresh Doc built only from sync_state() — encode_state_as_update_v2(&StateVector::default()) decoded with apply_update — is byte-equal to the source. You wrote zero transform functions; the merge is the engine.
★ A running RGA, simple case first (Go)
Go IntermediateWith no mature Go CRDT library, hand-roll a small RGA that compiles and converges for in-order ops — anchor, place, delete, render — so the property test has a concrete, working core to harden before you add buffering.
New in this step
after anchor Each inserted element records the id it was inserted after; that anchor is how every replica reconstructs the same order without a server assigning positions.
(counter, site) tie-break When two elements share the same anchor (a concurrent insert at the same spot), lower (counter, site) goes first — a rule every replica applies identically, so order never diverges.
placement walk The scan that finds where a new element belongs among the run of elements after its anchor; getting it right is the single subtlety that makes or breaks convergence.
sibling vs descendant subtree True siblings share the same anchor and are compared by the tie-break; an earlier sibling’s descendants must be stepped over, never landed inside — confuse them and the same ops yield different text.
idempotent operation Applying the same op twice has no extra effect (a known id is ignored); this is what makes replays and duplicate deliveries harmless.
Get something that compiles and converges, then harden it
This is the honest cost of Go here, and the spotlight-equivalent for the default backend — so we build it in
two passes instead of one leap. An RGA (Replicated Growable Array) gives each element an id (site, counter);
an insert references the id it goes after; deletes set a tombstone. Pass one (this step) handles the
common case where every op’s after anchor is already present: find the anchor, place the new element among
its siblings by a deterministic tie-break, and render with String() (skipping tombstones). The one subtlety
that makes or breaks convergence is the placement walk. After the anchor sits a run of elements: that
anchor’s direct children plus their own descendants, interleaved in document order. You must compare the
newcomer only against true siblings (elements anchored at the same id) and step over an earlier
sibling’s descendants — never stop inside another element’s subtree. Get that wrong and the same ops in two
orders land an element on different sides of a sibling, and the replicas diverge. The code below runs and
converges for any sequence of ops that arrive after the element they reference — the normal case on a single
ordered socket. Pass two (the next step) adds the buffering that makes it converge even when ops arrive
out of order. Building the simple, working core first means the property test has something concrete to
harden, not a blank canvas.
RGA core — runs and converges for in-order ops
package crdt
import "strings"
type ID struct {
Site string
Counter uint64
}
type Elem struct {
ID, After ID
Value rune
Deleted bool
}
type Doc struct {
elems []Elem // in document order (head sentinel = zero ID at index -1)
index map[ID]int // ID -> position in elems, for fast anchor lookup
}
func NewDoc() *Doc { return &Doc{index: map[ID]int{}} }
// firstWins reports whether sibling a is placed before sibling b when they share
// an anchor: lower (Counter, Site) goes first, so every replica agrees on order.
func firstWins(a, b ID) bool {
if a.Counter != b.Counter {
return a.Counter < b.Counter
}
return a.Site < b.Site
}
// descendsFrom walks the After-chain of id and reports whether it leads back to
// anchor — i.e. id sits inside anchor's subtree (a child/grandchild), not a sibling.
func (d *Doc) descendsFrom(id, anchor ID) bool {
for id != (ID{}) {
i, ok := d.index[id]
if !ok {
return false
}
id = d.elems[i].After
if id == anchor {
return true
}
}
return false
}
// ApplyInsert places e among the siblings sharing its anchor by the firstWins
// tie-break, stepping OVER any earlier sibling's descendants so e never lands
// inside another element's subtree. Idempotent: a known id is ignored.
// (Buffering for an unknown anchor: next step.)
func (d *Doc) ApplyInsert(e Elem) {
if _, seen := d.index[e.ID]; seen {
return // idempotent
}
pos := 0
if e.After != (ID{}) {
anchor, ok := d.index[e.After]
if !ok {
return // anchor unseen — pass two will buffer instead of dropping
}
pos = anchor + 1
}
// Walk the run after the anchor. Decide ONLY at true siblings (same After);
// skip past an earlier sibling's descendants; stop when the run ends.
for pos < len(d.elems) {
c := d.elems[pos]
if c.After == e.After { // a true sibling sharing e's anchor
if firstWins(e.ID, c.ID) {
break // e sorts before this sibling — place it here
}
pos++
continue
}
if d.descendsFrom(c.After, e.After) {
pos++ // inside an earlier sibling's subtree — step over it
continue
}
break // left this anchor's run entirely
}
d.elems = append(d.elems, Elem{})
copy(d.elems[pos+1:], d.elems[pos:])
d.elems[pos] = e
d.reindex(pos)
}
// ApplyDelete tombstones an element. Idempotent; unknown ids are ignored for now.
func (d *Doc) ApplyDelete(id ID) {
if i, ok := d.index[id]; ok {
d.elems[i].Deleted = true
}
}
func (d *Doc) reindex(from int) {
for i := from; i < len(d.elems); i++ {
d.index[d.elems[i].ID] = i
}
}
// String renders the visible text, skipping tombstones.
func (d *Doc) String() string {
var b strings.Builder
for _, e := range d.elems {
if !e.Deleted {
b.WriteRune(e.Value)
}
}
return b.String()
}What success looks like
The package compiles and your first convergence holds: insert A and B both anchored at the head plus a child x anchored after A, apply them in two different in-order sequences on two fresh Docs, and both String() to "AxB" — the descendant-aware walk lands x between A and B.
★ Make the RGA converge out of order (Go)
Go AdvancedAdd a buffer that holds an op whose after anchor hasn’t arrived yet and drains it when the anchor lands — the step that turns “works on one ordered socket” into a real CRDT that converges in any order.
New in this step
causal readiness An op is ready only once the element it depends on (its after anchor) is present; until then it must wait rather than be applied or dropped.
pending buffer A map of not-yet-ready ops keyed by the missing anchor; over Redis fan-out and reconnects ops will arrive out of order, so you hold them here instead of losing the edit.
draining as a cascade When an anchor finally lands, you apply the ops waiting on it — and each one applied may unblock yet more buffered ops, so the drain repeats until nothing is ready.
Buffering is what turns 'works on one socket' into a real CRDT
The simple core converges only when ops arrive after their anchors. Over Redis fan-out and reconnects, ops
will arrive out of order, so an insert can reference an after you haven’t seen. Don’t drop it — buffer
it keyed by the missing anchor (causal readiness), and when that anchor finally lands, drain the buffer
(which may cascade: a drained op can unblock another). The same idea covers a delete for an unknown id. This
buffer is part of the document’s durable state — the serialization step persists it too, so a reload
mid-gap never silently loses an edit. With buffering in place, the engine converges for any order, any
duplicates: that’s the guarantee the property test will prove.
Agent prompt — paste into an agent with repo access
An insert arrives before the element its `after` anchor points to. The simple core dropped it. With the buffer, what happens to that insert — and when the anchor finally lands, what does String() converge to compared with a replica that saw the same ops in causal order?
Role: Senior Go engineer in this repo.
Context: internal/crdt/rga.go from the previous step (Doc with ApplyInsert/ApplyDelete/String + the
descendant-aware placement walk + the firstWins tie-break, in-order only — an unseen anchor is dropped).
Task: Extend Doc with a pending buffer so out-of-order ops converge, WITHOUT changing the placement walk.
Requirements:
- Add pending map[ID][]Elem (inserts waiting on an unseen After) and pendDel []ID (deletes for unseen ids).
- ApplyInsert: if e.After is unknown, append e to pending[e.After] and return; after any successful place,
drain pending[placed.ID] recursively (a drained insert may unblock further inserts — a cascade).
- ApplyDelete: if the id is unknown, record it in pendDel (deduped); when its element later lands during a
drain, tombstone it. A delete of a known id tombstones immediately.
- Keep both ops idempotent (a known id / already-buffered delete is a no-op) and KEEP the firstWins placement
walk from pass one unchanged — buffering only decides WHEN to place, never WHERE.
Tests / acceptance:
- Apply an insert whose After arrives LATER; assert the text is correct once the anchor lands (and the
reviewer trace A/B/childX still converges to "AxB" in every order).
- A delete that arrives before its target still tombstones once the element lands.
- Re-apply every op a second time; assert String() is unchanged (idempotence).
- A seeded fuzz: one fixed op set applied in many random orders (full shuffles, with duplicates) all yield
identical String() — and match a reference tree-walk renderer.
- `go test ./... -race` passes.
Output: a unified diff plus how the buffer drains and why it cannot lose, duplicate, or misplace an op.What success looks like
go test ./... -race passes and the engine now converges in any order — matching the Rust path’s guarantee. An insert whose after arrives later is buffered, not dropped; when the anchor lands the buffer drains (cascading to any insert it unblocks) and String() equals a replica that saw the ops in causal order. A delete that arrives before its target tombstones once the element appears, re-applying every op a second time leaves String() unchanged (idempotent), and the seeded fuzz — one op set in many shuffled orders with duplicates — yields one identical text every run.
Serialize the Doc once — elements and the pending buffer (Go)
Go AdvancedDefine one codec that round-trips the whole Doc — placed elements and the pending buffer — to bytes, so sync, snapshots, and history all persist the same format and a restart mid-gap loses nothing.
New in this step
serialization Turning the in-memory Doc into a flat byte sequence (and back) so it can be sent as a sync frame or stored in a snapshot row.
encoding/gob (vs JSON) Go’s native binary format; it round-trips a map keyed by a struct id cleanly, which JSON cannot — exactly the shape the element index and pending buffer have.
buffered op as durable state An op held because its anchor hasn’t arrived is a real, unacknowledged edit; serialize it too, or a restart mid-gap silently drops it and convergence quietly breaks.
The buffer is durable state, not scratch memory
The Go path needs exactly one serialization format, defined here so sync, the Postgres snapshot, and
history’s version capture all agree. Encode the ordered element list and the pending-op buffer: an op
buffered because its anchor hadn’t arrived is a real, unacknowledged edit — if a snapshot persisted only the
placed elements, a restart mid-gap would silently drop it and convergence would quietly break (and the green
property test wouldn’t catch it, because both reloaded Docs would be identically wrong). Use encoding/gob,
not JSON: gob round-trips a Go map cleanly (JSON can’t key a map by a struct id), and the element list plus
the buffer are exactly the gob-friendly shapes you already have. The bytes this produces are what
{"t":"sync","state":"<base64>"} carries and what the snapshot row stores.
gob codec over the whole Doc
package crdt
import (
"bytes"
"encoding/gob"
)
// wire form of the Doc: placed elements (in order) + the pending buffer.
type docState struct {
Elems []Elem
Pending map[ID][]Elem // inserts waiting on an unseen After
PendDel []ID // deletes waiting on an unseen element
}
// Snapshot encodes the FULL Doc (visible + buffered) for sync/snapshot/version.
func (d *Doc) Snapshot() ([]byte, error) {
var buf bytes.Buffer
st := docState{Elems: d.elems, Pending: d.pending, PendDel: d.pendDel}
if err := gob.NewEncoder(&buf).Encode(st); err != nil {
return nil, err
}
return buf.Bytes(), nil
}
// Load rebuilds a Doc (and its index) from Snapshot bytes.
func (d *Doc) Load(b []byte) error {
var st docState
if err := gob.NewDecoder(bytes.NewReader(b)).Decode(&st); err != nil {
return err
}
d.elems, d.pending, d.pendDel = st.Elems, st.Pending, st.PendDel
d.index = make(map[ID]int, len(st.Elems))
d.reindex(0)
return nil
}Agent prompt — paste into an agent with repo access
Role: Senior Go engineer in this repo.
Context: internal/crdt/rga.go has Doc{elems, index, pending, pendDel} with ApplyInsert/ApplyDelete/String.
Task: Add internal/crdt/codec.go with Doc.Snapshot() ([]byte, error) and Doc.Load([]byte) error using gob.
Requirements:
- Encode BOTH the placed element list AND the pending buffer (and pending deletes) — never just visible state.
- Load rebuilds the index and is the inverse of Snapshot for any Doc.
- gob, not JSON (the index/pending maps are keyed by the ID struct).
Tests / acceptance:
- Round-trip: a Doc with some buffered (anchor-not-yet-seen) ops Snapshots, Loads into a fresh Doc, then the
late anchor arrives on BOTH — assert both String() are equal (the buffer survived the round-trip).
- `go test ./... -race` passes.
Output: a unified diff plus why the pending buffer must be serialized.What success looks like
go test ./... -race passes. A Doc holding some still-buffered (anchor-not-yet-seen) ops Snapshot()s to gob bytes, Load()s into a fresh Doc that rebuilds its index, and when the late anchor arrives on both the original and the reloaded Doc, both String() outputs match. The pending buffer survived the round-trip — a restart mid-gap loses nothing, which is why the codec serializes the buffer and not just the visible elements. These are the bytes {"t":"sync","state":"<base64>"} and the snapshot row carry.
Compose the app: wire routes, engine, Redis, Postgres, shutdown (Rust)
Rust AdvancedWrite the main.rs entrypoint that composes the parts into one runnable process — the axum router and WS route, a shared room registry, the Redis client, the Postgres pool, and graceful shutdown.
New in this step
composition root (entrypoint) The one place that builds and wires every dependency into a running process; until main composes the parts you only have pieces.
#[tokio::main] The attribute that starts the tokio async runtime and runs your async fn main, so the server can await many connections at once.
sqlx / PgPool sqlx is an async Postgres client; a PgPool keeps a reusable set of connections open so loading or snapshotting a room never pays full connect cost.
axum State axum’s mechanism for sharing one value (here the room registry) across all handlers, so every WebSocket connection reaches the same rooms.
room registry The shared Arc<Mutex<HashMap<String, Arc<Mutex<Room>>>>> mapping room id to room; it lazy-loads each room from its latest Postgres snapshot on first access.
graceful shutdown axum::serve(...).with_graceful_shutdown(...) on a Ctrl-C signal drains in-flight connections instead of dropping live sockets mid-edit.
The entrypoint is what makes it runnable
Until now you have parts; main composes them into a process you can run. Read PORT/REDIS_URL/
DATABASE_URL from env, open one redis client and one sqlx pool (closed on shutdown), and build a Registry
of rooms shared across handlers via axum::extract::State. The registry is the canonical shared-mutable-state
problem from the Rust track — Arc<Mutex<HashMap<String, Arc<Mutex<Room>>>>>. Mount
GET /rooms/{id}/ws (axum 0.8 {id} syntax), GET / (the demo page you’ll add next), and GET /healthz,
then axum::serve(...).with_graceful_shutdown(...) so SIGINT drains connections cleanly.
Agent prompt — paste into an agent with repo access
Role: Senior Rust engineer in this repo.
Context: axum 0.8 (ws), tokio, the redis crate, sqlx (postgres); Room (apply_local/apply_remote/sync_state)
and a Registry that lazy-loads a Room from its latest Postgres snapshot.
Task: Write src/main.rs that composes the whole service.
Requirements:
- #[tokio::main]; read PORT (default 8080), REDIS_URL, DATABASE_URL from env.
- Build one redis client + one sqlx PgPool; share a Registry (Arc<Mutex<HashMap<..>>>) via State.
- Router: GET /rooms/{id}/ws (WebSocketUpgrade), GET /healthz -> "ok", GET / -> the bundled demo page.
- axum::serve with .with_graceful_shutdown on a ctrl_c signal.
Tests / acceptance:
- `cargo build` and `cargo clippy -- -D warnings` are clean.
- `curl localhost:8080/healthz` returns "ok"; a websocat client to /rooms/demo/ws receives {"t":"sync",...}.
Output: a unified diff plus the shutdown sequence (what gets drained, in what order).What success looks like
cargo build and cargo clippy -- -D warnings are clean, and the service runs as one process. curl localhost:8080/healthz returns ok; a websocat client to /rooms/demo/ws now receives a {"t":"sync","state":"<base64>"} frame on connect (the room lazy-loaded from its latest Postgres snapshot, empty on first run). Ctrl-C drains in-flight connections through with_graceful_shutdown instead of dropping sockets. This is the step that makes everything runnable.
Compose the app: wire routes, engine, Redis, Postgres, shutdown (Go)
Go AdvancedWrite cmd/server/main.go that composes the parts into one runnable process — the HTTP mux with the WS route, a shared room registry, the Redis client, the pgx pool, and graceful shutdown.
New in this step
composition root The one file that builds and wires every dependency into a running process; cmd/server/main.go is where the pieces become a server you can run.
pgx / pgxpool pgx is the Go Postgres driver; a pgxpool.Pool keeps a reusable set of connections so loading or snapshotting a room never pays full connect cost.
http.ServeMux / http.Server The standard library router and server; mux.HandleFunc("GET /rooms/{id}/ws", ...) registers routes and http.Server runs them — no framework needed.
http.FileServer Serves a directory of static files; here it hands out web/demo.html at GET / so the two-client demo needs no build.
http.Server.Shutdown Stops accepting new connections and waits for in-flight ones to finish, so a deploy or Ctrl-C never drops a live editing socket mid-write.
SIGINT / SIGTERM The interrupt and terminate signals (Ctrl-C, a container stop); catching them with signal.Notify is what triggers the graceful Shutdown and stops subscriber goroutines.
One main that composes the parts into a process
cmd/server/main.go is the composition root. Read PORT/REDIS_URL/DATABASE_URL from env, open one
redis.Client and one pgxpool.Pool (both closed on shutdown), and build a room.Registry (a
map[string]*room.Room guarded by a mutex) that lazy-loads each room from its latest Postgres snapshot via
the codec you just wrote. Mount GET /rooms/{id}/ws (r.PathValue("id")), GET / (the demo page next), and
GET /healthz. Run an http.Server and call srv.Shutdown(ctx) on SIGINT/SIGTERM so in-flight sockets and
subscriber goroutines stop cleanly.
cmd/server/main.go (composition essentials)
func main() {
ctx := context.Background()
rdb := redis.NewClient(mustParse(os.Getenv("REDIS_URL")))
pool, err := pgxpool.New(ctx, os.Getenv("DATABASE_URL"))
if err != nil { log.Fatal(err) }
defer pool.Close()
reg := room.NewRegistry(rdb, store.New(pool)) // lazy-loads rooms from snapshots
mux := http.NewServeMux()
mux.HandleFunc("GET /rooms/{id}/ws", server.HandleWS(reg))
mux.HandleFunc("GET /healthz", func(w http.ResponseWriter, _ *http.Request) { w.Write([]byte("ok")) })
mux.Handle("GET /", http.FileServer(http.Dir("web"))) // serves web/demo.html
srv := &http.Server{Addr: ":" + cmp.Or(os.Getenv("PORT"), "8080"), Handler: mux}
go srv.ListenAndServe()
stop := make(chan os.Signal, 1)
signal.Notify(stop, os.Interrupt, syscall.SIGTERM)
<-stop
sctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
_ = srv.Shutdown(sctx)
}Agent prompt — paste into an agent with repo access
Role: Senior Go engineer in this repo.
Context: internal/room (Registry, Room), internal/server (HandleWS), internal/store (snapshots via pgx),
internal/hub (go-redis pub/sub), internal/crdt (Doc + codec).
Task: Write cmd/server/main.go composing the service.
Requirements:
- Read PORT (default 8080), REDIS_URL, DATABASE_URL from env; one redis.Client + one pgxpool.Pool, closed on shutdown.
- Build a room.Registry that lazy-loads a room from its latest snapshot (crdt.Doc.Load) on first access.
- Routes: GET /rooms/{id}/ws, GET /healthz -> "ok", GET / -> http.FileServer over ./web.
- http.Server with Shutdown(ctx) on SIGINT/SIGTERM; subscriber goroutines stop on context cancel.
Tests / acceptance:
- `go build ./...` and `go vet ./...` pass.
- `curl localhost:8080/healthz` returns "ok"; a ws client to /rooms/demo/ws receives {"t":"sync",...}.
Output: a unified diff plus the shutdown order (server, then rooms' subscribers).What success looks like
go build ./... and go vet ./... pass, and cmd/server/main.go runs the whole process. curl localhost:8080/healthz returns ok; a ws client to /rooms/demo/ws receives {"t":"sync","state":"<base64>"} on connect — base64 of the gob-encoded Doc the registry lazy-loaded from the latest snapshot (empty on first run). SIGINT/SIGTERM calls srv.Shutdown(ctx) so in-flight sockets and subscriber goroutines stop cleanly: same routes, same shutdown discipline as the Rust path.
Broadcast ops with Redis pub/sub and reconcile late joiners
Rust AdvancedPublish every applied update to a per-room Redis channel and subscribe to it on every instance, and send a newcomer the full state before live ops — so a client on one server sees edits from a client on another.
New in this step
Redis pub/sub A broadcast channel: every server instance subscribes, anyone publishes, and Redis delivers each message to all of them — how instance 1’s edits reach a client on instance 2.
per-room channel One channel per document (room:{id}); each applied op (and presence) is published here as the JSON envelope, so fan-out stays scoped to the room that produced the edit.
subscriber task A per-room async task that listens on the channel and apply_remotes each incoming update into the local Doc, keeping every instance’s copy converged.
late-joiner sync A client connecting mid-session first gets one {"t":"sync"} with the full sync_state(), then the live stream — and because the CRDT tolerates duplicates and gaps, it never mis-applies the overlap.
Redis makes many server instances behave like one
With more than one server instance, a client on instance A must see edits from a client on instance B.
Publish each update to room:{id} and have every instance subscribe; Redis fans them out. A joining client
gets the one-time {"t":"sync","state":...} (the full encoded state from sync_state()) and then the live
stream — so it never misses or double-applies (the CRDT tolerates duplicates anyway). This is the pub/sub
fan-out the Redis track teaches in Fan out events with pub/sub.
Agent prompt — paste into an agent with repo access
Client A edits on server instance 1; client B is connected to instance 2. Trace the bytes: what exact envelope lands on the Redis `room:{id}` channel, and how does instance 2 turn it back into the same character on B's screen?
Role: Senior Rust engineer in this repo.
Context: yrs Room (apply_local/apply_remote/sync_state) + the redis crate (tokio-comp); the wire envelope
{"t":"op","seq","data":{"update":<base64>}} from the protocol step.
Task: Wire Redis pub/sub: publish each op envelope to channel room:{id}; subscribe and apply_remote incoming
updates; on a new WS connection send {"t":"sync","state":<base64 sync_state()>} before streaming live ops.
Requirements:
- One subscriber task per room; publish AFTER local apply; don't re-send an update to the socket it came from.
- The sync state is sync_state() = encode_state_as_update_v2(&StateVector::default()), base64-encoded.
Tests / acceptance:
- Two server instances + two clients: an edit on client A appears on client B within ~100ms.
- A late joiner ends byte-equal with existing clients.
Output: a unified diff plus the late-joiner sequence.What success looks like
Run two server instances and connect client A to one, client B to the other. An edit on A appears on B within ~100ms — even across instances, because each apply_local publishes its op to the Redis room:{id} channel and every instance’s subscriber apply_remotes it. On the wire that op is the engine-specific Rust envelope:
{ "t": "op", "seq": 41, "data": { "update": "<base64 encode_update_v2 bytes>" } }A client that joins late first gets {"t":"sync","state":<base64 sync_state()>}, then the live stream, and ends byte-equal with the existing clients — duplicates and gaps are harmless.
Broadcast ops with Redis pub/sub and reconcile late joiners
Go AdvancedPublish every applied op to a per-room Redis channel, subscribe to it across instances, and send newcomers the current state first — so a client on one server sees edits from a client on another.
New in this step
go-redis pub/sub The go-redis client’s Publish/Subscribe: every instance subscribes, anyone publishes, and Redis delivers to all — how instance 1’s edits reach a client on instance 2.
per-room channel One channel per document (room:{id}); each applied RGA op is published here as its self-describing JSON envelope, so fan-out stays scoped to the room that produced the edit.
SUBSCRIBE goroutine A per-room goroutine that reads the channel and runs each incoming op through ApplyInsert/ApplyDelete (idempotent, buffer-tolerant), keeping every instance’s Doc converged.
late-joiner sync A client connecting mid-session first gets one {"t":"sync"} with the serialized Doc, then the live stream — and because the RGA tolerates duplicates and gaps, the overlap is harmless.
Agent prompt — paste into an agent with repo access
Client A edits on instance 1; client B is on instance 2. What exact JSON envelope lands on the Redis `room:{id}` channel on the Go path, and how does instance 2's SUBSCRIBE goroutine turn it back into the same character on B?
Role: Senior Go engineer in this repo.
Context: The hand-rolled RGA Doc + github.com/redis/go-redis/v9.
Task: Wire Redis pub/sub: PUBLISH each op to room:{id}; a per-room SUBSCRIBE goroutine applies incoming ops;
a new WS connection receives {"t":"sync"} with the serialized Doc before the live op stream.
Requirements:
- Apply locally, then publish; incoming ops go through ApplyInsert/ApplyDelete (idempotent + buffered).
- Serialize the Doc (its element list) for the sync; gob or JSON is fine.
Tests / acceptance:
- Two instances + two clients: an edit on A reaches B within ~100ms.
- A late joiner converges byte-equal with existing clients.
Output: a unified diff plus how buffering handles out-of-order arrival.What success looks like
Two instances, client A on one and B on the other: an edit on A reaches B within ~100ms — each instance SUBSCRIBEs room:{id} and runs incoming ops through ApplyInsert/ApplyDelete (idempotent). On the Go path the published envelope is the self-describing RGA op, not opaque bytes. A late joiner receives {"t":"sync"} with the serialized Doc first, then the live stream, and converges byte-equal.
Persist and compact snapshots in Postgres
Rust AdvancedPeriodically write the encoded CRDT state to Postgres so a room survives restarts, and compact the op log into the snapshot to bound its growth.
New in this step
snapshot The full convergent state serialized to one Postgres row; on restart you load it and replay any later ops, so the room comes back byte-equal.
append-only op log The stream of every edit ever applied; it grows forever and tombstones accumulate, which is why you periodically fold it into a snapshot.
compaction Folding the op log up to a point into one snapshot and dropping those ops, bounding storage without losing the ability to reload.
BYTEA Postgres’s raw-bytes column type; the encoded CRDT state (yrs updateV2 bytes here) is stored directly in snapshots.state as BYTEA.
UPSERT An insert-or-update in one statement (INSERT ... ON CONFLICT DO UPDATE); keyed by room_id it keeps exactly one latest snapshot row per room.
Snapshots are the durable truth; compaction bounds growth
An append-only op log grows forever and tombstones accumulate. Periodically (every N ops or T seconds)
serialize the convergent state to a snapshot row and truncate the applied op log up to that point.
Restart = load the latest snapshot, then replay any ops after it.
snapshot table
CREATE TABLE snapshots (
room_id TEXT PRIMARY KEY,
state BYTEA NOT NULL, -- encoded CRDT state
op_seq BIGINT NOT NULL, -- last op included
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);Agent prompt — paste into an agent with repo access
Role: Senior Rust engineer in this repo.
Context: yrs Room with encodable state; Postgres via sqlx or tokio-postgres; the snapshots table above.
Task: Add snapshot persistence + compaction.
Requirements:
- Every 200 ops or 5s (whichever first), UPSERT the encoded state + op_seq into snapshots.
- On room load, fetch the snapshot, apply_update it into a fresh Doc, then replay later ops.
- Writes are transactional; never block the edit hot path (do it off a channel/task).
Tests / acceptance:
- Kill and reload a room: the document is byte-equal to before the restart.
Output: a unified diff plus the compaction trigger rationale.What success looks like
Type into room demo, wait for a snapshot to fire (every 200 ops or 5s), then kill the server and restart it. Reconnecting shows the same document — the room reloaded by apply_update-ing the snapshots.state bytes into a fresh Doc and replaying any later ops, byte-equal to before the restart. The compaction write runs off a task so it never blocks the edit hot path, and psql shows one row per room in snapshots with a monotonically rising op_seq.
Persist and compact snapshots in Postgres
Go AdvancedPeriodically serialize the Doc to Postgres and compact the op log so a room survives restarts and the log stops growing forever.
New in this step
snapshot The full convergent state serialized to one Postgres row; on restart you Doc.Load it and replay any later ops, so the room comes back byte-equal.
append-only op log The stream of every RGA op ever applied; it grows forever and tombstones accumulate, which is why you periodically fold it into a snapshot.
compaction Folding the op log up to a point into one snapshot and truncating those ops, bounding storage without losing the ability to reload.
BYTEA Postgres’s raw-bytes column type; the gob-encoded Doc (elements and pending buffer) is stored directly in snapshots.state as BYTEA.
UPSERT An insert-or-update in one statement (INSERT ... ON CONFLICT DO UPDATE); keyed by room_id it keeps exactly one latest snapshot row per room.
Agent prompt — paste into an agent with repo access
Role: Senior Go engineer in this repo.
Context: The RGA Doc (serializable element list); pgx; the snapshots table (room_id, state BYTEA, op_seq).
Task: Add snapshot persistence + compaction.
Requirements:
- Every 200 ops or 5s, UPSERT the serialized Doc + op_seq; truncate the in-memory op log up to op_seq.
- On room load, deserialize the snapshot then replay later ops; persistence runs off the hot path.
Tests / acceptance:
- Reload a room from a snapshot and assert String() equals the pre-restart document.
Output: a unified diff plus how you keep persistence off the edit path.What success looks like
Edit room demo, let a snapshot fire (every 200 ops or 5s), kill the server, and restart. The reloaded room String()s to exactly the pre-restart text — Doc.Load rebuilt it (elements and pending buffer) from the gob bytes in snapshots.state, then replayed later ops. The snapshots row stores gob where Rust stores updateV2 bytes, but it’s the same table and the same byte-equal restart guarantee, and the write stays off the edit hot path.
Prove convergence with a property test
Rust AdvancedWrite a randomized test that applies the same concurrent ops in many shuffled orders and asserts every replica ends identical — turning convergence from a claim into a proof for your yrs integration.
New in this step
convergence The property that every replica which has seen the same edits ends with byte-identical text, regardless of order or duplicates — the one guarantee that defines this whole project.
property / fuzz test A test that asserts an invariant (“all replicas equal”) over many randomly generated inputs, rather than a handful of fixed cases — it catches the order that breaks you.
proptest A Rust property-testing crate (a seeded loop works too) that generates and shuffles op sets across iterations so convergence is exercised under thousands of orderings.
Convergence is the one property you must test
Everything else is plumbing; convergence is the guarantee. Generate random insert/delete op sets, shuffle
them, apply to multiple fresh Docs in different orders (with duplicates), and assert all get_string
outputs are equal. yrs is well-tested, so this guards your integration, not the library.
Agent prompt — paste into an agent with repo access
Role: Senior Rust engineer in this repo.
Context: yrs Room integration.
Task: Add a property/fuzz test for convergence (proptest or a seeded loop).
Requirements:
- Generate K random ops; apply them to 3 fresh Docs in 3 different shuffled orders, with some duplicates.
- Assert all 3 get_string outputs are byte-equal.
- Run >= 1000 iterations in CI.
Tests / acceptance:
- `cargo test convergence` passes reliably.
Output: a unified diff plus what classes of bug this would catch.What success looks like
cargo test convergence passes across ≥1000 iterations — K random ops applied to 3 fresh Docs in 3 shuffled orders, with duplicates, all produce byte-equal get_string() — proving the property for your yrs integration, not just assuming it from the library.
Prove convergence with a property test
Go AdvancedWrite a randomized test that applies the same concurrent ops in many shuffled orders and asserts every replica ends identical — the proof that your hand-rolled RGA is a real CRDT, not just a plausible one.
New in this step
convergence The property that every replica which has seen the same edits ends with byte-identical text, regardless of order or duplicates — the one guarantee that defines this whole project.
testing/quick Go’s standard property-testing helper (a seeded loop works too); it generates and shuffles op sets across iterations so convergence is exercised under thousands of orderings.
-race flag go test -race instruments the binary to catch concurrent unsynchronised access; passing it proves the per-room mutex actually guards the Doc.
Agent prompt — paste into an agent with repo access
Role: Senior Go engineer in this repo.
Context: The hand-rolled RGA Doc.
Task: Add a convergence property test.
Requirements:
- Generate K random insert/delete ops; apply to 3 fresh Docs in different shuffled orders, with duplicates
and some out-of-order (after-before-anchor) arrivals to exercise buffering.
- Assert all 3 String() outputs are identical.
- Use testing/quick or a seeded loop; >= 1000 iterations.
Tests / acceptance:
- `go test ./... -run TestConvergence -race` passes reliably.
Output: a unified diff plus which ordering hazards the test exercises.What success looks like
go test ./... -run TestConvergence -race passes reliably over ≥1000 iterations: K random insert/delete ops applied to 3 fresh Docs in different shuffled orders — with duplicates and out-of-order (after-before-anchor) arrivals to exercise the buffer — all yield identical String(). -race clean means the per-room mutex actually guards the Doc. Same property and same identical-final-text assertion as the Rust test; this is the guarantee that makes the hand-rolled RGA a real CRDT, not just a plausible one.
Show who's here: presence and live cursors with Redis
IntermediateTrack who’s online in one per-room Redis sorted set keyed by an expiry score — heartbeat with ZADD, list live members with ZRANGEBYSCORE, sweep stale ones with ZREMRANGEBYSCORE, and broadcast cursor moves over the same pub/sub channel — so presence stays fast and ephemeral, never touching the document.
New in this step
ephemeral presence Who’s online, where their cursor is, and their colour — throwaway state you broadcast but never persist into the CRDT, so it can’t corrupt the document.
Redis sorted set A set whose members each carry a numeric score and stay ordered by it; one set per room (presence:{room}) holds the whole room’s membership in a single key.
ZADD Adds or re-scores a member; a heartbeat re-adds the user with a fresh now + ttl score, which is how membership stays current.
ZRANGEBYSCORE Returns members whose score falls in a range; ZRANGEBYSCORE presence:{room} <now> +inf is exactly the still-live members.
ZREMRANGEBYSCORE Removes members in a score range; a periodic sweep of -inf <now> reclaims everyone whose expiry has fallen into the past.
score-as-expiry Storing each member’s unix-ms expiry as its score lets one ranged read filter live from dead, instead of giving every user a separate TTL key.
why KEYS is a foot-gun KEYS presence:* scans the entire keyspace and blocks the server; a sorted set sidesteps it — list membership with one ranged read, never KEYS.
Presence is ephemeral — one sorted set, never KEYS
Presence (who’s online, where their cursor is, their colour) is throwaway state — never persist it into
the CRDT. Keep it in one Redis sorted set per room, presence:{room}, where the member is the user id
and the score is the entry’s expiry as a unix-ms timestamp. A heartbeat re-adds the member with a fresh
now + ttl score (ZADD); the live members are exactly those whose score is still in the future
(ZRANGEBYSCORE presence:{room} <now> +inf); a periodic sweep drops the expired ones
(ZREMRANGEBYSCORE presence:{room} -inf <now>). One key holds the whole room, so listing presence is a
single ranged read — never KEYS presence:*, which scans the entire keyspace and blocks the server (the
foot-gun the Redis track calls out in Scan the keyspace safely in production; sorted-set
membership sidesteps the scan-vs-KEYS problem entirely). Cursor moves still broadcast over the same pub/sub
channel as a presence message, exactly like an op.
Presence as a sorted set (redis-cli)
# heartbeat every few seconds: re-add the member with a fresh expiry score (now + 10s, in ms)
ZADD presence:roomA 1718900000000 kerim
# who's here right now = members whose expiry is still in the future (score >= now)
ZRANGEBYSCORE presence:roomA 1718899990000 +inf
# sweep the expired: drop everyone whose score has fallen into the past
ZREMRANGEBYSCORE presence:roomA -inf 1718899990000What success looks like
Heartbeat two users with ZADD presence:roomA <now+ttl> <user>, then ZRANGEBYSCORE presence:roomA <now> +inf lists exactly the members whose expiry is still in the future. Stop one user’s heartbeat and they stop appearing in the ZRANGEBYSCORE <now> +inf read the moment their score falls behind now; a periodic ZREMRANGEBYSCORE … -inf <now> sweep then reclaims the row — expiry is enforced by the score filter, not a Redis TTL. Membership is one ranged read over a single sorted set, never a KEYS presence:* keyspace scan. Cursor moves arrive on other clients as {"t":"presence",...} frames and never touch the document — presence stays ephemeral.
Watch two clients converge: the bundled demo page
IntermediateSave one self-contained web/demo.html — no build, no framework — that opens two WebSockets to /rooms/demo/ws, shows two textareas side by side, and has a per-client offline/reconnect toggle so you can see concurrent edits merge and queued offline edits replay.
New in this step
optimistic local apply A pane applies your keystroke to its own document immediately, before the server confirms it, so typing feels instant; the CRDT guarantees it still converges with everyone else.
offline op queue + replay While a client is offline its ops pile up in a local list; on reconnect it flushes the list and the server merges the backlog — the reconnect-replay payoff the project promised.
ESM import from a CDN (esm.sh) Loading a JS library as a browser ES module straight from a URL (import ... from "https://esm.sh/yjs"), so the demo runs with no bundler or install step.
The payoff, in two browser tabs
This is the artifact the project promised and the definition of done in docs/project-specs/concord.md
hinges on — and it is the page your composed main already serves at GET /. It is deliberately zero-install:
one static file the backend hands out, using only the browser’s WebSocket. Each pane is an independent client
on room demo. Type in either box; the demo diffs the change, emits the canonical op envelopes from the
Define the op and sync wire protocol step, and the server applies + broadcasts them so both panes converge.
A tiny client-side RGA in the page applies the self-describing Go op (kind, id, after, value) and
re-renders the text — the same descendant-aware placement you built, so what you see in the browser matches
what the engine computes. The offline toggle closes a client’s socket and queues its ops locally; reconnect
flushes the queue, the server merges the backlog, and both panes end byte-identical. That is convergence and
reconnect-replay, visible.
web/demo.html
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Concord — two-client convergence demo</title>
<style>
body { font: 15px system-ui, sans-serif; margin: 24px; background: #0b0f17; color: #e8edf5; }
h1 { font-size: 18px; }
.panes { display: flex; gap: 16px; flex-wrap: wrap; }
.pane { flex: 1 1 320px; border: 1px solid #243049; border-radius: 10px; padding: 12px; }
.bar { display: flex; align-items: center; gap: 10px; margin-bottom: 8px; }
.pill { font-size: 12px; padding: 2px 8px; border-radius: 999px; border: 1px solid #355; }
.pill.live { color: #3FD27A; border-color: #2c6; }
.pill.off { color: #f5b14c; border-color: #a83; }
textarea { width: 100%; height: 220px; box-sizing: border-box; font: 14px ui-monospace, monospace;
background: #0e1422; color: #e8edf5; border: 1px solid #243049; border-radius: 8px; padding: 10px; resize: vertical; }
button { font: inherit; cursor: pointer; border-radius: 8px; border: 1px solid #355; background: #16203a; color: inherit; padding: 4px 10px; }
.hint { color: #9fb0c8; font-size: 13px; margin: 6px 0 14px; }
</style>
</head>
<body>
<h1>Concord — type in both, watch them converge</h1>
<p class="hint">Two independent clients on room <code>demo</code>. Type in either box; edits flow as CRDT ops
through the server and both converge. Click <b>Go offline</b> on one, keep typing, then <b>Reconnect</b> —
its queued ops replay and both boxes end identical.</p>
<div class="panes" id="panes"></div>
<script type="module">
import * as Y from "https://esm.sh/yjs@13.6.14";
const WS_URL = (location.origin.replace(/^http/, "ws")) + "/rooms/demo/ws";
// Change to "rust" if you are running the Rust backend
const BACKEND = "go";
// Base64 helpers for the Rust (yrs/Yjs) path
const toBase64 = (arr) => {
let s = ""; for (let i = 0; i < arr.length; i++) s += String.fromCharCode(arr[i]);
return btoa(s);
};
const fromBase64 = (str) => {
const bin = atob(str), arr = new Uint8Array(bin.length);
for (let i = 0; i < bin.length; i++) arr[i] = bin.charCodeAt(i);
return arr;
};
// --- GO ENGINE (RGA) ---
// A tiny client-side RGA, faithful to the Go engine: applies the self-describing
// {kind,id,after,value} op so each pane can render the convergent document.
function newRGA() { return { elems: [], index: new Map() }; }
const key = (id) => id.site + ":" + id.counter;
const isZero = (id) => id.site === "" && id.counter === 0;
const sameId = (a, b) => a.site === b.site && a.counter === b.counter;
function firstWins(a, b) { return a.counter !== b.counter ? a.counter < b.counter : a.site < b.site; }
function reindex(d, from) { for (let i = from; i < d.elems.length; i++) d.index.set(key(d.elems[i].id), i); }
function descendsFrom(d, id, anchor) {
while (!isZero(id)) {
const i = d.index.get(key(id));
if (i === undefined) return false;
id = d.elems[i].after;
if (sameId(id, anchor)) return true;
}
return false;
}
function rgaInsert(d, e) {
if (d.index.has(key(e.id))) return; // idempotent
let pos = 0;
if (!isZero(e.after)) {
const a = d.index.get(key(e.after));
if (a === undefined) return; // anchors arrive in order on one socket
pos = a + 1;
}
while (pos < d.elems.length) { // decide only at true siblings; skip descendants
const c = d.elems[pos];
if (sameId(c.after, e.after)) { if (firstWins(e.id, c.id)) break; pos++; continue; }
if (descendsFrom(d, c.after, e.after)) { pos++; continue; }
break;
}
d.elems.splice(pos, 0, e);
reindex(d, pos);
}
function rgaDelete(d, id) { const i = d.index.get(key(id)); if (i !== undefined) d.elems[i].deleted = true; }
function rgaText(d) { let s = ""; for (const e of d.elems) if (!e.deleted) s += e.value; return s; }
function visibleIds(d) { const a = []; for (const e of d.elems) if (!e.deleted) a.push(e.id); return a; }
// One client per pane.
function makeClient(site, root) {
// Go state
const docGo = newRGA();
let counter = 0;
// Rust state
const docRust = new Y.Doc();
const textRust = docRust.getText("body");
let seq = 0, online = false, ws = null;
const queue = []; // ops typed while offline
const ta = root.querySelector("textarea");
const pill = root.querySelector(".pill");
const toggle = root.querySelector("button");
const send = (data) => {
const frame = JSON.stringify({ t: "op", seq: seq++, data });
if (online && ws && ws.readyState === 1) ws.send(frame); else queue.push(frame);
};
// The Rust server speaks v2 exclusively (encode_update_v2 / decode_v2), so the
// browser must emit and decode v2 too — a v1 update decodes to an empty doc with
// no error, silently breaking sync.
docRust.on("updateV2", (update, origin) => {
if (origin !== "remote") send({ update: toBase64(update) });
});
const render = () => {
const at = ta.selectionStart;
ta.value = BACKEND === "go" ? rgaText(docGo) : textRust.toString();
ta.selectionStart = ta.selectionEnd = Math.min(at, ta.value.length);
};
ta.addEventListener("input", () => { // diff the box, emit insert/delete ops
const next = ta.value;
if (BACKEND === "go") {
const prev = rgaText(docGo);
let p = 0; while (p < prev.length && p < next.length && prev[p] === next[p]) p++;
let s = 0; while (s < prev.length - p && s < next.length - p && prev[prev.length - 1 - s] === next[next.length - 1 - s]) s++;
for (const id of visibleIds(docGo).slice(p, prev.length - s)) { rgaDelete(docGo, id); send({ kind: "delete", id }); }
const ids = visibleIds(docGo);
let after = p > 0 ? ids[p - 1] : { site: "", counter: 0 };
for (const ch of next.slice(p, next.length - s)) {
const id = { site, counter: ++counter };
rgaInsert(docGo, { id, after, value: ch, deleted: false });
send({ kind: "insert", id, after, value: ch });
after = id;
}
} else {
const prev = textRust.toString();
let p = 0; while (p < prev.length && p < next.length && prev[p] === next[p]) p++;
let s = 0; while (s < prev.length - p && s < next.length - p && prev[prev.length - 1 - s] === next[next.length - 1 - s]) s++;
docRust.transact(() => {
if (prev.length - s - p > 0) textRust.delete(p, prev.length - s - p);
if (next.length - s - p > 0) textRust.insert(p, next.slice(p, next.length - s));
}, "local");
}
render();
});
const connect = () => {
ws = new WebSocket(WS_URL);
ws.onopen = () => {
online = true; pill.textContent = "live"; pill.className = "pill live";
while (queue.length) ws.send(queue.shift()); // replay queued ops on reconnect
};
ws.onmessage = (ev) => {
let m; try { m = JSON.parse(ev.data); } catch (_) { return; }
if (m.t === "op" && m.data) {
if (BACKEND === "go" && m.data.kind) {
if (m.data.kind === "insert") rgaInsert(docGo, { id: m.data.id, after: m.data.after, value: m.data.value, deleted: false });
else if (m.data.kind === "delete") rgaDelete(docGo, m.data.id);
render();
} else if (BACKEND === "rust" && m.data.update) {
Y.applyUpdateV2(docRust, fromBase64(m.data.update), "remote");
render();
}
} else if (m.t === "sync" && m.state && BACKEND === "rust") {
try { Y.applyUpdateV2(docRust, fromBase64(m.state), "remote"); render(); } catch(e) {}
}
};
ws.onclose = () => { online = false; pill.textContent = "offline"; pill.className = "pill off"; };
};
toggle.addEventListener("click", () => {
if (online) ws.close(); else { pill.textContent = "reconnecting…"; connect(); }
toggle.textContent = online ? "Reconnect" : "Go offline";
});
connect();
}
const panes = document.getElementById("panes");
[["a", "Client A"], ["b", "Client B"]].forEach(([site, label]) => {
const el = document.createElement("div");
el.className = "pane";
el.innerHTML =
'<div class="bar"><b>' + label + '</b><span class="pill">connecting…</span>' +
'<button>Go offline</button></div><textarea spellcheck="false"></textarea>';
panes.appendChild(el);
makeClient(site, el);
});
</script>
</body>
</html>Agent prompt — paste into an agent with repo access
Take one pane offline and type several characters so they queue locally. Before the queue flushes on reconnect, do the two panes match? And once it flushes, what makes both converge rather than the backlog overwriting — or being overwritten by — what the other pane typed meanwhile?
Role: Front-end engineer in this repo (vanilla JS, no framework, no build step).
Context: The composed server serves this file at GET / and exposes GET /rooms/{id}/ws. The wire protocol is
{"t":"sync","state":"<base64>"} once on connect, then {"t":"op","seq":N,"data":{...}} both directions; the Go
op.data is {kind, id, after, value}, the Rust op.data is {update: <base64>}. Two clients share room "demo".
Task: Write web/demo.html — one self-contained file with two side-by-side textareas, each an independent
client on /rooms/demo/ws, plus a per-client offline/reconnect toggle.
Requirements:
- Add a BACKEND = "go" | "rust" toggle (a const in the script is fine).
- Go: diff textarea against current text, emit/apply {"t":"op","seq","data":{kind...}} using a small RGA that matches the Go engine's (Counter, Site) tie-break.
- Rust: use ESM Yjs to apply {"t":"op","seq","data":{update:...}} and the sync state. Encode updates to base64 before sending.
- Offline toggle: close the socket and queue outgoing ops locally; on reconnect, flush the queue (replay).
- A status pill per client: live / offline / reconnecting…; ignore the sync frame's opaque bytes on Go.
Tests / acceptance:
- Open in two browser tabs: typing in one appears in the other within ~100ms and both converge byte-equal.
- Take one client offline, type several characters, reconnect — the queued ops replay and both panes match.
Output: the complete web/demo.html, no commentary.What success looks like
Open the served page at GET /. Type in both panes at the same spot at the same time: edits flow as op envelopes through the server and within ~100ms both textareas show the identical document — both keystrokes survive, no last-write-wins, no conflict dialog. Click Go offline on one pane (pill flips to offline), type several characters that queue locally, then Reconnect — the queued ops replay, the server merges the backlog, and both panes end byte-identical (on the Rust path set BACKEND = "rust"; the panes round-trip opaque updateV2 bytes — same wire envelope). This is the convergence and reconnect-replay the project promised, visible in two panes.
Build the collaborative editor (Jetpack Compose)
Jetpack Compose IntermediateBuild the editor screen — render the live text, drive the room WebSocket, apply remote ops to the visible text, and draw remote cursors — so a Compose user edits the same converging document as every other client.
New in this step
BasicTextField Compose’s editable text widget; its value changes are what you diff into CRDT ops, and it re-renders the converged document.
ViewModel The screen’s state holder that owns the document model and the socket, surviving recomposition and config changes.
optimistic local apply + authoritative merge Show your own keystroke instantly, but treat the server/CRDT stream as the source of truth — remote ops merge in and the text still converges for everyone.
presence heartbeat A small presence message (cursor index, colour) sent every few seconds to refresh your entry in the room’s sorted set, so others see you as online.
index↔element-id bridge The mapping that turns a flat (offset, length) change in the text field into ops on element ids — and back — so a plain widget can drive an id-keyed CRDT.
Agent prompt — paste into an agent with repo access
Role: Android engineer (Kotlin, Jetpack Compose) in this repo.
Context: Server /rooms/{id}/ws sends {"t":"sync"|"op"|"presence"}; the client sends ops + presence heartbeats.
Task: Build a collaborative editor screen.
Requirements:
- A BasicTextField whose changes emit ops to the socket (Go = id-keyed JSON insert/delete ops; Rust/yrs = opaque base64 update bytes).
- Apply incoming "op" messages to the local document model; render remote cursors from "presence".
- Send a presence heartbeat (cursor index, colour) every few seconds; reconnect with backoff.
- Local echo is optimistic; remote ops merge (the server/CRDT is the source of truth).
Tests / acceptance:
- ViewModel unit test: applying two interleaved remote ops yields the expected text.
Output: a unified diff plus the optimistic-local + authoritative-merge flow.What success looks like
The ViewModel unit test goes green: applying two interleaved remote ops yields the expected merged text. On device the BasicTextField reflects the converged document with optimistic local edits, and remote cursors render from presence heartbeats.
Build the collaborative editor (Flutter)
Flutter IntermediateBuild the editor screen — render the live text with a TextField, stream ops over the room WebSocket, and paint remote cursors — so a Flutter user edits the same converging document as every other client.
New in this step
TextField Flutter’s editable text widget (with a controller); its edits are what you diff into CRDT ops, and it re-renders the converged document.
state notifier (Riverpod/Bloc) The screen’s state holder that owns the document model and the socket and rebuilds the UI when either changes.
optimistic local apply + authoritative merge Show your own keystroke instantly, but treat the server/CRDT stream as the source of truth — remote ops merge in and the text still converges for everyone.
presence heartbeat A small presence message (cursor index, colour) sent every few seconds to refresh your entry in the room’s sorted set, so others see you as online.
index↔element-id bridge The mapping that turns a flat (offset, length) change in the text field into ops on element ids — and back — so a plain widget can drive an id-keyed CRDT.
Agent prompt — paste into an agent with repo access
Role: Flutter engineer (Dart) in this repo.
Context: Server /rooms/{id}/ws sends {"t":"sync"|"op"|"presence"}.
Task: Build a collaborative editor screen with a state notifier (Riverpod/Bloc).
Requirements:
- A TextField/EditableText whose edits emit ops to the socket (Go = id-keyed JSON insert/delete ops; Rust/yrs = opaque base64 update bytes).
- Apply incoming ops to the document model; overlay remote cursors from presence.
- Heartbeat presence every few seconds; reconnect with backoff; optimistic local + authoritative merge.
Tests / acceptance:
- A notifier unit test: two interleaved remote ops produce the expected text.
Output: a unified diff plus the reconnect/backoff strategy.What success looks like
The notifier unit test goes green: two interleaved remote ops produce the expected merged text. The TextField renders the live document with optimistic local edits, remote cursors overlay from presence, and a dropped socket reconnects with backoff and re-syncs without losing edits.
Build the collaborative editor (SwiftUI)
SwiftUI IntermediateBuild the editor screen — render the live text, drive the room WebSocket with URLSessionWebSocketTask, and show remote cursors — so a SwiftUI user edits the same converging document as every other client.
New in this step
TextEditor SwiftUI’s multi-line editable text view; its changes are what you diff into CRDT ops, and it re-renders the converged document.
URLSessionWebSocketTask Apple’s built-in WebSocket client; you send ops and presence on it and receive the live sync/op/presence stream.
@Observable model on @MainActor An @Observable state object, mutated on the @MainActor, that owns the document and socket so the UI updates safely off the network stream.
optimistic local apply + authoritative merge Show your own keystroke instantly, but treat the server/CRDT stream as the source of truth — remote ops merge in and the text still converges for everyone.
presence heartbeat A small presence message (cursor index, colour) sent every few seconds to refresh your entry in the room’s sorted set, so others see you as online.
index↔element-id bridge The mapping that turns a flat (offset, length) change in the editor into ops on element ids — and back — so a plain view can drive an id-keyed CRDT.
Agent prompt — paste into an agent with repo access
Role: iOS engineer (Swift, SwiftUI, Swift Concurrency) in this repo.
Context: Server /rooms/{id}/ws sends {"t":"sync"|"op"|"presence"}.
Task: Build a collaborative editor backed by an @Observable model and URLSessionWebSocketTask.
Requirements:
- A TextEditor whose edits emit ops (Go = id-keyed JSON insert/delete; Rust/yrs = opaque base64 update bytes); apply incoming ops to the model on @MainActor.
- Render remote cursors from presence; heartbeat every few seconds; reconnect with backoff.
- Optimistic local edits, authoritative merge from the server.
Tests / acceptance:
- A unit test: applying two interleaved remote ops yields the expected text.
Output: a unified diff plus the concurrency boundaries.What success looks like
The unit test goes green: applying two interleaved remote ops yields the expected merged text. The TextEditor renders the live document on @MainActor with optimistic local edits and authoritative server merges, while the URLSessionWebSocketTask reconnects with backoff and re-syncs.
Deploy: Cloud Run + Cloud SQL + Memorystore
AdvancedRun the WebSocket server on Cloud Run with managed Postgres (Cloud SQL) and Redis (Memorystore) — the same convergence guarantee, now on serverless infrastructure where clients reconnect and re-sync painlessly.
New in this step
Cloud Run Google’s serverless container host; it runs your WebSocket server and scales instances up and down, with Memorystore keeping every instance consistent via pub/sub.
Cloud SQL Managed PostgreSQL — the same Postgres your snapshots target locally, now hosted, so the durable convergent truth survives in the cloud.
Memorystore Managed Redis; it provides the pub/sub fan-out and presence that keep edits flowing across Cloud Run instances.
connection-duration cap Serverless platforms cap how long one request (here a long-lived socket) may stay open, so clients must reconnect and re-sync — which the CRDT makes painless because duplicates and gaps converge.
WebSockets on serverless — mind the connection lifetime
Cloud Run supports WebSockets but caps request (connection) duration, so clients must reconnect and
re-sync — which the CRDT makes painless (duplicates and gaps converge). Pub/sub via Memorystore keeps
every instance consistent; Cloud SQL holds snapshots. Set a sensible max instance concurrency for long-lived
sockets.
Deploy
gcloud run deploy concord \
--source . \
--add-cloudsql-instances PROJECT:us-central1:concord-pg \
--set-env-vars REDIS_URL=...,DATABASE_URL=... \
--region us-central1 --allow-unauthenticated --timeout 3600What success looks like
gcloud run deploy prints a public service URL. Open the demo against it in two tabs and edits still converge across instances — Memorystore fans ops out, Cloud SQL holds the snapshots. When Cloud Run caps a long-lived socket and drops it, the client reconnects and re-syncs; because the CRDT tolerates duplicates and gaps, the document comes back byte-equal with no lost edits — the same convergence guarantee, now over managed infrastructure.
Suggest a rewrite or continuation with Gemini
Optional add-on IntermediateMount POST /rooms/{id}/copilot — request the selected text plus a mode, call Gemini for a rewrite or continuation, and return only the suggestion — so the AI proposes, and nothing in the document changes until the user accepts.
New in this step
Gemini Google’s LLM API; here it returns a rewrite or continuation of the selected text — an additive Copilot, never load-bearing for collaboration correctness.
generateContent The Gemini text-generation method; you send a tight instruction plus the selection and read back the suggested text.
server-side API key The GEMINI_API_KEY stays on your server; the editor calls your endpoint, never Gemini directly, so the key is never shipped to a client.
200 / 400 / 502 status codes Distinct outcomes the editor can tell apart: 200 with the suggestion, 400 for an unknown mode, 502 when the upstream Gemini call fails — a bad request vs an outage.
The AI never edits — it proposes (over one endpoint)
The Copilot is additive: the user selects text, you ask Gemini for a rewrite/continuation, and you show
it. Nothing changes until the user accepts — consistent with the platform’s “no auto-actions” rule. Keep the
GEMINI_API_KEY server-side; the editor calls your endpoint, never Gemini directly. Mount one route,
POST /rooms/{id}/copilot, with the canonical wire contract from the spec: the request body is
{ selection, mode } where mode is rewrite or continue, and the success response is { suggestion }.
Return 200 with the suggestion, 400 for an unknown mode, and 502 when the upstream Gemini call
fails — so the editor can tell a bad request apart from an upstream outage. The suggestion is only proposed
here; the next step turns an accepted one into CRDT ops.
Chat prompt — paste into a chat to get the code
Role: Gemini integration engineer. The reader has no repo here — return complete code.
Context: A server endpoint in the user's selected backend; GEMINI_API_KEY in env.
Task: Mount POST /rooms/{id}/copilot. Request {"selection": "...", "mode": "rewrite"|"continue"} ->
response {"suggestion": "..."}. It calls suggest(selection, mode), which asks Gemini generateContent with a
tight instruction and returns the suggested text.
Requirements:
- Wire contract: 200 with {"suggestion": "..."} on success; 400 when mode is not "rewrite" or "continue";
502 when the upstream Gemini call fails or times out.
- Server-side key; 20s timeout; return plain text in suggestion; cap input length.
- "rewrite" improves clarity/grammar of the selection; "continue" extends it in the same voice.
- Link the official Gemini text-generation docs rather than hardcoding a model id that may change.
Tests / acceptance (describe):
- POST with mode "continue" on "The meeting is" returns 200 and a non-empty suggestion.
- POST with mode "summarise" returns 400 without calling Gemini; an upstream failure returns 502.
Output: the complete handler (route + suggest), no commentary.What success looks like
POST /rooms/demo/copilot with {"selection":"The meeting is","mode":"continue"} returns 200 and {"suggestion":"..."} — a non-empty continuation in the same voice. An unknown mode like "summarise" returns 400 without calling Gemini, and an upstream failure or timeout returns 502, so the editor can tell a bad request from an outage. The suggestion is only proposed — nothing in the document changes until the next step turns an accepted one into CRDT ops.
Insert the accepted suggestion as CRDT ops
Optional add-on IntermediateWhen the user accepts a suggestion, turn it into ordinary insert/delete ops so it merges and syncs like any other edit.
An AI edit is just an edit
The suggestion must flow through the same CRDT path as a keystroke — replace the selection with delete ops for the old element ids and insert ops for the new text. Then it broadcasts, persists, and converges exactly like a human edit, with full collaboration and undo. No special-casing.
Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (use the selected backend + frontend).
Context: The editor has a current selection (a range of element ids) and a suggested replacement string.
Task: Apply an accepted Copilot suggestion as CRDT ops.
Requirements:
- Emit delete ops for the selected element ids and insert ops for the replacement (anchored correctly).
- Route through the normal local-apply + broadcast path; no bypass of the CRDT or persistence.
- The change must appear on other clients and survive a restart.
Tests / acceptance:
- Accepting a suggestion on one client updates a second client's document to the same text.
Output: a unified diff plus why routing through the CRDT preserves convergence + undo.What success looks like
Accepting a Copilot suggestion on one client deletes the selection’s element ids and inserts the new text as ordinary CRDT ops — so a second connected client’s document updates to the same text within ~100ms and the change survives a restart. Because it flows through the normal local-apply + broadcast path with no privileged write, the AI edit merges, persists, and undoes exactly like a keystroke.
From one snapshot to an authored version history
Optional add-on IntermediateGeneralize the single latest-state snapshot into many named, authored versions — keep the op-log as the event stream and stamp a labelled checkpoint for each saved version — so a document gains a browsable, restorable history.
New in this step
event sourcing Treat the stream of edits as the source of truth — you never mutate past events, you append new ones and stamp checkpoints, so any point in history can be reconstructed.
op-log as event stream The sequence of CRDT ops is the event log; a version is just a named point over it, which is what makes time-travel and diff fall out of the model.
version (named checkpoint) A row holding the full encoded state at a point, plus author, label, and the op sequence it covers — stored whole (not a delta) so it can be rebuilt on its own.
Event sourcing: the op-log is the log, snapshots are checkpoints
The base build already persists the convergent state in the Persist and compact snapshots in Postgres step — one row holding the latest encoded CRDT state. History generalizes that: the stream of CRDT ops is an event log, and a version is a named checkpoint over it (author, label, the op sequence it covers, a timestamp). You never mutate past events; you append new ones and stamp checkpoints — classic event sourcing. Each version stores a full encoded state, not a delta, so any version can be reconstructed on its own later.
versions table
CREATE TABLE versions (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
room_id TEXT NOT NULL,
label TEXT NOT NULL, -- human name, e.g. "before the rewrite"
author TEXT NOT NULL,
state BYTEA NOT NULL, -- full encoded CRDT state at this point
op_seq BIGINT NOT NULL, -- last op included in this snapshot
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX versions_room_time ON versions (room_id, created_at DESC);Capture a named version
Optional add-on IntermediateAdd an endpoint that snapshots the room’s current convergent state into the versions table with the author and a label — a read of the converged state, so a version can never diverge from what collaborators see.
Reuse the encoder you already have
Capturing a version is the snapshot step plus metadata: encode the current convergent state exactly as the persistence step does (Rust: encode_state_as_update_v2 with a default StateVector for the full state; Go: serialize the RGA element list), then insert it with the author, a label, and the current op sequence. Nothing about the live document changes — a version is a read of the converged state, so it can never diverge from what collaborators see.
Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (use the selected backend).
Context: Rooms already persist a latest-state snapshot to Postgres; the versions table (room_id, label, author, state BYTEA, op_seq) exists.
Task: Add POST /rooms/{id}/versions { label, author } that captures the current convergent state as a named version.
Requirements:
- Encode the FULL current state the same way the snapshot step does (Rust: encode_state_as_update_v2 with a default StateVector; Go: serialize the RGA element list); store it in `state`.
- Record the current op_seq so a later compaction can tell which ops the version already covers.
- Read a consistent converged state (snapshot under the same lock/transaction the persistence step uses); run it off the edit hot path.
Tests / acceptance:
- Capturing twice yields two rows; each row's state reloads (into a fresh Doc / fresh RGA) to the exact text shown at capture time.
- Capturing never blocks or alters live editing.
Output: a unified diff plus where the capture hooks into the existing persistence path.What success looks like
POST /rooms/demo/versions {"label":"before the rewrite","author":"kerim"} returns 201 with {"id":12,"op_seq":41}. Capturing twice yields two versions rows, and each row’s full encoded state reloads — into a fresh Doc (Rust apply_update) or fresh RGA (Go deserialize) — to the exact text shown at capture time. A version is a read of the converged state, so it captures off the hot path and never blocks or alters live editing.
Browse versions and diff two points
Optional add-on AdvancedList a room’s versions and compute a structural diff between any two — added, removed, and moved element ranges, not a character-by-character text diff — so the comparison knows what moved instead of guessing.
New in this step
structural (id-keyed) diff Because every element carries a stable unique id, you diff by id — ids only in B are added, ids only in A are removed, ids whose order changed are moved — a precise semantic diff a text diff can only guess at.
Diff the structure, using the CRDT's stable ids
A naive text diff guesses what moved; your CRDT already knows, because every element carries a stable, globally-unique id. Materialize each version (Rust: apply its stored update into a fresh Doc; Go: deserialize its element list) into an ordered list of (id, value, deleted), then diff by id: ids in B but not A are insertions, ids in A but not B are deletions, and ids whose relative order changed are moves. The result is a precise, semantic diff — exactly what a “compare versions” view renders — and it falls out of the model instead of being reverse-engineered from text.
Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (use the selected backend).
Context: versions holds full encoded states; you can materialize any version to an ordered list of elements with stable ids.
Task: Add GET /rooms/{id}/versions (list) and GET /rooms/{id}/diff?from=A&to=B returning a structural diff.
Requirements:
- Endpoints must return exact JSON shapes: /versions -> {id, op_seq}, /diff -> {added, removed, moved}.
- Materialize both versions to ordered (id, value, deleted) element lists.
- Diff by element id: report added ids (in B not A), removed ids (in A not B), and ranges whose order changed (moves); group consecutive ids into ranges for a readable result.
- A pure function over the two materialized lists — no live-document access, no AI.
Tests / acceptance:
- Inserting a paragraph between two versions shows exactly that range as added and nothing as removed.
- Diffing a version against itself returns an empty diff.
Output: a unified diff plus the id-keyed diff algorithm in one paragraph.What success looks like
GET /rooms/demo/diff?from=A&to=B returns {"added":[...],"removed":[...],"moved":[...]} keyed by stable element id, not character text. Insert a paragraph between two versions and that exact range shows up under added with nothing in removed; diff a version against itself and every list is empty. Because each element carries a globally-unique id, the diff knows what moved instead of guessing — a precise semantic diff that falls out of the CRDT model.
Restore a past version as a new edit
Optional add-on AdvancedRestore a chosen version without overwriting anything — emit the ordinary insert/delete ops that transform the current document into the chosen one, through the same path as a keystroke — so concurrent work is merged, never erased.
New in this step
non-destructive restore Restoring as a new edit rather than a rollback — a destructive overwrite would erase concurrent work and bypass the merge, so instead the restore flows through the ordinary CRDT path and is itself undoable.
materialize-then-diff Rebuild the target version’s text, diff it against the current convergent text, and emit only the minimal insert/delete ops needed — one path that works for both the yrs and hand-rolled engines.
Restore is an edit, not a rollback
Destructive overwrites break collaboration — they erase concurrent work and bypass the merge. Instead, restore non-destructively: materialize the target version’s text, diff it against the current convergent text, and emit the resulting insert/delete ops through the normal local-apply + broadcast path — the exact discipline the Insert the accepted suggestion as CRDT ops step uses for AI edits. The restore then merges, broadcasts, persists, and is itself snapshot-able like any edit; nobody’s in-flight changes are lost, and convergence is untouched. (yrs also exposes a first-class snapshot() / encode_state_from_snapshot time-travel API — it needs the Doc created with skip_gc — but materialize-then-diff works identically for Go’s hand-rolled engine, so we keep one path for both backends.)
Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (use the selected backend).
Context: You can materialize any version's text; the editor applies local ops via a local-apply + Redis broadcast path; the ai-copilot step established "insert as CRDT ops".
Task: Add POST /rooms/{id}/restore { version_id } that restores a version non-destructively.
Requirements:
- Endpoint must return exact JSON shape: /restore -> {applied: true}.
- Materialize the target version's text; diff it against the current convergent text to get a minimal insert/delete op set.
- Apply those ops through the SAME local-apply + broadcast path as a normal edit — never overwrite state or bypass the CRDT/persistence.
- The restore is a normal edit: it broadcasts to other clients and can itself be captured as a new version.
Tests / acceptance:
- After restore, every connected client converges byte-equal to the target text.
- A concurrent edit made during the restore is preserved (merged), not lost.
- Restoring does not delete or rewrite the prior versions.
Output: a unified diff plus why routing through CRDT ops preserves convergence and undo.What success looks like
POST /rooms/demo/restore {"version_id":7} returns {"applied":true}, and every connected client converges byte-equal to the target version’s text — because restore materializes that version, diffs it against the current convergent text, and emits the resulting insert/delete ops through the normal local-apply + broadcast path. A concurrent edit made during the restore is merged, not lost, the prior versions stay untouched, and the restore is itself a normal edit you can snapshot or undo.
Compact the op-log behind named versions
Optional add-on AdvancedOnce a version covers a prefix of the op-log, prune those ops — the named snapshot is the durable truth for everything up to its op sequence — so the log stops growing while reload, diff, and restore still work.
New in this step
retention rule (lowest retained op_seq) Compact behind the lowest op_seq across the versions you must still reconstruct — any op at or below it is already captured by a full version, so it can drop without losing the ability to reload, diff, or restore.
Snapshots let you garbage-collect the event stream
This generalizes the base compaction. The base project compacts into one latest snapshot; with history you compact behind the oldest version you still need. Any op whose sequence is at or below the lowest retained version’s op_seq is already captured by a full snapshot, so it can drop from the op-log — the in-memory log from the snapshot step, or an ops table if you persisted it. That bounds growth without losing the ability to reload, diff, or restore: keep the versions, drop the superseded raw ops.
prune ops a version already covers
-- if you persist the op-log to a table: drop ops a snapshot has already absorbed
DELETE FROM ops
WHERE room_id = $1
AND seq <= (SELECT min(op_seq) FROM versions WHERE room_id = $1);Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (use the selected backend).
Context: versions hold full snapshots with op_seq; the op-log (in-memory or a table) accumulates behind them.
Task: Add a compaction job that prunes ops already covered by the lowest retained version.
Requirements:
- Compute the lowest op_seq across the versions you must still reconstruct; drop ops at or below it.
- Run off the edit hot path on a timer; never remove an op a retained version does not yet cover.
- Generalizes the base snapshot/compaction step — it does not replace named versions.
Tests / acceptance:
- After compaction, a late joiner still reconstructs the document byte-equal (convergence/merge-compatibility unaffected).
- An older retained version still materializes and restores correctly after its raw ops are pruned.
Output: a unified diff plus the retention rule (which op_seq you compact behind and why).What success looks like
The compaction job deletes ops whose seq is at or below min(op_seq) across the retained versions, bounding op-log growth. After it runs, a late joiner still reconstructs the document byte-equal — convergence is unaffected, because the named snapshot is the durable truth for everything up to its op sequence — and an older retained version still materializes and restores correctly even though its raw ops are gone. You keep the versions and drop only the superseded ops.
Add pgvector and a workspace index
Optional add-on IntermediateEnable the pgvector extension and add a table holding one embedding vector per document, indexed for fast nearest-neighbour search — so meaning-based search lives in the same Postgres that already holds the durable truth.
New in this step
pgvector A Postgres extension adding a vector column type and similarity indexes, so embeddings live beside the snapshots they describe — one system of record, one transaction.
vector(N) and its dimension A fixed-length array of floats; N (e.g. 768) MUST equal the embedding dimension the model emits, or inserts fail with a dimension error — keep both sides in lock-step.
approximate nearest neighbour (ANN) Finding the closest vectors fast by trading exactness for speed — what makes “find the most similar documents” practical at scale.
HNSW index A graph-based ANN index pgvector offers; it serves the “closest vectors” ordering quickly so search and backlinks aren’t full scans.
cosine distance (vector_cosine_ops) The similarity measure for text embeddings; the vector_cosine_ops opclass tells the HNSW index to rank by the <=> cosine-distance operator (smaller = more similar).
Vectors live next to the snapshots they describe
Semantic search needs a vector per document. Store it in Postgres beside the snapshots so the index and the durable truth share one system of record and one transaction. pgvector adds a vector(N) column type and approximate-nearest-neighbour (ANN) indexes; an HNSW index over a cosine-distance opclass (vector_cosine_ops) makes “find the closest vectors” fast. N must equal the embedding dimension you request from the model in the next step — pick one (e.g. 768) and keep both sides in lock-step.
enable pgvector + the embeddings table
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE doc_embeddings (
room_id TEXT PRIMARY KEY, -- one current vector per document
embedding vector(768) NOT NULL, -- dimension MUST match the model output
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- approximate nearest neighbour over cosine distance
CREATE INDEX doc_embeddings_hnsw
ON doc_embeddings USING hnsw (embedding vector_cosine_ops);Embed a snapshot with Gemini
Optional add-on IntermediateWhen a snapshot persists, turn its plain text into an embedding vector with the Gemini Embeddings API and UPSERT it into the embeddings table — capturing the document’s meaning as a point you can search by.
New in this step
embedding A mapping from text to a point in vector space where nearby points mean similar things — what makes meaning-based search possible instead of keyword matching.
embedContent / embedding.values The Gemini embeddings method; you send the snapshot’s text and read the float vector back from embedding.values.
outputDimensionality The length of vector you request; set it equal to your vector(N) column (e.g. 768) or the insert fails with a dimension mismatch.
retrieval task type A hint telling the model the embedding is for document retrieval (vs a query), which improves how well stored docs and search queries line up.
L2-normalization Scaling the vector to unit length; Gemini pre-normalizes only its 3072-dim output, so for vector(768) you must normalize yourself or cosine distance is meaningless.
An embedding is the document's meaning as a vector
An embedding maps text to a point in vector space where nearby points mean similar things — that is what makes meaning-based search possible. Call Gemini’s embedContent with the snapshot’s plain text and read the vector from embedding.values; request an outputDimensionality equal to your vector(N) column, and set the document retrieval task type (or instruction) per the embeddings docs. Keep the API key server-side; the editor never calls Gemini directly. Model ids change — link the docs rather than pinning one.
Chat prompt — paste into a chat to get the code
Role: Gemini integration engineer. The reader has no repo here — return complete code.
Context: A server-side handler in the user's selected backend; GEMINI_API_KEY in env; doc_embeddings(room_id, embedding vector(768)).
Task: Implement embedSnapshot(roomId, plainText) that embeds the text with the Gemini Embeddings API and UPSERTs the vector for that room.
Requirements:
- Call the embedContent method; read the float vector from embedding.values.
- Request outputDimensionality matching the vector(N) column (e.g. 768) and the document retrieval task type per the docs. L2-normalize the 768-dimensional embedding.
- Server-side key; 20s timeout; trim very long documents to the model's input-token limit; UPSERT by room_id (ON CONFLICT DO UPDATE).
- Link the official Gemini Embeddings docs instead of hardcoding a model id that may change.
Tests / acceptance (describe):
- Embedding two paragraphs yields an L2-normalized 768-length float vector that inserts without a dimension error.
- A re-embed of the same room updates the existing row rather than duplicating it.
Output: the complete handler, no commentary.What success looks like
embedSnapshot(roomId, plainText) calls Gemini’s embedContent, reads embedding.values, and the resulting 768-length float vector — L2-normalized to unit length (required for vector(768), since Gemini only pre-normalizes the 3072-dim output) — UPSERTs into doc_embeddings without a dimension error. Re-embedding the same room updates the existing row by room_id rather than duplicating it, so each document keeps exactly one current vector.
Re-embed on every snapshot — index the mutating truth
Optional add-on AdvancedHook re-embedding into snapshot persistence and version capture, debounced to the snapshot cadence, so the stored vector always reflects the document’s current converged state — never a stale or in-flight one.
New in this step
stale index over a mutating corpus Unlike a static document set, a collaborative doc changes constantly, so its embedding goes stale the moment someone types — the index must be refreshed against the converged truth.
debounce to snapshot cadence Trigger the embed only on the snapshot/version events (every N ops or T seconds), off the edit hot path, so you embed converged states — not every intermediate keystroke — and editing never stalls.
The indexed truth keeps changing under you
This is the real lesson: unlike a static corpus, a collaborative document mutates constantly, so its embedding goes stale the moment someone types. Tie re-embedding to the events that already mark a consistent state — the snapshot-persistence step and the Capture a named version step from history — rather than to keystrokes. Debounce to the snapshot cadence (every N ops or T seconds) so you embed converged states, not every intermediate one, and UPSERT by room_id so the index holds exactly one current vector per document. The vector then lags the live text by at most one snapshot interval and never reflects an unconverged state.
Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (use the selected backend).
Context: Snapshots persist on a cadence; history captures named versions; embedSnapshot(roomId, plainText) exists.
Task: Trigger re-embedding whenever a snapshot or a named version is persisted.
Requirements:
- Call embedSnapshot with the SAME converged plain text the snapshot/version stored — never an unconverged in-flight buffer.
- Debounce to the snapshot cadence; run the embed off the edit hot path (a task/goroutine) so it never blocks editing.
- UPSERT keeps one current vector per room; a failed embed is retried/logged, not fatal to the edit path.
Tests / acceptance:
- After edits + a snapshot, the stored vector reflects the new text (a search matching the new content ranks the doc higher than before).
- Embedding latency never delays applying or broadcasting an op.
Output: a unified diff plus where the embed hooks into the snapshot/version path.What success looks like
After some edits and a snapshot, the stored vector reflects the new text — a search matching the new content ranks the doc higher than before. Re-embedding is tied to the snapshot/version events (debounced to the snapshot cadence) and runs off the edit hot path, so embedding latency never delays applying or broadcasting an op, and the UPSERT keeps one current vector per room. The index lags the live text by at most one snapshot interval and never reflects an unconverged state.
Semantic search and backlinks over pgvector (Rust)
Optional add-on AdvancedEmbed the query and rank documents by cosine distance with sqlx and the pgvector crate, and surface each document’s nearest neighbours as backlinks — two read-only routes that never touch convergence.
New in this step
pgvector crate (Vector binding) The Rust pgvector crate (sqlx feature) lets you bind a pgvector::Vector straight into a query and read one back, so the embedding crosses the SQL boundary without manual encoding.
cosine-distance operator pgvector’s <=> operator returns cosine distance between two vectors; ORDER BY embedding <=> $1 LIMIT k is the nearest-neighbour ranking the HNSW index serves.
distance-to-similarity Cosine distance is 1 - cosine similarity, so a smaller <=> means more similar; you report similarity = 1 - distance to the client.
backlinks “Related docs” — the same nearest-neighbour query seeded with a document’s own vector, excluding itself, so each doc surfaces the ones most like it.
ANN search and backlinks are the same query, behind two GET routes
Search embeds the query text and orders documents by the <=> cosine-distance operator; “related docs” (backlinks) is the same query seeded with a document’s own vector, excluding itself. The HNSW index makes both fast. With sqlx, bind a pgvector::Vector straight into the query and read it back the same way. Cosine distance is 1 - cosine similarity, so a smaller <=> means more similar. Expose them as two read-only routes from the spec’s contract: GET /rooms/{id}/search?q=...&k=5 and GET /rooms/{id}/backlinks?k=5, both returning the same JSON shape — an array of { room_id, similarity } ordered most-similar first. q is the search text (required for search); k caps the result count (default 5).
Cargo dependency
# Cargo.toml — add to the existing deps
pgvector = { version = "0.4", features = ["sqlx"] }
# sqlx with its "postgres" + a tokio runtime feature provides the Pool you bind Vector intoAgent prompt — paste into an agent with repo access
Role: Senior Rust engineer in this repo.
Context: sqlx Pool<Postgres>; the pgvector crate (sqlx feature); doc_embeddings(room_id, embedding vector(768)); embedSnapshot exists.
Task: Mount GET /rooms/{id}/search?q=&k= and GET /rooms/{id}/backlinks?k= over pgvector, backed by
search(query, k) and backlinks(roomId, k).
Requirements:
- Both routes return 200 with a JSON array [{"room_id": "...", "similarity": 0.93}, ...], most-similar first.
q is the search text (required for search); k caps the count (default 5).
- Embed the query (reuse the embeddings call with the query retrieval task type/instruction per the docs).
- search: ORDER BY embedding <=> $1 LIMIT k, binding a pgvector::Vector; similarity = 1 - cosine_distance.
- backlinks: the same query seeded with the room's own vector, with WHERE room_id <> $self to exclude itself.
- Parameterised queries only; the HNSW index serves the ordering.
Tests / acceptance:
- GET /rooms/{id}/backlinks returns the topically-similar doc first; an unrelated doc falls below a threshold.
- Two documents about the same topic rank each other as top backlinks.
- Search and backlinks are read-only over the index — they never write document state or affect convergence.
- `cargo clippy -- -D warnings` is clean.
Output: a unified diff plus the distance-to-similarity conversion.What success looks like
cargo clippy -- -D warnings is clean. GET /rooms/{id}/search?q=...&k=5 and GET /rooms/{id}/backlinks?k=5 each return 200 with [{"room_id":"...","similarity":0.93},...], most-similar first — similarity = 1 - (embedding <=> $1), the cosine distance the HNSW index serves. Two documents about the same topic rank each other as top backlinks; an unrelated doc falls below the threshold. Both routes are read-only over the index and never write document state or touch convergence.
Semantic search and backlinks over pgvector (Go)
Optional add-on AdvancedEmbed the query and rank documents by cosine distance with pgx and pgvector-go, and surface each document’s nearest neighbours as backlinks — two read-only routes that never touch convergence.
New in this step
pgvector-go (RegisterTypes) The Go pgvector-go library registers a vector type on each pgx connection (via pgxvec.RegisterTypes), so you can pass pgvector.NewVector([]float32{...}) as a parameter and scan it back.
cosine-distance operator pgvector’s <=> operator returns cosine distance between two vectors; ORDER BY embedding <=> $1 LIMIT k is the nearest-neighbour ranking the HNSW index serves.
distance-to-similarity Cosine distance is 1 - cosine similarity, so a smaller <=> means more similar; you report similarity = 1 - distance to the client.
backlinks “Related docs” — the same nearest-neighbour query seeded with a document’s own vector, excluding itself, so each doc surfaces the ones most like it.
Register the type once, then it is just SQL behind two GET routes
pgvector-go registers a vector type on the pgx connection so you can pass a pgvector.Vector as a query parameter and scan it back. Search orders by the <=> cosine-distance operator; backlinks is the same query seeded with a document’s own vector, excluding itself. The HNSW index serves both, and cosine distance is 1 - cosine similarity. Expose them as the spec’s two read-only routes: GET /rooms/{id}/search?q=...&k=5 and GET /rooms/{id}/backlinks?k=5, both returning an array of { room_id, similarity } ordered most-similar first. q is the search text (required for search); k caps the result count (default 5).
install + register the vector type
// go get github.com/pgvector/pgvector-go
import (
"github.com/pgvector/pgvector-go"
pgxvec "github.com/pgvector/pgvector-go/pgx"
)
// register the vector type after each connection opens (e.g. pool AfterConnect)
config.AfterConnect = func(ctx context.Context, conn *pgx.Conn) error {
return pgxvec.RegisterTypes(ctx, conn)
}Agent prompt — paste into an agent with repo access
Role: Senior Go engineer in this repo.
Context: pgxpool; github.com/pgvector/pgvector-go (+ its /pgx subpackage); doc_embeddings(room_id, embedding vector(768)); embedSnapshot exists.
Task: Mount GET /rooms/{id}/search?q=&k= and GET /rooms/{id}/backlinks?k= over pgvector, backed by
Search(ctx, query, k) and Backlinks(ctx, roomID, k).
Requirements:
- Both routes return 200 with a JSON array [{"room_id": "...", "similarity": 0.93}, ...], most-similar first.
q is the search text (required for search); k caps the count (default 5).
- Register the vector type via pgxvec.RegisterTypes (per connection / pool AfterConnect).
- Embed the query (query retrieval task type/instruction per the docs); pass pgvector.NewVector([]float32{...}) as the parameter.
- Search: ORDER BY embedding <=> $1 LIMIT k; return room ids + similarity (1 - cosine_distance).
- Backlinks: the same query seeded with the room's own vector, excluding itself (WHERE room_id <> $2).
Tests / acceptance:
- GET /rooms/{id}/backlinks returns the topically-similar doc first; an unrelated doc falls below the threshold.
- Two topically-similar docs are each other's top backlink.
- Search and backlinks are read-only over the index — they never write document state or affect convergence.
- `go vet ./...` and `go build ./...` pass.
Output: a unified diff plus the distance-to-similarity conversion.What success looks like
go vet ./... and go build ./... pass. After pgxvec.RegisterTypes registers the vector type on each connection, GET /rooms/{id}/search?q=...&k=5 and GET /rooms/{id}/backlinks?k=5 each return 200 with [{"room_id":"...","similarity":0.93},...], most-similar first — similarity = 1 - (embedding <=> $1). Two topically-similar docs are each other’s top backlink; an unrelated doc falls below the threshold. Same contract and same read-only-over-the-index guarantee as the Rust path — neither route writes document state or affects convergence.
Define the typed editor tools
Optional add-on IntermediateDeclare the small set of typed tools the model may call — find_all, insert_at, replace_range — as Gemini function declarations, the only way the model can touch the document, so it can never emit arbitrary bytes.
New in this step
Gemini function calling A mode where you declare functions the model may invoke; it then returns structured calls to them instead of prose, so you control exactly what it can do.
function declaration The spec of one tool — its name, description, and typed parameters — that you hand the model; keeping the set tiny and total (find_all/insert_at/replace_range) bounds what an AI edit can be.
functionCall part What the model returns instead of text: a name plus args you execute, so there is no path for it to write document bytes directly.
JSON-schema parameters The typed shape of a tool’s arguments (an object with named, required properties); it lets the model fill in valid, checkable args rather than freeform strings.
Typed tools, not freeform text, keep AI edits safe
A /ai command (“turn this list into a table”, “summarize this section as bullets at the top”) should not let the model emit arbitrary document bytes — it should call typed tools whose effects you control. Gemini function calling lets you declare functions with a JSON-schema parameters object; the model then returns structured functionCall parts (name + args) instead of prose. Keep the tool set tiny and total: find_all(pattern) reads, insert_at(index, text) and replace_range(start, end, text) write. Every write becomes CRDT ops in a later step — so there is no path for the model to bypass the merge.
function declarations (Gemini tools)
{
"tools": [{
"functionDeclarations": [
{ "name": "find_all",
"description": "Find all ranges matching a pattern in the current document.",
"parameters": { "type": "object",
"properties": { "pattern": { "type": "string" } },
"required": ["pattern"] } },
{ "name": "insert_at",
"description": "Insert text at a character index.",
"parameters": { "type": "object",
"properties": { "index": { "type": "integer" }, "text": { "type": "string" } },
"required": ["index", "text"] } },
{ "name": "replace_range",
"description": "Replace the text in the half-open range [start, end) with new text.",
"parameters": { "type": "object",
"properties": { "start": { "type": "integer" }, "end": { "type": "integer" }, "text": { "type": "string" } },
"required": ["start", "end", "text"] } }
]
}]
}Run the agentic tool-calling loop
Optional add-on AdvancedDrive a short server-side loop — send the command plus the tool declarations, run each tool the model calls, return the result, and repeat until it finishes — so a single command can plan across several steps without ever bypassing the merge.
New in this step
agentic loop The model plans across multiple steps — call a tool, see its result, call the next — instead of answering in one shot; you run that back-and-forth server-side until it’s done.
functionResponse The message you send back after executing a functionCall, carrying the result (and the matching call id), so the model can plan its next step.
max-iterations cap A small ceiling on loop turns so a misbehaving model can’t run forever; on the cap (or a final text answer) you stop and return the collected pending edits plus a summary.
Agentic = the model plans across multiple tool calls, behind one endpoint
A single command may need several steps — find the list, read it, replace the range. That is an agentic loop: send the user’s instruction + the current document context + the tool declarations; if the response has a functionCall, execute it and append a functionResponse (with the matching call id) to the conversation, then call again; stop when the model returns a final text answer with no more calls. Bound the loop with a small max-iterations cap and keep the key server-side. find_all runs against the live converged text; the writing tools are staged for the next step, where they become CRDT ops. Mount the loop behind the spec’s route, POST /rooms/{id}/ai: the request body is { instruction } and the response is { edits, summary } — edits is the collected list of pending (not-yet-applied) writes and summary is the model’s final text. The next per-backend step applies those edits as CRDT ops; this endpoint only plans them.
Agent prompt — paste into an agent with repo access
Role: Gemini integration engineer in this repo (server-side, selected backend).
Context: The three tool declarations (find_all/insert_at/replace_range) from the previous step; GEMINI_API_KEY in env; access to the room's current converged text.
Task: Mount POST /rooms/{id}/ai backed by runAiCommand(roomId, instruction), a bounded agentic
function-calling loop.
Requirements:
- Wire contract: request {"instruction": "..."}; response 200 {"edits": [...pending...], "summary": "..."},
400 on a missing/empty instruction. The edits are pending (not yet applied here).
- Send instruction + a slice of the current document + tools (functionDeclarations); read functionCall parts (name + args).
- Execute each call (find_all reads; insert_at/replace_range are collected as pending edits), then send a functionResponse with the matching id; repeat.
- Cap iterations (e.g. 6); on the cap or a final text part, stop and return the collected pending edits + the model's summary.
- Server-side key; per-call timeout; link the official Gemini function-calling docs instead of pinning a model id.
Tests / acceptance:
- POST {"instruction":"Add a one-line summary at the top"} returns 200 with an insert_at(0, ...) pending edit
and a summary; the loop terminates within the cap.
- A command needing no change returns 200 with no pending edits (no spurious writes).
Output: a unified diff plus how the loop terminates and stays bounded.What success looks like
POST /rooms/demo/ai {"instruction":"Add a one-line summary at the top"} returns 200 with {"edits":[...],"summary":"..."} — the edits list holds a pending insert_at(0, ...) (not yet applied) and the loop terminates within the iteration cap. A command that needs no change returns 200 with an empty edits list — no spurious writes. The model only ever returns structured functionCall parts for the three typed tools, so there’s no path to emit arbitrary document bytes; the next per-backend step turns these pending edits into CRDT ops.
Apply /ai tool results as CRDT ops (Rust)
Optional add-on AdvancedTranslate each pending edit into yrs text operations inside a transaction, so an AI edit produces the same update bytes as a human keystroke.
No special-casing: the AI uses the same engine
insert_at(i, s) is text.insert(&mut txn, i, s); replace_range(a, b, s) is text.remove_range(&mut txn, a, b - a) then text.insert(&mut txn, a, s). Run them in one transact_mut, take encode_update_v2(), and broadcast it through the exact path your merge + Redis steps already use. Because it is an ordinary update, it converges, persists, snapshots, and undoes like any edit — the model gets no privileged write path.
Agent prompt — paste into an agent with repo access
Role: Senior Rust engineer in this repo.
Context: yrs Room (apply_local/apply_remote/encode); runAiCommand returns pending edits (insert_at/replace_range); the merge + Redis broadcast path exists.
Task: Apply a set of pending /ai edits to the room as ordinary CRDT ops.
Requirements:
- Map insert_at -> text.insert; replace_range -> text.remove_range then text.insert, all in ONE transact_mut.
- Broadcast encode_update_v2() through the normal local-apply + Redis path; no bypass of CRDT or persistence.
- Apply against current indices; if the doc shifted, recompute from find_all rather than trusting stale offsets.
Tests / acceptance:
- An /ai edit on one client appears byte-equal on a second client and survives a restart.
- Concurrent human typing during the /ai apply merges; nothing is lost.
- `cargo clippy -- -D warnings` is clean.
Output: a unified diff plus why reusing the keystroke path preserves convergence + undo.What success looks like
cargo clippy -- -D warnings is clean. A pending /ai edit maps insert_at/replace_range to text.insert/text.remove_range in one transact_mut, and the resulting encode_update_v2() broadcasts through the normal local-apply + Redis path — so the edit appears byte-equal on a second client and survives a restart. Human typing during the apply merges; nothing is lost. The model gets no privileged write path, so its edit converges, persists, and undoes exactly like a keystroke.
Apply /ai tool results as CRDT ops (Go)
Optional add-on AdvancedTranslate each pending edit into RGA insert/delete ops, routed through the same local-apply and broadcast path as a keystroke.
The hand-rolled engine treats AI edits as ordinary edits
insert_at(i, s) becomes a run of ApplyInsert ops anchored at the element id at position i; replace_range(a, b, s) tombstones the element ids in the half-open range [a, b) with ApplyDelete and inserts s at the anchor. Route them through the normal local-apply + Redis publish path from your merge step — the same ApplyInsert / ApplyDelete the network calls use — so the edit broadcasts, persists, buffers out-of-order, and converges exactly like a human one. No special path for the model.
Agent prompt — paste into an agent with repo access
Role: Senior Go engineer in this repo.
Context: The hand-rolled RGA Doc (ApplyInsert/ApplyDelete/String); runAiCommand returns pending edits; the Redis broadcast path exists.
Task: Apply a set of pending /ai edits to the Doc as ordinary RGA ops.
Requirements:
- Resolve positions to element ids, then insert_at -> ApplyInsert run; replace_range -> ApplyDelete the ids in [a,b) + ApplyInsert the new text at the anchor.
- Publish each op through the normal local-apply + Redis path; ops stay idempotent + buffer-tolerant (no bypass).
- Recompute positions from find_all if the document shifted; never trust stale offsets.
Tests / acceptance:
- An /ai edit on one client reaches a second client byte-equal and survives a restart.
- Concurrent typing during the apply merges; the convergence property test still passes.
- `go test ./... -race` passes.
Output: a unified diff plus why reusing the op path preserves convergence.What success looks like
go test ./... -race passes. A pending /ai edit resolves positions to element ids — insert_at becomes an ApplyInsert run, replace_range tombstones the ids in [a,b) with ApplyDelete then inserts at the anchor — and each op publishes through the normal local-apply + Redis path. So the edit reaches a second client byte-equal, survives a restart, and concurrent typing merges with the convergence property test still green. Same observable as the Rust path; no special path for the model.
Make /ai safe, observable, and convergent
Optional add-on AdvancedGate every /ai run behind a preview, record it as a named version for provenance, and prove its edits converge like any other — so an AI command is safe, reversible, and merge-compatible. (This step’s named-version capture assumes the history module is also enabled.)
New in this step
preview-confirm (no auto-apply) Show the model’s planned edits as a diff and apply only when the user confirms — respecting the platform’s no-auto-actions rule, so the AI never changes the document on its own.
provenance Capture a named version labelled with the command before applying, so every agentic edit is browsable, diffable, restorable, and undoable through history.
range validation Check every tool argument and range against the current document and reject out-of-bounds edits, so a stale or hallucinated offset can’t corrupt the text.
Confirm-before-apply, provenance, and the convergence guarantee
The apex ties the three modules together. Respect the platform’s no-auto-actions rule: show the model’s planned edits as a diff preview and apply only on the user’s confirm. Use history for provenance — capture a named version (“/ai: turn list into table”) before applying, so every agentic edit is browsable, diffable, restorable, and undoable. Guardrails: bound the loop, validate every tool argument and range against the live document, and reject out-of-bounds edits. Because the edits flow through the ordinary CRDT path, the convergence guarantee that defines this project is untouched — an AI edit is just an edit.
Agent prompt — paste into an agent with repo access
Role: Senior engineer in this repo (use the selected backend + frontend).
Context: runAiCommand returns pending edits + a summary; edits apply as CRDT ops; history can capture named versions.
Task: Wrap /ai with preview-confirm, provenance, and guardrails.
Requirements:
- Return the pending edits as a diff preview; apply ONLY after the user confirms (no auto-apply).
- Capture a named version labelled with the command BEFORE applying, so the run is browsable/diffable/restorable via history.
- Validate tool args and ranges against the current document; reject out-of-bounds; keep the iteration cap.
Tests / acceptance:
- A confirmed /ai command's edits appear on every connected client and survive a restart.
- Three replicas that received the run's ops in different orders end byte-equal (convergence/merge-compatibility unaffected).
- The run shows up in history as a restorable version; declining the preview changes nothing.
Output: a unified diff plus how provenance + the CRDT path keep AI edits fully reversible and convergent.What success looks like
A /ai run shows its planned edits as a diff preview and applies only on confirm — declining changes nothing. On confirm, a named version is captured first (“/ai: turn list into table”) for provenance, then the edits flow through the ordinary CRDT path: they appear on every connected client and survive a restart, three replicas that received the ops in different orders end byte-equal, and the run is browsable, diffable, and restorable in history. Out-of-bounds tool arguments are rejected. The convergence guarantee that defines the project is untouched — an AI edit is just an edit.
Where to take it next
- Go deep on the engine: the Rust track (ownership, async tokio) and the Redis track (pub/sub, TTL presence, streams).
- See how this project’s
rust 5/go 3ratings stack up across builds on the Compare page. - Curious why Go scores 3/5 here but 5/5 for a REST API? That contrast — same language, different problem — is the whole point of the rating system.