A Spreadsheet at Google-Sheets Scale: Cell Families and Focused Recompute in lazily-rs

A spreadsheet is the original reactive program. You type a number into one cell, and formulas scattered across the sheet update themselves. It is the canonical demo for every signals library — and also the canonical place where naive reactivity falls over, because a real sheet has millions of cells and you are only ever looking at a few dozen of them.

This post is about two new primitives in lazily — CellMap and CellFamily — and a benchmark that drives them at the documented capacity of a Google Sheets workbook: 10,000,000 cells. The headline isn't that it builds the sheet in under a second (it does). It's that once the sheet exists, editing an input and reading a 1,000-cell viewport costs ~11 microseconds — and stays ~11 µs whether the sheet has one million cells or ten thousand. That flat line is the whole argument for lazy reactivity, and it falls directly out of recomputing only the cells you focus on.

If you haven't met the library before, the architecture tour covers the core graph — Cell → Slot → Signal → Effect, all owned by a Context. Everything below builds on those primitives.

The problem: a collection is not a cell

The obvious way to model a sheet of values in a signals library is one cell holding the whole collection:

let sheet = ctx.cell(HashMap::<CellId, f64>::new());

This is coarse. Every single-entry edit replaces the whole map, so the PartialEq guard sees a different HashMap and invalidates every reader of the cell — even readers that only ever touched one entry. Over a wire (lazily speaks a Snapshot/Delta sync protocol), it's worse: one keystroke re-sends the entire map.

Editing entry a wakes up readers B and C for no reason. At spreadsheet scale that is the difference between an interactive sheet and a spinning beachball.

CellMap: fine-grained, keyed reactivity

CellMap<K, V> is a hash collection whose membership is itself reactive, with one independently-tracked value cell per entry. Each entry is its own CellHandle<V>, so a reader that depends on entry a is not invalidated when entry b changes — only that entry's own dependents recompute.

use lazily::{CellMap, Context};

let ctx = Context::new();
let scores: CellMap<&'static str, i32> = CellMap::new(&ctx);
let alice = scores.entry(&ctx, "alice", 10);
let bob   = scores.entry(&ctx, "bob", 20);

// A computed over the whole collection recomputes only on *membership* change.
let n = ctx.computed({
    let scores = scores.clone();
    move |ctx| scores.len(ctx)
});
assert_eq!(ctx.get(&n), 2);

// Mutating an existing entry does NOT change membership — `n` is untouched.
alice.set(&ctx, 11);
assert_eq!(ctx.get(&n), 2);
assert_eq!(bob.get(&ctx), 20);

Two things are tracked separately:

Per-entry value — each key's CellHandle<V> has its own dependents. Editing alice invalidates only readers of alice.
Membership — a dedicated version cell bumps only when keys are added or removed, so keys() / len() readers recompute on structural change, not on every value edit.

CellFamily: a lazy, cached factory of cells

CellFamily<K, V> layers a value factory on top of CellMap — the same idea as Recoil/Jotai's atomFamily. You give it a function from key to value, and it lazily mints and caches one cell per key on first access:

use lazily::{CellFamily, Context};

let ctx = Context::new();
// factory: every key starts as key * 2
let fam: CellFamily<u32, u32> = CellFamily::new(&ctx, |&k| k * 2);

let c0 = fam.get(&ctx, 0);   // minted on first access -> 0
let c7 = fam.get(&ctx, 7);   // minted -> 14
assert_eq!(fam.get(&ctx, 7).get(&ctx), 14); // same cell, cached

The internals are tiny — get is just a cached entry_with against the underlying CellMap:

pub fn get(&self, ctx: &Context, key: K) -> CellHandle<V> {
    let factory = Rc::clone(&self.factory);
    let k = key.clone();
    self.map.entry_with(ctx, key, move || factory(&k))
}

This is exactly what you want for a spreadsheet: a cell exists conceptually for every coordinate, but you only pay to materialize the ones you touch. Storage is a sparse arena (Vec<Option<Node>> with a free-list) — it allocates only the cells you actually create, never the full grid.

The benchmark: a 10,000,000-cell sheet

The scale benchmark builds a spreadsheet-shaped graph of N input cells plus N formula slots, where each formula reads two inputs:

formula[i] = input[i] + input[i-1]

At N = 1_000_000 that's ~2,000,000 reactive nodes. To model a full-capacity Google Sheets workbook (documented limit: 10,000,000 cells), run it at N = 5_000_000 — 5M inputs + 5M formulas = 10M cells:

LAZILY_SCALE_N=5000000 cargo bench --features scale-bench --bench scale

A single criterion run on a 186 GB host:

case	mean	per cell
`build` (10M cells)	~706 ms	~71 ns
`cold_full_recalc` (5M formulas)	~518 ms	~104 ns
`full_recalc_invalidate_all` (5M)	~329 ms	~66 ns
`viewport_recalc` (1k focused cells)	~11.4 µs	~11 ns

Building all ten million cells takes about seven tenths of a second. Recomputing every formula from cold is about half a second. Those are the "you asked for everything, you got everything" numbers.

The last row is the interesting one.

Focus is the whole game

viewport_recalc edits one input cell and then reads only a 1,000-cell viewport — the cells a user can actually see. It costs ~11.4 µs. That is roughly 5,000× cheaper than a full recalc, and — the part that matters — it is the same ~11 µs the benchmark measured at 1M cells. Viewport cost is independent of sheet size.

This is the definition of lazy reactivity. An edit doesn't compute anything — it flips dirty flags down the dependency edges in O(1). Work happens on pull, when something reads a value. The 9,999,000 formulas you aren't looking at stay dirty and cost nothing. Only the cells you focus on — the viewport — pay to recompute, and they pay per-cell, not per-sheet.

Eager reactive systems invert this: an edit pushes recomputation through the whole graph immediately, so cost scales with sheet size whether or not anyone is looking. lazily is lazy by default — Slots mark dirty on invalidation and recompute on access — and only opts into eager evaluation when you explicitly ask with ctx.signal(). A spreadsheet is the textbook case where lazy wins: enormous graph, tiny focus.

What about memory?

Building 2,000,000 nodes uses ~414 MiB RSS — about 216 bytes per node. Because storage is a sparse arena, capacity is bounded by populated cells vs. available RAM (≈ RAM ÷ 216 B), not by the grid's theoretical size. The 186 GB host above could hold on the order of 10⁸–10⁹ populated cells — far past any realistically-filled sheet. And the per-cell cost held roughly constant from 1M → 10M, which is the evidence that the model extrapolates rather than degrading at capacity.

Why this composes

CellMap and CellFamily aren't a spreadsheet feature — they're the keyed-collection layer that was missing from the core graph. The same fine-grained invalidation that keeps a viewport cheap also keeps a wire-sync protocol cheap: lazily addresses collection entries with wire-stable keys, so a Snapshot/Delta stream sends only the entries that actually changed. A reactive UI cache, a derived-view layer over a database, a holon graph — anything shaped like "a big keyed collection where readers touch a small slice" gets the same flat-line behavior.

The lesson the spreadsheet teaches, made mechanical: don't recompute what no one is looking at.

Try it

cargo add lazily
# scale benchmark (10M-cell sheet)
LAZILY_SCALE_N=5000000 cargo bench --features scale-bench --bench scale