The non-negotiable
I contribute to HyperFrames, a system that renders video from HTML. The whole project turns on one property that sounds boring until you try to hold it in production: same input, same pixels, every render. Not "close enough." Not "within a JPEG artifact." Byte-identical frames across machines, across runs, across the same run replayed a week later.
If you have that property, everything downstream gets easy. Regression tests become diff old.png new.png. Cloud rendering is trivially parallel because frame N doesn't need to know anything about frame N-1. Bug reports become reproducible. Reviews become visual diffs.
If you don't have that property, you have a recorded video. Which is a fine thing to have. It's just not the same thing.
This post is about the places non-determinism kept trying to sneak back in, what we did about them, and the one adapter that almost broke the contract.
Why the obvious approaches don't clear the bar
Before HyperFrames I reached for one of two options depending on the day:
Record a browser session to MP4. Playwright, ffmpeg, chrome-recorder — pick your poison. The problem is that recording is stateful. The encoder samples at whatever cadence it can get, the browser paints at whatever cadence it can spare, and the two schedules never agree. A GC pause during recording shows up as a hitch in the output. A cold font load flashes as a glyph substitution mid-frame. Rerun the same script tomorrow and you'll get a video that is similar, not identical.
Canvas WYSIWYG tools. After Effects, Motion, Rive, the various in-browser canvas tools. Motion is authored on a timeline and rendered frame by frame — which is closer to what we want — but the moment you need typographic control, semantic HTML, a live product screenshot, or a runtime state machine, you're outside the tool's comfort zone. And capture-based export back out to video reintroduces the recording problem for anything the canvas didn't own.
Here's the tradeoff, drawn honestly:
| Axis | Recorded MP4 (Playwright + ffmpeg) | Canvas WYSIWYG (AE / Motion) | HyperFrames (HTML + one paused timeline) |
|---|---|---|---|
| Byte-identical across runs | no — encoder + browser cadence drift | yes if fully authored in canvas | yes, by construction |
| Renders parallel per frame | no — stateful capture | mostly | yes — each frame is seek(t) + rasterize |
| Handles live web content (fonts, SVG, product UI) | yes, natively | poorly — must import assets | yes, natively |
| Runtime state (form input, product screenshots, live data) | yes | no | yes |
| Debuggable | opaque — the browser is a black box | click-through in the app | open DevTools on any frame t |
| Failure mode | silent frame drops | asset drift when re-imported | loud — a missing font raises before render |
The row that makes HyperFrames worth the trouble is the first one. Everything else is a consequence.
The core idea
The trick is embarrassingly simple to describe and gnarly to enforce: hold a single paused animation timeline. Never let it play. To render frame N at time t = N / fps, seek(t), wait for the DOM to settle, and rasterize. That's the whole loop.
// Roughly what the render loop does. No wall-clock time enters this function.
async function renderFrame(t, page) {
// Every adapter (GSAP, Lottie, Three, Anime, CSS, WAAPI, TypeGPU)
// exposes seek(t). None of them are allowed to advance on their own.
await page.evaluate((t) => window.__hf.seekAll(t), t);
// Fonts, images, video decode, WebGL uploads — all forced to settle
// before the compositor is allowed to paint the frame.
await page.evaluate(() => window.__hf.settle());
return page.screenshot({ omitBackground: false });
}The side-effects of seekAll(t) are what the framework spends its complexity budget on. Every animation runtime we support has to obey the pause. Every source of implicit time — requestAnimationFrame, Date.now(), performance.now(), <video>.currentTime, the audio graph clock — has to either be plumbed through the framework's clock or held still until we say otherwise.
If any one of those leaks, the frame you get is a function of the wall clock, not of t. And then you're back to recording.
Where non-determinism actually hides
I had expected the interesting bugs to live in animation math. They didn't. They lived in five places I would not have guessed before shipping this.
1. Font loading is async and glyph substitution is silent. The browser cheerfully paints a frame with a fallback font, then repaints with the real one two frames later. If you rasterize during that window, you get a frame that is stylistically off — different metrics, different kerning, sometimes a different script entirely. The fix is to await document.fonts.ready in the settle step, and to fail loudly if a @font-face didn't load. HyperFrames refuses to render a frame with a missing font. That felt aggressive until we shipped the alternative and watched a launch video go out with Arial where the client's brand font should have been.
2. requestAnimationFrame runs on browser cadence, not frame count. Every third-party animation library assumes rAF ticks at ~60Hz driven by the compositor. When you're rendering a 30fps video, or a 60fps one on a machine that's under load, rAF fires whenever the browser feels like it, and animation values drift a few pixels between runs. We had to monkey-patch rAF at the framework level so it fires exactly once per seek() and carries a frame-index clock instead of a wall-clock timestamp. That single patch closed more determinism bugs than the next four fixes combined.
3. Garbage collection pauses interfere with animation values. GSAP tweens are computed lazily; if a GC pause lands between a seek() and the paint, the interpolation math runs against slightly stale state. On a hot render machine this was invisible. On a cold Lambda worker it was a 1-2 pixel bleed on rotating elements. The mitigation was ugly: force a --gc-interval on the render worker and warm the heap with a dry-run frame before starting the real render. Not principled. Just works.
4. <video> and <audio> decode is async. Setting video.currentTime = t doesn't mean the frame at t is decoded and ready. It means the decoder has been asked to seek. The seeked event fires later, and the compositor paints whatever it has right now. HyperFrames had to add an explicit await video.seek(t) primitive that resolves on the seeked event and then waits an extra rAF for the compositor to catch up. Two frames of latency, but the alternative was a frame that looked correct 90% of the time.
5. Random seeds inside third-party libraries. Three.js particle systems, GSAP's Math.random() scrub inside certain plugins, any noise-based effect. All of them reach for Math.random(), which is a wall-clock function of the JS runtime's PRNG state at boot. Two renders of the same composition produced different particle trajectories because the runtime had done different work before the first tween. We shipped a seeded Math.random shim at the framework entry point and required every adapter to route through it.
The seven-adapter surface
HyperFrames supports seven animation runtimes: GSAP (default), Lottie, Three.js, Anime.js, CSS keyframes, the Web Animations API, and TypeGPU. Each had to conform to the "one paused timeline you can seek" contract. Most were straightforward — GSAP, Anime, and WAAPI all expose a seek(t) or currentTime = t primitive natively. Lottie ships one via goToAndStop. Three.js exposes it through the animation mixer with mixer.setTime(t). TypeGPU is our own, so we designed for it.
The hard one was CSS keyframes.
CSS does not expose a seek(). There is no element.animation.currentTime = t that composes cleanly with animation-delay, animation-duration, and pause states. What you can do is: set animation-play-state: paused, then manipulate animation-delay to a negative value equal to -t, which makes the paused-animation resolve as if it were t seconds into playback.
That works. It also has the property that changing animation-delay restarts style resolution in ways that can trigger a repaint, and if you have a hundred elements on a frame you get a hundred style recalculations per seek. We shipped it because we had to. If I were redesigning the adapter I would drop CSS keyframes support and force those animations through WAAPI, which was designed to be seekable and doesn't fight the compositor.
What I'd redesign
The place I keep landing when I think about a v2 is: a WASM sandbox around the render page.
Right now the framework relies on discipline — every adapter promises to route through the framework's clock, every third-party library promises not to call Math.random directly. In practice, we catch the violators with linting and integration tests, but new libraries always find new ways to leak wall-clock into the frame.
If I could redo it, the render page would run inside a WASM shell that intercepts Math.random, Date.now, performance.now, requestAnimationFrame, and the audio/video decode clocks at the runtime level. Every call would return a framework-controlled deterministic value keyed off the current t. Libraries wouldn't have to cooperate. They just wouldn't be able to reach the real clock even if they tried.
That's a large project. It also might be the only real answer, because the current design is fundamentally a set of promises, and promises don't survive the next npm install.
See also
- HNSW or IVF-PQ? What I Actually Chose at 2M Documents — a different flavor of "pick the tool for the regime you're in, not the one you might grow into."
More on the HyperFrames project — including the seven-adapter design, the Lambda render path, and the CLI — is on the projects page.