Who is Kaushik Saravanan?

Kaushik Saravanan is an AI/ML engineer and MS in Artificial Intelligence Engineering candidate at Carnegie Mellon University (ECE, expected December 2027), based in Pittsburgh, PA. He was previously an Associate Application Engineer at SAP Labs India (2024–2026), where he shipped production GDPR-compliant RAG and LLM systems to 400+ users. IEEE-published researcher and Smart India Hackathon 2022 winner.

Is Kaushik Saravanan open to new AI/ML roles?

Yes. Kaushik is open to Summer 2027 AI/ML and RAG internships in the US, and full-time AI engineering roles starting January 2028 after his CMU MS-AIE graduation. Reach out via LinkedIn (linkedin.com/in/kaushiksss) or X (@Kaushiks0).

Does Kaushik need visa sponsorship?

Kaushik is an F-1 international student at Carnegie Mellon University. He has 3-year STEM OPT eligibility after his December 2027 graduation, and is open to employers who sponsor H-1B afterward.

What did Kaushik build at SAP Labs India?

At SAP Labs India (2024–2026) he engineered a GDPR-compliant, privacy-first RAG platform for SAP's internal chatbot. He scaled it to 2M+ documents and 400+ users with <2s p95 end-to-end latency, fine-tuned DeBERTa for Germany-specific PII detection (94% recall@10, MRR@10=0.82), and rewrote a credential-fetch client in dependency-free Go for 9,000+ Linux servers.

What are Kaushik's IEEE publications?

Two IEEE papers: 'Swarm Intelligence-Based Cooperative Intelligent Transportation System' (ICCIES 2025) and 'Cognitive Intrusion Detection System in Autonomous Vehicles Using Machine Learning' (ICPECTS 2024).

What is Kaushik's tech stack?

Python, Go, FastAPI, PyTorch, TensorFlow, Hugging Face Transformers, LangChain, PostgreSQL, Docker, Kubernetes, NVIDIA CUDA, Google Cloud Platform, and Microsoft Azure. Specializes in RAG pipelines, LLM fine-tuning (DeBERTa, QLoRA), and cloud observability.

What HyperFrames Taught Me About Deterministic Video Rendering

Q: What are Kaushik's IEEE publications?

Two IEEE papers: 'Swarm Intelligence-Based Cooperative Intelligent Transportation System' (ICCIES 2025) and 'Cognitive Intrusion Detection System in Autonomous Vehicles Using Machine Learning' (ICPECTS 2024).

Q: What is Kaushik's tech stack?

Python, Go, FastAPI, PyTorch, TensorFlow, Hugging Face Transformers, LangChain, PostgreSQL, Docker, Kubernetes, NVIDIA CUDA, Google Cloud Platform, and Microsoft Azure. Specializes in RAG pipelines, LLM fine-tuning (DeBERTa, QLoRA), and cloud observability.

The non-negotiable

I contribute to HyperFrames, a system that renders video from HTML. The whole project turns on one property that sounds boring until you try to hold it in production: same input, same pixels, every render. Not "close enough." Not "within a JPEG artifact." Byte-identical frames across machines, across runs, across the same run replayed a week later.

If you have that property, everything downstream gets easy. Regression tests become diff old.png new.png. Cloud rendering is trivially parallel because frame N doesn't need to know anything about frame N-1. Bug reports become reproducible. Reviews become visual diffs.

If you don't have that property, you have a recorded video. Which is a fine thing to have. It's just not the same thing.

This post is about the places non-determinism kept trying to sneak back in, what we did about them, and the one adapter that almost broke the contract.

Why the obvious approaches don't clear the bar

Before HyperFrames I reached for one of two options depending on the day:

Record a browser session to MP4. Playwright, ffmpeg, chrome-recorder — pick your poison. The problem is that recording is stateful. The encoder samples at whatever cadence it can get, the browser paints at whatever cadence it can spare, and the two schedules never agree. A GC pause during recording shows up as a hitch in the output. A cold font load flashes as a glyph substitution mid-frame. Rerun the same script tomorrow and you'll get a video that is similar, not identical.

Canvas WYSIWYG tools. After Effects, Motion, Rive, the various in-browser canvas tools. Motion is authored on a timeline and rendered frame by frame — which is closer to what we want — but the moment you need typographic control, semantic HTML, a live product screenshot, or a runtime state machine, you're outside the tool's comfort zone. And capture-based export back out to video reintroduces the recording problem for anything the canvas didn't own.

Here's the tradeoff, drawn honestly:

Axis	Recorded MP4 (Playwright + ffmpeg)	Canvas WYSIWYG (AE / Motion)	HyperFrames (HTML + one paused timeline)
Byte-identical across runs	no — encoder + browser cadence drift	yes if fully authored in canvas	yes, by construction
Renders parallel per frame	no — stateful capture	mostly	yes — each frame is `seek(t)` + rasterize
Handles live web content (fonts, SVG, product UI)	yes, natively	poorly — must import assets	yes, natively
Runtime state (form input, product screenshots, live data)	yes	no	yes
Debuggable	opaque — the browser is a black box	click-through in the app	open DevTools on any frame `t`
Failure mode	silent frame drops	asset drift when re-imported	loud — a missing font raises before render

The row that makes HyperFrames worth the trouble is the first one. Everything else is a consequence.

The core idea

The trick is embarrassingly simple to describe and gnarly to enforce: hold a single paused animation timeline. Never let it play. To render frame N at time t = N / fps, seek(t), wait for the DOM to settle, and rasterize. That's the whole loop.

// Roughly what the render loop does. No wall-clock time enters this function.
async function renderFrame(t, page) {
  // Every adapter (GSAP, Lottie, Three, Anime, CSS, WAAPI, TypeGPU)
  // exposes seek(t). None of them are allowed to advance on their own.
  await page.evaluate((t) => window.__hf.seekAll(t), t);
 
  // Fonts, images, video decode, WebGL uploads — all forced to settle
  // before the compositor is allowed to paint the frame.
  await page.evaluate(() => window.__hf.settle());
 
  return page.screenshot({ omitBackground: false });
}

The side-effects of seekAll(t) are what the framework spends its complexity budget on. Every animation runtime we support has to obey the pause. Every source of implicit time — requestAnimationFrame, Date.now(), performance.now(), <video>.currentTime, the audio graph clock — has to either be plumbed through the framework's clock or held still until we say otherwise.

If any one of those leaks, the frame you get is a function of the wall clock, not of t. And then you're back to recording.

Where non-determinism actually hides

I had expected the interesting bugs to live in animation math. They didn't. They lived in five places I would not have guessed before shipping this.

1. Font loading is async and glyph substitution is silent. The browser cheerfully paints a frame with a fallback font, then repaints with the real one two frames later. If you rasterize during that window, you get a frame that is stylistically off — different metrics, different kerning, sometimes a different script entirely. The fix is to await document.fonts.ready in the settle step, and to fail loudly if a @font-face didn't load. HyperFrames refuses to render a frame with a missing font. That felt aggressive until we shipped the alternative and watched a launch video go out with Arial where the client's brand font should have been.

2. requestAnimationFrame runs on browser cadence, not frame count. Every third-party animation library assumes rAF ticks at ~60Hz driven by the compositor. When you're rendering a 30fps video, or a 60fps one on a machine that's under load, rAF fires whenever the browser feels like it, and animation values drift a few pixels between runs. We had to monkey-patch rAF at the framework level so it fires exactly once per seek() and carries a frame-index clock instead of a wall-clock timestamp. That single patch closed more determinism bugs than the next four fixes combined.

3. Garbage collection pauses interfere with animation values. GSAP tweens are computed lazily; if a GC pause lands between a seek() and the paint, the interpolation math runs against slightly stale state. On a hot render machine this was invisible. On a cold Lambda worker it was a 1-2 pixel bleed on rotating elements. The mitigation was ugly: force a --gc-interval on the render worker and warm the heap with a dry-run frame before starting the real render. Not principled. Just works.

4. <video> and <audio> decode is async. Setting video.currentTime = t doesn't mean the frame at t is decoded and ready. It means the decoder has been asked to seek. The seeked event fires later, and the compositor paints whatever it has right now. HyperFrames had to add an explicit await video.seek(t) primitive that resolves on the seeked event and then waits an extra rAF for the compositor to catch up. Two frames of latency, but the alternative was a frame that looked correct 90% of the time.

5. Random seeds inside third-party libraries. Three.js particle systems, GSAP's Math.random() scrub inside certain plugins, any noise-based effect. All of them reach for Math.random(), which is a wall-clock function of the JS runtime's PRNG state at boot. Two renders of the same composition produced different particle trajectories because the runtime had done different work before the first tween. We shipped a seeded Math.random shim at the framework entry point and required every adapter to route through it.

The seven-adapter surface

HyperFrames supports seven animation runtimes: GSAP (default), Lottie, Three.js, Anime.js, CSS keyframes, the Web Animations API, and TypeGPU. Each had to conform to the "one paused timeline you can seek" contract. Most were straightforward — GSAP, Anime, and WAAPI all expose a seek(t) or currentTime = t primitive natively. Lottie ships one via goToAndStop. Three.js exposes it through the animation mixer with mixer.setTime(t). TypeGPU is our own, so we designed for it.

The hard one was CSS keyframes.

CSS does not expose a seek(). There is no element.animation.currentTime = t that composes cleanly with animation-delay, animation-duration, and pause states. What you can do is: set animation-play-state: paused, then manipulate animation-delay to a negative value equal to -t, which makes the paused-animation resolve as if it were t seconds into playback.

That works. It also has the property that changing animation-delay restarts style resolution in ways that can trigger a repaint, and if you have a hundred elements on a frame you get a hundred style recalculations per seek. We shipped it because we had to. If I were redesigning the adapter I would drop CSS keyframes support and force those animations through WAAPI, which was designed to be seekable and doesn't fight the compositor.

Three stacked render passes of the same 8-frame composition — Run A at 09:14 UTC, Run B at 18:47 UTC, Run C a week later. The diff columns between them are pure black: zero pixels of drift. A comparison strip below shows a recorded browser session where the diff column fills with red noise from encoder cadence, GC pauses, and font swaps. — Same input, same pixels. The black diff column between rerun stripes is the whole product — everything downstream (parallel rendering, visual regression tests, reproducible bug reports) is a consequence of that column staying black.

What I'd redesign

The place I keep landing when I think about a v2 is: a WASM sandbox around the render page.

Right now the framework relies on discipline — every adapter promises to route through the framework's clock, every third-party library promises not to call Math.random directly. In practice, we catch the violators with linting and integration tests, but new libraries always find new ways to leak wall-clock into the frame.

If I could redo it, the render page would run inside a WASM shell that intercepts Math.random, Date.now, performance.now, requestAnimationFrame, and the audio/video decode clocks at the runtime level. Every call would return a framework-controlled deterministic value keyed off the current t. Libraries wouldn't have to cooperate. They just wouldn't be able to reach the real clock even if they tried.

That's a large project. It also might be the only real answer, because the current design is fundamentally a set of promises, and promises don't survive the next npm install.