HomeBlogPricingCareersDocsGitHubSlack community
TENSORLAKE·ENGINEERING LOG / VOL. 04·EST. 2024

Engineering log.

Dispatches from the Tensorlake runtime team — how we build the sandbox layer under five million agent workloads, the research that shapes it, and what we shipped this week.

EDITIONQ2 · 2026
POSTS12
AUTHORS07
Sort
Archive —ALL
11 POSTS · UPDATED DAILY
May 1, 2026Engineering

Browser Harness: direct Chrome DevTools Protocol access and a self-healing harness for browser agentsBrowser-use strips their harness to 592 lines and gives the LLM a raw WebSocket to Chrome's DevTools Protocol. When the agent hits a gap, it writes a helper function — and saves it for future runs.

Antonio Jimeno YepesEngineering3 min
Apr 30, 2026Engineering

How scaffold design affects coding agent benchmark scores: lessons from DroidContextual instruction injection, per-model tool schemas, and planning/execution splits — the harness decisions behind Droid's 77.3% on Terminal-Bench.

Antonio Jimeno YepesEngineering3 min
Apr 29, 2026Engineering

Hermes: the coding agent that gets better the more you use itMost coding agents are stateless. Hermes uses a closed learning loop where the agent creates skills from experience, improves them during use, and builds a persistent model of who you are across sessions.

Antonio Jimeno YepesEngineering3 min
Apr 29, 2026Engineering

Pi: a coding agent with efficient system promptingPi keeps its entire system prompt — including all tool definitions — under 1,000 tokens, a 10× reduction over tools like Claude Code or Cline.

Antonio Jimeno YepesEngineering2 min
Apr 28, 2026Engineering

ForgeCode: top open source coding agent in Terminal-Bench@2.0ForgeCode reaches 81.8% on Terminal-Bench 2.0 with both Claude Opus 4.6 and GPT-5.4 — a look at what the harness is doing that the model isn't.

Antonio Jimeno YepesEngineering3 min
Apr 27, 2026Engineering

Building Sandboxes for Computer UseComputer Use is a tiny loop. The hard part is building the boring, reproducible desktop around it.

BERNARD KOLOBARASoftware Engineer · Tensorlake8 min
Apr 24, 2026Engineering

Starting hundreds of sandboxes in parallel, and the design that makes it possible.We replaced reconciliation loops with durable command outbox to scale our sandbox scheduler to start 1000s of sandboxes every second

David CalaveraSoftware Engineer · Tensorlake 9 min
Apr 22, 2026Engineering

Tensorlake is now an official Harbor environment runtimeHarbor defines and evaluates terminal tasks. Tensorlake provides the MicroVM execution layer. Together they are a full evaluation stack for CLI agents.

Tensorlake TeamEngineering7 min
Apr 16, 2026Concepts

Suspend vs. snapshot: pause a sandbox, or save it for reuse?One is a pause button, the other is a save file. Same state, different question — and the answer shapes your cost model, your fan-out pattern, and which failures you can recover from.

Diptanu Gon ChoudhuryCEO / Co-founder12 min
Apr 3, 2026Engineering

Autoresearch on steroids with sandboxesAn LLM agent can propose incremental training-script improvements, but safely executing untrusted code requires isolated sandboxes with resource limits — that's where Tensorlake comes in. Here's the overnight hill-climb, end to end.

Tensorlake TeamEngineering9 min
◆ FIELD NOTES — WEEKLY

Engineering posts, in your inbox.

One dispatch per week from the Tensorlake team — runtime deep-dives, product updates, and the occasional benchmark that surprised us.