Engineering blog

Observability, incident analysis, and system reliability — in practice.

Writing for engineering teams on observability, incident analysis, and building reliable systems.

Latest writing

Posts for teams operating real systems

Static HTML for bots, share previews, and fast first paint — the SPA enhances it after load.

When Your LLM Won't Stop Talking: Rate Limiting in MCP Hangar

Your agent just discovered it can call hangar_call in a loop. Twelve hundred requests in ninety seconds. Here's how MCP Hangar's two-system rate limiting — token bucket at the command bus, exponential backoff on auth — puts a ceiling on that.

mcpmcp-hangarsecurityobservabilityarchitectureopen-source

Human-in-the-Loop for MCP: How the Approval Gate Works

Your LLM just decided to delete a production alert rule. You didn't ask it to. The approval gate puts a human between the decision and the execution — not for every tool call, but for the ones where 'undo' is a support ticket.

mcpmcp-hangargovernancesecurityllmenterprisehuman-in-the-loop

Benchmarking MCP Tool Calls: Three Findings That Aren't 'Parallel Is Faster'

We ran 5,300 measurements across 6 scenarios to benchmark MCP Hangar's parallel tool execution. The headline is 19.6× speedup. The actual findings are more interesting: stdio isn't serial, the framework costs nothing, and a hardcoded '4' was silently capping your concurrency.

mcpbenchmarksperformancemcp-hangarconcurrencyopen-source

The MCP Governance Problem Nobody's Talking About

MCP is exploding. Everyone's plugging random servers into their LLMs. Nobody's asking who's accountable when something goes catastrophically wrong. This is an enterprise nightmare waiting to happen, and you're sleepwalking into it.

mcpgovernancesecurityllmobservabilityenterprise
Archive

All posts

6 published articles ready for crawlers and readers.

See full archive

LLM-Assisted Post-Mortems: The Streetlight Effect, Industrialized

You pasted 400 lines of logs into ChatGPT and it wrote you a root cause analysis. It reads beautifully. It's also wrong — because you fed it a curated slice of reality and it hallucinated the rest. The interesting question isn't whether copy-paste works. It's what changes when the LLM can query your stack directly — and what new problems that creates.

observabilityllmpost-mortemmcpincident-responseai-opsgovernance

When Your LLM Won't Stop Talking: Rate Limiting in MCP Hangar

Your agent just discovered it can call hangar_call in a loop. Twelve hundred requests in ninety seconds. Here's how MCP Hangar's two-system rate limiting — token bucket at the command bus, exponential backoff on auth — puts a ceiling on that.

mcpmcp-hangarsecurityobservabilityarchitectureopen-source

Human-in-the-Loop for MCP: How the Approval Gate Works

Your LLM just decided to delete a production alert rule. You didn't ask it to. The approval gate puts a human between the decision and the execution — not for every tool call, but for the ones where 'undo' is a support ticket.

mcpmcp-hangargovernancesecurityllmenterprisehuman-in-the-loop

Benchmarking MCP Tool Calls: Three Findings That Aren't 'Parallel Is Faster'

We ran 5,300 measurements across 6 scenarios to benchmark MCP Hangar's parallel tool execution. The headline is 19.6× speedup. The actual findings are more interesting: stdio isn't serial, the framework costs nothing, and a hardcoded '4' was silently capping your concurrency.

mcpbenchmarksperformancemcp-hangarconcurrencyopen-source

The MCP Governance Problem Nobody's Talking About

MCP is exploding. Everyone's plugging random servers into their LLMs. Nobody's asking who's accountable when something goes catastrophically wrong. This is an enterprise nightmare waiting to happen, and you're sleepwalking into it.

mcpgovernancesecurityllmobservabilityenterprise