There Was No Tenant | Why Is This Down?

The audit line says the tenant check passed. It ran, it returned, nothing threw, and the green checkmark went into the log. There was no tenant. The check evaluated against an empty context, and empty matched a policy written for nobody. That is the failure mode this post is about: not a check that fails, a check that runs against nothing and calls it a pass.

Here is the sentence version. The MCP Tasks extension — shipping with the 2026-07-28 wave, currently in its ten-week release-candidate window — moves governance from a synchronous choke point into an async lifecycle, and the per-request identity that every governance layer binds via contextvars silently fails to cross the thread-pool boundary that every real proxy has. Your tenant, your principal, your pinned tool digest — simply not there when the task is created or served. No error. No log line saying “governance not bound.” The machinery runs; the identity doesn’t ride along.

Two days ago I wrote that everything governing an agent in the new MCP is yours to turn on. This is the uncomfortable sequel: the thing you turned on can be wired, running, and still not bound. Every snippet below was executed, on CPython 3.12 and 3.13, output shown inline. The repro does the arguing.

Four primitives teach you to trust

A governance proxy authorizes each request against per-request identity: tenant, principal, a policy snapshot, a pinned digest. The idiomatic way to carry that through a deep call stack in Python is contextvars — set it once at the request boundary, read it anywhere downstream, no threading it through every signature. Frameworks do exactly this for request IDs, auth context, OTel spans. It is the ergonomic choice, which is why everyone makes it.

And it earns your trust honestly, because the async primitives you touch every day propagate it faithfully:

import asyncio, contextvars

pin = contextvars.ContextVar("pin", default="<unset>")

async def child():
    return pin.get()

async def main():
    pin.set("acme")
    # create_task copies the current context at creation (PEP 567):
    print("create_task ->", await asyncio.create_task(child()))
    # gather, too:
    print("gather      ->", (await asyncio.gather(child()))[0])

asyncio.run(main())
# create_task -> acme
# gather      -> acme

Even the hop from a foreign thread onto the loop preserves it, because run_coroutine_threadsafe copies the caller’s context — continuing the same file:

import threading

def from_worker_thread(loop):
    pin.set("acme")            # bound in the worker thread, not on the loop
    fut = asyncio.run_coroutine_threadsafe(child(), loop)
    print("run_coroutine_threadsafe ->", fut.result())

async def main():
    loop = asyncio.get_running_loop()
    t = threading.Thread(target=from_worker_thread, args=(loop,))
    t.start()
    await asyncio.sleep(0.2)
    t.join()

asyncio.run(main())
# run_coroutine_threadsafe -> acme

create_task, gather, run_coroutine_threadsafe, plain await. Four for four. You spend months watching contextvars just work, and you stop checking. That is the trap being set.

The one boundary that lies

Real proxies are not pure-async. Somewhere in the stack there is blocking work behind a thread pool — a synchronous vendor SDK you are not going to rewrite, a CPU-bound hash, a legacy client. That is loop.run_in_executor(...). And run_in_executor propagates context in neither direction:

import asyncio, contextvars
pin = contextvars.ContextVar("pin", default="<unset>")

# (1) The worker does NOT inherit the loop's context:
def worker_reads():
    return pin.get()

async def a():
    pin.set("loop-value")
    print("worker inherits loop ctx? ->",
          await asyncio.get_running_loop().run_in_executor(None, worker_reads))

asyncio.run(a())
# worker inherits loop ctx? -> <unset>

# (2) And what the worker binds does NOT come back:
def worker_binds():
    pin.set("acme")           # bind identity/digest in the invoke path
    return "done"

async def b():
    await asyncio.get_running_loop().run_in_executor(None, worker_binds)
    # back on the main loop — where the async task gets created and served:
    print("main loop sees worker's bind? ->", pin.get())

asyncio.run(b())
# main loop sees worker's bind? -> <unset>

run_in_executor is a contextvars black hole, and it looks identical to the four primitives that aren’t. Same await, same call site, same shape in a code review. The primitives that copy context train you to treat ambient identity as reliable; the one boundary that doesn’t is the one every real proxy crosses.

And the black hole has a floor, and the floor is worse. That bind in worker_binds didn’t vanish — it landed in the pool thread’s own context, and pool threads are reused. Watch what the next request sees:

from concurrent.futures import ThreadPoolExecutor

async def c():
    loop = asyncio.get_running_loop()
    pool = ThreadPoolExecutor(max_workers=1)
    # request A binds its tenant in the worker:
    await loop.run_in_executor(pool, worker_binds)
    # request B — a different request, same reused thread:
    print("next request on same pool thread sees ->",
          await loop.run_in_executor(pool, worker_reads))

asyncio.run(c())
# next request on same pool thread sees -> acme

max_workers=1 just makes it deterministic; the default executor reuses threads the same way. So the failure isn’t only there was no tenant. It’s there was someone else’s tenant — request B running its checks against request A’s identity, because A’s bind took up residence in a thread B happened to draw from the pool. An empty context is fail-open by omission. A leaking one is cross-tenant identity bleed, and it means the obvious “fix” — just bind the context inside the worker where you need it — doesn’t patch the hole, it arms it.

Tasks makes the hole load-bearing

Synchronously, the black hole is survivable. You bind at the loop boundary, the executor does its blocking thing, and you never need the context inside the worker — the request begins and ends on the loop, and the loop has the identity the whole time.

The Tasks extension breaks that symmetry. The new lifecycle is server-directed: the server answers tools/call with a task handle and the client drives the rest through polling — tasks/get, tasks/cancel — each poll a separate request, on a separate turn. Walk the identity through it:

tools/call arrives on the main loop. Identity is here, bound and verified.
The blocking invoke runs in a worker thread — where a proxy naturally pins the tool digest or re-reads identity. Inside the black hole.
The handler that mints the task handle runs on the main loop again — but nothing the worker bound came back across the boundary.
Later, tasks/get arrives — a different request, a different turn, whose only link to the original identity is whatever you persisted at step 3. Which, per step 3, may be nothing.

The old experimental API at least had a blocking tasks/result you could squint at as one long request. The redesign replaced it with polling, and polling is the honest shape: separate turns, no ambient anything, no session to lean on. Whatever the task knows about who owns it is what you explicitly gave it. The rest is gone.

Three failure modes fall out, all silent. The unattributed task: created with an empty tenant because the bind lived in the worker, so every later cross-tenant check evaluates against None — fail-open by omission. Or, if you’re lucky, the task lands untracked and the client gets a dead handle and a “task not found” — fail-closed by accident, which reads even worse in an audit. The misattributed task: step 2 ran on a pool thread still carrying a previous request’s bind, and whatever step 3 persists, it persists under the wrong tenant — the bleed from the last section, now written into a durable handle that a later tasks/get will faithfully authorize against. Drift blindness: the digest you pinned in the worker never reaches the point where the task result is re-verified, so the integrity check you think you have doesn’t fire. Nothing throws. Nothing logs. All three pass.

This is not a Python story

Three preconditions, and the failure follows:

Request-scoped context held ambiently — contextvars, thread-locals, async-local storage. Every governance layer.
A thread pool for blocking or CPU work, or a synchronous SDK. Every real proxy.
An async lifecycle where identity must survive request → worker → loop → a later, separate poll. That’s Tasks.

The first two were always in your stack. MCP just made the third one mainstream, which is why the people who will hit this haven’t hit it yet — the RC is locked, the wave lands July 28, and the migration off the old experimental Tasks API is mandatory. The seam is about to take production traffic.

And it is not a CPython quirk. Go’s context.Context doesn’t cross a goroutine you forgot to pass it to. Node’s AsyncLocalStorage doesn’t survive a worker_threads hop or an unwrapped thread-pool callback. Same shape, different runtime: ambient context ends where the stack ends, and an async task is a new stack.

Pass the data

The tempting fix is to patch the boundary: wrap the executor call with contextvars.copy_context().run(...) so the worker can at least read the loop’s identity. The standard library already ships this fix as a factory part — asyncio.to_thread does exactly that wrapping for you, and it’s strictly better than raw run_in_executor: the worker inherits the loop’s context (verified: it reads loop-value, not <unset>), and because each call runs in its own copy, nothing the worker binds can take up residence in the pool thread — to_thread doesn’t bleed. And still: what the worker binds dies with the copy. The mutation never reaches the loop, verified in both directions. The primitive that patches the read direction and disarms the bleed still cannot carry a bind back across the boundary — which tells you the boundary isn’t under-implemented, it’s telling you something. Wrapping keeps you doing the thing that caused this: relying on ambient context across a seam designed not to carry it, one that reopens the next time someone calls a vendor SDK through a bare executor three layers down.

The durable fix is to stop treating per-request identity as ambient the moment work becomes async. Capture it as explicit data at the one point where it is unambiguous — the request boundary, on the loop — and thread it into the task as a value: a governance snapshot the task carries, not a contextvar it hopes to read. Ambient context is a convenience for a single synchronous call stack. An async task is a new stack, a new thread, and a later turn. Give it its papers.

Contextvars are a caching layer for identity, and an async task is a cache invalidation you didn’t know you triggered. When the work outlives the stack, pass the data.

Three questions for your next architecture review

Find every thread-pool hop on your invoke path, and sort them: to_thread lets the worker read your context and contains its binds; raw run_in_executor does neither — the worker reads nothing, and what it binds squats in a reused thread for the next request to inherit. For each hop: what does governance code inside it read, what does it bind that something downstream expects to see, and which request gets that bind instead?
When a task is minted in your stack, is its owner an explicit field captured at the request boundary — or a value someone reads from context at creation time and assumes was there? Could you tell the difference in production, given that both look like a passing check?
What does your policy engine do with an empty principal? If the answer is anything but a hard deny with a loud log, then every context hole in your stack is a fail-open — and you’ve just read about one that ships in July.

The check that fails is a bug. The check that runs against nothing is an architecture. Async didn’t break your governance — it revealed that your governance was a variable you hoped would still be there. It won’t be. Pass the data.

Disclosure: I build MCP Hangar in this space — an MIT-licensed governance layer that sits on the MCP call path, which is exactly where the identity in this post has to be captured before it can be threaded anywhere. The entire project is at github.com/mcp-hangar/mcp-hangar. I’m not pitching it here — but it shapes what I notice, and you should know that.