FastAPI + JoinableQueue + Threads = Deadlock

What happened?

My FastAPI server froze and every incoming request hung indefinitely. The periodic logging loop kept running because it lives on another thread.

Investigation

Reproduce code

import itertools
import logging
import threading
import time
from multiprocessing import JoinableQueue

import uvicorn
from fastapi import FastAPI, Query

logging.basicConfig(
    format="%(asctime)s %(levelname)s %(name)s %(message)s",
    level=logging.INFO,
)
logger = logging.getLogger("stuck-server")

app = FastAPI()
task_queue = JoinableQueue(maxsize=1)
worker_should_run = threading.Event()
worker_should_run.set()
threads_started = threading.Event()
task_counter = itertools.count()


def log_heartbeat() -> None:
    while True:
        logger.info("heartbeat: FastAPI process is still logging")
        time.sleep(2)


def drain_queue() -> None:
    while True:
        worker_should_run.wait()
        task_id = task_queue.get()
        logger.info("worker consumed %s", task_id)
        time.sleep(0.2)
        task_queue.task_done()


@app.on_event("startup")
def start_threads() -> None:
    if threads_started.is_set():
        return
    threading.Thread(target=drain_queue, daemon=True).start()
    threading.Thread(target=log_heartbeat, daemon=True).start()
    threads_started.set()
    logger.info("background threads started")


@app.get("/transcribe")
async def transcribe(
    block: bool = Query(
        False,
        description=(
            "When true, pause the worker and enqueue tasks until JoinableQueue.put() "
            "blocks forever, freezing the server."
        ),
    ),
) -> dict[str, str]:
    task_id = f"task-{next(task_counter)}"
    if block:
        worker_should_run.clear()
        logger.info("pausing worker and filling queue before blocking")
        task_queue.put(f"{task_id}-warmup")
        logger.info("queue filled; the next put call will block forever")
        task_queue.put(f"{task_id}-blocking")
        return {"status": "unreachable"}  # pragma: no cover
    worker_should_run.set()
    task_queue.put(task_id)
    logger.info("task submitted normally")
    return {"status": "enqueued", "task_id": task_id}


@app.get("/healthz")
async def healthz() -> dict[str, str]:
    return {"status": "ok"}


if __name__ == "__main__":
    uvicorn.run(
        "stuck_server:app",
        host="127.0.0.1",
        port=8001,
        log_level="info",
    )

Guiding question

Do you understand how this happens?

How did the deadlock form?

The main thread runs the asyncio event loop. We emit events to a worker thread via a multiprocessing.JoinableQueue. When that worker dies, it stops pulling events, the queue fills up, and the synchronous put() call blocks forever. Because the event loop lives on that same thread, the entire FastAPI server freezes at that point. The logging heartbeat runs on another thread, so it continues to print even while the main thread is wedged.

Background

How OS manages many I/Os efficiently

Operating systems provide mechanisms for waiting on many I/O sources efficiently. Without them, the CPU would sit idle while a single task blocks.

Inefficient case:

Task A
|
|---- IO wait (3 seconds) ----|
|
rest of Task A

In this case the CPU cannot do useful work until the I/O finishes.


More efficient case:

Task A
|
| STOP (await)
|
Task B runs
Task C runs
Task D runs
|
IO completes
|
Task A resumes

While Task A waits on I/O, the scheduler can run other tasks, making much better use of the CPU. Having a mechanism to run other work while a task waits on I/O is therefore crucial.

Linux I/O model

Linux provides an API named epoll that can wait on a large number of file descriptors efficiently. epoll is a kernel-level I/O event notification API; it lets you ask the kernel to watch many descriptors and tell you when they become ready.

Linux also follows the well-known design idea:

Everything is a file

Most I/O resources are managed as file descriptors (fd), which are just integers. Examples include:

  • files
  • sockets (network connections)
  • pipes
  • terminals
  • timers
  • event notification handles

When you open a socket, the OS returns an fd, e.g.

socket fd = 42

You then use that fd to issue I/O calls:

read(fd)
write(fd)

Role of epoll

epoll watches the state of each registered fd.

A program typically:

  1. registers the fds it wants to monitor
  2. calls epoll_wait() to wait for those fds to become ready
epoll_wait()

When the program calls epoll_wait(), the OS watches for readiness:

fd ready ?

and puts the process to sleep until something becomes ready.

If a network packet arrives, the kernel marks the socket as readable:

socket fd → READABLE

At that point epoll_wait() returns with the list of ready fds, for example:

ready fds:
- socket_1
- socket_7

Event loop

When epoll_wait() returns, user-space code resumes. In most asynchronous programs an event loop performs this work.

Typical flow:

while True:
    run_ready_tasks()
    events = epoll_wait()
    resume_tasks_for(events)

This enables the efficient cycle of

I/O wait
↓
run other tasks
↓
I/O complete
↓
resume original task

In Linux:

  • most I/O surfaces are exposed as file descriptors (fd)
  • epoll lets the kernel watch many fds efficiently
  • epoll_wait() reports which fd is ready
  • the event loop restarts the task that was waiting on that fd

All of this allows the CPU to keep working on other tasks while one task waits for I/O.

How the event loop works

The event loop is implemented in Python, not in the OS. Conceptually it looks like this:

+----------------------+
|      Event Loop      |
+----------------------+
     |            |
     |            |
Ready Tasks     epoll
 (Queue)      (waiting)
  • Ready Tasks → tasks that can run immediately
  • epoll → file descriptors that are currently waiting for I/O

Consider this example:

Task A
|
| STOP (await)
|
Task B runs
Task C runs
Task D runs
|
IO completes
|
Task A resumes

When Task A executes an await, the following happens:

  1. Task A enters the ready queue.
  2. The event loop schedules Task A.
  3. Task A hits await and suspends.
  4. The socket fd (say 42) backing that await is registered with epoll.
  5. Control returns to the event loop.
  6. The ready queue now contains [Task B, Task C, Task D].
  7. The loop runs B, C, and D.
  8. More network data arrives and marks fd 42 READABLE.
  9. The event loop knows fd 42 belongs to Task A, so it re-queues Task A.
  10. Task A eventually runs again and continues.

What is a coroutine?

A coroutine is a function that can pause and later resume. A regular function runs straight through once called; a coroutine can yield control mid-way.

Python's await keyword pauses the currently running coroutine and hands control back to the event loop. The event loop runs other tasks and resumes the suspended coroutine when its awaited I/O (or other awaited operation) completes.

async def task_a():
    data = await sock.recv()
    print(data)

The await pauses task_a:

task_a()
  |
  await sock.recv()
  |
  pause

Putting it all together:

Event Loop
   |
run(Task A)
   |
Coroutine A
   |
await future
   |
pause
   |
return control
   |
Event Loop

socket readable
      |
epoll_wait returns
      |
Future done
      |
Task A re-enters ReadyQueue
      |
Event Loop
      |
resume coroutine

How JoinableQueue works

multiprocessing.JoinableQueue provides a task queue plus synchronization for tracking task completion. Unlike a normal Queue, it lets a producer wait until every enqueued task has been processed.

This structure shows up frequently in parallel programs:

Main Process
    |
    v
Producer → Queue → Worker1
                   Worker2
                   Worker3

Common examples include map-reduce jobs, batch processing, crawlers, and generic task executors.

JoinableQueue acts as the synchronization primitive that lets the producer wait until all workers finish their tasks.

If a task in the queue never gets processed (so no one calls task_done()), join() blocks forever. In practice the program appears stuck — a deadlock.

Failure analysis

Why did the event loop hang completely?

  1. The worker thread consuming the JoinableQueue dies.
  2. The async handler still tries to synchronously enqueue work.
  3. That put() call blocks once the queue fills up, so the handler never returns.
  4. Because the handler runs on the main event-loop thread, the thread stops processing events.

Why does put() on the queue hang?

Because the worker that should be draining the JoinableQueue is dead, nothing consumes the items, the queue reaches its capacity, and the next put() call blocks.

Summary

The asyncio event loop is blocked by a synchronous hang inside an async handler. To prevent a repeat:

  1. Give queue.put() a timeout and handle queue.Full (log, drop, or retry) so the event loop never blocks forever.
  2. Add a health check/restart for the worker thread so the queue keeps draining; if the worker dies, the main loop can fail fast instead of wedging.