Skip to main content

Job lifecycle & realtime delivery

Sending a message to a tmate triggers a multi-stage pipeline involving FastAPI, Celery, Redis, and WebSockets. Understanding this flow helps you debug delays and build reliable integrations.

1. Request intake

  1. Client calls POST /v1/chats or POST /v1/chats/{thread_id}/messages.
  2. FastAPI (app/api/routes/chats.py) validates the Supabase user, thread access, and attachment metadata.
  3. The API stores the user message, touches the thread, and enqueues a background job if a tmate needs to respond.

2. Job creation

  • Jobs are recorded in the database with a unique job_id.
  • Celery enqueues run_agent_job defined in app/worker/tasks.py.
  • Metadata includes agent_key, thread_id, user_id, attachments, and optional session_id.

3. Worker execution

Inside run_agent_job:

  1. Resolve the user context (resolve_user_context) to load org info, enabled tmates, and billing plan.
  2. Apply the user context to environment variables so downstream tools can read USER_ID, ENABLED_AGENTS, etc.
  3. Instantiate the agent via create_agent and call agent.run(...) (which in turn calls run_api → TmatesAgentsSDK).
  4. Record usage if billing is enabled (BillingManager.record_usage).
  5. Clean up attachments using _strip_attachment_links to avoid duplicating direct download URLs.

If the worker fails with a transient DB error, Celery retries the job using the retry policies defined near the top of tasks.py.

4. Posting results back to the API

Workers can’t talk to WebSockets directly, so they call internal endpoints:

  • _post_chat_status_to_apiPOST /v1/internal/chat-status for “started”, “in_progress”, or progress updates.
  • _post_chat_result_to_apiPOST /v1/internal/agent-result with the final response text and attachments.

app/api/routes/agent_results.py validates the payload, persists the assistant message, and emits WebSocket events.

5. WebSocket fan-out

app/api/routes/websocket.py keeps a ConnectionManager keyed by user ID. When a new message arrives:

  1. The API serializes the message into a payload that mirrors REST responses.
  2. Every active socket for that user receives { "type": "new_message", ... }.
  3. If the socket write fails (network error), the connection is pruned and clients must reconnect.

Status updates follow the same path with "type": "chat_status", allowing clients to show typing indicators or progress meters.

6. Attachments

  • Agents call register_generated_attachments(job_id, attachments) to stage files while they upload to storage.
  • _post_chat_result_to_api attaches those entries to the response so clients know which files to fetch from /v1/files/download/....
  • _strip_attachment_links removes raw download URLs accidentally left in the text body (“download here: https://…”) to prevent leaking credentials.

7. Troubleshooting checklist

SymptomWhere to look
Job never completesCheck Celery worker logs and Redis connectivity; verify CELERY_BROKER_URL.
WebSocket doesn’t updateEnsure worker is calling _post_chat_result_to_api; confirm the API host can reach itself (correct APP_BASE_URL).
Attachments missingConfirm the agent registered them and the storage backend is configured; inspect _strip_attachment_links logs.
Duplicate responsesLook for retries in Celery logs; idempotency relies on the internal API ignoring duplicate job_ids.

Follow these stages end to end whenever you diagnose slow or missing replies from tmates.