Job lifecycle & realtime delivery
Sending a message to a tmate triggers a multi-stage pipeline involving FastAPI, Celery, Redis, and WebSockets. Understanding this flow helps you debug delays and build reliable integrations.
1. Request intake
- Client calls
POST /v1/chatsorPOST /v1/chats/{thread_id}/messages. - FastAPI (
app/api/routes/chats.py) validates the Supabase user, thread access, and attachment metadata. - The API stores the user message, touches the thread, and enqueues a background job if a tmate needs to respond.
2. Job creation
- Jobs are recorded in the database with a unique
job_id. - Celery enqueues
run_agent_jobdefined inapp/worker/tasks.py. - Metadata includes
agent_key,thread_id,user_id, attachments, and optionalsession_id.
3. Worker execution
Inside run_agent_job:
- Resolve the user context (
resolve_user_context) to load org info, enabled tmates, and billing plan. - Apply the user context to environment variables so downstream tools can read
USER_ID,ENABLED_AGENTS, etc. - Instantiate the agent via
create_agentand callagent.run(...)(which in turn callsrun_api→ TmatesAgentsSDK). - Record usage if billing is enabled (
BillingManager.record_usage). - Clean up attachments using
_strip_attachment_linksto avoid duplicating direct download URLs.
If the worker fails with a transient DB error, Celery retries the job using the retry policies defined near the top of tasks.py.
4. Posting results back to the API
Workers can’t talk to WebSockets directly, so they call internal endpoints:
_post_chat_status_to_api→POST /v1/internal/chat-statusfor “started”, “in_progress”, or progress updates._post_chat_result_to_api→POST /v1/internal/agent-resultwith the final response text and attachments.
app/api/routes/agent_results.py validates the payload, persists the assistant message, and emits WebSocket events.
5. WebSocket fan-out
app/api/routes/websocket.py keeps a ConnectionManager keyed by user ID. When a new message arrives:
- The API serializes the message into a payload that mirrors REST responses.
- Every active socket for that user receives
{ "type": "new_message", ... }. - If the socket write fails (network error), the connection is pruned and clients must reconnect.
Status updates follow the same path with "type": "chat_status", allowing clients to show typing indicators or progress meters.
6. Attachments
- Agents call
register_generated_attachments(job_id, attachments)to stage files while they upload to storage. _post_chat_result_to_apiattaches those entries to the response so clients know which files to fetch from/v1/files/download/...._strip_attachment_linksremoves raw download URLs accidentally left in the text body (“download here: https://…”) to prevent leaking credentials.
7. Troubleshooting checklist
| Symptom | Where to look |
|---|---|
| Job never completes | Check Celery worker logs and Redis connectivity; verify CELERY_BROKER_URL. |
| WebSocket doesn’t update | Ensure worker is calling _post_chat_result_to_api; confirm the API host can reach itself (correct APP_BASE_URL). |
| Attachments missing | Confirm the agent registered them and the storage backend is configured; inspect _strip_attachment_links logs. |
| Duplicate responses | Look for retries in Celery logs; idempotency relies on the internal API ignoring duplicate job_ids. |
Follow these stages end to end whenever you diagnose slow or missing replies from tmates.