> ## Documentation Index
> Fetch the complete documentation index at: https://www.activepieces.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Latency

> What Activepieces adds to a synchronous webhook, warm versus cold, so you can set expectations before building on it

A **synchronous webhook** holds the connection open until your flow returns, so the caller waits for the orchestration around your flow plus your flow's own work. These are floor numbers, slow paths included.

<Note>
  **Scope:** the recommended production setup (`SANDBOX_CODE_ONLY`, `AP_REUSE_SANDBOX=true`, one flow per worker), which self-hosted deployments and **dedicated Cloud workers** run. The shared Cloud freemium pool uses a different sandbox. See [Production Setup](/install/configure-operate/production-setup).
</Note>

## What to expect

| Situation                                  | Caller waits                     |
| ------------------------------------------ | -------------------------------- |
| **Warm** (the normal case)                 | **\~0.2 s**                      |
| **Cold** (first request on a fresh worker) | **\~2 s**                        |
| **Under heavy load** (every worker busy)   | **\~0.5 s**                      |
| **Timeout** (flow never responds)          | 30 s, then the connection closes |

## Warm vs cold

<img src="https://mintcdn.com/activepieces/vmaQxJIiR2E_kMnS/resources/diagrams/cold-vs-warm.png?fit=max&auto=format&n=vmaQxJIiR2E_kMnS&q=85&s=4ddd3d230a8965d8714d8c2ec9f86e0f" alt="Warm vs cold latency: a warm sync webhook returns in ~0.2 s, while a cold first request on a fresh worker takes ~2 s, dominated by installing pieces (~0.5 s) and booting the engine (~0.9 s)" width="2000" height="938" data-path="resources/diagrams/cold-vs-warm.png" />

**Cold** means a worker that has never seen this flow: it installs the pieces, fetches the bundle, and forks an engine first. **Warm** reuses all three. You hit cold when:

* It is the first request after a deploy, restart, or scale-up.
* `AP_REUSE_SANDBOX` is off (then every request is cold).
* A burst exceeds your warm worker count. Each worker runs one flow at a time, so the surplus queues or starts cold.

<Tip>
  Size **statically for peak** so a warm slot is always waiting. Autoscaling's boot lag cannot defend a 30 s sync budget. Cold starts are not Cloud-specific, they are how the engine boots, and you will see them anywhere.
</Tip>

## Your own work usually dominates

A warm call is almost entirely per-step bookkeeping, not your code: a progress callback per step, tens of ms each. A step that calls a third-party API waits on that API, often far longer. Next to a real outbound call, the Activepieces overhead is the smaller number.

## Measured (warm)

Four-step flow (webhook, math, code in `isolated-vm`, response), single warm call, no contention:

| p50    | avg    | p95    | p99    |
| ------ | ------ | ------ | ------ |
| 163 ms | 176 ms | 260 ms | 343 ms |

<Note>
  The same setup measures \~165 ms unloaded but \~505 ms under sustained peak: the app tier saturating, not the flow slowing. Always ask which load a figure was measured under. Throughput view: [Benchmark](/install/architecture/benchmark).
</Note>

## Reduce it

* **Keep workers warm:** `AP_REUSE_SANDBOX=true` plus a statically sized, always-on fleet.
* **Size for peak concurrency:** N concurrent requests need N warm workers ([sizing](/install/configure-operate/production-setup#sizing)).
* **Fewer, heavier steps:** each step adds overhead.
* **Don't let a slow third party block the response:** use an async webhook and callback instead of holding against the 30 s ceiling.
