> ## Documentation Index > Fetch the complete documentation index at: https://www.activepieces.com/docs/llms.txt > Use this file to discover all available pages before exploring further. # Latency > What Activepieces adds to a synchronous webhook, warm versus cold, so you can set expectations before building on it A **synchronous webhook** holds the connection open until your flow returns, so the caller waits for the orchestration around your flow plus your flow's own work. These are floor numbers, slow paths included. **Scope:** the recommended production setup (`SANDBOX_CODE_ONLY`, `AP_REUSE_SANDBOX=true`, one flow per worker), which self-hosted deployments and **dedicated Cloud workers** run. The shared Cloud freemium pool uses a different sandbox. See [Production Setup](/install/configure-operate/production-setup). ## What to expect | Situation | Caller waits | | ------------------------------------------ | -------------------------------- | | **Warm** (the normal case) | **\~0.2 s** | | **Cold** (first request on a fresh worker) | **\~2 s** | | **Under heavy load** (every worker busy) | **\~0.5 s** | | **Timeout** (flow never responds) | 30 s, then the connection closes | ## Warm vs cold Warm vs cold latency: a warm sync webhook returns in ~0.2 s, while a cold first request on a fresh worker takes ~2 s, dominated by installing pieces (~0.5 s) and booting the engine (~0.9 s)

Warm vs cold latency: a warm sync webhook returns in ~0.2 s, while a cold first request on a fresh worker takes ~2 s, dominated by installing pieces (~0.5 s) and booting the engine (~0.9 s)

**Cold** means a worker that has never seen this flow: it installs the pieces, fetches the bundle, and forks an engine first. **Warm** reuses all three. You hit cold when: * It is the first request after a deploy, restart, or scale-up. * `AP_REUSE_SANDBOX` is off (then every request is cold). * A burst exceeds your warm worker count. Each worker runs one flow at a time, so the surplus queues or starts cold. Size **statically for peak** so a warm slot is always waiting. Autoscaling's boot lag cannot defend a 30 s sync budget. Cold starts are not Cloud-specific, they are how the engine boots, and you will see them anywhere. ## Your own work usually dominates A warm call is almost entirely per-step bookkeeping, not your code: a progress callback per step, tens of ms each. A step that calls a third-party API waits on that API, often far longer. Next to a real outbound call, the Activepieces overhead is the smaller number. ## Measured (warm) Four-step flow (webhook, math, code in `isolated-vm`, response), single warm call, no contention: | p50 | avg | p95 | p99 | | ------ | ------ | ------ | ------ | | 163 ms | 176 ms | 260 ms | 343 ms | The same setup measures \~165 ms unloaded but \~505 ms under sustained peak: the app tier saturating, not the flow slowing. Always ask which load a figure was measured under. Throughput view: [Benchmark](/install/architecture/benchmark). ## Reduce it * **Keep workers warm:** `AP_REUSE_SANDBOX=true` plus a statically sized, always-on fleet. * **Size for peak concurrency:** N concurrent requests need N warm workers ([sizing](/install/configure-operate/production-setup#sizing)). * **Fewer, heavier steps:** each step adds overhead. * **Don't let a slow third party block the response:** use an async webhook and callback instead of holding against the 30 s ceiling.