Async Data Model

Many Virlo resources are multi-phase: the main job completes quickly, but secondary jobs (AI viral analysis, audience demographics, audience geography, per-video intelligence, tracked-creator reports) keep working in the background. Without explicit signals, a GET /v1/agents/:id could return status: "completed" while analysis_data was still null — with no way to tell whether the null meant "still working" or "never coming."

This page documents the additive fields that solve that problem uniformly across every async resource: finalized, pending_jobs[], and intelligence_status. They are purely informational — every existing field keeps its current shape and meaning, so existing integrations continue to work unchanged.

Everything on this page is additive. You can ignore the new fields entirely and your existing polling loops will keep working. We added them because we heard from teams (especially those building agents on top of the API) that the previous "is this null because it's pending or because it doesn't exist?" ambiguity was the single biggest source of confusion.

The shape of an in-progress response

Every GET that returns a resource with secondary jobs now includes two new top-level fields inside the data envelope:

Agent still analyzing

{
  "data": {
    "...": "...existing agent fields unchanged...",
    "finalized": false,
    "pending_jobs": [
      {
        "type": "viral_analysis",
        "status": "processing",
        "poll_url": "/v1/agents/2b1f.../analysis/latest",
        "result_path": "data.analysis_data",
        "webhook_event": "content_research_agent.run.completed",
        "retry_after_seconds": 15
      }
    ]
  }
}

Legacy Orbit/Comet GETs use the same shape with orbit.run.completed / comet.run.completed webhook events during the migration window.

When the resource is fully ready, the shape collapses to its simplest form:

{
  "data": {
    "...": "...existing fields unchanged...",
    "finalized": true,
    "pending_jobs": []
  }
}

That's it. No new headers, no new response codes, no behavior change on the existing fields.

`finalized`

A single boolean that tells you whether the resource and every secondary job spawned by it are done. Once finalized: true, the response will not change again until you explicitly trigger something (e.g. re-running the search, refreshing the audience snapshot).

`finalized`	What it means
`true`	All work is done. Trust every field. `pending_jobs` will be `[]`.
`false`	At least one job is still running, queued, or being retried. Check `pending_jobs[]` to see which. Some fields you see may still be `null` because their job hasn't completed yet — not because the resource is broken.

We deliberately picked the word finalized over is_complete to avoid colliding with status: "completed", which is about the main job only. A resource can have status: "completed" and finalized: false at the same time — the main scrape finished, but the AI analysis is still running.

One scoped exception: per-item intelligence. finalized: true covers the run and its run-level jobs (scrape, AI analysis). On Data Intelligence agents, per-video / per-slideshow intelligence trickles in item by item and can still be pending on some rows shortly after the run finalizes — that's exactly why each item carries its own intelligence_status. Treat finalized as the run-level signal and intelligence_status as the per-item signal; re-fetch the list until the items you need are ready (or terminal).

`pending_jobs[]`

When finalized: false, pending_jobs[] enumerates every secondary job still in flight. Each entry tells you exactly what is pending, where to poll for it, and which webhook will fire when it's done:

Name
type
Type
string
Required
*
Description
Machine-readable identifier for the pending job. Currently one of: viral_analysis, audience_demographics, audience_geography, tracking_report.
Name
status
Type
string
Required
*
Description
Lifecycle state of the secondary job. One of: pending (queued), processing (running now), failed (last attempt failed, may retry), expired (gave up after the retry window).
Name
job_id
Type
string
Description
Stable identifier for this job when one exists (e.g. audience snapshot jobs). Often absent for jobs that are implicit children of the parent resource.
Name
poll_url
Type
string
Description
Path you can GET to retrieve this specific piece of data once it's ready. The URL is always free to poll.
Name
result_path
Type
string
Description
JSON pointer (relative to the response root) that tells you where the result will appear once it's ready. Useful for agents that diff before-and-after payloads.
Name
webhook_event
Type
string
Description
Name of an event from the existing Supported events list that fires when this job reaches a terminal state. Subscribe via POST /v1/webhooks to avoid polling. Every value listed here is guaranteed to be an event we already supported — we did not introduce any new event names just for this feature, so existing webhook subscribers keep working as-is.
Name
started_at
Type
string
Description
ISO-8601 timestamp when the job started (if known).
Name
retry_after_seconds
Type
integer
Description
Suggested wait before your next poll. Honor this if you can — it adapts to current load.

If a secondary job ultimately fails or never runs (e.g. the user disabled it), it stops appearing in pending_jobs[] after the retry window closes. You'll see finalized: true and the corresponding result field stays at its terminal value (typically null). This avoids the "infinite pending" trap.

`intelligence_status` (per-video / per-slideshow)

For agent (and legacy Orbit/Comet) video and slideshow lists, each item carries an intelligence_status field that disambiguates a null intelligence object:

`intelligence_status`	Meaning	What to do
`ready`	The intelligence fields are populated on this item.	Use `intelligence` directly.
`pending`	The resource has `data_intelligence_enabled: true` but this item's intelligence hasn't been computed yet.	Re-fetch later, or subscribe to the relevant `*.run.completed` webhook.
`disabled`	The resource was created without `data_intelligence_enabled`.	Don't expect intelligence on this item. Intelligence cannot be added retroactively — create a new agent with `data_intelligence_enabled: true`.
`failed`	Intelligence was attempted but couldn't complete.	Treat as terminal; no further work will happen automatically.
`skipped`	The item was filtered out of intelligence processing (e.g. unsupported language, missing transcript / panel text).	Treat as terminal.

This eliminates the most common confusion: "I see intelligence: null — does that mean my agent isn't done yet, or that I forgot to enable Data Intelligence?"

Videos and slideshows have separate intelligence pipelines and slightly different field shapes (slideshows analyse panel text instead of transcripts and drop video-only fields like visual_format, camera_perspective, and scene_changed). The intelligence_status enum, polling rules, and lifecycle are identical — so a single envelope-aware client handles both.

Polling pattern

The simplest correct loop now looks like this:

Python — wait until everything is ready

import time, requests

def fetch_when_finalized(url, headers, timeout_s=900, base_delay_s=15):
    started = time.time()
    while time.time() - started < timeout_s:
        res = requests.get(url, headers=headers).json()["data"]
        if res.get("finalized") is True:
            return res
        # Honor the smallest retry_after_seconds across pending jobs if present.
        pending = res.get("pending_jobs") or []
        delay = min(
            (j.get("retry_after_seconds") or base_delay_s for j in pending),
            default=base_delay_s,
        )
        time.sleep(delay)
    raise TimeoutError(f"{url} not finalized after {timeout_s}s")

TypeScript — wait until everything is ready

async function fetchWhenFinalized(
  url: string,
  headers: Record<string, string>,
  { timeoutMs = 900_000, baseDelayMs = 15_000 } = {},
) {
  const started = Date.now()
  while (Date.now() - started < timeoutMs) {
    const json = await fetch(url, { headers }).then((r) => r.json())
    const data = json.data ?? {}
    if (data.finalized === true) return data

    const pending: Array<{ retry_after_seconds?: number }> = data.pending_jobs ?? []
    const delayMs = Math.min(
      ...pending.map((j) => (j.retry_after_seconds ?? baseDelayMs / 1000) * 1000),
      baseDelayMs,
    )
    await new Promise((r) => setTimeout(r, delayMs))
  }
  throw new Error(`${url} not finalized in time`)
}

This single loop replaces every per-resource "wait until status === completed, then re-fetch and hope analysis is there" hack. It works identically for Content Research Agent, Orbit, Comet, Satellite, and Tracking responses.

You can still poll the legacy status field on the main job if you want progress UI before the resource is finalized. finalized is the end-state signal; status is the main-job signal. Both are accurate.

Webhook pattern (recommended)

Polling works fine, but webhooks are dramatically cheaper for high-volume use cases and remove all timing guesswork. Each entry in pending_jobs[] tells you exactly which event to subscribe to:

Register one webhook for every event you care about

curl -X POST https://api.virlo.ai/v1/webhooks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-app.example.com/webhooks/virlo",
    "description": "All async completions",
    "enabled_events": [
      "content_research_agent.run.completed",
      "orbit.run.completed",
      "comet.run.completed",
      "audience.snapshot.completed",
      "tracking.cycle.completed"
    ]
  }'

When the event arrives, the delivery body carries the same data shape you'd get from the corresponding GET endpoint, including the now-populated finalized: true and empty pending_jobs: []. See Webhooks → Envelope for full delivery details, headers, signing, and retry behavior.

The recommended hybrid pattern: subscribe to webhooks for completion, fall back to polling only if you haven't received an event after ~10× retry_after_seconds. This handles webhook misconfiguration and transient network issues without making your integration brittle.

Mapping per resource

This is the same table as above, expressed per-resource so you can copy/paste the relevant rows for the features you use:

Resource	Secondary job	`pending_jobs[].type`	`pending_jobs[].webhook_event`
Content Research Agent (`GET /v1/agents/:id`)	AI viral analysis	`viral_analysis`	`content_research_agent.run.completed`
Orbit (`GET /v1/orbit/:id`) (legacy)	AI viral analysis	`viral_analysis`	`orbit.run.completed`
Comet (`GET /v1/comet/:id`) (legacy)	AI viral analysis (per cycle)	`viral_analysis`	`comet.run.completed`
Satellite (`GET /v1/satellite/creator/status/:job_id`)	Audience demographics	`audience_demographics`	`audience.snapshot.completed`
Satellite (same)	Audience geography	`audience_geography`	`audience.snapshot.completed`
Tracking (`GET /v1/tracking/creators/:id`)	AI tracking report	`tracking_report`	`tracking.cycle.completed`
Audience job (`GET /v1/audience/snapshot/:job_id`)	The snapshot itself	—	`audience.snapshot.completed`

audience_demographics and audience_geography are powered by the same snapshot job, so a single audience.snapshot.completed event resolves both pending entries in one shot.

Backwards compatibility

These fields are 100% additive. Specifically:

No existing field changed shape or meaning. status: "completed" still means "the main job is done"; analysis_data: null still means "no analysis yet"; audience_demographics: null still means "no snapshot yet." We just added complementary fields that tell you why a null is null.
No new webhook events were introduced. Every pending_jobs[].webhook_event value is an event we already supported and that already fires from the same processor today. Existing webhook subscribers will not see duplicate deliveries.
No new auth, no new headers, no new rate limits. Both fields are returned on the same endpoints under the same auth as today.
Unknown-key tolerant clients are unaffected. If your deserializer ignores unrecognized fields (the default in most JSON libraries), these are invisible to you.
Default behavior is unchanged. If you ignore finalized and pending_jobs[] and keep polling on status only, your integration continues to work exactly as it always has.

If you're building something new or rebuilding an integration: use finalized as your "stop polling" condition and you'll never have to think about which sub-jobs to wait for again. If you have a stable integration that already works: nothing forces you to migrate.