Data Intelligence
Virlo Data Intelligence is an optional per-item enrichment layer that produces 40+ structured fields for every video in your Orbit or Comet results, plus a parallel (slightly smaller) schema for every slideshow. Fields span content classification, visual format, hook analysis, captions/panel text, tone, brand safety, engagement signals, and more — all returned as typed JSON alongside the regular item data.
Overview
When Data Intelligence is enabled, each video collected by an Orbit search or Comet cycle is analyzed by Virlo's AI pipeline. The result is a rich, structured intelligence object attached to every video — covering topics, hooks, visual format, tone, brand safety, CTAs, and more. Ready for filtering, aggregation, and programmatic analysis at scale.
Data Intelligence fields are populated asynchronously. They will appear on each video object once processing completes (typically 30–90 seconds after the video is collected). Poll the video endpoint or use webhooks to detect completion.
How to enable
Set data_intelligence_enabled: true when creating an Orbit search or a Comet configuration.
Orbit
Pass the flag in the POST /v1/orbit request body:
- Name
data_intelligence_enabled- Type
- boolean
- Description
Enable per-video Data Intelligence analysis. Default:
false.
Enable on Orbit
curl -X POST https://api.virlo.ai/v1/orbit \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{
"name": "Skincare Routine Research",
"keywords": ["skincare routine", "glass skin tutorial"],
"platforms": ["tiktok", "youtube"],
"time_period": "this_week",
"data_intelligence_enabled": true
}'
Comet
Pass the flag in the POST /v1/comet request body:
- Name
data_intelligence_enabled- Type
- boolean
- Description
Enable per-video Data Intelligence analysis for every cycle. Default:
false.
Enable on Comet
curl -X POST https://api.virlo.ai/v1/comet \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{
"name": "Daily Skincare Trends",
"keywords": ["skincare routine", "glass skin"],
"platforms": ["tiktok"],
"cadence": "daily",
"data_intelligence_enabled": true
}'
Pricing
Data Intelligence adds $1.00 per run on top of the base search cost.
| Product | Base cost | + Intelligence | Total |
|---|---|---|---|
| Orbit search | $0.50 | +$1.00 | $1.50 |
| Comet cycle | $0.50 | +$1.00 | $1.50 per cycle |
The intelligence fee is charged once per run regardless of how many videos are returned. Polling and retrieving intelligence fields after processing is always free.
Field reference
All intelligence fields are returned inside the intelligence object on each video. Fields are grouped into categories below.
Content Classification
| Field | Type | Description |
|---|---|---|
primary_topic | string | Main topic/subject of the video |
secondary_topics | string[] | Additional topics covered |
keywords | string[] | Extracted keywords relevant to the content |
category | string | High-level content category (open vocabulary) |
content_format | string | Content format type (open vocabulary) |
Visual Analysis
| Field | Type | Description |
|---|---|---|
visual_format | enum | Primary visual presentation style |
camera_perspective | enum | Camera angle and positioning |
setting | enum | Physical location/environment |
lighting_quality | enum | Lighting quality assessment |
has_face_visible | boolean | Whether a human face is visible |
Hook & Copy
| Field | Type | Description |
|---|---|---|
hook_text | string | The opening hook text/spoken words |
hook_type | enum | Category of hook technique used |
visual_hook_type | enum | Visual hook technique in the first frames |
has_text_overlay | boolean | Whether text overlays are present |
text_overlay_purpose | enum | Purpose of the primary text overlay |
Captions & Transcript
| Field | Type | Description |
|---|---|---|
has_onscreen_captions | boolean | Whether burned-in captions are shown |
caption_style | enum | Visual style of on-screen captions |
transcript_quality | enum | Transcript quality rating |
transcript_word_count | integer | Total words in the transcript |
language_detected | string | ISO 639-1 language code |
Tone & Sentiment
| Field | Type | Description |
|---|---|---|
emotional_tone | enum | Dominant emotional tone of the content |
sentiment | enum | Overall sentiment polarity |
speaking_style | enum | Vocal delivery style |
Brand & Safety
| Field | Type | Description |
|---|---|---|
brand_safety_tier | enum | Brand safety classification tier |
is_nsfw | boolean | Whether content is not safe for work |
sensitive_topics | enum[] | List of sensitive topics detected |
is_sponsored | boolean | Whether the video is sponsored content |
brands_mentioned | string[] | Brand names mentioned in the video |
Engagement Signals
| Field | Type | Description |
|---|---|---|
cta_usages | enum[] | Call-to-action types used |
social_proof_used | enum[] | Social proof techniques employed |
trend_references | string[] | References to current trends or challenges |
Meta
| Field | Type | Description |
|---|---|---|
summary | string | One-sentence summary of the video content |
is_educational | boolean | Whether the video is primarily educational |
low_confidence_fields | string[] | Field names where the model had low confidence |
Slideshow intelligence
Slideshows (TikTok image carousels and equivalent on other platforms) are analysed by a separate intelligence pipeline that produces a parallel — but not identical — set of fields on each slideshow item. The lifecycle (pending → ready) and the intelligence_status enum are identical to videos, so a single envelope-aware client handles both. The schema differences are summarised below.
Same fields as videos
The following fields are produced for both videos and slideshows with identical semantics:
primary_topic · secondary_topics · keywords · category · content_format · background_reasoning · background_type · foreground_reasoning · foreground_type · hook_text · hook_type · language_detected · is_multilingual · emotional_tone · sentiment · has_face_visible · setting · brand_safety_tier · is_nsfw · sensitive_topics · is_educational · is_sponsored · brands_mentioned · cta_usages · trend_references · social_proof_used · summary · low_confidence_fields
Slideshow-only fields (not present on videos)
| Field | Type | Description |
|---|---|---|
narrative_arc | enum | How the panels are sequenced (e.g. problem_solution, buildup_to_reveal, numbered_list, linear_storytelling). |
text_density | enum | How much copy is on the panels overall (minimal · moderate · high). |
image_count | integer | Number of panels in the slideshow. |
panel_texts | string[] | Per-panel transcribed text, indexed in order. |
panel_text_full | string | All panel text concatenated — the "transcript equivalent" for slideshows. |
panel_text_word_count | integer | Word count across panel_text_full. |
panel_text_character_count | integer | Character count across panel_text_full. |
Video-only fields (not present on slideshows)
Slideshows have no spoken audio, no continuous motion, and no per-frame camera work — so the following video-only fields are omitted from the slideshow schema:
visual_format · camera_perspective · lighting_quality · visual_complexity · scene_changed · visual_hook_type · has_text_overlay · text_overlay_content · text_overlay_purpose · has_onscreen_captions · caption_style · transcript_quality · transcript_word_count · transcript_character_count · speaking_style
Example slideshow intelligence
{
"intelligence_status": "ready",
"intelligence": {
"primary_topic": "Minimal morning skincare routine for oily skin",
"secondary_topics": ["niacinamide benefits", "SPF layering"],
"keywords": ["skincare", "oily skin", "morning routine"],
"category": "beauty",
"content_format": "tutorial",
"narrative_arc": "numbered_list",
"text_density": "moderate",
"image_count": 4,
"panel_texts": [
"Step 1: gentle cleanser",
"Step 2: niacinamide serum",
"Step 3: lightweight moisturiser",
"Step 4: SPF 50"
],
"panel_text_full": "Step 1: gentle cleanser. Step 2: niacinamide serum. Step 3: lightweight moisturiser. Step 4: SPF 50",
"panel_text_word_count": 15,
"panel_text_character_count": 102,
"language_detected": "en",
"is_multilingual": false,
"emotional_tone": "inspiring",
"sentiment": "positive",
"has_face_visible": false,
"background_type": "real_world_photo",
"foreground_type": "real_photo_subject",
"setting": "indoor_bathroom",
"hook_text": "Stop wasting money on a 12-step routine.",
"hook_type": "bold_claim",
"brand_safety_tier": "safe",
"is_nsfw": false,
"sensitive_topics": [],
"is_educational": true,
"is_sponsored": false,
"brands_mentioned": ["CeraVe", "La Roche-Posay"],
"cta_usages": [
{ "type": "link_in_bio", "text": "products linked below" }
],
"trend_references": ["skin minimalism"],
"social_proof_used": [],
"summary": "Four-step morning routine for oily skin presented as a numbered carousel with affordable product picks.",
"low_confidence_fields": []
}
}
When polling, treat slideshows exactly like videos: use the top-level finalized + pending_jobs[] for the parent run, and intelligence_status per item. The same *.run.completed webhook fires when both video and slideshow intelligence have landed.
Enum values
Below are the canonical allowed values for each enum field. Values are stable — new additions are append-only and versioned.
hook_type
question · bold_claim · shock_statement · story_tease · tutorial_promise · controversy · before_after · pov_setup · statistic · direct_address · trend_reference · cliffhanger · negation · relatable_scenario · comparison · mystery_setup · none
speaking_style
conversational · formal · hype_energy · whisper_asmr · voiceover_narration · comedic · storytelling · instructional · monotone · aggressive · deadpan · shouting · flirty
emotional_tone
funny · inspiring · shocking · educational · controversial · heartwarming · wholesome · angry · sad · hype · calm · sarcastic · nostalgic · cringe · dark_humor · urgent · mysterious · relatable · neutral
sentiment
positive · negative · neutral · mixed
cta_usages
follow · subscribe · like_video · comment · share · save_post · tag_friend · link_in_bio · visit_website · dm_message · buy · pre_order · download · sign_up · use_code · enter_giveaway · vote · book_appointment
social_proof_used
testimonial · before_after_results · statistic_cited · celebrity_mention · expert_endorsement · popularity_claim · user_count · award_mention
sensitive_topics
mild_profanity · strong_profanity · violence_described · drug_reference · alcohol · tobacco_vaping · gambling · controversial_politics · religious_discussion · mental_health · eating_disorders · self_harm_reference · sexual_content · hate_speech · medical_claims · financial_advice · weapons · none
brand_safety_tier
safe · low_risk · medium_risk · high_risk · unsafe
transcript_quality
clean · partial · garbled
visual_format
talking_head · pov_footage · interview · street_interview · screen_recording · text_messaging_thread · slideshow_text · animation_motion_graphics · b_roll_montage · whiteboard_presentation · vlog_handheld · activity_demonstration · product_closeup · green_screen_commentary · split_screen_duet · gameplay_background · native_gameplay · stream_overlay · pip_stream_layout · food_overhead · dance_full_body · other
caption_style
standard_subtitles · animated_word_by_word · large_bold_centered · karaoke_highlight · meme_top_bottom
camera_perspective
selfie_front · rear_camera · tripod_static · handheld_moving · overhead_topdown · drone_aerial
setting
indoor_home_general · indoor_bedroom · indoor_bathroom · indoor_kitchen · indoor_living_room · indoor_studio · indoor_gym · indoor_office · indoor_classroom · outdoor_urban · outdoor_nature · outdoor_beach · outdoor_pool · car · vehicle_other · restaurant_cafe · store_retail · event_venue · studio_set · generic
lighting_quality
professional · cinematic · natural_good · golden_hour · natural_dim · ring_light · harsh_artificial · mixed · low_quality
visual_hook_type
text_hook · extreme_closeup · before_state · shocking_image · aesthetic_setup · person_speaking_to_camera · motion_action · crowded_scene · mystery_object · dramatic_zoom · animal_pet · none
text_overlay_purpose
title · list_item · statistic · quote · dialogue_label · chapter_marker · branding · cta · meme_caption · none
Full example
Below is a complete video object with intelligence fields populated — a viral skincare routine video on TikTok.
Intelligence-enriched video response
{
"id": "b3a1f892-7c4e-4d8a-9f12-6e8b4a2c1d05",
"url": "https://www.tiktok.com/@glowbysara/video/7392847561023456789",
"description": "the glass skin routine that changed my life 🧴✨ step by step for beginners #skincare #glassskin #routine",
"platform": "tiktok",
"views": 2847000,
"likes": 341200,
"shares": 89400,
"comments": 12800,
"bookmarks": 267000,
"publish_date": "2026-04-22T14:30:00Z",
"author": {
"username": "glowbysara",
"avatar_url": "https://p16-sign-sg.tiktokcdn.com/avatar/glowbysara.jpg",
"followers": 890000,
"verified": true
},
"hashtags": ["skincare", "glassskin", "routine", "skincareroutine", "glowup"],
"keyword_found_by": "glass skin tutorial",
"intelligence": {
"primary_topic": "skincare routine",
"secondary_topics": ["glass skin", "Korean beauty", "beginner skincare"],
"keywords": ["glass skin", "double cleanse", "hyaluronic acid", "snail mucin", "SPF"],
"category": "beauty_skincare",
"content_format": "step_by_step_tutorial",
"visual_format": "product_closeup",
"camera_perspective": "tripod_static",
"setting": "indoor_bathroom",
"lighting_quality": "ring_light",
"has_face_visible": true,
"hook_text": "the glass skin routine that changed my life",
"hook_type": "bold_claim",
"visual_hook_type": "before_state",
"has_text_overlay": true,
"text_overlay_purpose": "list_item",
"has_onscreen_captions": true,
"caption_style": "animated_word_by_word",
"transcript_quality": "clean",
"transcript_word_count": 312,
"language_detected": "en",
"emotional_tone": "inspiring",
"sentiment": "positive",
"speaking_style": "conversational",
"brand_safety_tier": "safe",
"is_nsfw": false,
"sensitive_topics": ["none"],
"is_sponsored": false,
"brands_mentioned": ["COSRX", "Laneige", "Beauty of Joseon"],
"cta_usages": ["follow", "save_post", "comment"],
"social_proof_used": ["before_after_results", "popularity_claim"],
"trend_references": ["glass skin challenge", "75 hard skincare"],
"summary": "Step-by-step Korean glass skin routine for beginners using drugstore and K-beauty products, demonstrating double cleansing through SPF application with before/after results.",
"is_educational": true,
"low_confidence_fields": []
},
"intent_match": {
"matches": true,
"reasoning": "Video features a step-by-step tutorial format (content_format: step_by_step_tutorial) with product closeups (visual_format: product_closeup), on-screen captions (has_onscreen_captions: true), and a positive, educational tone — all consistent with the intent to find beginner-friendly skincare routines."
}
}
Use cases
Brand safety filtering
Filter out high-risk content before syndicating UGC or placing ads. Query videos and filter on brand_safety_tier and sensitive_topics to enforce your brand guidelines programmatically.
safe_videos = [
v for v in videos
if v['intelligence']['brand_safety_tier'] in ('safe', 'low_risk')
and 'none' in v['intelligence']['sensitive_topics']
]
Hook analysis at scale
Analyze which hook types drive the highest engagement across your niche. Aggregate hook_type and visual_hook_type fields to discover patterns in top-performing content.
from collections import Counter
hook_stats = Counter(v['intelligence']['hook_type'] for v in videos)
top_hooks = hook_stats.most_common(5)
Content format breakdown
Understand the visual format distribution in a niche to inform your production style. Group by visual_format and compare average view counts per format.
from statistics import mean
format_views = {}
for v in videos:
fmt = v['intelligence']['visual_format']
format_views.setdefault(fmt, []).append(v['views'])
format_avg = {fmt: mean(views) for fmt, views in format_views.items()}
Competitive intelligence
Track which brands are mentioned in viral content, identify sponsorship patterns, and monitor competitor share-of-voice by aggregating brands_mentioned and is_sponsored across large video sets.
const brandMentions = {};
videos.forEach(v => {
v.intelligence.brands_mentioned.forEach(brand => {
brandMentions[brand] = (brandMentions[brand] || 0) + 1;
});
});
const sorted = Object.entries(brandMentions)
.sort(([,a], [,b]) => b - a);
Intent-based format discovery
Use intent with Data Intelligence to find videos matching a specific content format. For example, find "long-text-overlay reaction" videos — a person on camera with long overlay text (>50 words), not speaking, reacting to the text.
import requests
response = requests.post(
'https://api.virlo.ai/v1/orbit',
headers={'Authorization': 'Bearer YOUR_API_KEY'},
json={
'name': 'Long Text Overlay Reactions',
'keywords': ['reaction', 'storytime', 'text overlay'],
'platforms': ['tiktok'],
'time_period': 'this_week',
'data_intelligence_enabled': True,
'intent': 'Find videos where a person is visible on camera with long text overlay (>50 words), not speaking, reacting to the overlay text'
}
)
# After the run completes, filter to matched videos only
orbit_id = response.json()['data']['orbit_id']
matched = requests.get(
f'https://api.virlo.ai/v1/orbit/{orbit_id}/videos',
headers={'Authorization': 'Bearer YOUR_API_KEY'},
params={'intent_match': 'true'}
)
Workflow
This section walks through common end-to-end workflows that combine Data Intelligence with intent matching. Each example shows the full lifecycle: creating the search, waiting for results, and consuming the enriched data.
Example 1: Find UGC-style talking-head product reviews
A brand team wants to find authentic-feeling product reviews in the talking-head format for a UGC campaign.
Step 1 — Create an Orbit with intent and intelligence
import requests, time
API = 'https://api.virlo.ai/v1'
HEADERS = {'Authorization': 'Bearer YOUR_API_KEY'}
orbit = requests.post(f'{API}/orbit', headers=HEADERS, json={
'name': 'UGC Talking Head Reviews',
'keywords': ['product review', 'honest review', 'trying this product'],
'platforms': ['tiktok', 'instagram'],
'time_period': 'this_week',
'data_intelligence_enabled': True,
'intent': (
'Find talking-head style product reviews where the creator '
'is speaking to camera, showing the product, with a conversational '
'or instructional tone. Must not be a sponsored post.'
),
}).json()
orbit_id = orbit['data']['orbit_id']
print(f'Orbit created: {orbit_id}')
Step 2 — Poll until complete
while True:
result = requests.get(
f'{API}/orbit/{orbit_id}',
headers=HEADERS,
).json()
if result['data']['status'] == 'completed':
break
time.sleep(10)
summary = result['data'].get('intent_summary')
print(f"Matched: {summary['matched']} / {summary['total_evaluated']}")
Step 3 — Fetch only matching videos
matched = requests.get(
f'{API}/orbit/{orbit_id}/videos',
headers=HEADERS,
params={'intent_match': 'true'},
).json()
for video in matched['data']['videos']:
intel = video['intelligence']
print(f"{video['url']}")
print(f" Format: {intel['visual_format']}")
print(f" Tone: {intel['emotional_tone']}")
print(f" Sponsored: {intel['is_sponsored']}")
print(f" Match reason: {video['intent_match']['reasoning']}")
print()
Example 2: Brand safety audit with a Comet
A media buyer sets up a daily Comet to continuously monitor a niche and flag any content that might be unsafe for ad placement.
Step 1 — Create the Comet
comet = requests.post(f'{API}/comet', headers=HEADERS, json={
'name': 'Daily Brand Safety Audit - Fitness',
'keywords': ['fitness transformation', 'weight loss journey'],
'platforms': ['tiktok', 'youtube'],
'cadence': 'daily',
'min_views': 50000,
'data_intelligence_enabled': True,
'intent': (
'Find fitness content that is brand-safe for health supplement ads. '
'Must NOT contain: medical claims, eating disorder references, '
'extreme dieting, strong profanity, or NSFW content.'
),
}).json()
comet_id = comet['data']['id']
Step 2 — After each cycle completes, fetch brand-safe matches
Use the comet.run.completed webhook or poll the Comet videos endpoint.
safe_videos = requests.get(
f'{API}/comet/{comet_id}/videos',
headers=HEADERS,
params={'intent_match': 'true'},
).json()
for v in safe_videos['data']['videos']:
intel = v['intelligence']
print(f"{v['url']}")
print(f" Safety: {intel['brand_safety_tier']}")
print(f" Sensitive: {intel['sensitive_topics']}")
print(f" Reason: {v['intent_match']['reasoning']}")
Step 3 — Get the full summary
comet_detail = requests.get(
f'{API}/comet/{comet_id}',
headers=HEADERS,
).json()
summary = comet_detail['data'].get('intent_summary')
if summary:
print(f"Brand-safe: {summary['matched']}")
print(f"Flagged: {summary['total_evaluated'] - summary['matched']}")
Example 3: Webhook-driven pipeline
For production systems, use webhooks instead of polling. When an Orbit or Comet run completes, the webhook payload includes intent_summary with matched_urls for immediate processing.
from flask import Flask, request
app = Flask(__name__)
@app.post('/webhooks/virlo')
def handle_webhook():
payload = request.json
event = payload['event']
if event in ('orbit.run.completed', 'comet.run.completed'):
data = payload['data']
summary = data.get('intent_summary')
if summary and summary['matched'] > 0:
# Process matched URLs immediately — no pagination needed
for url in summary['matched_urls']:
enqueue_for_review(url)
print(f"Enqueued {summary['matched']} matched videos for review")
return '', 200
The matched_urls array in intent_summary gives you all matching video URLs in a single list, so you can process them without paginating through the full video set.
Intent matching
When you provide an intent along with data_intelligence_enabled: true, the system performs a post-intelligence evaluation step. After all 40+ intelligence fields are populated for each video, the system uses those structured fields to evaluate whether each video matches your stated intent.
How it works
- You create an Orbit or Comet with both
intentanddata_intelligence_enabled: true - Videos are collected and intelligence fields are populated as usual
- Each video's intelligence fields are evaluated against your intent using Gemini Flash
- Results are stored per-run and returned on video objects
The intent_match object
Each video object receives an intent_match field as a sibling of intelligence (not nested inside it):
{
"id": "b3a1f892-...",
"url": "https://www.tiktok.com/@creator/video/123",
"views": 2847000,
"intelligence": { ... },
"intent_match": {
"matches": true,
"reasoning": "Video has a visible face (has_face_visible: true), long text overlay with >50 words (transcript_word_count: 312, has_text_overlay: true), and conversational speaking style consistent with reaction format."
}
}
| Field | Type | Description |
|---|---|---|
matches | boolean | Whether the video matches the stated intent based on its intelligence fields |
reasoning | string | Explanation citing specific intelligence field values that informed the decision |
intent_match is null when:
intentwas not provided on the Orbit/Cometdata_intelligence_enabledisfalse- Intelligence analysis failed for the video
The intent_summary object
The run/config response includes an intent_summary object with aggregate match statistics:
{
"intent_summary": {
"matched": 28,
"total_evaluated": 142,
"not_evaluated": 5,
"matched_urls": [
"https://www.tiktok.com/@creator1/video/123",
"https://www.tiktok.com/@creator2/video/456"
]
}
}
| Field | Type | Description |
|---|---|---|
matched | integer | Number of videos that matched the intent |
total_evaluated | integer | Total videos evaluated against the intent |
not_evaluated | integer | Videos skipped (e.g. intelligence failed) |
matched_urls | string[] | URLs of all matched videos |
intent_summary is null when the run did not use both intent and data_intelligence_enabled: true.
Filtering by intent match
Use the ?intent_match query parameter on paginated video endpoints to filter server-side:
GET /v1/orbit/:orbit_id/videos?intent_match=true
GET /v1/orbit/:orbit_id/videos?intent_match=false
GET /v1/comet/:id/videos?intent_match=true
GET /v1/comet/:id/videos?intent_match=false
This returns only videos where intent_match.matches equals the specified value, avoiding the need to fetch all videos and filter client-side.
