You have a product photo. You want a cinematic multi-shot fashion film — the kind with camera cuts, different angles, and a real narrative feel. Not just a 3-second loop of the product slowly rotating.
Traditional image-to-video (I2V) animates one frame. You feed it a handbag photo, it gives you the handbag gently wobbling for five seconds. That's useful, but it's not a film.
Seedance 2.0's storyboard reference mode changes everything. Instead of animating a single image, you pass it a multi-panel storyboard grid — a 2×2 or 3×3 layout where each panel is a different shot — and Seedance 2.0 reads each panel as a sequential scene. The output is a real multi-shot video with cinematic transitions between angles.
This tutorial walks you through the entire pipeline: from a single product photo to a polished fashion film, using nothing but Python and the ArkRoute API. Full code included — copy, paste, run.
By the end of this tutorial, you'll have a 10-second cinematic fashion film generated entirely by AI. The pipeline:
Each step is an API call. No video editing software. No manual compositing. No $2,000 freelancer.
requests installed (pip install requests).A storyboard for Seedance 2.0 is a single image divided into a grid of panels. Each panel represents one shot in your final video.
| Grid | Panels | Best For | Recommended Duration |
|---|---|---|---|
| 2×2 | 4 shots | 5–10 second films | 10s |
| 3×3 | Up to 9 shots | 10–15 second films | 15s |
| 2×3 or 3×2 | 6 shots | Medium-length narratives | 10–15s |
Each panel should be a distinct camera angle or scene. Think of it like a director's shot list:
# Storyboard Prompt Template
"A 2x2 storyboard grid for a luxury fashion film.
Product: [YOUR PRODUCT DESCRIPTION].
Panel 1 (top-left): Close-up of [product] on marble surface,
soft directional lighting, shallow depth of field
Panel 2 (top-right): Model carrying [product] walking through
[location], golden hour, 35mm film look
Panel 3 (bottom-left): Detail shot of [product] texture and
material, macro lens, studio lighting
Panel 4 (bottom-right): Wide establishing shot, [product] in
lifestyle context, cinematic composition"
💡 Key principle: Make each panel visually distinct. Different camera distances (close-up vs. wide), different lighting, different angles. The more variety across panels, the more dynamic your final video will feel. Seedance 2.0 reads the visual differences between panels to create distinct shots.
You can generate the storyboard with any AI image model that handles multi-panel layouts well. ArkRoute gives you several options on the same API:
import requests
API_KEY = "your_arkroute_api_key"
BASE = "https://api.ark-route.com/v1"
# Generate a 2x2 storyboard grid
resp = requests.post(f"{BASE}/images/generations",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"model": "nano-banana-2",
"prompt": """A 2x2 storyboard grid for a luxury fashion film.
Product: A caramel leather handbag with gold hardware.
Panel 1 (top-left): Close-up of the handbag on a marble surface,
soft directional lighting, shallow depth of field, luxury editorial
Panel 2 (top-right): A woman carrying the handbag walking through
a Paris cobblestone street, golden hour, 35mm film look
Panel 3 (bottom-left): Extreme close-up of the leather texture
and gold clasp detail, macro lens, warm studio lighting
Panel 4 (bottom-right): Wide shot of the handbag on a cafe table
with an espresso, autumn leaves, Parisian atmosphere""",
"size": "1024x1024"
}
).json()
storyboard_url = resp["data"][0]["url"]
print(f"Storyboard: {storyboard_url}")
# Same API, just swap the model
resp = requests.post(f"{BASE}/images/generations",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"model": "seedream-3.0",
"prompt": "A 2x2 storyboard grid for a luxury fashion film...",
"size": "1024x1024"
}
).json()
storyboard_url = resp["data"][0]["url"]
NanoBanana 2 typically generates in 3–5 seconds and costs about $0.02. Seedream 3.0 takes 10–20 seconds but produces sharper detail. For storyboards, NanoBanana is usually sufficient — the storyboard is a reference layout, not the final output.
🎨 Tip: If you already have a storyboard image (designed in Figma, hand-drawn, or generated elsewhere), skip this step entirely. Just host it somewhere accessible and use the URL directly in Step 3.
This is the core of the technique. When you pass an image to Seedance 2.0, two things can happen depending on whether you set image_role:
| Mode | Parameter | Behavior |
|---|---|---|
| Image-to-Video (default) | No image_role | The image becomes the first frame. Seedance animates it — camera slowly moves, objects gently shift. One continuous shot. |
| Storyboard Reference | image_role: "reference_image" | The image is read as a visual reference. Seedance interprets each panel as a separate scene and generates a multi-shot video with cuts between them. |
The image_role: "reference_image" parameter is the entire difference between "animate this photo" and "direct a multi-shot film from this storyboard."
# Submit video generation with storyboard reference
resp = requests.post(f"{BASE}/video/generations",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"model": "seedance-2.0-fast",
"prompt": "Follow the 4-panel storyboard as a shot sequence. "
"A caramel leather handbag fashion film — Paris autumn "
"street, golden hour, cinematic cuts between scenes. "
"Each panel becomes a unique camera angle. Smooth "
"transitions, luxury editorial feel, 35mm film grain.",
"image_url": storyboard_url,
"image_role": "reference_image",
"duration": 10,
"aspect_ratio": "16:9",
"resolution": "720p"
}
).json()
task_id = resp["id"]
provider = resp["provider"]
print(f"Task submitted: {task_id}")
print(f"Provider: {provider}")
🔑 Two critical details:
1. The provider field in the response tells you which upstream account is handling your task. You must pass it back when polling for status — it routes the poll to the correct backend.
2. Your prompt should explicitly reference the storyboard — phrases like "follow the 4-panel storyboard" and "each panel becomes a unique camera angle" help Seedance understand the intent.
Seedance 2.0 Fast typically takes 60–150 seconds for a 10-second clip. Pro takes 120–240 seconds. Poll the status endpoint until status is "succeeded":
import time
print("Waiting for video generation...")
poll_count = 0
while True:
time.sleep(5)
poll_count += 1
r = requests.get(
f"{BASE}/video/status/{task_id}",
params={"provider": provider},
headers={"Authorization": f"Bearer {API_KEY}"},
).json()
status = r["status"]
print(f" Poll #{poll_count}: {status}")
if status == "succeeded":
video_url = r["video_url"]
print(f"\n✅ Video ready: {video_url}")
break
if status == "failed":
print(f"\n❌ Generation failed: {r}")
raise RuntimeError("Video generation failed")
# Download the MP4
video_data = requests.get(video_url).content
with open("fashion_film.mp4", "wb") as f:
f.write(video_data)
print(f"Saved: fashion_film.mp4 ({len(video_data) / 1024 / 1024:.1f} MB)")
The response is a flat JSON object — {"id": "...", "status": "succeeded", "video_url": "..."}. No nested structures.
Here's the full end-to-end script. Copy it, set your API key, and run:
#!/usr/bin/env python3
"""
AI Fashion Film Pipeline
========================
Product description → AI storyboard → Seedance 2.0 → cinematic film
Usage:
python fashion_film.py
Requirements:
pip install requests
"""
import requests
import time
import sys
# ── Configuration ──────────────────────────────────────────
API_KEY = "your_arkroute_api_key" # Get one free at ark-route.com
BASE = "https://api.ark-route.com/v1"
PRODUCT = "A caramel leather handbag with gold hardware and quilted stitching"
LOCATION = "Paris cobblestone streets in autumn"
STYLE = "luxury editorial, 35mm film grain, golden hour"
DURATION = 10 # 5, 10, or 15 seconds
VIDEO_MODEL = "seedance-2.0-fast" # "seedance-2.0" for Pro quality
IMAGE_MODEL = "nano-banana-2" # "seedream-3.0" for higher quality
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
def generate_storyboard():
"""Step 1-2: Generate a 2x2 storyboard grid."""
print("🎨 Generating storyboard...")
prompt = f"""A 2x2 storyboard grid for a luxury fashion film.
Product: {PRODUCT}.
Panel 1 (top-left): Close-up of the product on a marble surface,
soft directional lighting, shallow depth of field, {STYLE}
Panel 2 (top-right): A stylish woman carrying the product walking
through {LOCATION}, {STYLE}
Panel 3 (bottom-left): Extreme close-up detail shot of the product
texture and craftsmanship, macro lens, warm studio lighting
Panel 4 (bottom-right): Wide establishing shot, the product placed
in a lifestyle setting — cafe table with espresso, autumn leaves,
cinematic atmosphere, {STYLE}"""
resp = requests.post(f"{BASE}/images/generations",
headers=HEADERS,
json={
"model": IMAGE_MODEL,
"prompt": prompt,
"size": "1024x1024"
}
).json()
if "data" not in resp or not resp["data"]:
print(f"❌ Storyboard generation failed: {resp}")
sys.exit(1)
url = resp["data"][0]["url"]
print(f" ✅ Storyboard ready: {url}")
return url
def generate_video(storyboard_url):
"""Step 3: Submit storyboard to Seedance 2.0 reference mode."""
print(f"\n🎬 Submitting to {VIDEO_MODEL} (storyboard reference mode)...")
prompt = (
f"Follow the 4-panel storyboard as a shot sequence. "
f"A {PRODUCT} fashion film — {LOCATION}, {STYLE}. "
f"Cinematic cuts between scenes. Each panel becomes a unique "
f"camera angle. Smooth transitions, luxury editorial feel. "
f"No text overlays."
)
resp = requests.post(f"{BASE}/video/generations",
headers=HEADERS,
json={
"model": VIDEO_MODEL,
"prompt": prompt,
"image_url": storyboard_url,
"image_role": "reference_image",
"duration": DURATION,
"aspect_ratio": "16:9",
"resolution": "720p"
}
).json()
if "id" not in resp:
print(f"❌ Video submission failed: {resp}")
sys.exit(1)
task_id = resp["id"]
provider = resp["provider"]
print(f" Task ID: {task_id}")
print(f" Provider: {provider}")
return task_id, provider
def poll_and_download(task_id, provider):
"""Step 4: Poll until complete, then download the MP4."""
print(f"\n⏳ Waiting for video (this takes 60-150s for Fast, 120-240s for Pro)...")
start = time.time()
poll_count = 0
while True:
time.sleep(5)
poll_count += 1
r = requests.get(
f"{BASE}/video/status/{task_id}",
params={"provider": provider},
headers=HEADERS,
).json()
status = r["status"]
elapsed = int(time.time() - start)
print(f" [{elapsed}s] Poll #{poll_count}: {status}")
if status == "succeeded":
video_url = r["video_url"]
print(f"\n✅ Video ready! ({elapsed}s total)")
print(f" URL: {video_url}")
# Download
video_data = requests.get(video_url).content
filename = "fashion_film.mp4"
with open(filename, "wb") as f:
f.write(video_data)
print(f" Saved: {filename} ({len(video_data) / 1024 / 1024:.1f} MB)")
return video_url
if status == "failed":
print(f"\n❌ Generation failed after {elapsed}s")
print(f" Response: {r}")
sys.exit(1)
# Safety timeout at 10 minutes
if elapsed > 600:
print(f"\n⚠️ Timeout after {elapsed}s. Task may still be processing.")
print(f" Check manually: GET {BASE}/video/status/{task_id}?provider={provider}")
sys.exit(1)
def main():
print("=" * 60)
print(" AI Fashion Film Pipeline")
print(f" Product: {PRODUCT}")
print(f" Video model: {VIDEO_MODEL} | Duration: {DURATION}s")
print("=" * 60)
storyboard_url = generate_storyboard()
task_id, provider = generate_video(storyboard_url)
video_url = poll_and_download(task_id, provider)
print(f"\n{'=' * 60}")
print(" 🎥 Pipeline complete!")
print(f" Storyboard: {storyboard_url}")
print(f" Video: {video_url}")
print(f" Local file: fashion_film.mp4")
print(f"{'=' * 60}")
if __name__ == "__main__":
main()
seedance-2.0-fast to test prompts and storyboards cheaply ($1.70/clip at 10s), then switch to seedance-2.0 Pro for the final version ($4.30/clip at 10s).Let's do the math for a complete pipeline run:
| Step | Model | Cost |
|---|---|---|
| Storyboard generation | NanoBanana 2 | ~$0.02 |
| Draft video (iteration) | Seedance 2.0 Fast, 10s | $1.70 |
| Hero video (final) | Seedance 2.0 Pro, 10s | $4.30 |
| Total (1 draft + 1 hero) | ~$6.02 | |
In practice, you might generate 2–3 draft storyboards ($0.06) and 2–3 draft videos ($3.40–$5.10) before landing on a winner to upscale with Pro. A realistic full session: under $10.
💰 For comparison: A freelance videographer shooting a product film costs $500–$2,000. A motion graphics studio runs $2,000–$10,000. This pipeline delivers comparable quality for under $10, in under 5 minutes, with infinite iteration.
| Model | 5 seconds | 10 seconds | 15 seconds |
|---|---|---|---|
| Seedance 2.0 Fast | $0.85 | $1.70 | $2.55 |
| Seedance 2.0 Pro | $2.15 | $4.30 | $6.45 |
Everything in this tutorial runs on ArkRoute's API. Sign up, get 500 free credits, and generate your first fashion film in the next 5 minutes.
Start Free →No credit card required for the free tier. 500 credits = enough for several test films.