Skill: Dynamic Gap Detection
What
When directing scenes sequentially, detect spatial/temporal gaps between the current panel and the next story beat, and insert bridge frames on the fly.
Why
Scene plans are written at plot level — they describe KEY BEATS, not every physical movement. "Panel 01: corridor. Panel 02: door opens." But the camera can't teleport 10 meters. The viewer's brain needs intermediate frames to perceive smooth movement (Scott McCloud's closure principle). Without bridge frames, the comic feels like it's jumping/teleporting between locations.
The scene-directing agent must hold three things simultaneously:
- What just came out (the actual panel image)
- Where the story goes next (the next beat in scene-plan.md)
- The gap between them (spatial distance, time elapsed, emotional shift)
If the gap is too large for one frame → insert bridge frames.
How
Gap Assessment After Every Panel
After seeing panel N's output, before writing panel N+1:
1. Where is the camera RIGHT NOW? (location, height, direction)
2. Where does the NEXT STORY BEAT need the camera? (location, height, direction)
3. How far is that? (meters, steps, emotional distance)
4. Can the viewer's brain bridge this gap with one cut? (McCloud test)
The McCloud Test
Ask: "If I put these two images side by side, would the reader understand the movement between them without any text?"
- YES → proceed to next beat directly
- NO → insert 1-3 bridge frames
Bridge Frame Types
| Gap Type | Bridge Needed | Example |
|---|
| 1-3 steps | Usually none | "She turns her head" → next beat |
| 5-10 steps | 1 bridge frame | Walking closer to door |
| 10+ steps | 2-3 bridge frames | Corridor to room entrance |
| Room change | 1 transition frame | Doorway threshold shot |
| Emotional shift | 1 reaction frame | Character processes new information |
| Time skip | 1 establishing frame | New time/location needs grounding |
Bridge Frame Prompting
Bridge frames describe MOVEMENT, not events:
- "You are closer now. The door is bigger in your view."
- "You have taken three steps forward. The bed is closer."
- "She has turned her body. Her feet point toward the room."
They are NOT story beats — they carry the camera from A to B.
Real-Time Insertion
The scene-directing agent must be willing to:
- Add frames that aren't in the scene plan — bridge frames emerge from seeing the output, not from pre-planning
- Skip planned frames — if the model gave you a frame that covers two beats at once, skip the redundant one
- Split one planned beat into 2-3 frames — if the beat requires movement that can't happen in one image
The Three-State Hold
At all times during sequential generation, the agent holds:
CURRENT: What did panel N actually show?
NEXT: What does the scene plan say happens next?
GAP: What physical/emotional distance exists between them?
This is not pre-planning. This is live directing — like a film director watching playback and deciding the next shot.
Common Mistakes
- Pre-writing bridge frames — you can't know where bridges are needed until you SEE the output. The model might close the gap in one frame, or might leave a bigger gap than expected.
- Ignoring spatial gaps — jumping from "10m away in corridor" to "door open, room revealed" with no transition. The viewer's brain can't process this jump.
- Over-bridging — adding 5 frames to walk 3 steps. Not every movement needs a frame. Only bridge when McCloud test fails.
- Bridge frames with story events — bridge frames are pure movement. Don't add emotional beats or dialogue to them. Keep them simple.
Proven Pattern (hospital-entry-test)
- Scene plan: Panel 01 (corridor, 10m from door) → Panel 02 (door opens)
- Gap detected: 10 meters of walking + door interaction = too far for one cut
- Bridge frame inserted: Panel 02 became "closer to door, 2 steps away, warm light brighter"
- Door opening moved to Panel 03
- Result: smooth spatial progression that feels like continuous movement