Skill: Multi-Tab Parallel Image Generation
What
Generate panel images for one investigation — one model per project (preferably Gemini) — with proper panel assignment, naming, download, and verification. No overlaps, no missed panels.
RECOMMENDED: Manual Generation First (Token-Saving)
Before using browser automation, recommend the user generate images manually.
The agent should:
- Prepare all prompts — extract from scene-plan.md, format them ready to paste
- Present prompts sequentially — give the user panel-01 prompt, wait for confirmation, then panel-02, etc.
- User pastes into Gemini/ChatGPT themselves — they click "Create image", paste the prompt, download the result
- User downloads and names the file —
panel-01.png, panel-02.png, etc. into projects/{id}/panels/{style}/
- Agent verifies after — once user says "done", agent checks all panels exist and validates
Why manual first:
- Browser automation consumes significant tokens (navigating, clicking, waiting, screenshot verification)
- User can generate images faster manually (no round-trip latency per click)
- User sees the images as they generate and can re-roll immediately if quality is off
- Gemini/ChatGPT UI changes frequently — manual is always reliable
- Agent tokens are better spent on writing, knowledge extraction, and QA
When to use automation instead:
- User explicitly asks for automated generation
- User is away and wants the agent to run autonomously
- Gap-filling: only 3-5 panels need regen (small batch, worth the automation overhead)
- User has already tried manually and hit issues
Manual workflow:
Agent: "Here's panel 01 prompt. Paste this into Gemini and download as panel-01.png:"
→ [full prompt text]
User: *generates, downloads, places in panels folder*
User: "done" or "next"
Agent: "Here's panel 02 prompt:"
→ [full prompt text]
... repeat for all panels ...
Agent: "Let me verify all panels are present and match."
→ ls projects/{id}/panels/{style}/
→ Read each panel image → visual validation
Automated Approach (when manual isn't suitable)
Why Automation
Single-tab sequential generation = ~40 minutes per investigation (25 panels x 90s each). Four-tab parallel = ~12 minutes. The bottleneck is generation time (~10s), not prompting. Parallelism is free throughput.
ONE MODEL PER PROJECT (NON-NEGOTIABLE)
Use ONE model for an entire investigation. Mixing models within a project creates visual inconsistency — different color temperatures, different rendering styles, different atmospheric treatment. The investigation should feel like one artist made it.
Default: Gemini. Superior quality, better character consistency, better accuracy, better atmosphere.
Switch to ChatGPT ONLY when:
- Gemini's daily image generation quota is exhausted
- Gemini repeatedly refuses a prompt (content policy)
- Gemini download mechanism is completely broken (after all 3 tiers fail)
If you switch mid-project: Note which panels were generated on which model. Consider re-generating the outliers on the primary model later for consistency.
Per-project assignment:
Investigation 1 → Gemini (all 25 panels)
Investigation 2 → Gemini (all 25 panels)
Investigation 3 → Gemini (all 25 panels)
Investigation 4 → Gemini (all 25 panels)
Only switch to ChatGPT if Gemini quota hits or downloads fail.
If quota hits mid-project: finish remaining panels on ChatGPT,
schedule a reminder to re-generate those panels on Gemini
when quota resets (~24 hours).
When to Use
- After scene-plan.md is complete for a project
- When generating all 25 panels for one investigation
- When re-generating gap panels after a failed batch
Prerequisites
scene-plan.md exists in the project directory with 25 panel prompts
- Each prompt in scene-plan.md is assigned to either Gemini or ChatGPT
- Panel folder exists:
projects/{id}/panels/{style}/
- Chrome MCP tools available (Claude in Chrome extension)
- User logged into both Gemini and ChatGPT in browser
Phase 1: Setup (once per investigation)
1.1 Create the Panel Manifest
Before generating, create a tracking manifest:
INVESTIGATION: {project_id}
STYLE: {art_style}
PANELS FOLDER: projects/{id}/panels/{style}/
TOTAL: 25
GEMINI PANELS: [list from scene-plan.md, e.g., 1, 4, 5, 6, 7, 8, 9, 10, 14, 15, 19, 21, 24, 25]
CHATGPT PANELS: [list from scene-plan.md, e.g., 2, 3, 11, 12, 13, 16, 17, 18, 20, 22, 23]
TAB ASSIGNMENT:
Tab A (Gemini): panels [1, 5, 9, 14, 19, 25] → gemini tab 1
Tab B (Gemini): panels [4, 6, 7, 8, 10, 15, 21, 24] → gemini tab 2
Tab C (ChatGPT): panels [2, 11, 16, 20, 23] → chatgpt tab 1
Tab D (ChatGPT): panels [3, 12, 13, 17, 18, 22] → chatgpt tab 2
Rules for tab assignment:
- Split Gemini panels roughly evenly across 2 Gemini tabs
- Split ChatGPT panels roughly evenly across 2 ChatGPT tabs
- NO panel appears on more than one tab
- Write down the assignment BEFORE starting
1.2 Open 4 Tabs (ALL GEMINI by default)
Default: 4 Gemini tabs. One model per project.
Tab A: navigate to https://gemini.google.com/app → click "Create image"
Tab B: navigate to https://gemini.google.com/app → click "Create image" (new chat)
Tab C: navigate to https://gemini.google.com/app → click "Create image" (new chat)
Tab D: navigate to https://gemini.google.com/app → click "Create image" (new chat)
Why ALL Gemini: One model = one visual style = visual consistency across the investigation. Gemini is the better generator. Don't split models unless forced.
When to switch a tab to ChatGPT:
- Gemini quota exhausted → navigate that tab to
https://chatgpt.com
- Gemini download keeps failing on one tab → switch THAT tab only
- Gemini refuses a specific prompt → send that ONE panel to ChatGPT, then switch back
Gemini session limit: Each Gemini tab becomes unresponsive after ~6-8 images. When a tab slows down:
- Navigate to
https://gemini.google.com/app (fresh chat)
- Click "Create image" again
- Continue with the next panel in that tab's queue
- This is NOT switching models — it's refreshing the Gemini session
Verify all 4 tabs are on fresh Gemini pages. Record tab IDs.
Phase 2: Generation Cycle (repeat until all 25 done)
The Cycle: Fire → Wait → Download ONE AT A TIME → Validate → Next
CRITICAL LEARNING: Do NOT batch-download from all tabs simultaneously. When multiple files arrive in Downloads with random UUID names, you CANNOT reliably map which file came from which tab. This causes panel number mismatches.
Instead: Download and move ONE TAB AT A TIME.
Each cycle processes 4 panels simultaneously for GENERATION, but downloads SEQUENTIALLY for accuracy.
Step 1: FIRE (send 4 prompts, one per tab)
For each tab, send the NEXT panel in that tab's assignment list:
Gemini tabs:
1. Click the prompt field (find "Enter a prompt for Gemini" textbox)
2. Type the prompt from scene-plan.md (verbatim, minus the metadata lines)
3. Click "Send message" button (or press Return)
ChatGPT tabs:
1. Click the "Ask anything" field
2. Type: "Generate an image: {prompt from scene-plan.md}"
3. Press Return
Fire all 4 tabs in rapid succession — don't wait between them. Generation happens in parallel.
Step 2: WAIT
wait 12-15 seconds (let all 4 tabs finish generating)
Step 3: DOWNLOAD ONE TAB AT A TIME (sequential, not batch)
Process each tab one at a time. For each tab, do ALL of: download → move → validate → before moving to the next tab.
For Tab A (know it generated panel {NN}):
1. CLEAR Downloads first:
ls ~/Downloads/*.png ~/Downloads/*.part0.png 2>/dev/null | grep -v videogpt
(should be empty — if not, move/delete stale files first)
2. DOWNLOAD from this tab:
Gemini: click image → lightbox → find("Download full-sized image") → click ref
ChatGPT: run JS download with a.download = 'panel-{NN}.png'
3. WAIT 3 seconds
4. CHECK Downloads:
ls -lt ~/Downloads/*.png ~/Downloads/*.part0.png 2>/dev/null | grep -v videogpt
(should show exactly 1 new file)
5. MOVE with correct name:
mv ~/Downloads/{the-one-file} projects/{id}/panels/{style}/panel-{NN}.png
6. VALIDATE — read the image and confirm it matches the prompt:
Read the panel file → visually check: does this match panel {NN}'s scene-plan description?
If NO: delete it, note panel {NN} as "needs regen"
7. THEN move to Tab B and repeat.
Why sequential download matters:
When you batch-download from 4 tabs, 4 files arrive with random UUID names. You then guess which UUID = which panel based on timestamp order. But tabs don't finish in the order you sent prompts. Tab C might finish before Tab A. Result: panel numbers get swapped. This is exactly what happened in Session 1 — panels 14/15 were swapped, panel 18 got wrong content.
The fix is simple: Download one tab at a time. You know which tab generated which panel (from the manifest). You download from that tab. There's only one new file. You name it correctly. No ambiguity.
Gemini download (3-tier approach per tab):
Tier 1 — Native download button (preferred):
1. Click the generated image thumbnail in chat to open lightbox
2. Wait 2 seconds for lightbox to fully load
3. find("Download full-sized image") → get the ref in the LIGHTBOX DIALOG
(there will be multiple refs — use the LAST one, which is in the dialog overlay)
4. Click that ref
5. Wait 3 seconds
6. Check ~/Downloads/ for exactly 1 new Gemini_Generated_Image_*.png
7. Move immediately with correct panel name
Tier 2 — Re-open lightbox with fresh refs:
If Tier 1 fails (no file appeared):
1. Press Escape to close lightbox
2. Click the image again to re-open
3. Run find("Download full-sized image") AGAIN for fresh refs
4. Click the new ref
5. Check Downloads
Tier 3 — Reroute to ChatGPT:
If both tiers fail:
1. Note the panel number as "failed on Gemini"
2. After completing all other tabs, open a ChatGPT tab
3. Send the prompt with "Generate an image:" prefix
4. Download via JS method
5. Move and validate
Key learning: The download button ref CHANGES every time you open/close the lightbox. ALWAYS find() AFTER opening to get current refs.
Step 4: VALIDATE each panel (NON-NEGOTIABLE)
After moving each file, READ the image and visually confirm it matches:
Read projects/{id}/panels/{style}/panel-{NN}.png
Compare to scene-plan.md panel {NN} description
Ask: does this image show what the prompt asked for?
If YES: panel confirmed, move to next tab
If NO: delete the file, add panel {NN} to the "regen" list
Do NOT skip validation. A mismatched panel is worse than a missing panel — it silently corrupts the investigation.
Step 5: VERIFY count after all 4 tabs processed
ls projects/{id}/panels/{style}/ | sort
Check: which panels are present? Which are missing? Which need regen? Update the manifest.
Step 6: NEXT — fire next 4 prompts
Cross off the completed panels. Send the next panel in each tab's assignment list. If a tab's list is exhausted, that tab is done.
Phase 3: Gap Fill
After all cycles complete, verify:
# Should show panel-01.png through panel-25.png
ls projects/{id}/panels/{style}/ | sort
If gaps exist:
- List missing panel numbers
- Assign them to available tabs (any tab, doesn't matter which model)
- Re-run the cycle for just those panels
- Verify again
Common gap causes:
- Download silently failed (file didn't appear in Downloads)
- ChatGPT navigated to home page instead of creating new chat
- Gemini download button unresponsive
Phase 4: Cleanup
# Remove any leftover downloads
rm ~/Downloads/videogpt*.png 2>/dev/null
# Verify final count
ls projects/{id}/panels/{style}/*.png | wc -l
# Must be 25
Naming Convention
Panel files: panel-{NN}.png where NN is zero-padded (01-25)
Download names: Use the panel number in the JS download: a.download = 'panel-07.png'
Never use the UUID filename — always rename to panel-NN.png when moving
Model Selection Guide
Gemini is the SUPERIOR image generator. Better character consistency, better accuracy, better atmospheric quality, better composition. Use Gemini as the PRIMARY model. ChatGPT is the FALLBACK for when Gemini downloads fail or for high-volume batch work where download reliability matters more than quality.
Default: Generate ALL panels on Gemini first. Use ChatGPT only for:
- Panels that Gemini refuses (content policy)
- Re-generating gap panels when Gemini's download mechanism fails
- Rapid batch fills when speed > quality
| Panel Type | Use | Why |
|---|
| ALL cinematic/atmospheric panels | Gemini (primary) | Superior in every dimension — lighting, texture, composition, consistency |
| ALL data panels (maps, diagrams) | Gemini first, ChatGPT fallback | Gemini handles these well too, especially with nano-banana prompting |
| Character consistency across panels | Gemini | Much better at maintaining features across generations |
| Ink wash / watercolor / artistic styles | Gemini | Dramatically better artistic interpretation |
| When Gemini download fails | ChatGPT (fallback) | Reliable JS download method |
| Rapid gap-fill (speed over quality) | ChatGPT | Faster cycle time, reliable downloads |
General principles:
- AI image generation UIs (Gemini, ChatGPT, etc.) change their layouts, buttons, and limits without notice
- Don't assume download buttons, session limits, or prompt fields work the same as last time
- At the start of each generation session: test one image manually to confirm the workflow still works
- If the download mechanism breaks: try the other platform, or ask the user to download manually
- Always prefix prompts with "Generate an image:" — this is required for ChatGPT and doesn't hurt on other platforms
- Always include aspect ratio in prompts (e.g., "16:9 widescreen") — most models default to square
- Always rename downloaded files immediately to
panel-NN.png — never keep UUID filenames
Anti-Overlap Protocol
Before sending ANY prompt:
- Check the manifest: which panel number am I sending to which tab?
- Each panel number appears EXACTLY ONCE in the manifest
- After downloading, name the file with the CORRECT panel number
- After moving, verify the panel number matches the scene-plan entry
If confused about which panel a download belongs to:
- Check which tab you downloaded from
- Check that tab's assignment list
- The most recent image in that tab = the most recent panel in that tab's queue
Throughput Math
| Approach | Time per 25 panels |
|---|
| 1 tab sequential | ~40 min |
| 2 tabs parallel | ~22 min |
| 3 tabs parallel | ~15 min |
| 4 tabs parallel | ~12 min |
The sweet spot is 3-4 tabs. More than 4 creates too much context overhead tracking which tab has which panel.
Example: Full Run for One Investigation (All Gemini)
INVESTIGATION: strait-of-hormuz-military-chokepoint
STYLE: maritime-cartographic
MODEL: Gemini (all 25 panels)
Tab A (Gemini): 1, 5, 9, 13, 17, 21, 25 (7 panels)
Tab B (Gemini): 2, 6, 10, 14, 18, 22 (6 panels)
Tab C (Gemini): 3, 7, 11, 15, 19, 23 (6 panels)
Tab D (Gemini): 4, 8, 12, 16, 20, 24 (6 panels)
Cycle 1: Fire panels 1, 2, 3, 4 → wait 12s → download all 4 → move → verify (4/25)
Cycle 2: Fire panels 5, 6, 7, 8 → wait 12s → download all 4 → move → verify (8/25)
Cycle 3: Fire panels 9, 10, 11, 12 → wait → download → move → verify (12/25)
→ Tab A hits 6-image limit: refresh Tab A (navigate to /app, click Create Image)
Cycle 4: Fire panels 13, 14, 15, 16 → wait → download → move → verify (16/25)
Cycle 5: Fire panels 17, 18, 19, 20 → wait → download → move → verify (20/25)
→ Tabs B,C,D hit limit: refresh all three
Cycle 6: Fire panels 21, 22, 23, 24 → wait → download → move → verify (24/25)
Cycle 7: Fire panel 25 (Tab A only) → wait → download → move → verify (25/25)
Gap check: ls panels/maritime-cartographic/ | sort
Fill any gaps (re-gen on same model).
Done. ~12 minutes total.
If Gemini Quota Hits Mid-Project
Scenario: Gemini quota exhausted after panel 18.
Remaining: panels 19-25 (7 panels)
Option A (PREFERRED): Wait for quota reset (~24 hours)
→ Schedule reminder: "Resume Gemini generation for panels 19-25"
→ Save progress, document which panels are done
Option B (URGENT): Switch all 4 tabs to ChatGPT, finish 19-25
→ Note in learning-log.md: "Panels 19-25 on ChatGPT (quota hit)"
→ Re-generate on Gemini later for consistency (optional)
Common Mistakes
| Mistake | Fix |
|---|
| Batch downloading from all tabs at once | NEVER. Download one tab at a time. This is the #1 cause of panel swaps. |
| Skipping visual validation after download | ALWAYS read the image and verify it matches the prompt. |
| Same panel sent to two tabs | Check manifest before every prompt |
| Wrong panel number in filename | Download from ONE tab, name it immediately — no ambiguity |
| Forgot to move .part0.png files | Always check for both .png and .part0.png |
| Gemini page unresponsive | Open new Gemini chat (navigate to /app) |
| ChatGPT shows home page after prompt | Re-type prompt, ensure click is on the input field |
| Downloads going to wrong folder | Clear Downloads before each single-tab download |
| Panel count < 25 after all cycles | Run gap-fill phase |
| Images are square, not 16:9 | Add "16:9 widescreen" to every prompt |
| Moving files by timestamp order across tabs | DON'T. Timestamps don't reflect tab order. Download one tab at a time instead. |
The Session 1 Lesson
In the first production run, panels 14/15 were swapped and panel 18 got wrong content because:
- 3 tabs generated simultaneously
- All downloads arrived with UUID filenames
- Files were moved by timestamp, assuming Tab A finishes before Tab B
- But Tab C actually finished first — so timestamps didn't match tab assignment
The fix: Parallel GENERATION (fire 4 prompts fast) but SEQUENTIAL DOWNLOAD (process one tab at a time). You get 80% of the throughput benefit with 100% of the accuracy.
Production Learnings (Structural — Always True)
| Learning | Detail |
|---|
| Don't mix models within a project | Visual inconsistency (color temperature, rendering style) across panels |
| Parallel gen + sequential download = sweet spot | Fire prompts in parallel, but download one tab at a time for naming accuracy |
| Timestamp-based file matching fails | Tabs don't finish in order. Panel swaps happen when you guess by timestamp |
| Manual generation saves tokens | User pasting prompts directly is faster and cheaper than browser automation |
| AI image UIs change frequently | Download buttons, layouts, and limits change without notice. Don't hardcode assumptions about specific UI elements — verify each session |
| Always verify what you downloaded | Read the image after saving. A mismatched panel is worse than a missing one |
gemini-image-generation.md — single-tab Gemini workflow
browser-image-attachment.md — attaching reference images
multi-session-coordination.md — coordinating across browser lock
scene-plan.md per project — the prompts to send