Phase 1: Backbone Extraction

Extract every concept, definition, formula, algorithm, code snippet, and diagram from the course PDFs, **strictly aligned to page numbers**. Zero information loss is the target.

PublishedMay 30, 2026

Loading actions...

5 minBeginnerprompt10 files

Skill content

Main instructions and any bundled files for this skill.

markdown

Additional Files (9)

Phase 1: Backbone Extraction

Goal

Extract every concept, definition, formula, algorithm, code snippet, and diagram from the course PDFs, strictly aligned to page numbers. Zero information loss is the target.

File reading rule

Always use the /pdf skill to read PDF files. Do not use Python, bash, or any direct file I/O. Invoke /pdf with the uploaded file, then work from its output.

If a file is not already a PDF, stop and ask the user to convert it using /pdf first before continuing.

Backbone reference format

Input type	Unit	Reference format
PDF / slide file	Page number	`Lecture X, p. Y`
Topic list	Section item	`Section X.Y`
Generated syllabus	Syllabus entry	`Module X, Section Y`

Use the same format throughout all phases. Do not mix formats.

Extraction procedure

Step 1: Read via /pdf skill

Invoke the /pdf skill on the uploaded file. Note total page count from the output.

Step 2: Page-by-page extraction

For EVERY page, output one block. Keep blocks dense — no padding:

### Page X (of N) — [Slide title]

**Key content:**
- [Definitions — exact wording]
- [Formulas — $LaTeX$]
- [Algorithms — pseudocode or code block]
- [Diagrams — describe structure, name all labeled elements]
- [Tables — reproduce as Markdown]
- [Examples — reproduce in full]

**Concepts introduced:** [comma-separated]
**[EXPAND]:** [concept — reason] *(omit if nothing to flag)*

Empty/title pages still get a block:

### Page 1 (of 30) — Course Title
**Key content:** [Title page — no substantive content]
**Concepts introduced:** —

Step 3: Speed calibration by tier

Apply the tier set in Phase 0:

Tier	Page handling
Light (≤60p)	Full block per page
Medium (61–200p)	Full block per page; write lecture summary after each lecture
Heavy (201–400p)	Full block per page; after each lecture, compress to concept-inventory format (concept + location + type only) before moving to next lecture

For Heavy tier, the concept-inventory format is:

- [Concept name] | Lecture X, pp. Y-Z | [definition/theorem/algorithm/pattern]

This replaces the full block in memory but the full blocks are still written to the extract file.

Step 4: Completeness check (fast)

Pages in file: N  →  "### Page X" blocks in output: must = N

Content-rich pages: each block ≥ 5 lines of substantive content. If shorter, re-read the page.

Step 5: Lecture summary (append to extract file)

## Lecture Summary

**Topic:** [main topic]
**Total pages:** N
**Key concepts (by appearance):** [numbered list — keep brief]
**Prerequisites assumed:** [from earlier lectures]
**[EXPAND] markers:** [list with page refs]
**Open questions:** [unclear or contradictory items]

For topic list inputs

When the user pastes a topic list instead of a PDF:

Parse into hierarchy: Top level → Module, Second → Section, Third → Sub-concept
Assign numbers: 1.1, 1.2, 2.1, etc.
Write course-backbone.md with the full outline
Show the user: "Does this structure look right?" — adjust if needed
Populate each section with world-knowledge concept blocks, marked [From: world knowledge — no source PDF]

Output

lecture-XX-extract.md per PDF (zero-padded)
course-backbone.md if input was a topic list

Anti-patterns

DO NOT read PDFs with Python or direct file tools — use /pdf skill only
DO NOT merge multiple pages into one block
DO NOT skip pages you consider unimportant
DO NOT pad with meta-commentary — dense content only

Contents

Prompt Playground

1 Variable

Fill Variables

EXPAND

Preview

# Phase 1: Backbone Extraction

## Goal

Extract every concept, definition, formula, algorithm, code snippet, and diagram from the course PDFs, **strictly aligned to page numbers**. Zero information loss is the target.

## File reading rule

**Always use the `/pdf` skill to read PDF files.** Do not use Python, bash, or any direct file I/O. Invoke `/pdf` with the uploaded file, then work from its output.

If a file is not already a PDF, stop and ask the user to convert it using `/pdf` first before continuing.

## Backbone reference format

| Input type | Unit | Reference format |
|---|---|---|
| PDF / slide file | Page number | `Lecture X, p. Y` |
| Topic list | Section item | `Section X.Y` |
| Generated syllabus | Syllabus entry | `Module X, Section Y` |

Use the same format throughout all phases. Do not mix formats.

## Extraction procedure

### Step 1: Read via /pdf skill

Invoke the `/pdf` skill on the uploaded file. Note total page count from the output.

### Step 2: Page-by-page extraction

For EVERY page, output one block. Keep blocks dense — no padding:

```markdown
### Page X (of N) — [Slide title]

**Key content:**
- [Definitions — exact wording]
- [Formulas — $LaTeX$]
- [Algorithms — pseudocode or code block]
- [Diagrams — describe structure, name all labeled elements]
- [Tables — reproduce as Markdown]
- [Examples — reproduce in full]

**Concepts introduced:** [comma-separated]
**[EXPAND]:** [concept — reason] *(omit if nothing to flag)*
```

Empty/title pages still get a block:
```markdown
### Page 1 (of 30) — Course Title
**Key content:** [Title page — no substantive content]
**Concepts introduced:** —
```

### Step 3: Speed calibration by tier

Apply the tier set in Phase 0:

| Tier | Page handling |
|---|---|
| Light (≤60p) | Full block per page |
| Medium (61–200p) | Full block per page; write lecture summary after each lecture |
| Heavy (201–400p) | Full block per page; after each lecture, compress to concept-inventory format (concept + location + type only) before moving to next lecture |

For Heavy tier, the concept-inventory format is:
```
- [Concept name] | Lecture X, pp. Y-Z | [definition/theorem/algorithm/pattern]
```
This replaces the full block in memory but the full blocks are still written to the extract file.

### Step 4: Completeness check (fast)

```
Pages in file: N  →  "### Page X" blocks in output: must = N
```

Content-rich pages: each block ≥ 5 lines of substantive content. If shorter, re-read the page.

### Step 5: Lecture summary (append to extract file)

```markdown
## Lecture Summary

**Topic:** [main topic]
**Total pages:** N
**Key concepts (by appearance):** [numbered list — keep brief]
**Prerequisites assumed:** [from earlier lectures]
**[EXPAND] markers:** [list with page refs]
**Open questions:** [unclear or contradictory items]
```

---

## For topic list inputs

When the user pastes a topic list instead of a PDF:

1. Parse into hierarchy: Top level → Module, Second → Section, Third → Sub-concept
2. Assign numbers: `1.1`, `1.2`, `2.1`, etc.
3. Write `course-backbone.md` with the full outline
4. Show the user: "Does this structure look right?" — adjust if needed
5. Populate each section with world-knowledge concept blocks, marked `[From: world knowledge — no source PDF]`

---

## Output

- `lecture-XX-extract.md` per PDF (zero-padded)
- `course-backbone.md` if input was a topic list

## Anti-patterns

- **DO NOT** read PDFs with Python or direct file tools — use `/pdf` skill only
- **DO NOT** merge multiple pages into one block
- **DO NOT** skip pages you consider unimportant
- **DO NOT** pad with meta-commentary — dense content only

View Original Source

Related Skills

General

PromptBeginner5 minmarkdown

Untitled Skill

193

Jan 12, 2026

General

PromptBeginner5 minmarkdown

Frontend Typescript Linting.mdc

TypeScript and ESLint rules that MUST be followed when creating, modifying, or reviewing any file under apps/frontend/, including .ts, .tsx, .js, and .jsx files. Also apply when discussing frontend li...

160

Feb 15, 2026

General

PromptBeginner5 minmarkdown

2. Apply Deepthink Protocol (reason about dependencies

risks

127

Jan 15, 2026