Sephir recipe · productivity

summarize a YouTube video without leaving the tab

Target query: how to summarize a youtube video with ai in browser

The job

You have a 40-minute lecture, technical walkthrough, or video essay open on YouTube, and you need the point fast. You want the main claims, useful examples, and exact timestamps in a clean outline, without bouncing between tabs or rewatching sections to find where each idea appeared.

Why this is hard without Sephir

Without Sephir, you end up opening YouTube’s transcript, copying a huge text block, pasting it into a separate chat tool, then fixing the format by hand. Long transcripts are messy, timestamp mapping breaks, and verification becomes a second task. The agent layer keeps the extraction and summarization in the same tab so the source and output stay aligned.

How Sephir does it

  1. Open the YouTube video and expand the native transcript panel.
  2. Open Sephir in the sidepanel with Cmd+Shift+S.
  3. Ask for a timestamped summary of the visible transcript.
  4. Watch extractPageText(active tab) capture transcript text from the page.
  5. Review the outline and tighten it into sections like claims, evidence, and actions.
  6. Save the run as /yt-summary and export the result as Markdown or JSON.

The skill behind it

Sephir uses one extraction tool, then a structured synthesis pass so you get a readable outline tied to the source transcript.

yaml
skill: /yt-summarytools: extractPageTextintent: Extract visible YouTube transcript text and return a timestamped outline of key points.

What it costs

Sephir runs this on your own ChatGPT Plus via Codex OAuth or your own API key. Typical usage is ~4,000–8,000 input tokens and ~500–1,000 output tokens on Claude Opus 4, GPT-5.5, or Gemini 3 Pro. Short videos can fit the Free tier’s single-turn flow. See for Free vs Pro Lifetime details.

Related

  • Built for focused async work:
  • For sidebar comparison on this workflow:

Try it free

Free tier covers BYOK + single-turn chat.