AutoShow automates the processing of audio and video content from various sources, including YouTube videos, playlists, podcast RSS feeds, and local media files. It leverages advanced transcription services and language models (LLMs) to perform transcription, summarization, and chapter generation.
Currently there's three pieces:
autoshow-cli
which provides the widest set of functionality including text, image, speech, music, and video generation capabilities.autoshow.app
which provides a paid product version of the AutoShow CLI functionality (with some modalities currently in development) and doesn't require technical expertise to use.autoshow
which is the repo you're on now that splits the difference and gives an open source, local development experience that also has a frontend UI that you can host yourself.
The whole AutoShow project started with this repo and eventually split off into the dedicated CLI repo and the private repo behind the product.
I would like to eventually upstream a lot of the functionality of the CLI and the paid app into this repo, check back around the beginning of 2026 for updates.
AutoShow can generate diverse content formats including:
- Summaries and Chapters:
- Concise summaries
- Detailed chapter descriptions
- Bullet-point summaries
- Chapter titles with timestamps
- Social Media Posts:
- X (Twitter)
- Creative Content:
- Rap songs
- Rock songs
- Country songs
- Educational and Informative Content:
- Key takeaways
- Comprehension questions
- Frequently asked questions (FAQs)
- Curated quotes
- Blog outlines and drafts
- Support for multiple input types (YouTube links, local video and audio files)
- Integration with various:
- LLMs (ChatGPT, Claude, Gemini)
- Transcription services (Deepgram, Assembly)
- Customizable prompts for generating titles, summaries, chapter titles/descriptions, key takeaways, and questions to test comprehension
- Markdown output with metadata and formatted content
The AutoShow workflow includes the following steps that feed sequentially into each other:
- The user provides a content input (video URL or local file) and front matter is created based on the content's metadata.
- The audio is downloaded (if necessary).
- Transcription is performed using the selected transcription service.
- A customizable prompt is inserted containing instructions for the show notes or other content forms.
- The transcript is processed by the selected LLM service to generate the desired output based on the selected prompts.
.github/setup.sh
checks to ensure a .env
file exists and Node dependencies are installed. Run the workflow with the setup
script in package.json
.
npm run setup
Example commands for all available options can be found in docs/README.md
.
npm run dev
Open localhost:4321.
- ✨Hello beautiful human! ✨Jenn Junod host of Teach Jenn Tech & Shit2TalkAbout