A Node.js application that transcribes audio recordings into text and automatically saves them to Google Docs. Perfect for maintaining voice notes, meeting minutes, or any spoken content in a written format.
Create a .env
file with the following variables:
OPENAI_SPEECH_API_KEY=
DOC_ID=The ID of the Google document where the text will be saved
TIMEZONE=your_timezone (example: Asia/Jerusalem)
PERSONAL_AUTH_TOKEN=Create your personal key to use when calling the API
# Notion
NOTION_API_KEY=Your Notion integration secret
NOTION_DATABASE_ID=The Notion database ID or full database URL
# Google Service Account Credentials
TYPE=
PROJECT_ID=
PRIVATE_KEY_ID=
PRIVATE_KEY=
CLIENT_EMAIL=
CLIENT_ID=
AUTH_URI=
TOKEN_URI=
AUTH_PROVIDER_X509_CERT_URL=
CLIENT_X509_CERT_URL=
- Audio file transcription using OpenAI's Whisper model
- Automatic saving of transcriptions to Google Docs
- Timestamp recording for each transcription
- Support for M4A audio format
- RESTful API interface
- Node.js with Express.js
- OpenAI API (Whisper model for transcription)
- Google Docs API
- Multer for file upload handling
- Node.js installed
- OpenAI API key
- Google Cloud project with enabled Google Docs API
- Google Service Account credentials
- Notion integration (optional): Notion API key and database
curl -F "audio=@[path to file].m4a;type=audio/m4a" \
-H "Authorization:[your personal auth token]" \
-X POST https://[your api url]/transcribe
The app can create a Notion page for each transcription when NOTION_API_KEY
and NOTION_DATABASE_ID
are configured. Notion updates run in parallel with Google Docs for speed.
- Go to Notion Developers → Create a new internal integration.
- Copy the integration secret into
NOTION_API_KEY
.
You can use an existing database or create a new one. Share the database with your integration (Share → Invite → select your integration) so it has access.
Required/Supported properties (column names and types):
- Title property (type: Title)
- Name can be anything. The app auto-detects the Title property.
- Content property (type: Rich text)
- Recommended names:
content
,text
,body
,description
, ornotes
. The app picks the first matching Rich text property.
- Recommended names:
tags
(type: Multi-select)- General tags. The app prioritizes existing options but may add new options here if needed.
- Property must be named exactly
tags
(case-insensitive). This prevents conflicts with similarly named columns likeproject-tags
.
category-tags
(type: Multi-select)- High-level categories matched conservatively via AI from your transcription.
- The app will ONLY use options that already exist in this property and will NOT create new options.
- Create the options you want to be eligible, for example:
dev
,health
.
project-tags
(type: Multi-select)- Project-specific tags matched conservatively via AI.
- The app will ONLY use options that already exist in this property and will NOT create new options.
- Create the options you want to be eligible, for example:
smart-journal
,p1v3
,p1v4
.
Notes:
NOTION_DATABASE_ID
may be the raw 32-character ID or the full database URL. The app extracts the ID automatically (hyphens and query params are handled).- If
category-tags
orproject-tags
properties are missing, they are simply skipped.
- Title: generated by AI to summarize the transcription.
- Content: the full transcription text (stored in the first suitable Rich text property).
tags
: generated by AI; reuses existing options when possible and may create new options if needed.category-tags
: matched strictly against existing options; never creates new options.project-tags
: matched strictly against existing options; never creates new options.
The app performs title generation, tags
, category-tags
, and project-tags
extraction in parallel to reduce latency.