Based on the refined OpenAI models and extensive research on user needs, this project aimed to select a speech implementation framework that could meet the requirements of embedded systems.
We utilized Python Flask and React Socket.io to achieve seamless communication between the front-end web page and back-end server, visualizing GPT texts on the web page, and integrating Azure and OpenAI GPT models to accomplish speech recognition and synthesis.
During the implementation process, we identified compatibility issues arising from differences in ARM32 and ARM64 architectures and restructured the code to ensure compatibility, resulting in a score of 95 points for the final project.
graph LR;
word((Trigger Word));
asr[Automatic\nSpeech Recognition];
nlp[Natural\nLanguage Processing]
gpt[Generative\nPre-trained Transformer];
tts[Text-To-Speech]
sound((Sound))
word --> asr --> nlp --> gpt --> tts --> sound;
Video Link: Twitter
Demo Link: Smart Speaker
- Smart Speaker based on GPT by OpenAI
- Table Of Content
- Characteristics
- Steps
- Step 1. Install all dependencies
client - npm install
- Step 2. Train Wake word(Optional)
- Step 3. change .env.example to .env and filling .env files
- Step 4. Change TEST_MODE to True or IS_RASPBERRYPI in
server/utils/config.py
(Important), connect url inclient/src/app.js
(Optional) - Step 4. run
sh start.sh
orserver - app.py
andclient - npm start
- Step 1. Install all dependencies
- Installation
- Reference
- prompt completion
- continuous dialog
- precise ASR(speech to text)
- Prompt: Write a tagline for an ice cream shop.
- Completion: We serve up smiles with every scoop!
- Suggest one name for a horse.
- Lightning
- Suggest one name for a black horse.
- Midnight
- Suggest three names for a horse that is a superhero.
- Super Stallion
- Captain Colt
- Mighty Mustang
$ Suggest three names for an animal that is a superhero.
Animal: Cat
Names: Captain Sharpclaw, Agent Fluffball, The Incredible Feline
Animal: Dog
Names: Ruff the Protector, Wonder Canine, Sir Barks-a-Lot
Animal: Horse
Names: Super Stallion, Mighty Mare, The Magnificent Equine
Step 2. Train Wake word(Optional)
Step 4. Change TEST_MODE to True or IS_RASPBERRYPI in server/utils/config.py
(Important), connect url in client/src/app.js
(Optional)
run install.sh
or follow the steps
# src/pyaudio/device_api.c:9:10: fatal error: 'portaudio.h' file not found
brew install portaudio
pip3 install pyAudio
# Linux
sudo apt install python3-pyaudio
# https://stackoverflow.com/questions/58974116/how-to-install-libasound2-dev-32-bit-without-using-apt-get
sudo apt-get install libportaudio2
pip3 install pvporcupine
pip3 install pvcobra
sudo apt-get update
sudo apt-get install build-essential libssl-dev libasound2 wget
pip install azure-cognitiveservices-speech
cd ./code && mv .env.example .env
pip3 install python-dotenv
PICOVOICE_AI_KEY=${YOUR-PICOVOICE-AI-KEY}
SPEECH_KEY=${MICROSOFT-AZURE-SPEECH-KEY}
SPEECH_REGION=${MICROSOFT-AZURE-SPEECH-REGION}