1) 生成需要识别的wav文件,SpeechRecognition需要wav文件,不能识别mp3文件
安装库:
sudo apt install espeak ffmpeg libespeak1
pip install pyttsx3
代码:
def demo_tts_wav():
? ? import pyttsx3
? ? engine = pyttsx3.init()
? ? engine.setProperty('rate', 150)
? ? engine.setProperty('volume', 1.0)
? ? voices = engine.getProperty('voices')
? ? engine.setProperty('voice', voices[0].id)
? ? text = '你好,我是一个AI机器人'
? ? #engine.say(text)
? ? filename = 'ni_hao.wav'
? ? engine.save_to_file(text, filename)
? ? engine.runAndWait()
?
2. 语音识别,使用speech_recognition
安装库:
pip install SpeechRecognition
pip install pocketsphinx
下载模型文件:CMU Sphinx - Browse /Acoustic and Language Models/Mandarin at SourceForge.net
pip install vosk
下载模型文件到代码目录下:VOSK Models
解压,并且重命名为model
代码
def demo_speech_recognition():
? ? import speech_recognition as sr
? ? r = sr.Recognizer()
? ? try:
? ? ? ? audio_file = sr.AudioFile('ni_hao.wav')
? ? ? ? with audio_file as source:
? ? ? ? ? ? audio_data = r.record(source)
? ? ? ? #text = r.recognize_google(audio_data, language='zh-Cn')
? ? ? ? #text = r.recognize_wit(audio_data)
? ? ? ? text = r.recognize_vosk(audio_data, language='zh-Cn')
? ? ? ? print("识别结果:", text)
? ? except Exception as e:
? ? ? ? print("无法识别语音:", str(e))