gemma3:4bの返答を pyttsx3 を使って自然にリアルタイム読み上げる

gemma3:4bの返答を
pyttsx3 を使って
自然にリアルタイム読み上げる

読み上げは

touch main4.py

でファイルを作成し問題が起きた時にわかりやすくする

from module.module_audio_to_text import AudioToTextCorrector
from ollama import chat, ChatResponse
import pyttsx3
import re
import emoji

# モデル名
OLLAMA_MODEL = 'gemma3:4b'

# pyttsx3初期化
engine = pyttsx3.init()

def ask_ollama(prompt: str) -> str:
    try:
        response: ChatResponse = chat(model=OLLAMA_MODEL, messages=[
            {
                'role': 'user',
                'content': prompt,
            }
        ])
        return response.message.content.strip()
    except Exception as e:
        print(f"Ollamaエラー: {e}")
        return "エラーが発生しました。"

def remove_emoji(text: str) -> str:
    return emoji.replace_emoji(text, replace='')

def speak(text: str):
    clean_text = remove_emoji(text)
    print("\n【読み上げるテキスト】")
    print(clean_text)
    engine.say(clean_text)
    engine.runAndWait()

def main():
    audio_to_text = AudioToTextCorrector("config.json")

    while True:
        corrected_text = audio_to_text.record_and_correct(timeout_seconds=10)

        if corrected_text is None:
            print("終了条件に達したため、ループを抜けます。")
            break

        print("\n【認識・補正したテキスト】")
        print(corrected_text)

        # Ollamaへ質問
        ollama_reply = ask_ollama(corrected_text)

        print("\n【gemma3:4bの返答】")
        print(ollama_reply)

        # gemma3:4bの返答を音声で読み上げ
        speak(ollama_reply)

if __name__ == "__main__":
    main()

実行したけど

[2025-05-09 19:01:12.451] [ctranslate2] [thread 8719969] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead. stand by ready OK recording... finished Ollamaエラー: llama runner process has terminated: exit status 2 終了条件に達したため、ループを抜けます。

となる

エラー内容を調べると

Ollamaエラー: llama runner process has terminated: exit status 2
は、Ollamaの内部プロセス（llama-runner）がクラッシュして終了

このため chrome を一度終了してから再度実行

しかし音声が出ない

engine = pyttsx3.init('nsss')  # macOSならこれで確実に動く

と明示的にしてもだめ

touch tts_test.py

でファイルを作成

import pyttsx3

engine = pyttsx3.init('nsss')
text = "こんにちは。今日はどんな気分ですか？映画でも見ませんか？"
engine.say(text)
engine.runAndWait()

として実行

しかし音声が再生されない
色々試した結果、再起動したら音声が再生された

どうやら再起動することで解決したらしい
メモリが足りなくてエラーだったので
もう少しマシンスペックを上げないと厳しいかもしれない

とりあえず動くので
次は音声再生部分をモジュールにする
その後ウェイクワードエンジンをやってみる

コメントを残す コメントをキャンセル

コメントを残すコメントをキャンセル