認識したテキストをgemma3:4bへ質問して返答を表示する

🎯 目標
* マイクで音声入力
* faster-whisperでテキスト化
* Ollama（gemma3:4b）に質問として送信
* 返答を画面に表示するだけ（まだ読み上げはしない）

AudioToTextCorrector でテキストを取得
そのテキストを gemma3:4b に渡す
gemma3:4bの返答をprintする

これに伴い

from module.module_audio_to_text import AudioToTextCorrector

def main():
    audio_to_text = AudioToTextCorrector("config.json")

    recognized_texts = []

    while True:
        corrected_text = audio_to_text.record_and_correct(timeout_seconds=10)

        if corrected_text is None:
            print("終了条件に達したため、ループを抜けます。")
            break

        recognized_texts.append(corrected_text)
        print(corrected_text)

    if recognized_texts:
        message = "\n".join(recognized_texts)
        print("\n入力された音声テキスト一覧:")
        print(message)
    else:
        print("入力メッセージはありませんでした")

if __name__ == "__main__":
    main()

のコードを変更

これで実行

今日の天気
と音声入力すると

python main3.py
[2025-05-08 04:08:59.862] [ctranslate2] [thread 8630251] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
stand by ready OK
recording...
finished

【認識・補正したテキスト】
今日の気候はどうでしょうか。

【gemma3:4bの返答】
はい、今日の気候についてお調べします。

現在（2024年5月16日16時30分）の天気は以下の通りです。

*   **場所:** 東京都
*   **天気:** 晴れ
*   **気温:** 18℃
*   **降水確率:** 0%
*   **風:** 北西風 3～5km/h

より詳細な情報が必要な場合は、場所を具体的に指定してください。例えば、「大阪府の今日の天気は？」のように聞いていただければ、より正確な情報をお伝えできます。

また、以下のサイトでも最新の天気予報を確認できます。

*   **Yahoo!天気・災害:** [https://weather.yahoo.co.jp/](https://weather.yahoo.co.jp/)
*   **日本気象協会 tenki.jp:** [https://tenki.jp/](https://tenki.jp/)
stand by ready OK
recording...
finished
10秒間音声が入力されなかったため、処理を終了します。
終了条件に達したため、ループを抜けます。

となる

とりあえず音声で入力し
これを認識補正することで簡単な質問でも適切に回答可能できそう

次は
gemma3:4bの返答を
pyttsx3 を使って
自然にリアルタイム読み上げる

コメントを残す コメントをキャンセル

コメントを残すコメントをキャンセル