チラシの解析（gemini) – Linux & Android Dialy

チラシの解析（gemini)

【AI × チラシ解析】忙しい社会人のための“節約レシピ提案アプリ”を作ってみた

がまさに答えっぽい

touch image_analysis.py

でファイルを作成

イメージは.pngなので
これを読み込むようにする

from google import genai
from PIL import Image

class Gemini:
    def __init__(self):
        API_KEY = "生成したAPIキー"
        self.model = "gemini-2.0-flash"
        self.client = genai.Client(api_key=API_KEY)
        self.prompt = "スーパーマーケットの広告画像です。それぞれの広告に掲載されている商品と価格を”全て”抽出してリストにしてください。また掲載されている食材を使った今晩のレシピを提案してください。その際、1人前のおおよその価格も計算して教えてください。なお、調味料や米などは自由に使えるものとします。"

    def loadImage(self):
        # スクレイピングした画像データをロード
        image_paths = glob.glob("./source/*.jpg")
        images = []
        for image_path in image_paths:
            images.append(Image.open(image_path))
        self.images = images

    def run(self,prompt="",image=""):
        prompt = self.prompt
        images = self.images

        response = self.client.models.generate_content(
            model=self.model
            ,contents=[images,prompt]
        )
        self.response = response.text
        print(self.response)

が参考もとコード

これを商品リストのみに変更する

mkdir source
cp step-1.png source

でファイルを移動

実行したら

Traceback (most recent call last):
  File "/Users/snowpool/aw10s/gemini/image_analysis.py", line 1, in <module>
    from google import genai
ImportError: cannot import name 'genai' from 'google' (unknown location)

となった

どうやらインポートの文が間違いらしい

これを

import google.generativeai as genai

としたけど動作しない

Mainの記述がないで

if __name__ == "__main__":
    gemini = Gemini()
    gemini.loadImage()
    gemini.run()

を追加したが

python image_analysis.py
Traceback (most recent call last):
  File "/Users/snowpool/aw10s/gemini/image_analysis.py", line 34, in <module>
    gemini = Gemini()
             ^^^^^^^^
  File "/Users/snowpool/aw10s/gemini/image_analysis.py", line 9, in __init__
    self.client = genai.Client(api_key=API_KEY)
                  ^^^^^^^^^^^^
AttributeError: module 'google.generativeai' has no attribute 'Client'

となった

Google の google-generativeai ライブラリには Client というクラスは存在しません
とのこと

以下のように genai.configure() を使い、GenerativeModel を直接生成
gemini-2.0-flash → テキスト特化（画像には非対応）
gemini-pro-vision → 画像入力に対応 ✅

ということなので

import glob
from PIL import Image
import google.generativeai as genai

class Gemini:
    def __init__(self):
        API_KEY = "AIzaSyBGtutzF_xdEWcPf8343jyAt_Qq3t1cFIQ"  # セキュアな方法で管理を推奨
        genai.configure(api_key=API_KEY)

        # モデルを設定（画像対応モデルは gemini-pro-vision）
        self.model = genai.GenerativeModel("gemini-pro-vision")
        self.prompt = "スーパーマーケットの広告画像です。それぞれの広告に掲載されている商品と価格を”全て”抽出してリストにしてください。"

    def loadImage(self):
        # ./source/*.png を取得
        image_paths = glob.glob("./source/*.png")
        if not image_paths:
            print("⚠️ 画像が見つかりません")
        self.images = [Image.open(path) for path in image_paths]

    def run(self):
        for idx, image in enumerate(self.images):
            print(f"🖼 画像{idx+1} を処理中...")
            try:
                response = self.model.generate_content(
                    [self.prompt, image],
                    stream=False
                )
                print("✅ 結果:")
                print(response.text)
            except Exception as e:
                print("⚠️ エラー:", e)

# 実行用ブロック
if __name__ == "__main__":
    gemini = Gemini()
    gemini.loadImage()
    gemini.run()

としたが

🖼 画像1 を処理中...
⚠️ エラー: 404 Gemini 1.0 Pro Vision has been deprecated on July 12, 2024. Consider switching to different model, for example gemini-1.5-flash.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1745014048.933884 7409777 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.

gemini-1.5-pro-vision
にしても

🖼 画像1 を処理中...
⚠️ エラー: 404 models/gemini-1.5-pro-vision is not found for API version v1beta, or is not supported for generateContent. Call ListModels to see the list of available models and their supported methods.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1745014171.856705 7412024 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.

ということで使用するモデルバージョンを変更

import glob
from PIL import Image
import google.generativeai as genai

class Gemini:
    def __init__(self):
        API_KEY = "AIzaSyBGtutzF_xdEWcPf8343jyAt_Qq3t1cFIQ"  # セキュアな方法で管理を推奨
        genai.configure(api_key=API_KEY)

        # モデルを設定（画像対応モデルは gemini-pro-vision）
        self.model = genai.GenerativeModel("gemini-1.5-pro")
        self.prompt = "スーパーマーケットの広告画像です。それぞれの広告に掲載されている商品と価格を変更せずに”全て”抽出してリストにしてください。"

    def loadImage(self):
        # ./source/*.png を取得
        image_paths = glob.glob("./source/*.png")
        if not image_paths:
            print("⚠️ 画像が見つかりません")
        self.images = [Image.open(path) for path in image_paths]

    def run(self):
        for idx, image in enumerate(self.images):
            print(f"🖼 画像{idx+1} を処理中...")
            try:
                response = self.model.generate_content(
                    [self.prompt, image],
                    stream=False
                )
                print("✅ 結果:")
                print(response.text)
            except Exception as e:
                print("⚠️ エラー:", e)

# 実行用ブロック
if __name__ == "__main__":
    gemini = Gemini()
    gemini.loadImage()
    gemini.run()

として実行すると

✅ 結果:
リストは以下の通りです。

* カップヌードル：125円
* 柔軟剤：699円
* 冷凍食品：109円
* ポカリスエット：329円
* キリンレモン：459円
* お茶漬け：179円
* 大人用オムツ：299円
* ミックスナッツ：329円
* 牛乳：299円
* ドレッシング：249円
* 救急バン：169円
* フェイスマスク：99円
* 豆乳飲料：259円
* ティッシュペーパー：399円
* ハーゲンダッツ：199円
* 海苔：199円
* 食パン：89円
* ウェットティッシュ：99円
* バナナ：139円
* 卵：128円
* ポテトチップス：159円
* ベーコン：89円
* プチトマト：299円
* キュウリ：249円
* サラダチキン：299円
* 豆腐：369円
* ヨーグルト：179円
* 冷凍うどん：848円
* 冷凍食品：880円
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1745015805.274105 7441488 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.

元々のプロンプトは

        self.prompt = "スーパーマーケットの広告画像です。それぞれの広告に掲載されている商品と価格を”全て”抽出してリストにしてください。"

これだと

🖼 画像1 を処理中...
✅ 結果:
画像に掲載されている商品と価格のリストです。

* カップ麺：109円
* ドリンク：30円/329円
* 食パン：125円/249円
* 殺虫剤：50円/699円
* ポカリスエット：150円/459円
* 鶏肉：179円/329円
* 牛乳：299円
* ペン：20円/249円
* 救急バン：169円/119円
* パック飲料：10円/299円
* シャンプー：259円/399円
* ハーゲンダッツ：250円/199円
* 緑茶：199円
* トイレットペーパー：89円
* 卵：99円
* バナナ：139円/128円
* 豆腐：159円
* 納豆：89円
* 油揚げ：299円
* ヨーグルト：249円
* みかん：299円/369円
* ソーセージ：179円/848円
* ハム：880円

コメントを残す コメントをキャンセル

コメントを残すコメントをキャンセル