Yomitokuで写真やレシートを解析をColabで試す

Yomitokuで写真やレシートを解析をColabで試す

Yomitokuで写真やレシートを解析してみる

Colab で行なってるのがあったので参考にする

出力形式によって得られる情報が異なるらしい

A100で実験する

https://github.com/kotaro-kinoshita/yomitoku
でライブラリが公開されているので
ドキュメントなどはこれを読む

基本的にGPUが必要

! pip install yomitoku

でライブラリインストール

最後にsession restartしろといわれたので、ダイアログボックスに促されるままrestartする

次に写真の用意

google colabなので、/content以下にimageフォルダを作成し、その中にJPGをいれ
フォルダをまるっと指定すると、その中の画像ファイルをすべて解析してくれます

!mkdir image

でフォルダ作成
ここに写真をアップロードする

aw10s/ollama/images/test.jpg

をアップロード

! yomitoku /content/image/ -f md -o results -v --figure

で解析

結果

2024-12-15 21:31:34,841 - yomitoku.base - INFO - Initialize TextDetector
model.safetensors: 100% 102M/102M [00:04<00:00, 23.8MB/s]
2024-12-15 21:31:40,639 - yomitoku.base - INFO - Initialize TextRecognizer
config.json: 100% 256/256 [00:00<00:00, 1.87MB/s]
model.safetensors: 100% 200M/200M [00:08<00:00, 23.9MB/s]
2024-12-15 21:31:50,844 - yomitoku.base - INFO - Initialize LayoutParser
model.safetensors: 100% 172M/172M [00:07<00:00, 23.5MB/s]
2024-12-15 21:31:59,554 - yomitoku.base - INFO - Initialize TableStructureRecognizer
model.safetensors: 100% 172M/172M [00:03<00:00, 43.0MB/s]
2024-12-15 21:32:04,793 - yomitoku.cli.main - INFO - Output directory: results
2024-12-15 21:32:04,793 - yomitoku.cli.main - INFO - Processing file: /content/image/test.jpg
2024-12-15 21:32:06,818 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 1.9037435054779053
2024-12-15 21:32:07,114 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 2.19960355758667
2024-12-15 21:32:07,247 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.13303065299987793
2024-12-15 21:32:08,789 - yomitoku.base - INFO - TextRecognizer __call__ elapsed_time: 1.9707973003387451
2024-12-15 21:32:08,863 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_ocr.jpg
2024-12-15 21:32:08,914 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_layout.jpg
2024-12-15 21:32:08,927 - yomitoku.cli.main - INFO - Output file: results/image_test_p1.md
2024-12-15 21:32:08,928 - yomitoku.cli.main - INFO - Total Processing time: 4.13 sec

結果は自動作成される results の中にある

ファイル名_p1.md
という感じで出力される
image_test_p1.md
の中身は

<img src="figures/image_test_p1_figure_0.png" width="200px"><br>
登録番号 T5080401017738<br>とれたて食楽部<br>静岡県袋井市山名町3\-3<br>TEL 0538\-41\-1100

2024年 8月10日\(土\)08:59 \#000011<br>000801精算機1<br>000801精算機1<br>3901

09:08<br>R9309<br>\#000003<br>お会計券<br>西澤<br>000008

¥150<br>内8 ★きゅうり/鈴木 仁<br>P2023300101503

内8 ★きゅうり/小林宗作

P2055600101303

¥130

内8 リーフレタス/\(有\)成神工 ¥216<br>P2086402402169

|小計<br>\(内税 8%対象額<br>買上点数|¥496|
|-|-|
||¥496\)|
||3点|
|合計|¥496|
|\(税率 8%対象額|¥496\)|
|\(内消費税等 8%|¥36\)|
|課税事業者||
|\(税率 8%対象額|¥216\)|
|\(内消費税等 8%|¥16\)|
|免税事業者||
|\(税率 8%対象額|¥280\)|

¥496<br>クレジット<br>¥36\)<br>\(内消費税等

、内Sは軽減税率対象商品です。

これがOCRした内容

次に
出力形式によって得られる情報が異なるらしいので
JSON形式で出力する

! yomitoku /content/image/ -f json -o results -v --figure

結果は

2024-12-16 21:00:01,501 - yomitoku.base - INFO - Initialize TextDetector
2024-12-16 21:00:02,677 - yomitoku.base - INFO - Initialize TextRecognizer
2024-12-16 21:00:04,059 - yomitoku.base - INFO - Initialize LayoutParser
2024-12-16 21:00:05,086 - yomitoku.base - INFO - Initialize TableStructureRecognizer
2024-12-16 21:00:06,063 - yomitoku.cli.main - INFO - Output directory: results
2024-12-16 21:00:06,063 - yomitoku.cli.main - INFO - Processing file: /content/image/test.jpg
2024-12-16 21:00:07,057 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 0.9070024490356445
2024-12-16 21:00:07,207 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 1.0565669536590576
2024-12-16 21:00:07,340 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.1329059600830078
2024-12-16 21:00:08,864 - yomitoku.base - INFO - TextRecognizer __call__ elapsed_time: 1.8073179721832275
2024-12-16 21:00:08,940 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_ocr.jpg
2024-12-16 21:00:08,992 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_layout.jpg
2024-12-16 21:00:08,994 - yomitoku.cli.main - INFO - Output file: results/image_test_p1.json
2024-12-16 21:00:08,995 - yomitoku.cli.main - INFO - Total Processing time: 2.93 sec

出力されたファイルは
Image_test_p1.json

内容は

{
    "figures": [
        {
            "box": [
                586,
                249,
                1633,
                503
            ],
            "direction": "horizontal",
            "order": 0,
            "paragraphs": []
        }
    ],
    "paragraphs": [
        {
            "box": [
                638,
                569,
                1548,
                872
            ],
            "contents": "登録番号 T5080401017738\nとれたて食楽部\n静岡県袋井市山名町3-3\nTEL 0538-41-1100",
            "direction": "horizontal",
            "order": 1,
            "role": null
        },
        {
            "box": [
                569,
                924,
                1614,
                1080
            ],
            "contents": "2024年 8月10日(土)08:59 #000011\n000801精算機1\n000801精算機1\n3901",
            "direction": "horizontal",
            "order": 2,
            "role": null
        },
        {
            "box": [
                545,
                1173,
                1636,
                1347
            ],
            "contents": "09:08\nR9309\n#000003\nお会計券\n西澤\n000008",
            "direction": "horizontal",
            "order": 3,
            "role": null
        },
        {
            "box": [
                516,
                1387,
                1688,
                1577
            ],
            "contents": "¥150\n内8 ★きゅうり/鈴木 仁\nP2023300101503",
            "direction": "horizontal",
            "order": 4,
            "role": null
        },
        {
            "box": [
                485,
                1558,
                1448,
                1660
            ],
            "contents": "内8 ★きゅうり/小林宗作",
            "direction": "horizontal",
            "order": 5,
            "role": null
        },
        {
            "box": [
                641,
                1652,
                1186,
                1734
            ],
            "contents": "P2055600101303",
            "direction": "horizontal",
            "order": 6,
            "role": null
        },
        {
            "box": [
                1532,
                1534,
                1703,
                1620
            ],
            "contents": "¥130",
            "direction": "horizontal",
            "order": 7,
            "role": null
        },
        {
            "box": [
                484,
                1695,
                1712,
                1904
            ],
            "contents": "内8 リーフレタス/(有)成神工 ¥216\nP2086402402169",
            "direction": "horizontal",
            "order": 8,
            "role": null
        },
        {
            "box": [
                341,
                3432,
                1884,
                3719
            ],
            "contents": "¥496\nクレジット\n¥36)\n(内消費税等",
            "direction": "horizontal",
            "order": 10,
            "role": null
        },
        {
            "box": [
                288,
                3822,
                1779,
                4077
            ],
            "contents": "、内Sは軽減税率対象商品です。",
            "direction": "horizontal",
            "order": 11,
            "role": null
        }
    ],
    "tables": [
        {
            "box": [
                346,
                1915,
                1852,
                3358
            ],
            "cells": [
                {
                    "box": [
                        347,
                        1921,
                        1279,
                        2310
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "小計\n(内税 8%対象額\n買上点数",
                    "row": 1,
                    "row_span": 3
                },
                {
                    "box": [
                        1276,
                        1925,
                        1851,
                        2039
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥496",
                    "row": 1,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2043,
                        1851,
                        2133
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥496)",
                    "row": 2,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2152,
                        1851,
                        2289
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "3点",
                    "row": 3,
                    "row_span": 1
                },
                {
                    "box": [
                        347,
                        2296,
                        1280,
                        2426
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "合計",
                    "row": 4,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2296,
                        1851,
                        2426
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥496",
                    "row": 4,
                    "row_span": 1
                },
                {
                    "box": [
                        348,
                        2424,
                        1280,
                        2526
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "(税率 8%対象額",
                    "row": 5,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2424,
                        1851,
                        2526
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥496)",
                    "row": 5,
                    "row_span": 1
                },
                {
                    "box": [
                        347,
                        2540,
                        1280,
                        2695
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "(内消費税等 8%",
                    "row": 6,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2540,
                        1851,
                        2695
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥36)",
                    "row": 6,
                    "row_span": 1
                },
                {
                    "box": [
                        348,
                        2705,
                        1280,
                        2852
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "課税事業者",
                    "row": 7,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2705,
                        1851,
                        2852
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "",
                    "row": 7,
                    "row_span": 1
                },
                {
                    "box": [
                        348,
                        2865,
                        1280,
                        2951
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "(税率 8%対象額",
                    "row": 8,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2865,
                        1851,
                        2951
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥216)",
                    "row": 8,
                    "row_span": 1
                },
                {
                    "box": [
                        347,
                        2980,
                        1280,
                        3102
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "(内消費税等 8%",
                    "row": 9,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        2980,
                        1851,
                        3102
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥16)",
                    "row": 9,
                    "row_span": 1
                },
                {
                    "box": [
                        348,
                        3103,
                        1280,
                        3202
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "免税事業者",
                    "row": 10,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        3103,
                        1851,
                        3202
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "",
                    "row": 10,
                    "row_span": 1
                },
                {
                    "box": [
                        347,
                        3211,
                        1280,
                        3355
                    ],
                    "col": 1,
                    "col_span": 1,
                    "contents": "(税率 8%対象額",
                    "row": 11,
                    "row_span": 1
                },
                {
                    "box": [
                        1276,
                        3211,
                        1851,
                        3355
                    ],
                    "col": 2,
                    "col_span": 1,
                    "contents": "¥280)",
                    "row": 11,
                    "row_span": 1
                }
            ],
            "n_col": 2,
            "n_row": 11,
            "order": 9
        }
    ],
    "words": [
        {
            "content": "、内Sは軽減税率対象商品です。",
            "det_score": 0.6796720981104631,
            "direction": "horizontal",
            "points": [
                [
                    431,
                    3885
                ],
                [
                    1752,
                    3813
                ],
                [
                    1759,
                    3925
                ],
                [
                    438,
                    3997
                ]
            ],
            "rec_score": 0.03351603075861931
        },
        {
            "content": "(内消費税等",
            "det_score": 0.8627398538033404,
            "direction": "horizontal",
            "points": [
                [
                    871,
                    3594
                ],
                [
                    1378,
                    3577
                ],
                [
                    1381,
                    3686
                ],
                [
                    875,
                    3704
                ]
            ],
            "rec_score": 0.979337215423584
        },
        {
            "content": "¥36)",
            "det_score": 0.7510144629009029,
            "direction": "horizontal",
            "points": [
                [
                    1694,
                    3557
                ],
                [
                    1881,
                    3557
                ],
                [
                    1881,
                    3664
                ],
                [
                    1694,
                    3664
                ]
            ],
            "rec_score": 0.9134319424629211
        },
        {
            "content": "クレジット",
            "det_score": 0.8124990954358461,
            "direction": "horizontal",
            "points": [
                [
                    350,
                    3484
                ],
                [
                    1270,
                    3465
                ],
                [
                    1272,
                    3569
                ],
                [
                    352,
                    3589
                ]
            ],
            "rec_score": 0.9985630512237549
        },
        {
            "content": "¥496",
            "det_score": 0.8485499834607593,
            "direction": "horizontal",
            "points": [
                [
                    1463,
                    3438
                ],
                [
                    1843,
                    3429
                ],
                [
                    1846,
                    3534
                ],
                [
                    1465,
                    3543
                ]
            ],
            "rec_score": 0.9998164176940918
        },
        {
            "content": "(税率 8%対象額",
            "det_score": 0.853462993147711,
            "direction": "horizontal",
            "points": [
                [
                    459,
                    3234
                ],
                [
                    1102,
                    3225
                ],
                [
                    1103,
                    3327
                ],
                [
                    460,
                    3336
                ]
            ],
            "rec_score": 0.9029322266578674
        },
        {
            "content": "¥280)",
            "det_score": 0.8697890050123379,
            "direction": "horizontal",
            "points": [
                [
                    1632,
                    3182
                ],
                [
                    1858,
                    3182
                ],
                [
                    1858,
                    3295
                ],
                [
                    1632,
                    3295
                ]
            ],
            "rec_score": 0.9970043301582336
        },
        {
            "content": "免税事業者",
            "det_score": 0.8758680560327103,
            "direction": "horizontal",
            "points": [
                [
                    357,
                    3103
                ],
                [
                    831,
                    3112
                ],
                [
                    829,
                    3219
                ],
                [
                    355,
                    3210
                ]
            ],
            "rec_score": 0.9983263611793518
        },
        {
            "content": "(内消費税等 8%",
            "det_score": 0.8655238629057851,
            "direction": "horizontal",
            "points": [
                [
                    475,
                    2996
                ],
                [
                    1100,
                    2996
                ],
                [
                    1100,
                    3093
                ],
                [
                    475,
                    3093
                ]
            ],
            "rec_score": 0.908987820148468
        },
        {
            "content": "¥16)",
            "det_score": 0.8736068132308011,
            "direction": "horizontal",
            "points": [
                [
                    1663,
                    2953
                ],
                [
                    1837,
                    2953
                ],
                [
                    1837,
                    3060
                ],
                [
                    1663,
                    3060
                ]
            ],
            "rec_score": 0.9977002143859863
        },
        {
            "content": "(税率 8%対象額",
            "det_score": 0.8736011394336691,
            "direction": "horizontal",
            "points": [
                [
                    488,
                    2884
                ],
                [
                    1100,
                    2884
                ],
                [
                    1100,
                    2981
                ],
                [
                    488,
                    2981
                ]
            ],
            "rec_score": 0.9527927041053772
        },
        {
            "content": "¥216)",
            "det_score": 0.8798753646381205,
            "direction": "horizontal",
            "points": [
                [
                    1613,
                    2841
                ],
                [
                    1827,
                    2835
                ],
                [
                    1830,
                    2940
                ],
                [
                    1616,
                    2946
                ]
            ],
            "rec_score": 0.9991167187690735
        },
        {
            "content": "課税事業者",
            "det_score": 0.8782089667459907,
            "direction": "horizontal",
            "points": [
                [
                    383,
                    2761
                ],
                [
                    839,
                    2773
                ],
                [
                    836,
                    2875
                ],
                [
                    381,
                    2863
                ]
            ],
            "rec_score": 0.9996485114097595
        },
        {
            "content": "(内消費税等 8%",
            "det_score": 0.8545332324999831,
            "direction": "horizontal",
            "points": [
                [
                    422,
                    2552
                ],
                [
                    1015,
                    2569
                ],
                [
                    1012,
                    2666
                ],
                [
                    419,
                    2649
                ]
            ],
            "rec_score": 0.9030265808105469
        },
        {
            "content": "¥36)",
            "det_score": 0.874239359391371,
            "direction": "horizontal",
            "points": [
                [
                    1637,
                    2517
                ],
                [
                    1801,
                    2517
                ],
                [
                    1801,
                    2614
                ],
                [
                    1637,
                    2614
                ]
            ],
            "rec_score": 0.9990556836128235
        },
        {
            "content": "(税率 8%対象額",
            "det_score": 0.8743577414363437,
            "direction": "horizontal",
            "points": [
                [
                    430,
                    2455
                ],
                [
                    1018,
                    2472
                ],
                [
                    1015,
                    2556
                ],
                [
                    427,
                    2539
                ]
            ],
            "rec_score": 0.9258973002433777
        },
        {
            "content": "¥496)",
            "det_score": 0.873335879837346,
            "direction": "horizontal",
            "points": [
                [
                    1584,
                    2421
                ],
                [
                    1787,
                    2411
                ],
                [
                    1792,
                    2503
                ],
                [
                    1589,
                    2513
                ]
            ],
            "rec_score": 0.9990760087966919
        },
        {
            "content": "合計",
            "det_score": 0.7989976852002054,
            "direction": "horizontal",
            "points": [
                [
                    422,
                    2351
                ],
                [
                    774,
                    2357
                ],
                [
                    772,
                    2459
                ],
                [
                    420,
                    2453
                ]
            ],
            "rec_score": 0.9596889615058899
        },
        {
            "content": "¥496",
            "det_score": 0.84043764577154,
            "direction": "horizontal",
            "points": [
                [
                    1423,
                    2337
                ],
                [
                    1756,
                    2319
                ],
                [
                    1761,
                    2406
                ],
                [
                    1428,
                    2424
                ]
            ],
            "rec_score": 0.9845718741416931
        },
        {
            "content": "買上点数",
            "det_score": 0.8724811626703719,
            "direction": "horizontal",
            "points": [
                [
                    474,
                    2164
                ],
                [
                    821,
                    2176
                ],
                [
                    818,
                    2263
                ],
                [
                    471,
                    2251
                ]
            ],
            "rec_score": 0.8529166579246521
        },
        {
            "content": "3点",
            "det_score": 0.877569906485891,
            "direction": "horizontal",
            "points": [
                [
                    1611,
                    2127
                ],
                [
                    1751,
                    2127
                ],
                [
                    1751,
                    2221
                ],
                [
                    1611,
                    2221
                ]
            ],
            "rec_score": 0.999936580657959
        },
        {
            "content": "(内税 8%対象額",
            "det_score": 0.8549906700132722,
            "direction": "horizontal",
            "points": [
                [
                    466,
                    2070
                ],
                [
                    1026,
                    2084
                ],
                [
                    1023,
                    2171
                ],
                [
                    464,
                    2157
                ]
            ],
            "rec_score": 0.6945281624794006
        },
        {
            "content": "¥496)",
            "det_score": 0.8776254550615946,
            "direction": "horizontal",
            "points": [
                [
                    1566,
                    2039
                ],
                [
                    1756,
                    2028
                ],
                [
                    1761,
                    2121
                ],
                [
                    1571,
                    2131
                ]
            ],
            "rec_score": 0.9979957938194275
        },
        {
            "content": "小計",
            "det_score": 0.8659234180608216,
            "direction": "horizontal",
            "points": [
                [
                    483,
                    1981
                ],
                [
                    677,
                    1981
                ],
                [
                    677,
                    2076
                ],
                [
                    483,
                    2076
                ]
            ],
            "rec_score": 0.66346675157547
        },
        {
            "content": "¥496",
            "det_score": 0.8556146869165837,
            "direction": "horizontal",
            "points": [
                [
                    1558,
                    1955
                ],
                [
                    1732,
                    1943
                ],
                [
                    1738,
                    2028
                ],
                [
                    1563,
                    2039
                ]
            ],
            "rec_score": 0.9997748136520386
        },
        {
            "content": "P2086402402169",
            "det_score": 0.8685337040086014,
            "direction": "horizontal",
            "points": [
                [
                    628,
                    1823
                ],
                [
                    1186,
                    1823
                ],
                [
                    1186,
                    1897
                ],
                [
                    628,
                    1897
                ]
            ],
            "rec_score": 0.994750440120697
        },
        {
            "content": "内8 リーフレタス/(有)成神工 ¥216",
            "det_score": 0.7733530013156038,
            "direction": "horizontal",
            "points": [
                [
                    471,
                    1719
                ],
                [
                    1716,
                    1690
                ],
                [
                    1718,
                    1800
                ],
                [
                    474,
                    1829
                ]
            ],
            "rec_score": 0.487512469291687
        },
        {
            "content": "P2055600101303",
            "det_score": 0.8620755339288183,
            "direction": "horizontal",
            "points": [
                [
                    641,
                    1658
                ],
                [
                    1185,
                    1652
                ],
                [
                    1186,
                    1729
                ],
                [
                    641,
                    1734
                ]
            ],
            "rec_score": 0.9976783394813538
        },
        {
            "content": "内8 ★きゅうり/小林宗作",
            "det_score": 0.8030553586848483,
            "direction": "horizontal",
            "points": [
                [
                    485,
                    1558
                ],
                [
                    1448,
                    1558
                ],
                [
                    1448,
                    1660
                ],
                [
                    485,
                    1660
                ]
            ],
            "rec_score": 0.8243062496185303
        },
        {
            "content": "¥130",
            "det_score": 0.8647632946312013,
            "direction": "horizontal",
            "points": [
                [
                    1532,
                    1541
                ],
                [
                    1700,
                    1534
                ],
                [
                    1703,
                    1613
                ],
                [
                    1536,
                    1620
                ]
            ],
            "rec_score": 0.9999488592147827
        },
        {
            "content": "P2023300101503",
            "det_score": 0.8535739641137016,
            "direction": "horizontal",
            "points": [
                [
                    654,
                    1502
                ],
                [
                    1183,
                    1497
                ],
                [
                    1183,
                    1566
                ],
                [
                    654,
                    1571
                ]
            ],
            "rec_score": 0.9996684789657593
        },
        {
            "content": "内8 ★きゅうり/鈴木 仁",
            "det_score": 0.7407760786246895,
            "direction": "horizontal",
            "points": [
                [
                    503,
                    1413
                ],
                [
                    1447,
                    1402
                ],
                [
                    1448,
                    1494
                ],
                [
                    504,
                    1505
                ]
            ],
            "rec_score": 0.9217264652252197
        },
        {
            "content": "¥150",
            "det_score": 0.8529803361907365,
            "direction": "horizontal",
            "points": [
                [
                    1521,
                    1386
                ],
                [
                    1685,
                    1376
                ],
                [
                    1691,
                    1455
                ],
                [
                    1527,
                    1465
                ]
            ],
            "rec_score": 0.9999033212661743
        },
        {
            "content": "000008",
            "det_score": 0.8420085177533121,
            "direction": "horizontal",
            "points": [
                [
                    918,
                    1276
                ],
                [
                    1151,
                    1269
                ],
                [
                    1153,
                    1341
                ],
                [
                    920,
                    1347
                ]
            ],
            "rec_score": 0.9971499443054199
        },
        {
            "content": "西澤",
            "det_score": 0.8301356971000148,
            "direction": "horizontal",
            "points": [
                [
                    1198,
                    1261
                ],
                [
                    1360,
                    1254
                ],
                [
                    1363,
                    1333
                ],
                [
                    1201,
                    1340
                ]
            ],
            "rec_score": 0.9971343278884888
        },
        {
            "content": "お会計券",
            "det_score": 0.8479825696054877,
            "direction": "horizontal",
            "points": [
                [
                    540,
                    1209
                ],
                [
                    835,
                    1209
                ],
                [
                    835,
                    1285
                ],
                [
                    540,
                    1285
                ]
            ],
            "rec_score": 0.9989036917686462
        },
        {
            "content": "#000003",
            "det_score": 0.832013468600706,
            "direction": "horizontal",
            "points": [
                [
                    884,
                    1207
                ],
                [
                    1148,
                    1198
                ],
                [
                    1150,
                    1269
                ],
                [
                    886,
                    1278
                ]
            ],
            "rec_score": 0.9912707805633545
        },
        {
            "content": "R9309",
            "det_score": 0.8094638501787745,
            "direction": "horizontal",
            "points": [
                [
                    1195,
                    1189
                ],
                [
                    1392,
                    1182
                ],
                [
                    1394,
                    1254
                ],
                [
                    1198,
                    1261
                ]
            ],
            "rec_score": 0.9985978603363037
        },
        {
            "content": "09:08",
            "det_score": 0.8513526355901968,
            "direction": "horizontal",
            "points": [
                [
                    1442,
                    1174
                ],
                [
                    1635,
                    1167
                ],
                [
                    1638,
                    1241
                ],
                [
                    1444,
                    1248
                ]
            ],
            "rec_score": 0.9983935952186584
        },
        {
            "content": "3901",
            "det_score": 0.8389774914979935,
            "direction": "horizontal",
            "points": [
                [
                    1465,
                    1031
                ],
                [
                    1622,
                    1024
                ],
                [
                    1625,
                    1100
                ],
                [
                    1468,
                    1108
                ]
            ],
            "rec_score": 0.9998389482498169
        },
        {
            "content": "000801精算機1",
            "det_score": 0.8261694877181531,
            "direction": "horizontal",
            "points": [
                [
                    557,
                    1015
                ],
                [
                    1029,
                    1004
                ],
                [
                    1031,
                    1068
                ],
                [
                    559,
                    1079
                ]
            ],
            "rec_score": 0.8924864530563354
        },
        {
            "content": "000801精算機1",
            "det_score": 0.7923262641892176,
            "direction": "horizontal",
            "points": [
                [
                    1130,
                    993
                ],
                [
                    1596,
                    965
                ],
                [
                    1599,
                    1024
                ],
                [
                    1134,
                    1052
                ]
            ],
            "rec_score": 0.8452281355857849
        },
        {
            "content": "2024年 8月10日(土)08:59 #000011",
            "det_score": 0.76677570112371,
            "direction": "horizontal",
            "points": [
                [
                    567,
                    950
                ],
                [
                    1601,
                    899
                ],
                [
                    1605,
                    971
                ],
                [
                    571,
                    1021
                ]
            ],
            "rec_score": 0.7705328464508057
        },
        {
            "content": "TEL 0538-41-1100",
            "det_score": 0.7891764333521688,
            "direction": "horizontal",
            "points": [
                [
                    899,
                    822
                ],
                [
                    1425,
                    800
                ],
                [
                    1428,
                    856
                ],
                [
                    902,
                    878
                ]
            ],
            "rec_score": 0.8409962058067322
        },
        {
            "content": "静岡県袋井市山名町3-3",
            "det_score": 0.8232278183336981,
            "direction": "horizontal",
            "points": [
                [
                    773,
                    750
                ],
                [
                    1545,
                    729
                ],
                [
                    1547,
                    795
                ],
                [
                    774,
                    817
                ]
            ],
            "rec_score": 0.9089908003807068
        },
        {
            "content": "とれたて食楽部",
            "det_score": 0.8368622312565417,
            "direction": "horizontal",
            "points": [
                [
                    905,
                    666
                ],
                [
                    1359,
                    657
                ],
                [
                    1360,
                    732
                ],
                [
                    906,
                    740
                ]
            ],
            "rec_score": 0.7296138405799866
        },
        {
            "content": "登録番号 T5080401017738",
            "det_score": 0.8375435412981505,
            "direction": "horizontal",
            "points": [
                [
                    640,
                    582
                ],
                [
                    1384,
                    566
                ],
                [
                    1386,
                    637
                ],
                [
                    642,
                    653
                ]
            ],
            "rec_score": 0.6307746171951294
        }
    ]
}

となっている

この2つのファイルを使って
日付,店名,商品名,数量,金額
を抽出し
CSVファイルにできるか試すことにする

chatGPTでは目的通りのCSVファイルが作成された

次は他のレシートでも実験してみる

~/Downloads/Photos-001/    

に色々Google Photo の写真があるので試す

ぴあごも問題なく抽出できる
ココカラファインもできる

ただし単品の金額ではなく
合計金額になっているので
プロンプトを変えることにする

データから日付,店名,商品名,数量,金額 を抽出し 金額の部分は単品の金額にして CSVファイルにして
としたが
これだとものによっては金額が単品になっていない

これは杏林堂などのレシートでも同じ

データから日付,店名,商品名,数量,単品価格 を抽出し CSVファイルにして
とプロンプトを変えても変わらなかった

Yomitokuで写真やレシートを解析

Yomitokuで写真やレシートを解析

Yomitokuで写真やレシートを解析してみる
では
Google Colab で実行しているが
レシート解析に成功している

なお
【Python】PyTorchをAppleシリコン搭載Mac(M1、M2)にインストールする方法 – AppleシリコンGPUで動かす方法も、併せて紹介 –

によれば

PyTorchをNvidia製GPUで動かすためには、扱うデータを「メインメモリ」から、
「GPU上のメモリ」に移す必要があります。

AppleシリコンGPUの場合も同じで、データを、メインメモリから、GPU上のメモリに移す処理が必要

import torch
print(torch.backends.mps.is_available())


True
となればOK

AppleシリコンGPUを使用する場合、
* device = torch.device(‘mps’)
* {データ}.to(device)
を使う

mpsとは、Metal Perfomance Shadersの略称

メモリのサイズが気になったのでChatGPTで調べてみた

1 MacBook Air(16GBモデル)のGPUメモリサイズは固定ではなく、
ユニファイドメモリ(16GB)の中から動的に割り当てられます。
最大で約8GB〜12GB程度が割り当て可能ですが、これはシステム負荷に依存します。

必要に応じてアクティビティモニタやPyTorchでリアルタイムの使用量を確認するのがおすすめです。

とりあえず実験を進める
https://www.muji.com/public/media/jp/doc/9952536/muji2021aw_all.pdf
無印良品 2021 秋冬 収納・家具・家電・ファブリック
からPDFダウンロード

トミカ&プラレールカタログwithあにあ 2022-2023
https://www.takaratomy.co.jp/products/plarail/catalog/2022_2023TPcatalog.pdf

これを
Data/imageフォルダに移動しておく

pip install yomitoku 

でyomitokuをインストール

次に

import cv2
import torch

from yomitoku import DocumentAnalyzer
from yomitoku.data.functions import load_image, load_pdf

if __name__ == "__main__":
    filename = "drugstore_flyer"
    pdf_filepath = f"./data/image/{filename}.pdf"

    image = load_pdf(pdf_filepath)
    analyzer = DocumentAnalyzer(
        configs={},
        visualize=True,
        device='mps'
    )

    results, ocr_vis, layout_vis = analyzer(image[0])

    # to HTML
    # results.to_html(f"./outputs/{filename}.html")

    # to image
    cv2.imwrite(f"./outputs/{filename}_ocr.jpg", ocr_vis)
    cv2.imwrite(f"./outputs/{filename}_layout.jpg", layout_vis)

として
pdf_ocr.py
を保存

次に実行
しかしこれだとエラーになるので

import cv2

from yomitoku import DocumentAnalyzer
from yomitoku.data.functions import load_image, load_pdf

pdf_filepath = f"document.pdf"

image = load_pdf(pdf_filepath)
analyzer = DocumentAnalyzer(
    configs={},
    visualize=True,
    device='mps'
)

results, ocr_vis, layout_vis = analyzer(image[0])


# to image
cv2.imwrite(f"document_ocr.jpg", ocr_vis)
cv2.imwrite(f"document_layout.jpg", layout_vis)

でファイルを1つにして実行する

しかし

2024-12-14 06:37:40,596 - yomitoku.base - INFO - Initialize TextDetector
model.safetensors: 100%|█████████████████████| 102M/102M [00:03<00:00, 34.0MB/s]
2024-12-14 06:37:45,343 - yomitoku.base - INFO - Initialize TextRecognizer
config.json: 100%|█████████████████████████████| 256/256 [00:00<00:00, 1.43MB/s]
model.safetensors: 100%|█████████████████████| 200M/200M [00:06<00:00, 30.7MB/s]
2024-12-14 06:37:53,752 - yomitoku.base - INFO - Initialize LayoutParser
model.safetensors: 100%|█████████████████████| 172M/172M [00:04<00:00, 34.5MB/s]
2024-12-14 06:37:59,630 - yomitoku.base - INFO - Initialize TableStructureRecognizer
model.safetensors: 100%|█████████████████████| 172M/172M [00:06<00:00, 28.3MB/s]
2024-12-14 06:38:07,932 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 1.2879679203033447
2024-12-14 06:38:07,966 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.03367877006530762
2024-12-14 06:38:09,561 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 2.916445255279541
2024-12-14 06:38:19,991 - yomitoku.base - INFO - Initialize TextDetector
2024-12-14 06:38:20,963 - yomitoku.base - INFO - Initialize TextRecognizer
2024-12-14 06:38:22,444 - yomitoku.base - INFO - Initialize LayoutParser
2024-12-14 06:38:23,230 - yomitoku.base - INFO - Initialize TableStructureRecognizer
2024-12-14 06:38:24,966 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 1.029360055923462
2024-12-14 06:38:24,982 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.01499795913696289
2024-12-14 06:38:26,832 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 2.895256757736206
2024-12-14 06:38:26,837 - yomitoku.base - ERROR - Error occurred in TextRecognizer __call__: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/snowpool/aw10s/ollama/pdf_ocr.py", line 15, in <module>
    results, ocr_vis, layout_vis = analyzer(image[0])
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/yomitoku/document_analyzer.py", line 304, in __call__
    resutls, ocr, layout = asyncio.run(self.run(img))
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/yomitoku/document_analyzer.py", line 293, in run
    results = await asyncio.gather(*tasks)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/yomitoku/ocr.py", line 83, in __call__
    rec_outputs, vis = self.recognizer(img, det_outputs.points, vis=vis)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/yomitoku/base.py", line 45, in wrapper
    raise e
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/yomitoku/base.py", line 40, in wrapper
    result = func(*args, **kwargs)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/yomitoku/text_recognizer.py", line 103, in __call__
    for data in dataloader:
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 484, in __iter__
    return self._get_iterator()
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 415, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1138, in __init__
    w.start()
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "/Users/snowpool/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

となる

このエラーメッセージは、

multiprocessing モジュールを使用する際に適切なプロセス起動方法が設定されていないことが原因です。
特に、macOSではデフォルトで spawn を使用してプロセスを開始するため、
if __name__ == “__main__”: を使用しないとこのエラーが発生します。
以下の修正版コードを使用して、この問題を解決できます。

とのこと

import cv2
import torch

from yomitoku import DocumentAnalyzer
from yomitoku.data.functions import load_image, load_pdf

if __name__ == "__main__":
    filename = "drugstore_flyer"
    pdf_filepath = f"./images/{filename}.pdf"

    image = load_pdf(pdf_filepath)
    analyzer = DocumentAnalyzer(
        configs={},
        visualize=True,
        device='mps'
    )

    results, ocr_vis, layout_vis = analyzer(image[0])

    # to HTML
    # results.to_html(f"./outputs/{filename}.html")

    # to image
    cv2.imwrite(f"./outputs/{filename}_ocr.jpg", ocr_vis)
    cv2.imwrite(f"./outputs/{filename}_layout.jpg", layout_vis)


https://github.com/Shakshi3104/ymtk-supplementary/blob/main/app.py
にあったのでコードを書き換える

import cv2
import torch

from yomitoku import DocumentAnalyzer
from yomitoku.data.functions import load_image, load_pdf

if __name__ == "__main__":
    filename = "document"
    pdf_filepath = f"./images/{filename}.pdf"

    image = load_pdf(pdf_filepath)
    analyzer = DocumentAnalyzer(
        configs={},
        visualize=True,
        device='mps'
    )

    results, ocr_vis, layout_vis = analyzer(image[0])

    # to HTML
    # results.to_html(f"./outputs/{filename}.html")

    # to image
    cv2.imwrite(f"./outputs/{filename}_ocr.jpg", ocr_vis)
    cv2.imwrite(f"./outputs/{filename}_layout.jpg", layout_vis)

として

mkdir outputs  
mv document.pdf images/

でPDFを移動して
出力先のフォルダも作成しておく

これで実行すると

2024-12-15 05:50:41,470 - yomitoku.base - INFO - Initialize TextDetector
2024-12-15 05:50:42,794 - yomitoku.base - INFO - Initialize TextRecognizer
2024-12-15 05:50:44,762 - yomitoku.base - INFO - Initialize LayoutParser
2024-12-15 05:50:45,957 - yomitoku.base - INFO - Initialize TableStructureRecognizer
2024-12-15 05:50:48,155 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 1.2731208801269531
2024-12-15 05:50:48,189 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.033370256423950195
2024-12-15 05:50:49,887 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 3.005059003829956
2024-12-15 05:51:04,448 - yomitoku.base - INFO - TextRecognizer __call__ elapsed_time: 14.56023383140564
snowpool@kubotasorunoAir ollama % mkdir outputs
snowpool@kubotasorunoAir ollama % python pdf_ocr.py
2024-12-15 05:52:39,988 - yomitoku.base - INFO - Initialize TextDetector
2024-12-15 05:52:41,732 - yomitoku.base - INFO - Initialize TextRecognizer
2024-12-15 05:52:43,589 - yomitoku.base - INFO - Initialize LayoutParser
2024-12-15 05:52:44,413 - yomitoku.base - INFO - Initialize TableStructureRecognizer
2024-12-15 05:52:46,277 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 1.1299068927764893
2024-12-15 05:52:46,312 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.03462696075439453
2024-12-15 05:52:48,106 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 2.958970069885254
2024-12-15 05:53:01,007 - yomitoku.base - INFO - TextRecognizer __call__ elapsed_time: 12.900847911834717

となるが
最初のページのみしか処理されない

これは

load_pdf関数がPDFを画像に変換した際、最初のページだけを image[0] で指定しているため、
最初のページしか処理されていない状況です。
すべてのページを処理するには、PDF内のすべてのページをループするようにコードを修正します。

とのこと

これをコードを変えて全ページを実行してみるが

ページ数は147
処理開始が6時5分
とりあえず半分やるだけで1時間以上かかるので停止

次はレシートをcolabでやってみる