Yomitokuで写真やレシートを解析をColabで試す
Yomitokuで写真やレシートを解析してみる
で
Colab で行なってるのがあったので参考にする
出力形式によって得られる情報が異なるらしい
A100で実験する
https://github.com/kotaro-kinoshita/yomitoku
でライブラリが公開されているので
ドキュメントなどはこれを読む
基本的にGPUが必要
! pip install yomitoku
でライブラリインストール
最後にsession restartしろといわれたので、ダイアログボックスに促されるままrestartする
次に写真の用意
google colabなので、/content以下にimageフォルダを作成し、その中にJPGをいれ
フォルダをまるっと指定すると、その中の画像ファイルをすべて解析してくれます
!mkdir image
でフォルダ作成
ここに写真をアップロードする
aw10s/ollama/images/test.jpg
をアップロード
! yomitoku /content/image/ -f md -o results -v --figure
で解析
結果
2024-12-15 21:31:34,841 - yomitoku.base - INFO - Initialize TextDetector
model.safetensors: 100% 102M/102M [00:04<00:00, 23.8MB/s]
2024-12-15 21:31:40,639 - yomitoku.base - INFO - Initialize TextRecognizer
config.json: 100% 256/256 [00:00<00:00, 1.87MB/s]
model.safetensors: 100% 200M/200M [00:08<00:00, 23.9MB/s]
2024-12-15 21:31:50,844 - yomitoku.base - INFO - Initialize LayoutParser
model.safetensors: 100% 172M/172M [00:07<00:00, 23.5MB/s]
2024-12-15 21:31:59,554 - yomitoku.base - INFO - Initialize TableStructureRecognizer
model.safetensors: 100% 172M/172M [00:03<00:00, 43.0MB/s]
2024-12-15 21:32:04,793 - yomitoku.cli.main - INFO - Output directory: results
2024-12-15 21:32:04,793 - yomitoku.cli.main - INFO - Processing file: /content/image/test.jpg
2024-12-15 21:32:06,818 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 1.9037435054779053
2024-12-15 21:32:07,114 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 2.19960355758667
2024-12-15 21:32:07,247 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.13303065299987793
2024-12-15 21:32:08,789 - yomitoku.base - INFO - TextRecognizer __call__ elapsed_time: 1.9707973003387451
2024-12-15 21:32:08,863 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_ocr.jpg
2024-12-15 21:32:08,914 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_layout.jpg
2024-12-15 21:32:08,927 - yomitoku.cli.main - INFO - Output file: results/image_test_p1.md
2024-12-15 21:32:08,928 - yomitoku.cli.main - INFO - Total Processing time: 4.13 sec
結果は自動作成される results の中にある
ファイル名_p1.md
という感じで出力される
image_test_p1.md
の中身は
<img src="figures/image_test_p1_figure_0.png" width="200px"><br>
登録番号 T5080401017738<br>とれたて食楽部<br>静岡県袋井市山名町3\-3<br>TEL 0538\-41\-1100
2024年 8月10日\(土\)08:59 \#000011<br>000801精算機1<br>000801精算機1<br>3901
09:08<br>R9309<br>\#000003<br>お会計券<br>西澤<br>000008
¥150<br>内8 ★きゅうり/鈴木 仁<br>P2023300101503
内8 ★きゅうり/小林宗作
P2055600101303
¥130
内8 リーフレタス/\(有\)成神工 ¥216<br>P2086402402169
|小計<br>\(内税 8%対象額<br>買上点数|¥496|
|-|-|
||¥496\)|
||3点|
|合計|¥496|
|\(税率 8%対象額|¥496\)|
|\(内消費税等 8%|¥36\)|
|課税事業者||
|\(税率 8%対象額|¥216\)|
|\(内消費税等 8%|¥16\)|
|免税事業者||
|\(税率 8%対象額|¥280\)|
¥496<br>クレジット<br>¥36\)<br>\(内消費税等
、内Sは軽減税率対象商品です。
これがOCRした内容
次に
出力形式によって得られる情報が異なるらしいので
JSON形式で出力する
! yomitoku /content/image/ -f json -o results -v --figure
結果は
2024-12-16 21:00:01,501 - yomitoku.base - INFO - Initialize TextDetector
2024-12-16 21:00:02,677 - yomitoku.base - INFO - Initialize TextRecognizer
2024-12-16 21:00:04,059 - yomitoku.base - INFO - Initialize LayoutParser
2024-12-16 21:00:05,086 - yomitoku.base - INFO - Initialize TableStructureRecognizer
2024-12-16 21:00:06,063 - yomitoku.cli.main - INFO - Output directory: results
2024-12-16 21:00:06,063 - yomitoku.cli.main - INFO - Processing file: /content/image/test.jpg
2024-12-16 21:00:07,057 - yomitoku.base - INFO - TextDetector __call__ elapsed_time: 0.9070024490356445
2024-12-16 21:00:07,207 - yomitoku.base - INFO - LayoutParser __call__ elapsed_time: 1.0565669536590576
2024-12-16 21:00:07,340 - yomitoku.base - INFO - TableStructureRecognizer __call__ elapsed_time: 0.1329059600830078
2024-12-16 21:00:08,864 - yomitoku.base - INFO - TextRecognizer __call__ elapsed_time: 1.8073179721832275
2024-12-16 21:00:08,940 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_ocr.jpg
2024-12-16 21:00:08,992 - yomitoku.cli.main - INFO - Output file: results/image_test_p1_layout.jpg
2024-12-16 21:00:08,994 - yomitoku.cli.main - INFO - Output file: results/image_test_p1.json
2024-12-16 21:00:08,995 - yomitoku.cli.main - INFO - Total Processing time: 2.93 sec
出力されたファイルは
Image_test_p1.json
内容は
{
"figures": [
{
"box": [
586,
249,
1633,
503
],
"direction": "horizontal",
"order": 0,
"paragraphs": []
}
],
"paragraphs": [
{
"box": [
638,
569,
1548,
872
],
"contents": "登録番号 T5080401017738\nとれたて食楽部\n静岡県袋井市山名町3-3\nTEL 0538-41-1100",
"direction": "horizontal",
"order": 1,
"role": null
},
{
"box": [
569,
924,
1614,
1080
],
"contents": "2024年 8月10日(土)08:59 #000011\n000801精算機1\n000801精算機1\n3901",
"direction": "horizontal",
"order": 2,
"role": null
},
{
"box": [
545,
1173,
1636,
1347
],
"contents": "09:08\nR9309\n#000003\nお会計券\n西澤\n000008",
"direction": "horizontal",
"order": 3,
"role": null
},
{
"box": [
516,
1387,
1688,
1577
],
"contents": "¥150\n内8 ★きゅうり/鈴木 仁\nP2023300101503",
"direction": "horizontal",
"order": 4,
"role": null
},
{
"box": [
485,
1558,
1448,
1660
],
"contents": "内8 ★きゅうり/小林宗作",
"direction": "horizontal",
"order": 5,
"role": null
},
{
"box": [
641,
1652,
1186,
1734
],
"contents": "P2055600101303",
"direction": "horizontal",
"order": 6,
"role": null
},
{
"box": [
1532,
1534,
1703,
1620
],
"contents": "¥130",
"direction": "horizontal",
"order": 7,
"role": null
},
{
"box": [
484,
1695,
1712,
1904
],
"contents": "内8 リーフレタス/(有)成神工 ¥216\nP2086402402169",
"direction": "horizontal",
"order": 8,
"role": null
},
{
"box": [
341,
3432,
1884,
3719
],
"contents": "¥496\nクレジット\n¥36)\n(内消費税等",
"direction": "horizontal",
"order": 10,
"role": null
},
{
"box": [
288,
3822,
1779,
4077
],
"contents": "、内Sは軽減税率対象商品です。",
"direction": "horizontal",
"order": 11,
"role": null
}
],
"tables": [
{
"box": [
346,
1915,
1852,
3358
],
"cells": [
{
"box": [
347,
1921,
1279,
2310
],
"col": 1,
"col_span": 1,
"contents": "小計\n(内税 8%対象額\n買上点数",
"row": 1,
"row_span": 3
},
{
"box": [
1276,
1925,
1851,
2039
],
"col": 2,
"col_span": 1,
"contents": "¥496",
"row": 1,
"row_span": 1
},
{
"box": [
1276,
2043,
1851,
2133
],
"col": 2,
"col_span": 1,
"contents": "¥496)",
"row": 2,
"row_span": 1
},
{
"box": [
1276,
2152,
1851,
2289
],
"col": 2,
"col_span": 1,
"contents": "3点",
"row": 3,
"row_span": 1
},
{
"box": [
347,
2296,
1280,
2426
],
"col": 1,
"col_span": 1,
"contents": "合計",
"row": 4,
"row_span": 1
},
{
"box": [
1276,
2296,
1851,
2426
],
"col": 2,
"col_span": 1,
"contents": "¥496",
"row": 4,
"row_span": 1
},
{
"box": [
348,
2424,
1280,
2526
],
"col": 1,
"col_span": 1,
"contents": "(税率 8%対象額",
"row": 5,
"row_span": 1
},
{
"box": [
1276,
2424,
1851,
2526
],
"col": 2,
"col_span": 1,
"contents": "¥496)",
"row": 5,
"row_span": 1
},
{
"box": [
347,
2540,
1280,
2695
],
"col": 1,
"col_span": 1,
"contents": "(内消費税等 8%",
"row": 6,
"row_span": 1
},
{
"box": [
1276,
2540,
1851,
2695
],
"col": 2,
"col_span": 1,
"contents": "¥36)",
"row": 6,
"row_span": 1
},
{
"box": [
348,
2705,
1280,
2852
],
"col": 1,
"col_span": 1,
"contents": "課税事業者",
"row": 7,
"row_span": 1
},
{
"box": [
1276,
2705,
1851,
2852
],
"col": 2,
"col_span": 1,
"contents": "",
"row": 7,
"row_span": 1
},
{
"box": [
348,
2865,
1280,
2951
],
"col": 1,
"col_span": 1,
"contents": "(税率 8%対象額",
"row": 8,
"row_span": 1
},
{
"box": [
1276,
2865,
1851,
2951
],
"col": 2,
"col_span": 1,
"contents": "¥216)",
"row": 8,
"row_span": 1
},
{
"box": [
347,
2980,
1280,
3102
],
"col": 1,
"col_span": 1,
"contents": "(内消費税等 8%",
"row": 9,
"row_span": 1
},
{
"box": [
1276,
2980,
1851,
3102
],
"col": 2,
"col_span": 1,
"contents": "¥16)",
"row": 9,
"row_span": 1
},
{
"box": [
348,
3103,
1280,
3202
],
"col": 1,
"col_span": 1,
"contents": "免税事業者",
"row": 10,
"row_span": 1
},
{
"box": [
1276,
3103,
1851,
3202
],
"col": 2,
"col_span": 1,
"contents": "",
"row": 10,
"row_span": 1
},
{
"box": [
347,
3211,
1280,
3355
],
"col": 1,
"col_span": 1,
"contents": "(税率 8%対象額",
"row": 11,
"row_span": 1
},
{
"box": [
1276,
3211,
1851,
3355
],
"col": 2,
"col_span": 1,
"contents": "¥280)",
"row": 11,
"row_span": 1
}
],
"n_col": 2,
"n_row": 11,
"order": 9
}
],
"words": [
{
"content": "、内Sは軽減税率対象商品です。",
"det_score": 0.6796720981104631,
"direction": "horizontal",
"points": [
[
431,
3885
],
[
1752,
3813
],
[
1759,
3925
],
[
438,
3997
]
],
"rec_score": 0.03351603075861931
},
{
"content": "(内消費税等",
"det_score": 0.8627398538033404,
"direction": "horizontal",
"points": [
[
871,
3594
],
[
1378,
3577
],
[
1381,
3686
],
[
875,
3704
]
],
"rec_score": 0.979337215423584
},
{
"content": "¥36)",
"det_score": 0.7510144629009029,
"direction": "horizontal",
"points": [
[
1694,
3557
],
[
1881,
3557
],
[
1881,
3664
],
[
1694,
3664
]
],
"rec_score": 0.9134319424629211
},
{
"content": "クレジット",
"det_score": 0.8124990954358461,
"direction": "horizontal",
"points": [
[
350,
3484
],
[
1270,
3465
],
[
1272,
3569
],
[
352,
3589
]
],
"rec_score": 0.9985630512237549
},
{
"content": "¥496",
"det_score": 0.8485499834607593,
"direction": "horizontal",
"points": [
[
1463,
3438
],
[
1843,
3429
],
[
1846,
3534
],
[
1465,
3543
]
],
"rec_score": 0.9998164176940918
},
{
"content": "(税率 8%対象額",
"det_score": 0.853462993147711,
"direction": "horizontal",
"points": [
[
459,
3234
],
[
1102,
3225
],
[
1103,
3327
],
[
460,
3336
]
],
"rec_score": 0.9029322266578674
},
{
"content": "¥280)",
"det_score": 0.8697890050123379,
"direction": "horizontal",
"points": [
[
1632,
3182
],
[
1858,
3182
],
[
1858,
3295
],
[
1632,
3295
]
],
"rec_score": 0.9970043301582336
},
{
"content": "免税事業者",
"det_score": 0.8758680560327103,
"direction": "horizontal",
"points": [
[
357,
3103
],
[
831,
3112
],
[
829,
3219
],
[
355,
3210
]
],
"rec_score": 0.9983263611793518
},
{
"content": "(内消費税等 8%",
"det_score": 0.8655238629057851,
"direction": "horizontal",
"points": [
[
475,
2996
],
[
1100,
2996
],
[
1100,
3093
],
[
475,
3093
]
],
"rec_score": 0.908987820148468
},
{
"content": "¥16)",
"det_score": 0.8736068132308011,
"direction": "horizontal",
"points": [
[
1663,
2953
],
[
1837,
2953
],
[
1837,
3060
],
[
1663,
3060
]
],
"rec_score": 0.9977002143859863
},
{
"content": "(税率 8%対象額",
"det_score": 0.8736011394336691,
"direction": "horizontal",
"points": [
[
488,
2884
],
[
1100,
2884
],
[
1100,
2981
],
[
488,
2981
]
],
"rec_score": 0.9527927041053772
},
{
"content": "¥216)",
"det_score": 0.8798753646381205,
"direction": "horizontal",
"points": [
[
1613,
2841
],
[
1827,
2835
],
[
1830,
2940
],
[
1616,
2946
]
],
"rec_score": 0.9991167187690735
},
{
"content": "課税事業者",
"det_score": 0.8782089667459907,
"direction": "horizontal",
"points": [
[
383,
2761
],
[
839,
2773
],
[
836,
2875
],
[
381,
2863
]
],
"rec_score": 0.9996485114097595
},
{
"content": "(内消費税等 8%",
"det_score": 0.8545332324999831,
"direction": "horizontal",
"points": [
[
422,
2552
],
[
1015,
2569
],
[
1012,
2666
],
[
419,
2649
]
],
"rec_score": 0.9030265808105469
},
{
"content": "¥36)",
"det_score": 0.874239359391371,
"direction": "horizontal",
"points": [
[
1637,
2517
],
[
1801,
2517
],
[
1801,
2614
],
[
1637,
2614
]
],
"rec_score": 0.9990556836128235
},
{
"content": "(税率 8%対象額",
"det_score": 0.8743577414363437,
"direction": "horizontal",
"points": [
[
430,
2455
],
[
1018,
2472
],
[
1015,
2556
],
[
427,
2539
]
],
"rec_score": 0.9258973002433777
},
{
"content": "¥496)",
"det_score": 0.873335879837346,
"direction": "horizontal",
"points": [
[
1584,
2421
],
[
1787,
2411
],
[
1792,
2503
],
[
1589,
2513
]
],
"rec_score": 0.9990760087966919
},
{
"content": "合計",
"det_score": 0.7989976852002054,
"direction": "horizontal",
"points": [
[
422,
2351
],
[
774,
2357
],
[
772,
2459
],
[
420,
2453
]
],
"rec_score": 0.9596889615058899
},
{
"content": "¥496",
"det_score": 0.84043764577154,
"direction": "horizontal",
"points": [
[
1423,
2337
],
[
1756,
2319
],
[
1761,
2406
],
[
1428,
2424
]
],
"rec_score": 0.9845718741416931
},
{
"content": "買上点数",
"det_score": 0.8724811626703719,
"direction": "horizontal",
"points": [
[
474,
2164
],
[
821,
2176
],
[
818,
2263
],
[
471,
2251
]
],
"rec_score": 0.8529166579246521
},
{
"content": "3点",
"det_score": 0.877569906485891,
"direction": "horizontal",
"points": [
[
1611,
2127
],
[
1751,
2127
],
[
1751,
2221
],
[
1611,
2221
]
],
"rec_score": 0.999936580657959
},
{
"content": "(内税 8%対象額",
"det_score": 0.8549906700132722,
"direction": "horizontal",
"points": [
[
466,
2070
],
[
1026,
2084
],
[
1023,
2171
],
[
464,
2157
]
],
"rec_score": 0.6945281624794006
},
{
"content": "¥496)",
"det_score": 0.8776254550615946,
"direction": "horizontal",
"points": [
[
1566,
2039
],
[
1756,
2028
],
[
1761,
2121
],
[
1571,
2131
]
],
"rec_score": 0.9979957938194275
},
{
"content": "小計",
"det_score": 0.8659234180608216,
"direction": "horizontal",
"points": [
[
483,
1981
],
[
677,
1981
],
[
677,
2076
],
[
483,
2076
]
],
"rec_score": 0.66346675157547
},
{
"content": "¥496",
"det_score": 0.8556146869165837,
"direction": "horizontal",
"points": [
[
1558,
1955
],
[
1732,
1943
],
[
1738,
2028
],
[
1563,
2039
]
],
"rec_score": 0.9997748136520386
},
{
"content": "P2086402402169",
"det_score": 0.8685337040086014,
"direction": "horizontal",
"points": [
[
628,
1823
],
[
1186,
1823
],
[
1186,
1897
],
[
628,
1897
]
],
"rec_score": 0.994750440120697
},
{
"content": "内8 リーフレタス/(有)成神工 ¥216",
"det_score": 0.7733530013156038,
"direction": "horizontal",
"points": [
[
471,
1719
],
[
1716,
1690
],
[
1718,
1800
],
[
474,
1829
]
],
"rec_score": 0.487512469291687
},
{
"content": "P2055600101303",
"det_score": 0.8620755339288183,
"direction": "horizontal",
"points": [
[
641,
1658
],
[
1185,
1652
],
[
1186,
1729
],
[
641,
1734
]
],
"rec_score": 0.9976783394813538
},
{
"content": "内8 ★きゅうり/小林宗作",
"det_score": 0.8030553586848483,
"direction": "horizontal",
"points": [
[
485,
1558
],
[
1448,
1558
],
[
1448,
1660
],
[
485,
1660
]
],
"rec_score": 0.8243062496185303
},
{
"content": "¥130",
"det_score": 0.8647632946312013,
"direction": "horizontal",
"points": [
[
1532,
1541
],
[
1700,
1534
],
[
1703,
1613
],
[
1536,
1620
]
],
"rec_score": 0.9999488592147827
},
{
"content": "P2023300101503",
"det_score": 0.8535739641137016,
"direction": "horizontal",
"points": [
[
654,
1502
],
[
1183,
1497
],
[
1183,
1566
],
[
654,
1571
]
],
"rec_score": 0.9996684789657593
},
{
"content": "内8 ★きゅうり/鈴木 仁",
"det_score": 0.7407760786246895,
"direction": "horizontal",
"points": [
[
503,
1413
],
[
1447,
1402
],
[
1448,
1494
],
[
504,
1505
]
],
"rec_score": 0.9217264652252197
},
{
"content": "¥150",
"det_score": 0.8529803361907365,
"direction": "horizontal",
"points": [
[
1521,
1386
],
[
1685,
1376
],
[
1691,
1455
],
[
1527,
1465
]
],
"rec_score": 0.9999033212661743
},
{
"content": "000008",
"det_score": 0.8420085177533121,
"direction": "horizontal",
"points": [
[
918,
1276
],
[
1151,
1269
],
[
1153,
1341
],
[
920,
1347
]
],
"rec_score": 0.9971499443054199
},
{
"content": "西澤",
"det_score": 0.8301356971000148,
"direction": "horizontal",
"points": [
[
1198,
1261
],
[
1360,
1254
],
[
1363,
1333
],
[
1201,
1340
]
],
"rec_score": 0.9971343278884888
},
{
"content": "お会計券",
"det_score": 0.8479825696054877,
"direction": "horizontal",
"points": [
[
540,
1209
],
[
835,
1209
],
[
835,
1285
],
[
540,
1285
]
],
"rec_score": 0.9989036917686462
},
{
"content": "#000003",
"det_score": 0.832013468600706,
"direction": "horizontal",
"points": [
[
884,
1207
],
[
1148,
1198
],
[
1150,
1269
],
[
886,
1278
]
],
"rec_score": 0.9912707805633545
},
{
"content": "R9309",
"det_score": 0.8094638501787745,
"direction": "horizontal",
"points": [
[
1195,
1189
],
[
1392,
1182
],
[
1394,
1254
],
[
1198,
1261
]
],
"rec_score": 0.9985978603363037
},
{
"content": "09:08",
"det_score": 0.8513526355901968,
"direction": "horizontal",
"points": [
[
1442,
1174
],
[
1635,
1167
],
[
1638,
1241
],
[
1444,
1248
]
],
"rec_score": 0.9983935952186584
},
{
"content": "3901",
"det_score": 0.8389774914979935,
"direction": "horizontal",
"points": [
[
1465,
1031
],
[
1622,
1024
],
[
1625,
1100
],
[
1468,
1108
]
],
"rec_score": 0.9998389482498169
},
{
"content": "000801精算機1",
"det_score": 0.8261694877181531,
"direction": "horizontal",
"points": [
[
557,
1015
],
[
1029,
1004
],
[
1031,
1068
],
[
559,
1079
]
],
"rec_score": 0.8924864530563354
},
{
"content": "000801精算機1",
"det_score": 0.7923262641892176,
"direction": "horizontal",
"points": [
[
1130,
993
],
[
1596,
965
],
[
1599,
1024
],
[
1134,
1052
]
],
"rec_score": 0.8452281355857849
},
{
"content": "2024年 8月10日(土)08:59 #000011",
"det_score": 0.76677570112371,
"direction": "horizontal",
"points": [
[
567,
950
],
[
1601,
899
],
[
1605,
971
],
[
571,
1021
]
],
"rec_score": 0.7705328464508057
},
{
"content": "TEL 0538-41-1100",
"det_score": 0.7891764333521688,
"direction": "horizontal",
"points": [
[
899,
822
],
[
1425,
800
],
[
1428,
856
],
[
902,
878
]
],
"rec_score": 0.8409962058067322
},
{
"content": "静岡県袋井市山名町3-3",
"det_score": 0.8232278183336981,
"direction": "horizontal",
"points": [
[
773,
750
],
[
1545,
729
],
[
1547,
795
],
[
774,
817
]
],
"rec_score": 0.9089908003807068
},
{
"content": "とれたて食楽部",
"det_score": 0.8368622312565417,
"direction": "horizontal",
"points": [
[
905,
666
],
[
1359,
657
],
[
1360,
732
],
[
906,
740
]
],
"rec_score": 0.7296138405799866
},
{
"content": "登録番号 T5080401017738",
"det_score": 0.8375435412981505,
"direction": "horizontal",
"points": [
[
640,
582
],
[
1384,
566
],
[
1386,
637
],
[
642,
653
]
],
"rec_score": 0.6307746171951294
}
]
}
となっている
この2つのファイルを使って
日付,店名,商品名,数量,金額
を抽出し
CSVファイルにできるか試すことにする
chatGPTでは目的通りのCSVファイルが作成された
次は他のレシートでも実験してみる
~/Downloads/Photos-001/
に色々Google Photo の写真があるので試す
ぴあごも問題なく抽出できる
ココカラファインもできる
ただし単品の金額ではなく
合計金額になっているので
プロンプトを変えることにする
データから日付,店名,商品名,数量,金額 を抽出し 金額の部分は単品の金額にして CSVファイルにして
としたが
これだとものによっては金額が単品になっていない
これは杏林堂などのレシートでも同じ
データから日付,店名,商品名,数量,単品価格 を抽出し CSVファイルにして
とプロンプトを変えても変わらなかった