Skip to content

Commit b72de93

Browse files
committed
tweaks
Signed-off-by: Chris Abraham <[email protected]>
1 parent 61769fe commit b72de93

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

_posts/2024-12-18-doctr-joins-pytorch-ecosystem.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ Note: docTR also provides docker images for an easy deployment, such as a part o
5050
Now, let’s try docTR’s OCR recognition on this sample:
5151

5252

53-
![OCR sample](/assets/images/doctr-joins-pytorch-ecosystem/fg2.png){:style="width:100%;display: block;max-width:300px; margin-left:auto; margin-right:auto;"}
53+
![OCR sample](/assets/images/doctr-joins-pytorch-ecosystem/fg2.jpg){:style="width:100%;display: block;max-width:300px; margin-left:auto; margin-right:auto;"}
5454

5555

5656
The OCR recognition model expects an image with only one word on it and will output the predicted word with a confidence score. You can use the following snippet to test OCR capabilities from docTR:
@@ -70,7 +70,7 @@ result = model(doc)
7070
print(result)
7171
```
7272

73-
Here, the most important line of code is `model = recognition_predictor(pretrained=True)`. This will load a default text recognition model,** **`crnn_vgg16_bn`, but you can select other models through the `arch` parameter. You can check out the [available architectures](https://mindee.github.io/doctr/using_doctr/using_models.html).
73+
Here, the most important line of code is `model = recognition_predictor(pretrained=True)`. This will load a default text recognition model, `crnn_vgg16_bn`, but you can select other models through the `arch` parameter. You can check out the [available architectures](https://mindee.github.io/doctr/using_doctr/using_models.html).
7474

7575
When run on the sample, the recognition predictor retrieves the following data: `[('MAGAZINE', 0.9872216582298279)]`
7676

@@ -86,7 +86,7 @@ Note: using the DocumentFile object docTR provides an easy way to manipulate PDF
8686
The last example was a crop on a single word. Now, what about an image with several words on it, like this one?
8787

8888

89-
![photo of magazines](/assets/images/doctr-joins-pytorch-ecosystem/fg3.jpg){:style="width:100%;display: block;max-width:200px; margin-left:auto; margin-right:auto;"}
89+
![photo of magazines](/assets/images/doctr-joins-pytorch-ecosystem/fg3.jpg){:style="width:100%;display: block;max-width:300px; margin-left:auto; margin-right:auto;"}
9090

9191

9292
A text detection model is used before the text recognition to output a segmentation map representing the location of the text. Following that, the text recognition is applied on every detected patch.
@@ -113,10 +113,10 @@ plt.show()
113113
Running it on the full sample yields the following:
114114

115115

116-
![photo of magazines](/assets/images/doctr-joins-pytorch-ecosystem/fg4.png){:style="width:100%;display: block;max-width:200px; margin-left:auto; margin-right:auto;"}
116+
![photo of magazines](/assets/images/doctr-joins-pytorch-ecosystem/fg4.png){:style="width:100%;display: block;max-width:300px; margin-left:auto; margin-right:auto;"}
117117

118118

119-
Similarly to the text recognition, `detection_predictor` will load a default model (`fast_base `here). You can also load another one by providing it through the `arch` parameter.
119+
Similarly to the text recognition, `detection_predictor` will load a default model (`fast_base` here). You can also load another one by providing it through the `arch` parameter.
120120

121121

122122
## The full implementation
@@ -137,7 +137,7 @@ result = model(doc)
137137
result.show()
138138
```
139139

140-
![photo of magazines](/assets/images/doctr-joins-pytorch-ecosystem/fg5.png){:style="width:100%;display: block;max-width:200px; margin-left:auto; margin-right:auto;"}
140+
![photo of magazines](/assets/images/doctr-joins-pytorch-ecosystem/fg5.png){:style="width:100%;display: block;max-width:300px; margin-left:auto; margin-right:auto;"}
141141

142142
The last line should display a matplotlib window which shows the detected patches. Hovering the mouse over them will display their contents.
143143

@@ -152,7 +152,7 @@ plt.axis('off')
152152
plt.show()
153153
```
154154

155-
![black text on white](/assets/images/doctr-joins-pytorch-ecosystem/fg6.png){:style="width:100%;display: block;max-width:200px; margin-left:auto; margin-right:auto;"}
155+
![black text on white](/assets/images/doctr-joins-pytorch-ecosystem/fg6.png){:style="width:100%;display: block;max-width:300px; margin-left:auto; margin-right:auto;"}
156156

157157

158158
The pipeline is highly customizable, where you can modify the detection or recognition model behaviors by passing arguments to the `ocr_predictor`. Please refer to the [documentation](https://mindee.github.io/doctr/using_doctr/using_models.html) to learn more about it.

0 commit comments

Comments
 (0)