QuickStart
We provide a simple model inference interface that includes pre-processing and post-processing logic.
First, you need to import the necessary dependencies and create a DocAligner
class.
Model Inference
Here is a simple example demonstrating how to use DocAligner
for model inference:
from docaligner import DocAligner
model = DocAligner()
After initializing the model, you'll need to prepare an image for inference:
You can use a test image provided by DocAligner
:
Download link: run_test_card.jpg
import docsaidkit as D
img = D.imread('path/to/run_test_card.jpg')
Or you can directly read it from a URL:
import cv2
from skimage import io
img = io.imread('https://github.com/DocsaidLab/DocAligner/blob/main/docs/run_test_card.jpg?raw=true')
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
Next, you can use the model
for inference:
result = model(img)
The inference result you receive is packaged as a Document type, containing the document's polygon, OCR text information, etc. In this module, we won't use the OCR features, so we will only utilize the image
and doc_polygon
attributes. After obtaining the inference result, you can perform various post-processing operations.
DocAligner
has encapsulated the inference using the __call__
method, so you can directly call the instance for inference.
Model Download: We have designed an automatic model download feature. When you use DocAligner
for the first time, the model will be downloaded automatically.
Output Results
1. Drawing the Polygon
Draw and save an image with the document polygon.
# draw
result.draw_doc(
folder='path/to/save/folder',
name='output_image.jpg'
)
Or without specifying a path, directly output:
# The default output path is the current directory
# The default file name uses the current time, as "output_{D.now()}.jpg".
result.draw_doc()
2. Obtaining the NumPy Image
If you have other needs, you can use the gen_doc_info_image
method and then process it as needed.
img = result.gen_doc_info_image()
3. Extracting the Flattened Image
If you know the original size of the document, you can use the gen_doc_flat_img
method to transform the document image from its polygonal boundary into a rectangular image.
H, W = 1080, 1920
flat_img = result.gen_doc_flat_img(image_size=(H, W))
If the image class is unknown, you can omit the image_size
parameter. In this case, the "smallest rectangle" will be automatically calculated based on the document polygon's boundary, setting the rectangle's dimensions as H
and W
.
flat_img = result.gen_doc_flat_img()
When your document is significantly skewed in the image, calculating a more flattened smallest rectangle may occur, resulting in some distortion upon flattening. Therefore, it's recommended to manually set the image_size
parameter in such cases.