QuickStart
We provide a simple model inference interface that includes preprocessing and postprocessing logic.
First, you need to import the necessary dependencies and create an instance of the MRZScanner
class.
Model Inference
Below is a simple example demonstrating how to use MRZScanner
for model inference:
from mrzscanner import MRZScanner
model = MRZScanner()
After initializing the model, prepare an image for inference:
You can use the test image provided by MRZScanner
:
Download link: midv2020_test_mrz.jpg
import docsaidkit as D
img = D.imread('path/to/run_test_card.jpg')
Or you can read it directly from a URL:
import cv2
from skimage import io
img = io.imread('https://github.com/DocsaidLab/MRZScanner/blob/main/docs/test_mrz.jpg?raw=true')
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
This image is quite long, and performing inference directly might cause excessive text deformation. Therefore, when calling the model, enable the do_center_crop
parameter:
Next, you can perform inference using the model
:
from mrzscanner import MRZScanner
model = MRZScanner()
result, msg = model(img, do_center_crop=True)
print(result)
# >>> ('PCAZEQAOARIN<<FIDAN<<<<<<<<<<<<<<<<<<<<<<<<<',
# 'C946302620AZE6707297F23031072W12IMJ<<<<<<<40')
print(msg)
# >>> <ErrorCodes.NO_ERROR: 'No error.'>
MRZScanner
is encapsulated with __call__
, so you can directly call the instance for inference.
We have implemented an automatic model download feature. When you use MRZScanner
for the first time, the model will be downloaded automatically.
Using with DocAligner
Looking closely at the output results above, you might notice a few typos even after using do_center_crop
.
Since we performed full-image scanning earlier, the model might misinterpret some text in the image.
To improve accuracy, we can use DocAligner
to help align the MRZ region:
import cv2
from docaligner import DocAligner # Import DocAligner
from mrzscanner import MRZScanner
from skimage import io
model = MRZScanner()
doc_aligner = DocAligner()
img = io.imread(
'https://github.com/DocsaidLab/MRZScanner/blob/main/docs/test_mrz.jpg?raw=true')
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
flat_img = doc_aligner(img).doc_flat_img # Align the MRZ region
print(model(flat_img))
# >>> ('PCAZEQAQARIN<<FIDAN<<<<<<<<<<<<<<<<<<<<<<<<<',
# 'C946302620AZE6707297F23031072W12IMJ<<<<<<<40')
After using DocAligner
, there's no need to use the do_center_crop
parameter.
Now, you can see that the output results are more accurate—the MRZ region of the image has been successfully recognized.
Error Messages
To help users understand the reasons behind errors, we designed the ErrorCodes
class.
When the model inference fails, an error message is generated, covering the following scenarios:
class ErrorCodes(Enum):
NO_ERROR = 'No error.'
INVALID_INPUT_FORMAT = 'Invalid input format.'
POSTPROCESS_FAILED_LINE_COUNT = 'Postprocess failed, number of lines not 2 or 3.'
POSTPROCESS_FAILED_TD1_LENGTH = 'Postprocess failed, length of lines not 30 when `doc_type` is TD1.'
POSTPROCESS_FAILED_TD2_TD3_LENGTH = 'Postprocess failed, length of lines not 36 or 44 when `doc_type` is TD2 or TD3.'
This filters out basic errors, such as incorrect input formats, incorrect line counts, etc.
Check Digits
Check Digits are a key part of MRZ used to ensure data accuracy. They validate numerical correctness to prevent input errors.
- Detailed steps are documented in Reference: Check Digits.
Here’s what we want to emphasize:
- We do not provide a function to calculate Check Digits!
This is because the calculation method for MRZ Check Digits is not standardized. Besides the official calculation methods, different regions can define their own Check Digit algorithms for MRZ, so providing a predefined method might limit user flexibility.
Fun fact:
The Check Digits on the back of Taiwan's Resident Certificate (ARC) differ from global standards. Without collaboration with the government, it is impossible to determine their calculation method.
Our goal is to train a model that focuses on MRZ recognition. Each output format is automatically determined by the model. Many other open-source projects provide Check Digit calculation features. For example, the Arg0s1080/mrz project includes a Check Digit calculation function, and we recommend users utilize that project directly.