DocAligner

The core functionality of this project is called "Document Localization."

title

📄️ Introduction

This task is essentially a "precursor" to OCR tasks.

We provide installation via PyPI or by cloning the project from GitHub.

We provide a simple model inference interface, including the preprocessing and postprocessing logic.

When invoking the DocAligner model, you can make advanced settings by passing parameters.

Referencing past research, we first considered a point regression model.

We utilized the SmartDoc 2015 dataset for our testing.

Based on our experiments, we have developed a model that performs quite well.

In this chapter, we briefly introduce the datasets used for training and testing our models. These datasets include a variety of document images.

In the real world, you are bound to encounter situations where things don't work as expected.

The literature on this topic is sparse; we've compiled some of the more representative papers to serve as foundational material for research.