Skip to main content

References

The literature on this topic is sparse; we've compiled some of the more representative papers to serve as foundational material for research.

Comparative Overview

Modelsbg01bg02bg03bg04bg05Overall
HU-PageScan [1]-----0.9923
Advanced Hough [2]0.98860.98580.98960.9806-0.9866
LDRNet [4]0.98770.98380.98620.98020.98580.9849
Coarse-to-Fine [3]0.98760.98390.98300.98430.96140.9823
SEECS-NUST-2 [3]0.98320.97240.98300.96950.94780.9743
LDRE [5]0.98690.97750.98890.98370.86130.9716
SmartEngines [5]0.98850.98330.98970.97850.68840.9548
NetEase [5]0.96240.95520.96210.95110.22180.8820
RPPDI-UPE [5]0.82740.91040.96970.36490.21620.7408
SEECS-NUST [5]0.88750.82640.78320.78110.01130.7393

List of Papers

  1. HU-PageScan is a segmentation model based on pixel classification. While it performs well, the model size and computational requirements are significant, and it lacks resistance to partial occlusions, such as scenarios where fingers hold the document corners, failing to meet practical needs.

  1. Advanced Hough is a CV-Based model that performs well, but like all CV-Based models, it has drawbacks, such as sensitivity to light and angles.

  1. Coarse-to-Fine and SEECS-NUST-2 are deep learning-based models that use a recursive optimization strategy. While effective, they are slow.

  1. LDRNet is a deep learning-based model that we tested using their provided model. We found that the model was entirely fitted on the SmartDoc 2015 dataset, showing no generalization ability to other scenarios. We also tried to incorporate other data for training, but the performance was still not ideal, possibly due to the architecture's insufficient feature fusion capability.

  1. LDRE, SmartEngines, NetEase, RPPDI-UPE, SEECS-NUST are all CV-Based models.