DocClassifier
The core functionality of this project is called Document Classification.
Upon seeing this title, you might be confused: isn't this just a classification model? It sounds pretty ordinary!
Well, yes and no.
This time, we aim to build a non-typical classification model. It may have limited application, but the fun in creating it is immense.
In this project, we do not use the traditional cross-entropy loss function for the final classification results. Instead, we employ a similarity learning approach for classification. The overall effect is quite good, and if you have the time, feel free to keep reading.
This project was conceived and initiated by kunkunlin1221, who completed the early stages of program development and feasibility verification. Since he didn't have time to write the website, he entrusted me with continuing the work, refining the details, and publishing it here.
A special thanks to him for his contribution.
2024 © Z. Yuan
📄️ Introduction
In past project experiences, classification models have been some of the most common machine learning tasks.
📄️ Installation
We provide installation via PyPI or by cloning the project from GitHub.
📄️ Quick Start
We provide a simple model inference interface, including preprocessing and postprocessing logic.
📄️ Advanced
When invoking the DocClassifier model, you can perform advanced settings by passing parameters.
📄️ Model Design
A comprehensive model functionality is not achieved overnight; it requires multiple iterations of adjustments and designs.
📄️ Evaluation
The test dataset for this project is a private dataset. We only provide the evaluation results for this dataset.
📄️ Discussion
Based on our experiments, we have developed a model with promising performance. This model achieved over 90% accuracy on our test set and has demonstrated good results in practical applications.
📄️ Training
The relevant training environment setup sections have been moved to: Model Training Guide
📄️ Submission
The real world is full of surprises, and you're bound to encounter situations where things don't quite fit.