Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
## 3. Model Training / Evaluation / Prediction
SDMGR is a key information extraction algorithm that classifies each detected textline into predefined categories, such as order ID, invoice number, amount, etc.
The training and test data are collected in the wildreceipt dataset, use following command to downloaded the dataset.
```bash
wget https://paddleocr.bj.bcebos.com/ppstructure/dataset/wildreceipt.tar && tar xf wildreceipt.tar
```
Create dataset soft link to `PaddleOCR/train_data` directory.
```bash
cd PaddleOCR/ && mkdir train_data && cd train_data
ln -s ../../wildreceipt ./
```
### 3.1 Model training
The config file is `configs/kie/sdmgr/kie_unet_sdmgr.yml`, the default dataset path is `train_data/wildreceipt`.
Use the following command to load the model and predict. During the prediction, the text file storing the image path and OCR information needs to be loaded in advance. Use `Global.infer_img` to assign.