How to extract the structure of invoice data using tensorflow API faster crnn object detection

vignesh amudha
4 min readApr 18, 2019

Hi everyone, recently I being working on invoice data to extract the data and save it as structured data which will reduce the manual data entry process. Now it has been one of the big research among the community. In this blog, I prepared some samples of data so that we can work on.

There are so many blogs about how to create a custom object or text detection dataset and also using faster rcnn how to detect an object or text detection, So please read it, But, In this blog, I am going to give tips about what error I faced and how to recover from the error.

The first thing we have to remember is about image size before creating custom bounding box dataset using labelImg we have to ensure that all the image size should be the same size and ensure that all image is in jpg or png because in my dataset I had gif image so I forget to convert the gif to jpg due to that while training the model I got an error, because gif shape had 4 element (time,width, height, channel), but in jpg or png only 3 elements (width, height,channel). If you forget to convert the gif to png or jpg tuple shape is mismatched error will be thrown while training the model.

After creating the dataset properly, then we have to install a dependency module.

Mainly we have to install the object detection module from the tensorflow/research/object-detection folder every step is explained in the above link.after proper install I got an error a regarding no module net.

So there is no explanation regarding this error. I figured out by myself.

To remove this error it same procedure as object-detection we have to install the slim so please follow my colab notebook.

If you working in a local system you need GPU to run the tensorflow pretrained model or we can use the google colab free GPU instance I used the colab to the train the model.

Then we have to select the pretrained model from the tensorflow model zoo. At first, I selected the faster rcnn inceptionv2 2019 model, But it has some problem so I got error inside from the model file. So there is some problem in the new version of the model due to that I didn't choose any new model. So I used an old model which is faster rcnn resnet 2017 model.which not in official GitHub link I downloaded from the unofficial website. So please try a different new model.

That’s it so I shared the link of the all file and colab file so please make use of it.

In below link, we have invoice_tag folder so please download it and keep in your google drive.

Update: Below file is not working somehow it’s get deleted. So please go with this GitHub link. And I won't recommend fasterRcnn because there is so much robust architecture that came like Darknet Yolo, GCN Invoice Segmentation, So please go with that.

Recommended Architecture:

https://github.com/AlexeyAB/darknet

https://github.com/yhenon/pytorch-retinanet

https://towardsdatascience.com/using-graph-convolutional-neural-networks-on-structured-documents-for-information-extraction-c1088dcd2b8f

Working- https://github.com/vigneshgig/Faster_RCNN_for_Open_Images_Dataset_Keras/blob/master/invoice_segmentation_blog.ipynb

Not Working:

Download the colab notebook then run the file directly

Colab notebook file

After segmenting the invoice data then extract the text using Tesseract OCR which is a free open source OCR tool and store the text in the database.

Here the few samples I used for invoice segmenting.

I assigned seven classification label

Top_other — Unwanted data

Company_detail — company address, phone no, email id, etc.

Customer_detail — Customer address, phone no, email id, etc.

Invoice_detail — Invoice no, date, GST no, payment date, bill to, ship to, etc.

table — table data

table_total — sub_total, total, tax, etc

Bottom_other — Unwanted data

For study purposes, I used this kind of label. You can change it.

Note: it is just an invoice sample I downloaded from google. So it is for study purposes it is not a real dataset.

Thanks for reading my blog.

--

--