Working with Azure Form Recognizer
Azure Form Recognizer is one of the latest services under the aegis of Azure Cognitive Services. While optical character recognition (OCR) allows you to extract text from images and PDFs, Form Recognizer is one level of abstraction higher: it builds on OCR and allows you to assign meaning to the text that you extract. The documentation gives a good overview of the capabilities, but in this video, we want to see just what it’s capable of and what its current limitations are.
Keep in mind that the service is currently still in preview.
In this video, I show you:
- How to use Form Recognizer once you set it up (set up is easy following the Microsoft documentation)
- How it works with PDF and JPG files
- How to tag or label your files for your training dataset
- How well the models work with documents of various quality levels and different orientations
My take away is that there are definitely aspects of this product that are still quite immature. For example, one feature that would really improve the recognition is the ability to assign data types to the labels. This would allow the model to potentially realize that an “S” is actually a “5” and provide some rudimentary format checking. There’s also really no way (perhaps possible from the API?) of controlling the OCR phase which causes some elements to be missed entirely.
But given the relatively small training dataset, I’m very impressed with the tool.
Update: as of this week, Form Recognizer should be available in all regions per Microsoft.
Check out part 2 for how Form Recognizer works without custom models.
1 Response
[…] In part 1, we took a look at how the new Azure Form Recognizer (Preview) service can be used to extract…. […]