Lets say one has a large number of digitized documents (receipts, invoices, letters, etc.) that have been labeled generically by the capture software (e.g. a camera app on a smartphone). An example file name might look something likes this: “2020-10-10-MGHZTVMXTT.jpg”.
Now lets say one wishes to avoid the rather unpleasant task of renaming all of these files. Is there an application that will do this for you? Ideally, the application would perform OCR on the file and name the file accordingly. If the OCR results can be associated with the file for future searching that would be ideal.
I imagine such an application would need some guidance on choosing the title for a document. For example, one might “select” the name of a store and the date of the receipt in the application, then the application would use this definition to determine naming on all other similar documents.
Bonus functionality would include:
- The ability to break apart a single file that contains multiple receipts into one file per receipt.
- Automatic rotation.
- Automatic cropping of images (e.g. if there is background around the document, cropping to just the document).
- Customizable level of human review (e.g., before saving changes to files, allow individual to review the changes if desired).
Right now I’m reading up on IRISmart File, which looks promising, but I really have no idea what is available on this front and would appreciate any guidance you might have!