Building a document parsing app
Motivation
Document parsing involves reading a document such as a receipt or a bank statement in different formats including images, pdf, MS docx/xlsx, and extract required information from there.
This is very useful, for example, to process file uploads by applicants, either for auto-filling of application form or auto-approval/rejection of an applicant.
Sample flow
We will create a Document Parser flow, which conceptually, looks like these:
Document Parser flow
graph LR
Input --> RemoteFilesLoader --> LLM --> Output
The RemoteFilesLoader
uses DocumentParser API to extract data from images and PDFs, this is a new GDS product, which provides much better output compared to open source OCR tools.
Here is how it looks like in LaunchPad Studio:
The prompt for this is
{{inputs.base_prompt}}
Document Text:
```
{%- for doc in RemoteFilesLoader.output %}
{{ doc.content }}
{%- endfor %}
```
Field Descriptions in JSON:
```
{{inputs.fields_schema}}
```
Return only field values in minified JSON format
Note
You can use this JSON file and import it into LaunchPad Studio to get the same flow.
Use the Sample App
After publishing the flow, you should get a link to the Sample App.
Alternatively, you can copy the Flow ID and open it here: https://apps.stack.govtext.gov.sg/