Skip to content

Building a document parsing app

Motivation

Document parsing involves reading a document such as a receipt or a bank statement in different formats including images, pdf, MS docx/xlsx, and extract required information from there.

This is very useful, for example, to process file uploads by applicants, either for auto-filling of application form or auto-approval/rejection of an applicant.

Sample flow

We will create a Document Parser flow, which conceptually, looks like these:

Document Parser flow

graph LR
  Input --> RemoteFilesLoader --> LLM --> Output

The RemoteFilesLoader uses DocumentParser API to extract data from images and PDFs, this is a new GDS product, which provides much better output compared to open source OCR tools.

Here is how it looks like in LaunchPad Studio: doc-parser-flow-o-complete.png

The prompt for this is

      {{inputs.base_prompt}}

      Document Text:
      ```
      {%- for doc in RemoteFilesLoader.output %}
      {{ doc.content }}
      {%- endfor %}
      ```

      Field Descriptions in JSON:
      ```
      {{inputs.fields_schema}}
      ```

      Return only field values in minified JSON format

Note

You can use this JSON file and import it into LaunchPad Studio to get the same flow.

Use the Sample App

After publishing the flow, you should get a link to the Sample App.

Alternatively, you can copy the Flow ID and open it here: https://apps.stack.govtext.gov.sg/