The most important work happens before computers are involved
Vetting and scoping a document extraction project is the key to reliable success. If you don't have a playbook for vetting your projects, then this chapter is probably the most important advice I could give you. Nothing else matters if you try to automate a poorly defined process.
You already have a playbook for how to identify document extraction problems. Now you have to learn to evaluate different processes in your company to see if they're a good match.
This chapter takes you back through the components we used to define document extraction and turns them into exercises you can do to evaluate how likely a use case is to succeed. These are the exercises we'll cover:
- Defining your business problem by its data inputs
- Examining the documents that provide those inputs
- Simulating the extraction process
- Simulating the business process with extracted data
If you go through these exercises with little difficulty, you're in luck. You have a document extraction use case likely to succeed. And these exercises will have equipped you with a briefing you can hand to your engineering team (or a vendor) to help them get started.
And if you struggle to complete these exercises, it's a sign that this use case might be difficult to automate. That outcome, too, will equip you with a useful set of questions and information to bring to your technical team or resolve before they get involved.