The most important work happens before computers are involved

Vetting and scoping a document extraction project is the key to reliable success. If you don't have a playbook for vetting your projects, then this chapter is probably the most important advice I could give you. Nothing else matters if you try to automate a poorly defined process.

You already have a playbook for how to identify document extraction problems. Now you have to learn to evaluate different processes in your company to see if they're a good match.

This chapter takes you back through the components we used to define document extraction and turns them into exercises you can do to evaluate how likely a use case is to succeed. These are the exercises we'll cover:

  1. Defining your business problem by its data inputs
  2. Examining the documents that provide those inputs
  3. Simulating the extraction process
  4. Simulating the business process with extracted data

If you go through these exercises with little difficulty, you're in luck. You have a document extraction use case likely to succeed. And these exercises will have equipped you with a briefing you can hand to your engineering team (or a vendor) to help them get started.

And if you struggle to complete these exercises, it's a sign that this use case might be difficult to automate. That outcome, too, will equip you with a useful set of questions and information to bring to your technical team or resolve before they get involved.