Who will train and update my model?

Expect to need both engineers and domain experts to train your model no matter what, but the type of work they'll be doing — and when they'll be doing it — will depend upon how your model learns.

Training the model

  • For Rule Systems: your engineering team will interview domain experts, looking at data with them and understanding how they arrive at decisions. They'll translate those interviews into computer code.
  • For Machine Learning Systems: your engineering team will ask domain experts to annotate data, probably using software designed for data annotation. They'll set up a learning pipeline which uses those annotations to train a model they've created.
  • For Deep Learning Systems: the process will be the same as machine learning systems but the amount of data needed may soon be far less because of the ability to pre-train and fine-tune models.

Updating the model

Updating a model is when the differences between learned and non-learned systems really begin to emerge. The fundamental question is who the primary responsibility for improvement falls upon: engineers or domain experts.

  • In rule systems, engineers have the primary responsibility for model updates. Engineers will observe the model's performance, consult with domain experts to understand its errors, and then update the rule set accordingly. Updating the model is primarily an engineering exercise.
  • In learned systems, domain experts have the primary responsibility for model updates. Domain experts will observe the model's performance, annotate more data representative of the erroneous outputs, and re-run the training pipeline. Engineering involvement is primarily keeping systems running well in general, rather than resolving accuracy problems specifically.

This difference can have a large impact on project management once things get running. If the business unit that owns the automation project includes domain experts but not engineers, then they are not in control of their own ability to improve performance unless these domain experts can do it themselves.

One advantage learned systems have with respect to model updates is that learned systems are naturally more able to emit confidence scores with their own output. The lowest confidence outputs can automatically be flagged for human review and incorporation into a future training set, allowing the model to play an active role in guiding the kind of data it needs to become more confident in production.