5 Production
5.1 Definition
Production is a conceptual place where your work is being used by the intended users to make real-world decisions. Some examples include:
- A Predictive model integrating with front line business system via an API.
- A Dashboard available to support users making decisions.
- A monthly report that informs management meetings.
- An insight from a statistical model that has changed policy.
Note how not all of the above involve a technology implementation like an API.
Jenny trained a statistical model to help predict safety related incidents at worksites for her employer. One of the key factors that influenced and increase in safety incidents was significant rainfall in the previous 24 hours. The safety team has now adjusted their Safe Work check-lists to include a check item for the amount of recent rainfall at that site. If its >50mm then a mandatory inspection is performed.
This change is hard coded in a paper form, but is still a data-science driven model in production.
5.2 Principles
For a data science project to be successfully relied upon in production it should follow some key characteristics.
- Available to end users directly
- Running on stable infrastructure
- Used to make real-world decisions
- Can be replicated
- Can be reproduced safely
- Is documented sufficiently
- Can be maintained by others
In the NYR10 Conference Hadley Wickham summarises his views as:
- Not just once
- Not just my computer
- Not just me
5.3 Patterns for R Code
In terms of R specific outputs, what does a data science output look like from an R perspective?
- An R script that is run on a server on a schedule
- A
{shiny}
app hosted using on a server
- A
{plumber}
API service serving model predictions
- A hosted
{quarto}
dashboard
- An
{rmarkdown}
report
- An R package on CRAN or Github
Regardless of the solution, the fundamental concept is to get it off the developers laptop and have it be reliably available to an end user.
We will build this image up further in the next chapter when we examine more closely the elements of production deployed R code.