2  Data Science Frameworks

There are many approaches to tackling a data science project. Following a framework is important to ensuring successful delivery and management of a project. The key benefits of using a dedicated data science workflow are not only ensuring the work gets done right, but that it also provides a useful means for engaging stakeholders, providing project management updates and ensuring everyone is focused in the right areas.

2.2 Development vs Production

A key distinction introduced above is the separation of development practices from production practices. The typical project will primarily involve development practices outlined in the ‘Inner Loop’. When we refer to production work, it typically refers to work to support the deployment of development work in a context where it used to facilitate decision making. We will explore this concept further and analyse the key elements of ‘production’ code.

Note

The use of the terms development and production here represent the intent of the workflow and not specifically a computing environment. In many organisations there are dedicated computing and infrastructure environments (often called ‘development’, ‘dev’, ‘test’, ‘sandbox’, ‘prod’, ‘uat’ etc.) to support the physical separation of these workflow paradigms. This will be explored later.


  1. https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining↩︎