The Agile Manifesto applies to data science projects. But not always in the same way as for a regular software development project.
Let's take a fast tour through the 12 principles of agile software development. And we'll do it from a data science point of view.
In our process, the first two stages are about discovery. That is, discovering the business need and what data is available.
While we won't be delivering valuable software in those stages, we will be deliver value.
Value in the sense that we'll work with you to maximise the potential of your data. And to understand what business question(s) there are to guide the analysis stage.
We welcome and embrace changing requirements. We must of the project is to be a success.
Why? Well, because as we learn more about the data, the more focus we can give to solving your data problem.
It may be that within the project it's not about delivering software as such.
There will be data munging and data assessment. For data science, delivering on those is as important.
Also, backtracking to check and re-check data sources is common. That forms part of the frequent delivery of results for the project.
This is key in a data science project. There will be a lot of moving parts. Gathering data and preparing it for analysis for a start.
We insist on working closely with your team to enable them to see the benefits of the project. And to help steer it.
If your team understands the project that will motivate their involvement. And a motivated team is an efficient team.
Data science projects can throw some curve balls. And yes, it can get difficult at times.
A well motivated team who understand the project will make a difference.
We are data nerds. And that means we embrace technology in all its forms. But we still like face-to-face meetings (even if it's via Skype). Because they clear away any danger of misunderstanding.
We show progress through delivering on each step in the project lifecycle.
There may be some switching between the business understanding and data phases. But the focus will never waver. We want the same thing as you: results.
Pacing is as important in a data science project as a regular software project.
There is a cyclical nature to it but each cycle should move things on. Even if only by a small change.
Enhanced data discovery for example, or a better way to clean the data. It's all progress and counts towards the end goal.
The key to it is, to sustain the project so it feels like it is moving forward.
You'd expect data design to be foremost in what dats scientists do. But the pursuit of technical excellence is worth keeping an eye on across the whole project.
Not everyone will agree on the best way to do things - that's fine. But whatever the option it should still fit as the best technical solution.
Data science is a blended mix of complex components. To keep a project agile it's worth striving for simplicity where possible.
One of the core aims in a data science project is to deliver hard to find patterns in data. So by it's very definition, simplicity is a goal for the end result.
This is an interesting one. A data science team will likely consist of different skill sets. In agile terms a self-organising team produces the best output. That includes design and architecture.
Data science professionals will bridge the gap between core functions. Such as business analysis, data engineering and programming.
Data scientists tend to use an iterative task-based process. That means review at each step is vital to a successful outcome.
The review of data collection and cleansing does take up a big chunk of total project time and for good reason.
It seems that we can say with certainty that agile practices do work in a data science project.
There are some elements of the agile manifesto that apply easier than others. But the 12 guiding principles do help when planning a projects.