Repeatable Analytical Pipelines

Repeatable analytical pipelines are the bedrock of data analytics best practice. The concept is that if a data analytics project takes 2 weeks to put together, if the analysis needs iterating on, adapting to a similar project or simply refreshing with new data, it should be very simple to do, and should certainly not take another 2 weeks. (Analytics here can refer to building a predictive model, developing a feature in the rating algorithm, analysis for a rate change, etc).

This concept means analysis will be much clearer, structured, and follow a distinct set of steps, which as well as being far more efficient, allows for audit trails, peer review, and quality assurance which is especially important in insurance pricing, where the analysis needs to be compliant and so carries risk.

Efficiency

A pricing team that doesn’t build their analytical processes with repeatability in mind, will end up spending large amounts of time on tasks which could be zero-time or very quick to complete.

This is very common in insurance pricing teams, many analysts use closed-source software and analytical tools, along with Excel, to create analytical processes that require a lot of manual steps to repeat that often can not be done quickly and are often prone to error. The errors require time to debug and starting the process again. This results in projects taking far longer to complete, resource is stretched thin, and the backlog of projects grows.

By building with repeatability in mind, a core set of processes is developed that serve all analytical requirements, resource is spent on continuously improving the processes rather than simply running them. Over time the efficiency and capabilities of the team increase.

Quality Assurance

By building repeatable analytical pipelines, quality assurance becomes a much simpler process, resulting in higher quality analysis and higher confidence in the conclusion.

Peer reviewing work is far easier, the process is usually clearer to follow and validate, and analysis can be easily reproduced to review the conclusions.

Testing is easier and can be incorporated into the pipeline, if the analytical pipelines is adapted to serve an additional process, the testing would remain, whereas if a separate analytical process was built rather than re-using an existing pipeline, the tests would need to rebuilt.

By having quality assurance in place, the pricing team will produce analysis that will be reliable, boosting confidence of other stakeholders.

Why repeatable analytics is not common in pricing teams

There are a few main reasons that often explain why this is the case:

Tight deadlines – resulting in analysts prioritising short term results rather than building for the long term. This often leads to a vicious cycle where time-consuming processes reduce the time that can be spent on new tasks, resulting in even tighter deadlines.

Skill sets – a core part of repeatable analytical pipelines is automation, this is often easiest to do when using code, however strong programming skills are rare in pricing teams.

Overestimation of time required – analysts often opt to take the ‘quicker’ route of building out a solution in Excel, or following the steps to run a process over and over. In reality, analysis will likely only take slightly longer if built to be repeatable, and quicker if there is any form of iteration, and some processes can be automated quicker than actually running through the steps.

Conclusion

The adoption of repeatable analytical pipelines is fundamental for data analytics best practice, and can add a great deal of value to pricing teams. The premise is to streamline and systemise analytical processes so that they can be repeated easily, which in turn ensures efficiency, clarity, and quality assurance throughout each stage of analysis.

more insights