How do you assure an algorithm anyhow?

For the last few months, I've been doing research for a new thought leadership report - tentatively titled "Assurance for New Technologies".  The central question for this report is simple: How do you audit a blockchain?  What does an assurance engagement on a machine learning-driven algorithm look like?  How can assurance techniques handle AI?

There are already many auditors using audit data analytics, and even machine learning, to improve the quality of their audit work.  These technologies are increasingly familiar parts of the auditor's arsenal.  But the inverse question is less well understood - how do new and emerging technologies work out as the subjects of assurance engagements, rather than as tools used in them?

The need for assurance over these technologies is not theoretical.  More and more data is available all the time, and organisations are naturally looking at how they can use it - whether analysing it themselves with analytics and data science, or using it to fuel machine learning that creates new algorithms and AIs that acts based on that data.  But as organisations build ever greater value streams out of their data, it becomes increasingly important for them to know that their data is right, that the insights they've drawn from it are unbiased, and that any byproducts of the data are valid.  Writer Kate Crawford coined the term "data fundamentalism" to describe a belief that enough data can make correlation exact enough to take the place of causation, and that the results of data analytics have inherent truth.  But there are countless examples where this falls down - where data is shown to be incomplete, or where human biases inherent in the data we produce are dutifully recreated by the algorithms that train on that data.

So getting an external validation of your data, or the algorithms built from it, can be a matter not only of getting a step ahead of regulation - GDPR requires a notification and explanation of any algorithmic decisions made about data subjects - but also a way for firms to show quality.  Having an audited, rubber-stamped algorithm can be the differentiator between low- and high-quality competitors and attract customers and investment with a track record of proven trust.  So, back to the original question: how does one actually complete an assurance engagement on something like an algorithm?

The Big 4 and other larger firms have already been innovating on this, alongside innovative IT audit start-ups - and that's where I'm researching currently.  The piece I'm producing (scheduled for Q3/4 this year) will review the current thinking in the field, and propose a few principles for how to think about applying assurance techniques not just to the current wave of technologies, but for any future technological innovation.

Algorithmic assurance as it's currently practised usually combines detailed IT auditing - reading through code and checking it for errors - with high-level tests of control and governance.  This might take the form of questions about where data comes from, or probing how the machine learning process was run, or reviewing how updates will be made in the future.  Together, these elements help the algorithm auditor to build a picture of the suitability and reliability of the resulting tool – even if the algorithm itself may not be directly understandable.  Ethics is also a central pillar of this process - with the potential impact and harms of the algorithm being as important as how it was made.  But there's plenty more still to be learned - and invented - about this bold new world of assurance.

If you would like to contribute to the research, please contact me.