Using machine learning to find the unpredictable
This spring when I was learning R, I came across a paper by Anders Kristian Munk, Asger Gehrt Olesen and Mathieu Jacomy about using machine learning in anthropology – not to classify big data, as machine learning is often used, but to see what the algorithmic can’t predict. Those unpredictable bits of data turned out to be the most interesting for qualitative analysis.
I was fascinated by the idea, and since I’d just been learning to use R for simple machine learning, I tried it on our dataset. We’ve been finding lots of interesting things in the data, but looking at the most common uses of machine vision in sci-fi and games and art is, to be honest, a bit boring. Of course people use machine vision to scan stuff or analyse it. The idea of looking for that unpredictable bits of the dataset really appealed to me. And it worked!
I wrote up my results in a commentary to the original paper, and it was just published in Big Data and Society. The paper is called Algorithmic failure as a humanities methodology: Machine learning’s mispredictions identify rich cases for qualitative analysis.
I made a video abstract for the paper too, because I think the basic idea of using algorithmic failure as a qualitative methodology has a lot of potential. Here’s my short version of the idea – read the paper for the analysis and to see just what I did.
AI is being used more and more on qualitative data. I think we could solve some of the ethical problems in AI by focusing more on its underlying epistemology. We need to find more ways of using AI and machine learning to support the fundamental epistemology of qualitative research.
The code I used for this was really quite simple, and I have published it on GitHub. I really hope people try using and adapting these ideas, whether by using and adapting the code or just by trying out other ways of using algorithmic failure as a generative, qualitative methodology.