This is a first draft that I’m using for a quick demo to my colleagues at CDN today. I’ll go through and organise it better within the next few days. – Jill 14.02.2024

In 2022 I learned about FAIR data, the movement to make research data Findable, Accessible, Interoperable and Reproducible. One of UiB’s brilliant research librarians, Jenny Ostrup, patiently helped me make the dataset from the Machine Vision project FAIR in 2022 – I wrote a little bit about that in my blog post enthusiastically exlaining why you should learn R (which I still stand by but maybe learn Python instead: it’s really not that hard, and it is so useful, especially now that you can ask ChatGPT for help).

Anyway, back then, Jenny suggested I find Wikidata IDs for as many items in our dataset as possible, because that makes it much easier for other researchers to connect our dataset to their datasets. The Machine Vision dataset documents 500 works of digital art, video games and narratives like movies and novels, and a lot of them were already in Wikidata. We could have saved ourselves all that time entering data about what year The Matrix was made and what country it was made in! The data for popular movies and games is very rich – you can compare our entry on The Matrix to the entry in Wikidata. But Wikidata has appalling coverage of digital artworks, indie games and electronic literature.

Recently I’ve been doing a lot more editing on Wikipedia (more about that another day), inspired by Deena Larsen’s impressive drive to improve coverage of women in electronic literature, and I was surprised when an experienced Wikipedia advising Deena recommended focusing on Wikidata first.

Becoming more familiar with Wikidata made me realise that we’ve been thinking way too small in FAIR data. We shouldn’t just add Wikidata identifiers to our datasets. We should add our datasets to Wikidata. And more than that, when we begin to plan our datasets, we should be considering whether using the data model already in Wikidata would be a good way to do it.

Take the Machine Vision dataset as an example. Many of the works it documents are already in Wikidata, but 189 of the digital artworks are not! We also have information about many characters in movies, novels, games and so on that are not currently in Wikidata. So I wrote up a proposal for how I could donate the dataset, showing how the variables in our dataset could map to Wikidata properties.

In Machine Vision datasetQID (Wikidata)Variables mapped to Wikidata properties
The Zizi ShowQ124422240 (I created this)Year: 2020
Creator: Jake Elwes
Country: United Kingdom
URL: https://zizi.ai/ -> URLs will need statement of when active as some already dead or no longer main URL
Publication type: Art -> P31 (instance of) Q838948 (work of art)
Technologies referenced: AI -> P180 (depicts) Q107307291 (artificial intelligence entity)
Technologies used: Deepfake -> Not sure?
Topics: AI, gender, identity -> don’t import
Sentiment: Empowering, exciting -> don’t import
Characters: Zizi, Drag Artists -> create Zizi, don’t import “Drag artists” because they’re a collective character so trickier
The MatrixQ83495 (already in Wikidata)Year: 1999
Creator: Wachowski Sisters
Country: United States -> P495Q30
URL: n/a -> empty
Publication type: Art -> P31: (currently film – could add narrative or set to not import data when property already filled like here?)
Technologies referenced:  P180 (depicts) > AI, Biometrics, Body scans, Drones, Holograms, Surveillance cameras, Virtual reality
Technologies used: n/a
Topics: AI, consciousness, dystopian etc -> (don’t import)
Sentiment: (don’t import)
Characters: Neo, Morpheus, Rebels, Agent Smith -> don’t import Rebels as they are collective group, others already linked from entry.

Uh, nobody responded. Utter silence. Probably partly because it was a long text and really why would volunteers bother to look at it? But the thing is, after spending more time learning about Wikidata, I’m pretty sure it’ll be absolutely fine for me to just go ahead and upload the data, or at least the bits of it that are uncontroversial.

So how would I do that? Well, there are tools for converting csv files to Wikidata. Or, and this is easier to demonstrate here and now, you can use Quickstatements and just paste them into and hit run. If you use Zotero you can do this right now with books and articles that are in your bibliography – you can export an item in your bibliography to the format Wikidata QuickStatements and produce a text file that you can paste in to a webpage that converts it to a Wikidata entry. So for instance, a recent journal article I added to my Zotero library can be output as a regular bibliographic reference like this:

Lyall, Ben. “Narratives in Numbers: Sociotechnical Storytelling with Self-Tracking.” Memory, Mind & Media 3 (2024): e1. https://doi.org/10.1017/mem.2023.12.

Or I can choose to export it as a Wikidata Statement.

To do this, right-click on the item, choose Export Item, then choose the format Wikidata QuickStatements:

This produces a text file with the reference formatted in the QuickStatement format.


CREATE
LAST P31 Q13442814
LAST Den "journal article from 'Memory, Mind & Media' published in 2024"
LAST P356 "10.1017/mem.2023.12"
LAST P953 "https://www.cambridge.org/core/product/identifier/S2635023823000127/type/journal_article"
LAST P478 "3"
LAST P304 "e1"
LAST P2093 "Ben Lyall" P1545 "1"
LAST P577 +2024-00-00T00:00:00Z/9
LAST Len "Narratives in numbers: Sociotechnical storytelling with self-tracking"
LAST P1476 en:"Narratives in numbers: Sociotechnical storytelling with self-tracking"
LAST P407 Q1860

Now go to https://quickstatements.toolforge.org, select New batch, paste in your text, click Run and hey presto, you have a new entry! If you want you can add many references at a time.

There are lots of other tools too. For instance, the ORCIDATOR:

  1. If you’re a researcher and you’ve published anything, see if there is a Wikidata entry for you. If there is, check if it has your ORCID ID – add it if not. (And if you don’t have an ORCID ID yet it’s time to make one!)
  2. Copy and paste the QID for the entry about yourself. It’s the grey number after your name at the top that starts with Q.
  3. Now try the ORCIDator! Paste your QID into the box. If there’s more info about you on ORCID than Wikidata, click the “Add metadata from ORCID authors to Wikidata” button. Otherwise, choose Create/amend papers for ORCID authors.
  4. Watch it add your papers and info!

OK, so I haven’t even explained how useful Wikidata is.

Quick links to be expanded:

Leave A Comment

Recommended Posts

Triple book talk: Watch James Dobson, Jussi Parikka and me discuss our 2023 books

Thanks to everyone who came to the triple book talk of three recent books on machine vision by James Dobson, Jussi Parikka and me, and thanks for excellent questions. Several people have emailed to asked if we recorded it, and yes we did! Here you go! James and Jussi’s books […]

Image on a black background of a human hand holding a graphic showing the word AI with a blue circuit board pattern inside surrounded by blurred blue and yellow dots and a concentric circular blue design.
AI and algorithmic culture Machine Vision

Four visual registers for imaginaries of machine vision

I’m thrilled to announce another publication from our European Research Council (ERC)-funded research project on Machine Vision: Gabriele de Setaand Anya Shchetvina‘s paper analysing how Chinese AI companies visually present machine vision technologies. They find that the Chinese machine vision imaginary is global, blue and competitive.  De Seta, Gabriele, and Anya Shchetvina. “Imagining Machine […]

Do people flock to talks about ChatGPT because they are scared?

Whenever I give talks about ChatGPT and LLMs, whether to ninth graders, businesses or journalists, I meet people who are hungry for information, who really want to understand this new technology. I’ve interpreted this as interest and a need to understand – but yesterday, Eirik Solheim said that every time […]