In 2022 I learned about FAIR data, the movement to make research data Findable, Accessible, Interoperable and Reproducible. One of UiB’s brilliant research librarians, Jenny Ostrup, patiently helped me make the dataset from the Machine Vision project FAIR in 2022 – I wrote a little bit about that in my blog post enthusiastically exlaining why you should learn R (which I still stand by but maybe learn Python instead: it’s really not that hard, and it is so useful, especially now that you can ask ChatGPT for help).

Anyway, back then, Jenny suggested I find Wikidata IDs for as many items in our dataset as possible, because that makes it much easier for other researchers to connect our dataset to their datasets. The Machine Vision dataset documents 500 works of digital art, video games and narratives like movies and novels, and a lot of them were already in Wikidata. We could have saved ourselves all that time entering data about what year The Matrix was made and what country it was made in! The data for popular movies and games is very rich – you can compare our entry on The Matrix to the entry in Wikidata. But Wikidata has appalling coverage of digital artworks, indie games and electronic literature.

Recently I’ve been doing a lot more editing on Wikipedia (more about that another day), inspired by Deena Larsen’s impressive drive to improve coverage of women in electronic literature, and I was surprised when an experienced Wikipedia advising Deena recommended focusing on Wikidata first.

Becoming more familiar with Wikidata made me realise that we’ve been thinking way too small in FAIR data. We shouldn’t just add Wikidata identifiers to our datasets. We should add our datasets to Wikidata. And more than that, when we begin to plan our datasets, we should be considering whether using the data model already in Wikidata would be a good way to do it.

Take the Machine Vision dataset as an example. Many of the works it documents are already in Wikidata, but 189 of the digital artworks are not! We also have information about many characters in movies, novels, games and so on that are not currently in Wikidata. So I wrote up a proposal for how I could donate the dataset, showing how the variables in our dataset could map to Wikidata properties.

In Machine Vision datasetQID (Wikidata)Variables mapped to Wikidata properties
The Zizi ShowQ124422240 (I created this)Year: 2020
Creator: Jake Elwes
Country: United Kingdom
URL: https://zizi.ai/ -> URLs will need statement of when active as some already dead or no longer main URL
Publication type: Art -> P31 (instance of) Q838948 (work of art)
Technologies referenced: AI -> P180 (depicts) Q107307291 (artificial intelligence entity)
Technologies used: Deepfake -> Not sure?
Topics: AI, gender, identity -> don’t import
Sentiment: Empowering, exciting -> don’t import
Characters: Zizi, Drag Artists -> create Zizi, don’t import “Drag artists” because they’re a collective character so trickier
The MatrixQ83495 (already in Wikidata)Year: 1999
Creator: Wachowski Sisters
Country: United States -> P495Q30
URL: n/a -> empty
Publication type: Art -> P31: (currently film – could add narrative or set to not import data when property already filled like here?)
Technologies referenced:  P180 (depicts) > AI, Biometrics, Body scans, Drones, Holograms, Surveillance cameras, Virtual reality
Technologies used: n/a
Topics: AI, consciousness, dystopian etc -> (don’t import)
Sentiment: (don’t import)
Characters: Neo, Morpheus, Rebels, Agent Smith -> don’t import Rebels as they are collective group, others already linked from entry.

Uh, nobody responded. Utter silence. Probably partly because it was a long text and really why would volunteers bother to look at it? But the thing is, after spending more time learning about Wikidata, I’m pretty sure it’ll be absolutely fine for me to just go ahead and upload the data, or at least the bits of it that are uncontroversial.

So how would I do that? Well, there are tools for converting csv files to Wikidata. Or, and this is easier to demonstrate here and now, you can use Quickstatements and just paste them into and hit run. If you use Zotero you can do this right now with books and articles that are in your bibliography – you can export an item in your bibliography to the format Wikidata QuickStatements and produce a text file that you can paste in to a webpage that converts it to a Wikidata entry. So for instance, a recent journal article I added to my Zotero library can be output as a regular bibliographic reference like this:

Lyall, Ben. “Narratives in Numbers: Sociotechnical Storytelling with Self-Tracking.” Memory, Mind & Media 3 (2024): e1. https://doi.org/10.1017/mem.2023.12.

Or I can choose to export it as a Wikidata Statement.

To do this, right-click on the item, choose Export Item, then choose the format Wikidata QuickStatements:

This produces a text file with the reference formatted in the QuickStatement format.


CREATE
LAST P31 Q13442814
LAST Den "journal article from 'Memory, Mind & Media' published in 2024"
LAST P356 "10.1017/mem.2023.12"
LAST P953 "https://www.cambridge.org/core/product/identifier/S2635023823000127/type/journal_article"
LAST P478 "3"
LAST P304 "e1"
LAST P2093 "Ben Lyall" P1545 "1"
LAST P577 +2024-00-00T00:00:00Z/9
LAST Len "Narratives in numbers: Sociotechnical storytelling with self-tracking"
LAST P1476 en:"Narratives in numbers: Sociotechnical storytelling with self-tracking"
LAST P407 Q1860

Now go to https://quickstatements.toolforge.org, select New batch, paste in your text, click Run and hey presto, you have a new entry! If you want you can add many references at a time.

There are lots of other tools too. For instance, the ORCIDATOR:

  1. If you’re a researcher and you’ve published anything, see if there is a Wikidata entry for you. If there is, check if it has your ORCID ID – add it if not. (And if you don’t have an ORCID ID yet it’s time to make one!)
  2. Copy and paste the QID for the entry about yourself. It’s the grey number after your name at the top that starts with Q.
  3. Now try the ORCIDator! Paste your QID into the box. If there’s more info about you on ORCID than Wikidata, click the “Add metadata from ORCID authors to Wikidata” button. Otherwise, choose Create/amend papers for ORCID authors.
  4. Watch it add your papers and info!

OK, so I haven’t even explained how useful Wikidata is.

Quick links to be expanded:


Discover more from Jill Walker Rettberg

Subscribe to get the latest posts sent to your email.

Leave A Comment

Recommended Posts

Screenshot of a paragraph from a New York Times article published May 12, 2026. Text reads: "The price of tomatoes -tart bursts of flavor in salads and sandwiches — surged nearly 40 percent in April from a year ago on a combination of bad weather, high tariffs and climbing transportation costs."
AI STORIES

Genre glitches and unexpected promotional phrases as a sign of AI writing

A genre glitch is a characteristic of LLM-assisted writing where the text suddenly switches genre, typically inserting a short promotional phrase full of sensory details into an informational text. Genre glitches occur when a word in the generated text is heavily associated with a genre or context that is markedly […]

Top of a ransom note from Shinyhunters hacking group. Text reads: "SHINYHUNTERS rooting your systems since '19 ;) ShinyHunters has breached Instructure (again). Instead of contacting us to resolve it they ignored us and did some "security patches"."
Networked Politics University politics

UiB self-hosts the open source version of Canvas, so wasn’t affected by the breach

On May 1st Canvas announced a security breach, and then yesterday the system was hacked. The login page was replaced by a ransom note: if universities don’t pay up by 12 May, student data will be released. Here’s what the login page looked like yesterday: Way back in 2015, when […]

AI and algorithmic culture Networked Politics

AI-generated images, fascist aesthetics: Dieselbrølet and Heimatstrom

My German is pretty dodgy, so when I first saw Heimatstrom on Bluesky, shared by Roland Meyer, a professor of visual culture at Universität Zürich’s Digital Society Initiative, I misinterpreted it and thought it was a far-right campaign. But no, Heimatstrom is a group of left-wing environmentalists using fascist AI […]

Photo of a billboard ad at Oslo S train station showing a smiliing conductor and the text "Du må ikke sove. Joda, bare sov du."
AI STORIES

“Du må ikke sove”: a floating motif detached from its meaning (or: LLMs can write Norwegian but miss cultural references)

There’s a new ad for the train between Stavanger and Oslo in Norway that uses a line from Arnulf Øverland’s famous anti-fascist poem Du må ikke sove (“You must not sleep”). Du må ikke sove, you must not sleep, the ad says. And then it flips it, jovially, joda, bare […]

Academics in Norway: Sign this petition asking for research-based discussions of how to use AI in universities

I just signed a petition calling for Norwegian universities to use research expertise on AI when deciding how to implement it, rather than having decisions be made mostly administratively. ,  If you are a researcher in Norway, please read it and sign it if you agree – and share with anyone else who might be interested. The petition was written by three researchers at UiT: Maria Danielsen (a philosopher who completed her PhD in 2025 on AI and ethics, including discussions of art and working life), Knut Ørke (Norwegian as a second language), and Holger Pötzsch (a professor of media studies with many years of research on digital media, video games, disruption, and working life, among other topics).  This is not about preventing researchers from exploring AI methods in their research. It is about not uncritically accepting the hype that everyone must use AI everywhere without critical reflection. It is about not introducing Copilot as the default option in word processors, or training PhD candidates to believe they will fall behind if they do not use AI when writing articles, without proper academic discussion. Changes like these should be knowledge-based and discussed academically, not merely decided administratively, because they alter the epistemological foundations of research. Maria wrote to me a couple of months ago because she had read my opinion piece in Aftenposten in which I called for a strong brake on the use of language models in knowledge work. She was part of a committee tasked with developing UiT’s AI strategy and was concerned because there was so much hype and so few members of the committee with actual expertise in AI. I fully support the petition. There are probably some good uses for AI in research, but the uncritical, hype-driven insistence that we must simply adopt it everywhere is highly risky. There are many researchers in Norway with strong expertise in AI, language, ethics, working life, and culture. We must make use of this expertise. This is also partly about respect for research in the humanities, social sciences, psychology, and law. Introducing AI at universities and university colleges is not merely a technical issue, and perhaps not even primarily a technical one. It concerns much more: philosophy of science, methodological reflection, epistemology, writing, publishing, the working environment, and more. […]

screenshot of Grammarly - main text in the middle, names of experts on the left with reccomendations and on the right more info about the expert review feature
AI and algorithmic culture Teaching

Grammarly generated fake expert reviews “by” real scholars

Grammarly is a full on AI plagiarism machine now, generating text, citations (often irrelevant), “humanizing” the text to avoid AI checkers and so on. If you’re an author or scholar, they also have been impersonating and offering “feedback” in your name. Until yesterday, when they discontinued the Expert Review feature due to a class action lawsuit. Here are screenshots of how it worked.