Generating research papers reveals our clichés

JillNovember 22, 2022December 7, 20221 Comment

A few weeks ago Meta released Galactica, a language model that generates scientific papers based on a prompt you type in. They put it online and invited people to try it out, but had to remove it after just three days after people generated convincing but utterly false “papers”. You can still download the code though. According to the paper they published about the model, even the biggest version of the model can run on an NVIDIA A100, a processor that costs 132,000 NOK, so not as utterly out of reach as foundation models were thought to be by academia.

I tested it out while it was online. The tweet I saw announcing was from Yann LeCun, Chief AI Scientist at Meta, who wrote that it “won’t write papers automatically for you, but it will greatly reduce your cognitive load while you write them.”

This tool is to paper writing as driving assistance is to driving.
It won't write papers automatically for you, but it will greatly reduce your cognitive load while you write them. https://t.co/0WgR8DWUV6
— Yann LeCun (@ylecun) November 16, 2022

Trying it out on the kinds of research questions I tend to wonder about does not make me think it’ll be helpful in my writing process. It comes up with falsehoods (“feminist hypertext critics have mostly focused on plot”), provides no references, and is full of clichés. After I read the paper, this makes sense since it is not trained on humanities research at all. It tried to answer though.

The lack of references is startling, although I see from Yann LeCun’s tweet he does claim there are references, so perhaps it is just missing from the demo site. Or perhaps it just couldn’t do references for humanities research. Without references, it is not a research paper, no matter how many papers it has been trained on.

The clichéd structure and writing style is interesting. I tried the prompt “Summarise research on machine vision from feminist posthumanist perspective”, and while it didn’t come up with anything useful, it did replicate the sort of general statement about how important this would be that I and many others have made to the point that they are clichéd.

We will identify relevant literature and analyse it thematically. We will use these themes to develop a set of recommendations for how to make machine vision more equitable, transparent and just. We will then write a paper to summarise our findings.

It suggests a systematic literature review, which is not a methodology I have ever seen used by feminist posthumanists. But yeah, those recommendations for “equitable, transparent and just” AI? I’ve promised them, too. They are simultaneously a dime a dozen and the holy grail of AI.

The “research paper” also gets a bit repetitive.

We will conduct a systematic literature review using the PRISMA guidelines. We will search academic databases for relevant literature and analyse the literature thematically. We will use these themes to develop a set of recommendations for how to make machine vision more equitable, transparent and just. We will then write a paper to summarise our findings.

The “Potential Challenges” section is particularly helpless.

Potential Challenges
We expect to find a large amount of literature on the harms of machine vision. However, it may be challenging to find literature on how these harms relate to feminist and posthumanist theories. We will need to be sensitive to this potential challenge.

I suppose this isn’t wrong, it’s just that this is not how a human researcher would address the challenge. Well, perhaps an undergrad would, and this is the sort of thing a new MA student might write, but research degrees train you to develop connections and knowledge that isn’t already described by others.

Unsurprisingly, a lot of people criticised Galactica quite fast. Michael Black is the director of Max Planck Institute for Intelligent Systems, and he was not impressed:

I asked #Galactica about some things I know about and I'm troubled. In all cases, it was wrong or biased but sounded right and authoritative. I think it's dangerous. Here are a few of my experiments and my analysis of my concerns. (1/9)
— Michael Black (@Michael_J_Black) November 17, 2022

And yet, I don’t think this is without promise. Yes, it’s dangerous the way it hallucinates facts and makes up references. So do all the large language models. Their glib command of cliches and writing structures makes them seem all the more convincing, if rather bland. You get so bored you glaze over and might well not check whether it’s actually true.

But if the point of Galactica is to summarise existing research papers that’s could be really helpful, although useless without references. Sites like Elicit already do this, though not well for the humanities. They also present it as a way of finding research, not generating it.

AI has been reading our papers for a good while – I wrote a blog post about how to write for machine readers and I was only half joking. We do need to be aware of how our words will be read, interpreted, processed. At the same time, once your words are out, they’re no longer in your control. If they were ever in your control. If they were ever your words.

After all, we’re all large language models in a sense, trained on all the words and sentences we’ve ever heard or read. AI just skips the subjectivity, anxiety and self-doubt.

(I generated the top image using DALL-E.)

AI Meta research

1 Comment

February 21, 2023 06:32

jack smith

Hello

I see your website http://www.jilltxt.net and its impressive. I wonder if advertising options like guest post, ad content are available on your site?

What’s the price if we want to advertise on your site?

Note : Article must not be any mark as sponsored or advertise or like that and we can only pay by paypal.

Cheers
jack smith

Reply

Wikidata as research tool

In 2022 I learned about FAIR data, the movement to make research data Findable, Accessible, Interoperable and Reproducible. One of UiB’s brilliant research librarians, Jenny Ostrup, patiently helped me make the dataset from the Machine Vision project FAIR in 2022 – I wrote a little bit about that in my […]

Machine Vision Presentations

Triple book talk: Watch James Dobson, Jussi Parikka and me discuss our 2023 books

Thanks to everyone who came to the triple book talk of three recent books on machine vision by James Dobson, Jussi Parikka and me, and thanks for excellent questions. Several people have emailed to asked if we recorded it, and yes we did! Here you go! James and Jussi’s books […]

AI and algorithmic culture Machine Vision

My book on Machine Vision is out!

Finally I can share what I’ve been working on! I absolutely loved writing this book, taking the time to dig deep into histories, ideas and theories that I think really help understand how machine vision technologies like facial recognition and image generation are impacting us today. I wanted the book […]

AI and algorithmic culture

Visiting OpenAI

Last night I attended the OpenAI Forum Welcome Reception at OpenAI’s new offices in San Francisco. The Forum is a recently launched initiative from OpenAI that is meant to be “a community designed to unite thoughtful contributors from a diverse array of backgrounds, skill sets, and domain expertise to enable […]

Image on a black background of a human hand holding a graphic showing the word AI with a blue circuit board pattern inside surrounded by blurred blue and yellow dots and a concentric circular blue design.

AI and algorithmic culture Machine Vision

Four visual registers for imaginaries of machine vision

I’m thrilled to announce another publication from our European Research Council (ERC)-funded research project on Machine Vision: Gabriele de Setaand Anya Shchetvina‘s paper analysing how Chinese AI companies visually present machine vision technologies. They find that the Chinese machine vision imaginary is global, blue and competitive. De Seta, Gabriele, and Anya Shchetvina. “Imagining Machine […]

AI and algorithmic culture

Do people flock to talks about ChatGPT because they are scared?

Whenever I give talks about ChatGPT and LLMs, whether to ninth graders, businesses or journalists, I meet people who are hungry for information, who really want to understand this new technology. I’ve interpreted this as interest and a need to understand – but yesterday, Eirik Solheim said that every time […]

jill/txt

Tags

jill/txt

Generating research papers reveals our clichés

Related

1 Comment

jack smith

Leave A Comment Cancel reply

Search Here ….

Tags

Generating research papers reveals our clichés

Share this:

Related

1 Comment

jack smith

Leave A Comment Cancel reply

Recommended Posts

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: