This is fun: brand new words invented yesterday. The Norwegian Newspaper Corpus scrapes newspaper content from online versions of Norwegian newspapers, matches all the words against words used previously in this and other electronic corpuses (corpi?) they have access to, and lists words never used before. Norwegian, like German, uses compound words, so while the English “jazz festival” is two words, in Norwegian it’s “jazzfestival” — so lots of the new words are freshly created compounds. Some are misspellings, others are anglicisms. (from Gisle Andersen’s presentation — this is a Bergen-based project.)
Previous Post
19th century newspapers and blogs Next Post
the stare of the author as you read 6 thoughts on “words invented yesterday”
Leave A Comment Cancel reply
Recommended Posts
In 2022 I learned about FAIR data, the movement to make research data Findable, Accessible, Interoperable and Reproducible. One of UiB’s brilliant research librarians, Jenny Ostrup, patiently helped me make the dataset from the Machine Vision project FAIR in 2022 – I wrote a little bit about that in my […]
Thanks to everyone who came to the triple book talk of three recent books on machine vision by James Dobson, Jussi Parikka and me, and thanks for excellent questions. Several people have emailed to asked if we recorded it, and yes we did! Here you go! James and Jussi’s books […]
Finally I can share what I’ve been working on! I absolutely loved writing this book, taking the time to dig deep into histories, ideas and theories that I think really help understand how machine vision technologies like facial recognition and image generation are impacting us today. I wanted the book […]
Last night I attended the OpenAI Forum Welcome Reception at OpenAI’s new offices in San Francisco. The Forum is a recently launched initiative from OpenAI that is meant to be “a community designed to unite thoughtful contributors from a diverse array of backgrounds, skill sets, and domain expertise to enable […]
I’m thrilled to announce another publication from our European Research Council (ERC)-funded research project on Machine Vision: Gabriele de Setaand Anya Shchetvina‘s paper analysing how Chinese AI companies visually present machine vision technologies. They find that the Chinese machine vision imaginary is global, blue and competitive. De Seta, Gabriele, and Anya Shchetvina. “Imagining Machine […]
Whenever I give talks about ChatGPT and LLMs, whether to ninth graders, businesses or journalists, I meet people who are hungry for information, who really want to understand this new technology. I’ve interpreted this as interest and a need to understand – but yesterday, Eirik Solheim said that every time […]
Lars
Hooray! One of yesterday’s new words is from Finnmarken: HelnestrĂ‚l. Unfortunately, it’s just a boat name, and an old one at that. Guess we have to try harder.
Jill
Ooh, yes, game the system!!! Great idea!
lesley
corpora?
jill/txt » notes from digital textuality seminar in Bergen
[…] ng it from online newspapers. Working with Knut Hofland here in Bergen. Generates lists of new words. Material is used by lexicographers, good tool for studying emergence and spread of new words. Political eve […]
Jamie
The plural of corpus is corpora (cuz u asked)
Stephen Shimanek
Longman gives stress on the first syllable in English, something I can’t seem to get into my head.