This is fun: brand new words invented yesterday. The Norwegian Newspaper Corpus scrapes newspaper content from online versions of Norwegian newspapers, matches all the words against words used previously in this and other electronic corpuses (corpi?) they have access to, and lists words never used before. Norwegian, like German, uses compound words, so while the English “jazz festival” is two words, in Norwegian it’s “jazzfestival” — so lots of the new words are freshly created compounds. Some are misspellings, others are anglicisms. (from Gisle Andersen’s presentation — this is a Bergen-based project.)
Previous Post
19th century newspapers and blogs Next Post
the stare of the author as you read 6 thoughts on “words invented yesterday”
Leave A Comment Cancel reply
Recommended Posts
Whenever I give talks about ChatGPT and LLMs, whether to ninth graders, businesses or journalists, I meet people who are hungry for information, who really want to understand this new technology. I’ve interpreted this as interest and a need to understand – […]
Having your own words processed and restated can help you improve your thinking and your writing. That’s one reason why talking with someone about your ideas can help you clarify your thoughts. ChatGPT is certainly no replacement for a knowledgable friend or colleague, […]
Like the rest of the internet, I’ve been playing with ChatGPT, the new AI chatbot released by OpenAI, and I’ve been fascinated by how much it does well and how it still gets a lot wrong. ChatGPT is a foundation model, that […]
A few weeks ago Meta released Galactica, a language model that generates scientific papers based on a prompt you type in. They put it online and invited people to try it out, but had to remove it after just three days after […]
This spring when I was learning R, I came across a paper by Anders Kristian Munk, Asger Gehrt Olesen and Mathieu Jacomy about using machine learning in anthropology – not to classify big data, as machine learning is often used, but to […]
I’m co-organising a preconfernece workshop for AoIR2022 in Dublin today with Annette Markham and MaryElizabeth Luka today, and I’m going to show a few of the ways I’ve engaged with new digital platforms and genres over the years. This is a key […]
Lars
Hooray! One of yesterday’s new words is from Finnmarken: HelnestrÂl. Unfortunately, it’s just a boat name, and an old one at that. Guess we have to try harder.
Jill
Ooh, yes, game the system!!! Great idea!
lesley
corpora?
jill/txt » notes from digital textuality seminar in Bergen
[…] ng it from online newspapers. Working with Knut Hofland here in Bergen. Generates lists of new words. Material is used by lexicographers, good tool for studying emergence and spread of new words. Political eve […]
Jamie
The plural of corpus is corpora (cuz u asked)
Stephen Shimanek
Longman gives stress on the first syllable in English, something I can’t seem to get into my head.