Last week I was in London, where I visited Pierre Huyghe’s exhibition Uumwelt at the Serpentine Gallery. You walk in, and there are flies in the air, flies and a large screen showing images flickering past, fast. The images are generated by a neural network and are reconstructions of images humans have looked at, but that the neural network hasn’t had direct access to – they are generated based on brainwave activity in the human subjects.
The images flicker past in bursts, fast fast fast fast fast slow fast fast fast, again and again, never resting. Jason Farago describes the rhythm as the machine’s “endless frantic attempts to render human thoughts into visual form”, and frantic describes it well, but it’s a nonhuman frantic, a mechanical frantic that doesn’t seem harried. It’s systematic, mechanical, but never resting, never quite sure of itself but trying again and again. I think (though I’m not sure) that this is an artefact of the fMRI scanning or the processing of the neural network that Huyghe has chosen to retain, rather than something Huyghe has introduced.
Huyghe uses technology from Yukiyasu Kamitani’s lab at Kyoto University. A gif Kamitani posted to Twitter gives a glimpse into how the system uses existing photographs as starting points for figuring out what the fMRI data might mean – the images that flicker by on the right hand side sometimes have background features like grass or a horizon line that is not present in the left image (the image shown to the human). Here is a YouTube version of the gif he tweeted:
The images and even the flickering rhythms of the Kamitani Lab video are really quite close to Huyghe’s Uumwelt. At the exhibition I thought perhaps the artist had added a lot to the images, used filters or altered colours or something, but I think he actually just left the images pretty much as the neural network generated them. Here’s a short video from one of the other large screens in Uumwelt – there were several rooms in the exhibition, each with a large screen and flies. Sections of paint on the walls of the gallery were sanded down to show layers of old paint, leaving large patterns that at first glance looked like mould.
The neural network Kamitani’s lab uses has a training set of images (photographs of owls and tigers and beaches and so on) which have been viewed by humans who were hooked up to fMRI, so the system knows the patterns of brain activity that are associated with each of the training images. Then a human is shown a new image that the system doesn’t already know, and the system tries to figure out what that image looks like by combining features of the images it knows produce similar brain activity. Or to be more precise, “The reconstruction algorithm starts from a random image and iteratively optimize the pixel values so that the DNN [DNN=deep neural network] features of the input image become similar to those decoded from brain activity across multiple DNN layers” (Shen et.al. 2017) Looking at the lab’s video and at Uumwelt, I suspect the neural network has seen a lot of photos of puppy dogs.
I’ve read a few of the Kamitani Lab’s papers, and as far as I’ve seen, they don’t really discuss how they conceive of vision in their research. I mean, what exactly does the brain activity correspond to? Yes, when we look at an image, our brain reacts in ways that deep neural networks can use as data to reconstruct an image that has some similarities with the image we looked at. But when we look at an image, is our brain really reacting to the pixels? Or are we instead imagining a puppy dog or an owl or whatever? I would imagine that if I look at an image of somebody I love my brain activity will be rather different than if I look at an image of somebody I hate. How would Kamitani’s team deal with that? Is that data even visual?
Kamitani’s lab also tried just asking people to imagine an image they had previously been shown. To help them remember the image, they were “asked to relate words and visual images so that they can remember visual images from word cues” (Shen et.al. 2017). As you can see below, it’s pretty hard to tell the difference between a subject’s remembered swan or aeroplane and their remembered swan or aeroplane. I wonder if they were really remembering the image at all, or just thinking of the concept or thing itself.
Uumwelt means “environment” or “world around us” in German, though Huyghe has given it an extra u at the start, in what Farago calls a “stutter” that matches the rhythms of the videos, though I had thought of it as more of a negator, an “un-environment”. Huyghe is known for his environmental art, where elements of the installation work together in an ecosystem, and of course the introduction of flies to Uumwelt is a way of combining the organic with the machine. Sensors detect the movements of the flies, as well as temperature and other data that relates to the movement of humans and flies through the gallery, and this influences the display of images. The docent I spoke with said she hadn’t noticed any difference in the speed or kinds of images displayed, but that the videos seemed to move from screen to screen, or a new set of videos that hadn’t been shown for a while would pop up from time to time. The exact nature of the interaction wasn’t clear. Perhaps the concept is more important than the actuality of it.
The flies apparently are born and die within the gallery, living their short lives entirely within the artwork. They are fed by the people working at the gallery, and appear as happy as flies usually appear, clearly attracted to the light of the videos.
Dead flies are scattered on the floors. They have no agency in this Uumwelt. At least none that affects the machines.