The Value of the Qualitative Experience: Reversing Neural Networks
As a child, I would often stare at the clouds and assign identities to their shapes. There were clouds that were shaped like turtles, birds, airplanes, and everything in between. The ancient Greeks looked to the skies at night and did something very similar with star formations, giving us the first constellations. Recently, researchers at Google asked the question of what a computer would see if it looked at the clouds. Widely put, if we allow a computer to interpret the image of something that it is trained to identify, what is the result?
By reversing the flow of data through a neural network designed to identify images based on a sampling of similar image, the researchers were able to answer just that question. Let’s step back a little back. A (artificial) neural network is a tool that is designed to answer a specific question by “learning” the qualities that lead it to the correct answer at hand. That sounds very abstract, but one of the more common uses of neural networks is to identify the content of images (a more obscure application might be to play video games). By feeding in several million positive examples of what a “banana” looks like, the neural net is able to identify that a given source image is likely to be a banana because it possesses many of the qualities of the bananas on which it has been positively trained.
That begs the question of “what are the qualities of a banana”? We tend to answer this question in qualitative terms, because we live in a world that has significant variation and a shared understanding of qualitative traits. We would say that a banana is long and skinny, yellow, round, and curved. Computers do not share that world with us, and so may only describe the qualities of a banana is a quantitative sense. But that’s alright, because there is no situation in which a computer would be asked to qualitatively understand a banana.
Instead of asking the neural net to identify if an image is an image of a banana, the researchers at Google asked it to generate a picture of what it believed to be a banana (and asked that the picture also have statistics similar to natural pictures, which is the key for allowing the neural net to make a picture that has qualitative properties), among other things. The results were fascinating.
You can see that the neural network is very concentrated on the form of the subject (as it should be), but not as much on the context. So, images where the subject is presented with a number of different contexts (such a measuring cup or screw) tend to qualitatively look more different from their ideal states. However, images where the subject is very likely to have been presented in only a select few contexts are very identifiable, such as the anenome fish and parachute. Images of parachutes are almost ALWAYS presented with the sky in the background, and maybe a little land. In the same sense, clownfish (“anenome fish”) are also almost always presented with the anenome in the back ground.
By using the neural net to generate a qualitative token of its intended target, users are able to test the efficacy of their neural nets in such a way that would be extremely difficult to achieve in a meaningful manner using quantitative methods. The researchers give the example of “dumbbells”. After training their neural net to identify dumbbells and then asking it to generate a picture of one, they found that the neural net was looking too closely at the arms that are invariably attached to dumbbells.
The researchers were also able to ask a neural net to create an image based off of what it sees from random noise. The results are purely a function of the things that the network was trained to identify. It’s almost like dreaming; in the absence of other meaning, what appears? We know now that computers do not dream of electric sheep, but rather vast landscapes and colorful iterative patterns.