Horses do not usually wear hats, and deep generative models, or GANs, do not normally follow the rules set by others. But a new device developed at MIT lets anyone into the GAN and tells the model, like a coder, to put a hat on the head of their horses.
In a new study at this month’s European Conference on Computer Vision, researchers showed that deep layers of neural networks can be edited, like so many lines of code, to produce stunning images that no one has seen before is.
Generation adversarial networks, or ganas, pit two neural networks against each other to create hyper-realistic images and sounds. A neural network, a generator, learns to mimic the faces seen in photographs, or speak the words it hears.
A second network, the discrepant, compares the output of the generator to the original. The generator then builds on judicious feedback until its concocted images and sounds are sufficiently convincing to pass for real.
GANs have fascinated researchers of artificial intelligence for their ability to represent amazing people, and sometimes, a bizarre cat who marries a bride standing at the door of a church Is left for, melted in a pile of fur, if left by the bride.
Like most intensive learning models, GANs rely on large datasets for learning. The more examples they see, the better they find them in imitating them.
But new studies suggest that larger datasets are not necessary. If you understand how a model is wired, Bau says, you can edit the numerical weights in its layers to achieve the behavior desired, even if no literal instance exists. No dataset? There is no problem. Just make your own.
“We’re like prisoners for our training data,” he says. “GANs only learn patterns that are already in our data. But here I can manipulate a condition in the model to make a horse with a hat. It is like editing a genetic sequence to create something completely new, such as inserting the firefly’s DNA into a plant to make it glow in the dark. ”
Bau was a software engineer at Google, and when he decided to attend school, he led the development of Google Hangouts and Google Image Search. The field of deep learning was exploding and he wanted to pursue fundamental questions in computer science.
Hoping to learn how to build transparent systems that will empower users, he joined MIT professor Antonio Toralba’s lab. There, he began to understand the deep nets and their millions of mathematical operations in how they represent the world.
Bau showed that you can, in a GAN, learn to isolate artificial neurons that, like layer cakes, had learned to attract a particular feature, like a tree, and closed them to make the tree disappear.
With this insight, Bau helped create GANPaint, a tool that lets users add and remove features such as doors and clouds from a photo. In the process, he finds that GANs have a stubborn streak: they won’t let you pull doors in the sky.
He says, “It was some rule, that used to say, the doors wouldn’t go there.” “It’s fascinating, we thought. It’s like an ‘if’ statement in a program. To me, it was a clear indication that the network had some sort of internal logic.”
To stave off many nights of sleep, Bau ran experiments and picked up through the layers of his models to equalize a conditional statement. In the end, it climbed on it. “Neural networks have different memory banks that act as a set of general rules, relating one set of learned patterns to another,” he says. “I realized that if you could identify a line of memory, you could write a new memory in it.”
In a shorter version of his ECCV talk, Bau demonstrated how to edit the model and rewrite memories using an intuitive interface he designed. He mimics a tree from one image and pastes it into another, placing it improperly on a building tower.
The model then extracts enough photographs of tree-sprouting towers to fill a family photo album. With a few more clicks, Bau transfers hats from human riders to his horses, and erases the reflection of light from a kitchen pillar.
Researchers hypothesize that each layer of a deep mesh acts as an associative memory, formed after repeated exposure to similar instances. For example, the Fed fed enough pictures of doors and clouds, the model learns that doors are the entrances to buildings, and clouds float in the sky. The model effectively remembers a set of rules for understanding the world.
The effect on manipulating GANs light is particularly striking. When GANPaint added windows to a room, for example, the model automatically added nearby reflections.