The important reason these tiny robots are taking pics of cats
At a photo shoot inside a cozy San Francisco coffee shop, models struck playful poses. Some sprawled out on shaggy pillows, limbs unfurling languidly.
Across the room, one stood, statuesque, on the top of a small white table as another strutted playfully down a wooden walkway.
Photographers captured their moves, clicking quickly from different directions and vantage points. The photos were sultry, moody, and, occasionally, featured furry paws.
This wasn’t an ordinary photo shoot. The subjects were cats with names like Passion, Shiloh, Buffy, and Blinx, who live at a café called KitTea, where visitors can pay to sip drinks and eat snacks while hanging out with resident and adoptable felines.
The mission was to take as many pictures as possible to help Vector learn to detect the felines that live in people’s homes.
Data — like, say, cute cat photos — is crucial to building artificial intelligence. The collection process is becoming increasingly important as we rely on AI to do an ever-increasing number of things, from helping self-driving cars navigate streets to getting virtual assistants like Alexa to respond to voices. That’s because in order for AI to work well, it generally needs to be trained first on a lot of data — and not just any kind of data, but information that reflects the kinds of tasks the AI will be working on.
But it’s not always easy to gather that data. One might even say the process can be like, well, herding cats.
Vector, which costs $250 and began shipping in October, is a cross between a companion and a pint-sized helper. It can give you a weather update, answer questions, take a picture of you, and play with the small, light-up cube that comes with it. It’s the latest model of robot from Anki, which has sold 2 million robots thus far.
Vector looks like a tiny black bulldozer with an itty-bitty lift, and brightly colored, slightly askew eyes. The robot — its creators invariably refer to it as a “he” — chitters and chatters, whether or not anyone is playing with it, and sounds like a cross between WALL-E, a guinea pig and a fart.
Vector relies on data to figure out how to do all kinds of things. That includes using its front-facing camera to recognize people and avoid bumping into objects, or its microphones to listen to human commands that start with the words “Hey Vector” and then respond appropriately.
One thing Vector can’t do right now is spot pets. Andrew Stein, Anki’s lead computer vision engineer and a cat owner himself, sees this as a problem for a robot that’s meant to engage with the world around it, which in many homes will include cats or dogs.
“If he’s smart about his environment and responds to a cat differently than a coffee mug sitting on his table, then he knows what a cat is, and that feels different,” Stein said as, nearby, a Vector photographed cats lounging on a rug.
Anki’s engineers are using artificial intelligence to teach Vector how to do this. A key (and sometimes tricky) part of making this work involves collecting data — in this case, that data includes photos of cats sitting, swiping, scratching and sniffing.
The company, which is also working on dog detection, hopes to roll out a feature that lets Vector perceive cats and dogs early next year. At first, Stein said, Vector will simply be able to detect a cat or a dog in the home, and the company is considering a range of simple reactions it could have, like taking an image owners can view in an accompanying smartphone app, or somehow interacting with the pet.
But getting Vector to notice a cat roving around your living room is not as simple as just showing the robot thousands of pictures of cats from existing online databases. Anki engineers have already used tens of thousands of these pictures to train a neural network — a kind of machine-learning algorithm loosely modeled after the way neurons function in the brain — on basic cat detection.
But Stein said the images in these databases are quite different from what cats look like from Vector’s viewpoint, which could be high above an animal or right in front of its paws, and most likely indoors.
“The key is getting data that is representative of what he will actually see when we deploy him into people’s homes,” he said.
Stein believes these images will “tune” Anki’s neural network, which Vector can then use to better spot detect furry friends.
The approach makes a lot of sense to Jason Corso, an associate professor at the University of Michigan who studies computer vision and video understanding. If Anki only used existing data sets on the web, or YouTube videos or Flickr photos of cats, its data would have all the biases of how humans typically take photos of their cats, he said.
For instance, if Corso took a photo of his cat, a tuxedo named Harry Potter, it would be from Corso’s height of about 5’6″. Chances are Vector won’t typically be looking at cats from that high up.
“Indeed, the robot needs to understand what a cat is from its own perspective,” he said.
To shoot the photos at KitTea, Anki employees placed Vectors on the floor, on tables and on a skinny wall-mounted catwalk. They pressed a button on Vector’s back, which captured five images in succession. A tiny, front-facing display showed the cat what the robot was shooting.
Over several hours, the team gathered more than 1,500 photos of the cats at the cafe.
Anki wants Vector to recognize that an animal is nearby without necessarily seeing the animal’s face, similar to how the robot can currently determine that a person is nearby by just seeing part of their body.
Eventually, Stein aims to have Vector identifying specific pets rather than just determining that a cat or dog is nearby. Then, perhaps, it could react differently to different animals — which would make sense, since some animals may want to look at it while others may be more skittish or just disinterested.
This was true of the cats at the cafe. Some stared quizzically at the robot, while a few pawed, pounced on, or shoved it. Many of them didn’t seem to notice the robot at all; they just wanted to snooze on shaggy round pillows or sit in a kitty-sized replica of the Golden Gate Bridge and stare wistfully out the window.