Roundup Hi, here's a few interesting bits and pieces from the world of AI. A public tax form from OpenAI reveals the crazy salaries of top AI researchers. There are more competitions pushing for improved image recognition models on mobiles, as well as training systems as fast and cheap as possible.
Image recognition on mobiles Google has launched another computer vision challenge to push image recognition in real time for mobile phones.
The On-Device Visual Intelligence Challenge (ODVI) is part of a workshop track at the Computer Vision and Pattern Recognition conference (CVPR) happening in June later this year in Salt Lake City.
It’s challenging to build fast, accurate models on small mobile chips given the latency limit. ODVI will focus on devising a unified metric that measures the “number of correct classifications within a specified per-image average time limit of 33 [milliseconds]”. Latency has been tricky to measure, and without a solid benchmark it’s difficult to compare different models.
Participants will be given these tools to get started, according to Google’s blog post.
- TOCO compiler for optimizing TensorFlow models for efficient inference
- TensorFlow Lite inference engine for mobile deployment
- A benchmarking SDK that can be run locally on any Android phone
- Sample models to showcase successful mobile architectures that run inference in floating-point and quantized modes
- Google’s benchmarking tool for reliable latency measurements on specific Pixel phones.
You can register for the competition here before the deadline on 15 June.
Cash rules everything around me OpenAI’s public tax forms reveal just how much companies are willing to splash on AI experts and it's a lot.
In 2016, the non-profit research lab backed by Elon Musk paid one of its co-founders and research director Ilya Sutskever a jaw-dropping $1.9m (£1.36m). Ian Goodfellow, known for his work on general adversarial networks, was second on the list and made $808,243 (£577,000). He has since moved to Google Brain. The third big name was the roboticist Pieter Abeel, a professor at the University of California, Berkeley and an adviser at OpenAI, who raked in $425,000 (£303,000).
AI conferences like the Neural Information Processing Systems (NIPS) have become perfect environments to poach researchers and engineers. Many companies organise exclusive parties and events, inviting a select group of attendees in an attempt to woo them.
Since OpenAI are non-profit, the average salaries are likely to be slightly lower than the amount offered by private companies, who also give employees generous packages when they join with stock.
So it's not too shabby to be an AI engineer right now.
NIPS?! Organisers of the biggest machine learning and AI conference, the aforementioned NIPS, are thinking of changing the conference’s name.
In a tweet, the executive board announced that they will be asking the community for suggestions at the end of May.
The NIPS executive board is currently discussing the possibility of changing the name of the NIPS conference. At the end of May, we will ask the whole NIPS community for input and suggestions for a potential new name. Please be patient until then.— NIPS (@NipsConference) April 18, 2018
The move has been applauded by people hoping to move beyond a name that has been used as a racist term describing Japanese people or a nickname for nipples. Others, however, believe that it's not a very welcoming name for women, and to prove the point a female-led group organized the Transformationally Intelligent Technologies Symposium (TITS) to highlight this.
Beyond accuracy Researchers are racing to submit their fastest and cheapest models trained on key datasets for the DAWNBench competition, which closes tomorrow at just before midnight.
Most benchmarks in machine learning focus on accuracy. DAWNBench, however, optimizes for training time, training and inference costs and inference latency and takes into account the different model architectures, software used and hardware.
It measures progress on three different datasets: ImageNet and CIFAR10, both for computer vision, and SQuAD, a question answering dataset for natural language processing.
A quick scan at the current scoreboards shows that there isn’t a single magic bullet for choosing hardware or software to get you fast, cheap and accurate models.
“Deep learning models are fine, but researchers mainly ignore the simple tricks that make them much faster to train,” Jeremy Howard, founder of fast.ai, a popular online deep learning course, and a researcher at the University of San Francisco, told The Register.
Howard and his team are currently ranked first and second on training times and costs for a model trained on CIFAR-10. ”We're trying to show that small groups with limited resources can make a big impact, he said.”
The latest scores show its possible to train a ResNet50 in about half an hour with half a cloud TPU pod (32 TPU2s). It’s also interesting to see an AmoebaNet, a model born from Google’s AutoML project that uses machine learning to design the architecture of new machine learning models, top the list for cheapest model to train on ImageNet. It shows some promise in using evolutionary search to build systems.
The times for training on SQuAD are much longer than the image recognition datasets, and training is much more expensive to execute than inference because it’s much more computationally intensive.
Hey Siri Apple have published a long and detailed blog post on the machine learning technology needed to personalize the words ‘Hey Siri’ needed to activate its personal assistant Siri.
It’s a “key-phrase detection” problem”, and Apple uses a recurrent neural network to recognise the words Hey Siri. To minimize the annoying problem of Siri turning on when the user doesn’t want it to, Apple talks about needing to personalize the system to recognize the primary user of the phone.
The speaker recognition model is split into two parts: enrollment and recognition. Enrollment involves the user speaking into the phone to give it a few vocal training samples. A statistical model learns the unique features of the user’s voice.
Next, the recognition phase makes sure to accept or reject the activation of Siri based on how similar an announcement of ‘Hey Siri’ sounds compared to the user’s voice.
“Although the average speaker recognition performance has improved significantly, anecdotal evidence suggests that the performance in reverberant (large room) and noisy (car, wind) environments still remain more challenging,” Apple said in the blog post.
"One of our current research efforts is focused on understanding and quantifying the degradation in these difficult conditions in which the environment of an incoming test utterance is a severe mismatch from the existing utterances in a user’s speaker profile." ®