Interview Natural language processing, or NLP for short, is one of the key research areas of artificial intelligence, and has had a major boost in the last couple of years.
After all, the written word is everywhere – be it in the form of scientific texts, news articles, wiki entries, or seemingly simply status updates on the social platform of your choice – and there’s a huge interest in making sense of it all.
One company especially enthusiastic about NLP is Facebook. No matter your opinion about its platform, the amount of data the Silicon Valley giant has to process every single day is simply mind boggling. And its platform comes with some NLP already built in: take its translation capabilities, for example.
To advance what’s possible with NLP and other aspects of AI, Facebook invests in a network of research facilities under the Facebook AI Research (FAIR) moniker. Its London office is led by Sebastian Riedel, who is also a professor in natural language processing and machine learning at University College London – and a keynote speaker at The Register’s AI conference MCubed, which takes place at the end of this month.
But what can you do with it?
Prof Riedel took some time out to chat to us about his work ahead of his presentation, and shed some light on the discipline as a whole. “My research revolves around the question of how machines should represent and store knowledge read from text: should they walk around with a million books and read up where needed? Or remember everything?” the professor said. “Secondly, I want machines to be able to use this knowledge in various tasks such as answering complex questions.“
One Facebook-related use-case for that is what Prof Riedel characterises as “social recommendations – by understanding the language people use when asking for suggestions and offering recommendations, our systems can automatically help make those suggestions more useful – organizing restaurant recommendations on a map, for example.”
Besides this, and the already mentioned machine translations, NLP is also supposed to help us humans with something often debated in the context of social media: harmful content. According to Prof Riedel, “NLP systems also help us to proactively identify and deal with content that violates our policies, even before it’s seen or reported by people.”
If you still have a Facebook account and aren’t completely sure you want to volunteer your data for this kind of research, Prof Riedel can reassure you your sensitive info isn't being feed into these studies: “At FAIR we focus a lot on open and reproducible research aimed at improving the general state-of-the-art in AI. This often means opting for public datasets that are shared by and with the wider research community, and avoiding the use of private and sensitive data.”
FAIR isn’t the only one taking that route, as more and more companies further along the path of incorporating AI are looking into techniques, such as federated learning to help keep private data private. But, then again, the discipline is going through a lot of changes right now, which means exciting times for Riedel and Co.
“The first big shift happened three years ago with the re-emergence of deep neural networks. Then, in the last year, we saw another important change after the release of ELMo and BERT, two pre-trained contextual word representations that leverage large unlabelled text collections to provide downstream models with better features.”
Keeping up, giving back
It also means a lot of work, though. “It’s been an exciting time to work in NLP, but it also means I must radically change my course module at UCL almost every year to keep up with the rate of change.”
Hanging up his sports jacket isn’t an option, though. “I still teach because I enjoy it, because I want to help grow the next generation of AI academics, start-up founders, teachers, and engineers, and because it’s one way industry can give back to the wider AI community.”
How an ace-hole AI bot built by Facebook, CMU boffins whipped a table of human poker prosREAD MORE
Looking into the near future of NLP, the professor hopes to see “substantial improvements in the area of language and memory. Today we are good at processing a short piece of text to answer questions about it or classify it. But how to read, say, all of Wikipedia and then utilize this knowledge downstream is still unclear. There is some interesting work in this direction emerging this year.”
He’d also like to see some improvement in terms of long form text generation. “Language models can now produce text that ‘looks like language’ but if you look closely it often only makes superficial sense. I also expect true progress in multilingual NLP beyond English and a few other mainstream languages.”
For enterprise developers with an interest in the topic, the professor has a few points to help you make your case when it comes to discussing the use of NLP in upcoming projects. “First, a lot of enterprise data (and knowledge) exists not in structured form stored in a database, but as unstructured text. Leveraging this knowledge at scale requires processing and representing in a way that machines can ‘understand it’ – and this is where NLP comes in.
“Second, language is often the most natural interface for people to interact with systems around them and therefore it is sometimes useful for companies to automate, or semi-automate, this interaction.” ®
If you want to learn more about NLP or have direct questions for Sebastian Riedel, get your ticket for MCubed London 2019 now. On September 30, Professor Riedel will give the opening keynote at the AI and machine learning conference, which is going to take place at the QE II Centre.
Sponsored: Webcast: Ransomware has gone nuclear