Computer scientists use wisdom of the crowd to help AI become an avid reader
Researchers from the University of Southampton will help train AI algorithms to extract meaning from text in different languages as part of a multimillion pound project uniting scientists across Europe.
Data scientists from Southampton’s School of Electronics and Computer Science will scale crowdsourcing techniques that feed novel natural language processing models as part of the Cross-lingual Event-centric Open Analytics Research Academy (CLEOPATRA).
The four-year CLEOPATRA project, part of the European Union’s Horizon 2020 research and innovation programme, will develop frameworks and tools that explore massive digital coverage generated by the intense disruption in the continent over the past decade – including appalling terrorist incidents and the dramatic movement of refugees and economic migrants.
The Marie Skłodowska-Curie Innovative Training Network is funding 15 early-career researchers across eight European universities and research institutes to develop advanced techniques for cross-lingual processing of text and other media, which will be showcased in applications such as digital humanities. A particular focus will be on the user experience of the solutions, proposing concepts and guidelines to improve accessibility and interaction with multi-lingual resources.
Southampton researchers will contribute to the network by proposing crowdsourcing approaches that use the wisdom of crowds to produce the data needed to train and validate the AI algorithms in a scalable, ethical and fair way.
Professor Elena Simperl, Southampton’s CLEOPATRA lead, explains: “Text processing questions can increasingly be automated through the latest AI methods, but all algorithms need to be trained. Crowdsourcing techniques can deliver examples that these algorithms can learn from, but the challenge arises when scaling the process to manage thousands of people in parallel while still producing outcomes that provide additionality. Our researchers will be investigating how to plan and manage this process.”
Advances in the collaborative research will deliver an important step toward augmented intelligence systems that can understand, summarise and translate vast quantities of text. The CLEOPATRA project is led by the L3S Research Center at the Gottfried Wilhelm Leibniz Universität in Hannover, Germany, and also includes researchers from the University of London, the Rheinische Friedrich-Wilhelms-Universität and German National Library of Science and Technology in Germany, the Institut Jozef Stefan in Slovenia, the University of Amsterdam in the Netherlands and University of Zagreb in Croatia.