Dr. Ralf SteinbergerEuropean Commission - Joint Research Centre (JRC)
Title: Multilingual and cross-lingual news analysis in the Europe Media Monitor (EMM)
Abstract: Almost every large organisation uses dedicated teams to monitor public sources such as newspapers for information relevant to their field of interest. In most cases, the original manual compilation of newspaper clippings has been replaced by an automated process that gathers relevant news, classifies it into subject domains and applies text mining methods to prepare the selected articles with the purpose of minimising the effort required by the human specialists using the output. With a coverage of between 20 and 70 languages, the publicly accessible Europe Media Monitor (EMM) family of applications developed at the European Commission's Joint Research Centre (JRC) is the most multilingual such system and its analysis goes beyond functionality provided by others. The speaker will give an overview of EMM, its users and its functionality. Text mining components used in EMM include tools for information extraction and disambiguation (persons, organisations, locations, quotations, events), name variant matching, clustering, classification, topic detection and tracking, cross-lingual cluster linking, social network generation and machine translation. Developing all these tools is particularly challenging for highly inflected languages such as those of the Slavic and the Finno-Ugric language families. The speaker will thus focus part of his talk on insights regarding the treatment of highly inflected languages. The various EMM applications are freely accessible via the starting page emm.newsbrief.eu/overview.html.
Ralf Steinberger is a computational linguist working as a lead scientist in theOpensource Text Information Mining and Analysis group at the European Commission'sJoint Research Centre (JRC) in Ispra, Italy. He studied Theoretical Linguistics in Berlin and Munich and he was awarded a Ph.D. in the field of Machine Translation in Manchester (UK). He worked in industry, academia and governmental organisations in Germany, England, Japan, Ethiopia and Italy. Ralf's specialisation lies in multilinguality and in methods to give cross-lingual informatoin access.