Title
Automating Predictive Modeling and Knowledge Discovery
Short CV
Ioannis Tsamardinos, Ph.D., is a Professor in
the Computer Science Department of UoC. He obtained his Ph.D. (in
2001) from the Intelligent Systems Program of University of
Pittsburgh. Subsequently, he joined the faculty of the Department of
Biomedical Informatics at Vanderbilt University until 2006 when he
returned to Greece. His research interests lie in the field of Machine
Learning, Data Science, and Bioinformatics and particularly variable
selection, causal discovery, and automation of machine learning. He
has mostly applied such methods on Bioinformatics and Biomedical
Informatics. Ioannis Tsamardinos has over 100 international refereed
publications in journals, conferences and edited volumes, more than
6000 citations in Google Scholar, and 2 US patents. He has been
awarded the ERC Consolidator Grant and the Greek
national grant on research excellence ARISTEIA II.
Abstract
There is an enormous, constantly increasing need for data analytics
(collectively meaning machine learning, statistical modeling, pattern
recognition, and data mining applications) in a vast plethora of
applications and including biological, biomedical, and business applications.
The primary bottleneck in the application of machine learning is the
lack of human analyst expert time and thus, a pressing need to
automate machine learning, and specifically, predictive and diagnostic
modeling. In this talk, we present the scientific and algorithmics
problems arising from trying to automate this process, such as
appropriate choice of the combination of algorithms for preprocessing,
transformations, imputation of missing values, and predictive
modeling, tuning of the hyper-parameter values of the algorithms, and
estimating the predictive performance and producing confidence
intervals. In addition, we present the problem of feature selection
and how it fits within an automated analysis pipeline, arguing that
feature selection is the main tool for knowledge discovery in this context.
Title
Emojis, Sentiment and Stance in Social Media
Short CV
Dr. Petra Kralj Novak is a researcher at the Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia. Her research belongs to the wide area of knowledge discovery from databases. Currently, as a postdoctoral researcher, she analyses social and mainstream media focusing on the mediated sentiment. She publishes in main machine learning and interdisciplinary journals and conferences. Avant-garde research in analyzing the role of emojis in conveying sentiment was published in P. Kralj Novak, et al. "Sentiment of emojis" and is the main reference for current research in emoji use. Dr. Petra Kralj Novak is also assistant professor at the Jožef Stefan International Postgraduate School (Ljubljana, Slovenia), and at Faculty of Information Studies in Novo Mesto (Slovenia). She has given seminars to academic (e.g., Georgia State University, Fudan University, University of Ljubljana) and industrial audiences (Career Builder, LLC [USA]). She was also invited speaker at international conferences (CMC Corpora 2016, SCSC 2018).
Abstract
Social media are computer-based technologies that provide means of information and idea sharing, as well as entertainment and engagement handly available as mobile applications and websites to both private users and businesses. As social media communication is mostly informal, it is an ideal environment for the use of emoji. We have collected Twitter data and engaged 83 human annotators to label over 1.6 million tweets in 13 European languages with sentiment polarity (negative, neutral, or positive). About 4% of the annotated tweets contain emojis. We have computed the sentiment of the emojis from the sentiment of the tweets in which they occur. We observe no significant differences in the emoji rankings between the 13 languages. Consequently, we propose our Emoji Sentiment Ranking as a European language-independent resource for automated sentiment analysis. In this talk, several emoji, sentiment and stance analysis applications will be presented, varying in data source, topics, language, and approaches used.