Publications
Trust in News on Social Media (2018)
This paper investigates trust in news on a social media platform. The paper is motivated by the finding that social media is the primary news source for a large group of people, especially young adults. Considering the challenges posed by online misinformation and fake news, an understanding of how users quantify trust in news and what factors influence this trust is needed. In a study with 108 participants, German high-school students provided trust ratings for online news including quality media and fake news. The study shows that users can quantify their trust in news items and that these trust ratings correspond to rankings of the sources by experts. The paper finds that psychometric scales that measure interpersonal trust are predictive of a user's mean trust rating across different news items. We show how this can be used to provide interventions for those prone to false trust and false distrust.
Student Success Prediction and the Trade-Off
between Big Data and Data Minimization (2018)
This paper explores student’s daily activity in a virtual learning environment in the anonymized Open University Learning Analytics Dataset (OULAD). We show that the daily activity of students can be used to predict their success, i.e. whether they pass or fail a course, with high accuracy. This is important since daily activity can be easily obtained and anonymized. To support this, we show that the binary information whether a student was active on a given day has similar predictive power as a combination of the exact number of clicks on the given day and sensitive private data like gender, disability, and highest educational level. We further show that the anonymized activity data can be used to group students. We identify different student types based on their daily binarized activity and outline how educators and system developers can utilize this to address different learning types. Our primary stakeholders are designers and developers of learning analytics systems as well as those who commission such systems. We discuss the privacy and design implications of our findings for data mining in educational contexts against the background of the principle of data minimization and the General Data Protection Regulation (GDPR) of the European Union.
Generating captions without looking beyond objects (2016)
This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions. This implies that in image captioning, all word categories other than nouns can be evoked by a powerful language model without sacrificing performance on the precision-oriented metric BLEU. The paper also investigates lower and upper bounds of how much individual word categories in the captions contribute to the final BLEU score. A large possible improvement exists for nouns, verbs, and prepositions.
Text comparison using word vector representations and dimensionality reduction (2015)
This paper describes a technique to compare large text sources using word vector representations (word2vec) and dimensionality reduction (t-SNE) and how it can be implemented using Python. The technique provides a bird's-eye view of text sources, e.g. text summaries and their source material, and enables users to explore text sources like a geographical map. Word vector representations capture many linguistic properties such as gender, tense, plurality and even semantic concepts like "capital city of". Using dimensionality reduction, a 2D map can be computed where semantically similar words are close to each other. The technique uses the word2vec model from the gensim Python library and t-SNE from scikit-learn.
Semantic and stylistic text analysis and text summary evaluation (2015)
The main contribution of this Master's thesis is a novel way of doing text comparison using word vector representations (word2vec) and dimensionality reduction (t-SNE). This yields a bird’s-eye view of different text sources, including text summaries and their source material, and enables users to explore a text source like a geographical map.
The main goal of the thesis was to support the quality control and quality assurance efforts of a company. This goal was operationalized and subdivided into several modules. In this thesis, the Topic and Topic Comparison modules are described.
For each module, the state of the art in natural language processing and machine learning research was investigated and applied. The implementation section of this thesis discusses what each module does, how it relates to theory, how the module is implemented, the motivation for the chosen approach and self-criticism. The thesis also describes how to derive a text quality gold standard using machine learning.
Non-mimicking digital musical interface as a music composition aid (2012)
My Bachelor's thesis yields feedback on advantages, disadvantages, usability problems and suggested improvements of a non-mimicking digital musical interface with an integrated music composition aid. The interface and the music composition aid were evaluated in a Thinking-aloud study with six users and analysed with a qualitative approach according to Mayring.
Columbia is a music composition aid for the iPad that focuses on harmony. The music composition aid is based on templates derived from an analysis of a set of pop songs regarding their chord progressions.