Knowledge-based SystemsStudy and teaching
Student research projects & theses

Student research projects & theses

The staff of the KBS will be pleased to advise you with current topic suggestions for your Bachelor/Master thesis.

Attention: Bachelor theses can also be carried out in pairs as long as the individual contribution can be assigned. For correspondingly more extensive tasks, please also see the Master's theses.

BACHELOR AND MASTER THESES

  • Developing Black-box Adversarial Attacks on Speech Emotion Recognition Models

    Nowadays, speech emotion recognition (SER) has been essential in human-computer interaction. The rapid development of deep learning for SER has become a popular research area. However, deep neural networks were shown vulnerable to external attacks, especially adversarial attacks, in which the adversarial examples are generated by adding human-indistinguishable perturbations to the original real samples. In SER, adversarial examples may lead to misclassification, resulting in invalid and misinterpreted interactions with users.

    Different from white-box adversarial attacks which require the data sources and targeted models' parameters, black-box adversarial attacks cannot obtain either the data source or parameters. Therefore, it is more challenging to generate black-box adversarial examples. The transferability of black-box adversarial attacks means the capability of an attack model on transferring among multiple targeted models. Enhancing the transferability can help save costs for learning a unified attacker than training multiple independent ones. Lifelong learning has been used to improve the transferability of black-box adversarial attacks. The goal of this topic is to further improve the black-box adversarial attacks' transferability.

    An ideal candidate should have:

    1. Good background knowledge of signal processing and deep learning.
    2. Strong programming skills in Python (Pytorch, Keras, or TensorFlow).

    Leitung und Ansprechpartner der Abschlussarbeit

    Zhao Ren

  • Computer Audition for Health Care: Integrating Machine Learning into Acoustic Analytics

    In real life, sound is an essential component of human perception of the world. Especially, human speech and human body sounds can reflect important physiological and psychological health information. Computer audition aims to teach a machine perceiving speech and audio signals by integrating signal processing, machine learning, and deep learning techniques. Recent advances in computer audition have proven promising in digital health applications, from disease diagnosis to therapies. Moreover, the rapid development of smartphones, tablets, and wearable devices promotes mobile health for contactless diagnosis and/or remote monitoring.

    Our goal is to develop robust and explainable deep neural networks for automated diagnosis from speech/audio signals. Possible topics include, but are not limited to:

    1. Developing COVID-19 detection models from multiple modalities, such as sounds (speech, cough, and breathing) and symptoms.
    2. Exploring the effect of preprocessing in automatic auscultation from phonocardiogram signals
    3. Respiratory disease diagnosis from sounds

    An ideal candidate should have:

    1. Good background knowledge of signal processing and deep learning.
    2. Strong programming skills in Python (Pytorch, Keras, or TensorFlow).

    Leitung und Ansprechpartner der Abschlussarbeit

    Zhao Ren

  • Erweiterung einer Fahrrad Navigations App [MSc]

    Bike GPS ist Betreiber eines Tourenportals, der unter anderem Tourismusregionen in den Alpen berät, wie sie einen Fahrrad-Tourismus aufbauen und wie sie sich als Mountainbike Destination weltweit einen Namen machen können (z.B. Gardasee, Livigno, Dolomiten, Ötztal, Zillertal und Kaltern).

    In Zusammenarbeit mit Bike GPS wurde bereits eine Open Source Fahrrad Navigations App entwickelt. Ziel dieser Masterarbeit ist es diese App funktional zu erweitern und das Interface Design durch Benutzerbefragungen und A/B-Tests zu verbessern. Während der Routenführung sollen beispielsweise Hotels, Restaurants und Einkehrstationen, Fotos von touristischen Highlights, besonderen Panoramen und wichtigen Details in der Nähe der Tour angezeigt werden.

    Technisch basiert die App auf Flutter um sowohl Android also auch iOS-Nutzer ansprechen zu können. Für die Karten wird auf OpenStreetMap zurückgegriffen. Entsprechende Vorkenntnisse sind sehr hilfreich aber nicht zwingend erforderlich.

    Leitung und Ansprechpartner der Abschlussarbeit

    Philipp Kemkes

  • Clustering of highly-variant geared motors using machine learning (Peter Wißbrock, Prof. Dr. Wolfgang Nejdl)

    In the context of Industry 4.0, industrial components are becoming more modular, customized and built in smaller quantities. This makes it increasingly difficult to determine the quality of manufactured products, whether manual or automated. Using the example of an existing data set with several hundred variants, the similarity of the acoustic behavior is to be assessed. The goal is to be able to evaluate and test variants with similar acoustic behavior under the same thresholds. Thus, the complexity of the variants shall be reduced in the end-of-line test. Here, the use of machine learning, in particular clustering, is promising.

    Typical tasks in the context of the final thesis:

    • Literature review on highly-variant components, as well as clustering methods
    • Exploratory analysis of the data set with respect to its variants
    • Implementation of selected preprocessing and clustering methods
    • Validation of the results and development of recommendations for action

     

    Recommended profile:

    • Programming skills in Python
    • Initial experience with machine learning and signal processing in general
    • Interest in working independently
    • Interest in working with industrial (real-world) data

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Peter Wißbrock Gregory Palmer

  • Counterfactual Explanation of Document Retrieval Models (Dr. Koustav Rudra)

    Document retrieval consists of two phases 1. First phase retrieval where a set of 100/1000 documents are retrieved for each query from a corpus of million documents, 2. In the second phase we deploy a ranker to rerank these documents based on their relevance to the query. This reranker is pretty complex and is based on deep neural models. The users don’t have any clue related to the decisions made by the ranker. Hence, in this task, our objective is to understand the reasoning behind the ranking process i.e., which words or phrases are relevant for the ranking, what are the changes we should make in the document to influence the ranking. We want to address following counterfactual explanation strategies over the ranking algorithms.

    1. For a given query, we rerank documents retrieved by first stage retrieval. If we take a pair of documents, how much change/perturbation do we have to make to those documents to change their ranking order. The changes to the documents should be minimal and topic of the document should not be changed drastically. To understand the important terms or phrases we can try different feature based approach such as gradients/ attention scores and make the changes in the documents.

    2. In some cases, web content writers want to understand the ranking strategies of search engines and use that information to create the content so that search engines promote their web pages in the top of the list. In case 1, we assume blackbox ranker model i.e., we cannot update or modify the model parameters. We are only allowed to modify the document content to some extent. However, in this scenario, we focus on a specific document and try to update our model parameters to promote that document i.e., the document should appear higher in the list.

     

    An ideal candidate should have:

    1. Strong background in python and deep learning
    2. motivation behind learning and exploring the data
    3. knowledge about basic IR concepts

     

    Related Papers:

    1. Measurable Counterfactual Local Explanations for Any Classifier

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Koustav Rudra

  • Question-Answering System for Legal Documents (Dr. Koustav Rudra)

    Question answering is a classical task in information retrieval. However, this might be a more complicated for the legal documents. Legal documents contain different sections such as fact, argument, judgement, etc. Hence, we may have to explore this section information along with the text to develop a better legal retrieval system. In some cases, answers are also not straightforward and complete. Hence, the retrieval system has to take care of all such complications.

     

    An ideal candidate should have:

    1. Strong background in python and deep learning
    2. motivation behind learning and exploring the data

     

    Related Papers:

    1. Dense passage retrieval for open-domain question answering

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Koustav Rudra

  • In-depth Analysis of Negative Sampling Strategies in Dense Retrieval (Dr. Koustav Rudra)

    Recent developments in representational learning for information retrieval can be organized in a conceptual framework that establishes two pairs of contrasts: sparse vs. dense representations and unsupervised vs. learned representations. Sparse learned representations can further be decomposed into expansion and term-weighting components. From Dense supervised retrieval scenario, we have methods like Dense Passage Retrieval (DPR), COIL, CLEAR. The performance of such dense supervised approaches are heavily dependent on the training procedure and how good the negative samples are. There are various strategies through which negative samples can be collected. For a given query, we can randomly sample negative documents from all the irrelevant ones. On the other hand, we can also sample negative documents that are close to the query i.e., having overlap with query terms but still irrelevant. The objective of this work is to understand the influence of different negative sampling strategies in the performance of different dense supervised retrieval set ups.

     

    An ideal candidate should have:

    1. Strong background in python and deep learning
    2. motivation behind learning and exploring the data
    3. knowledge about basic IR concepts

     

    Related Papers:

    1. Complementing lexical retrieval with semantic residual embedding 
    2. COIL : Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List
    3. Distilling dense representations for ranking using tightly-coupled teachers

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Koustav Rudra

  • Precedent & Statute Retrieval Task in Legal Documents (Dr. Koustav Rudra)

    In countries following the Common Law system (e.g., UK, USA, Canada, Australia, India), there are two primary sources of law – Statutes (established laws) and Precedents (prior cases). Statutes deal with applying legal principles to a situation (facts / scenario / circumstances which lead to filing the case). Precedents or prior cases help a lawyer understand how the Court has dealt with similar scenarios in the past, and prepare the legal reasoning accordingly.

    When a lawyer is presented with a situation (that will potentially lead to filing of a case), it will be very beneficial to him/her if there is an automatic system that identifies a set of related prior cases involving similar situations as well as statutes/acts that can be most suited to the purpose in the given situation. Such a system shall not only help a lawyer but also benefit a common man, in a way of getting a preliminary understanding, even before he/she approaches a lawyer. It shall assist him/her in identifying where his/her legal problem fits, what legal actions he/she can proceed with (through statutes) and what were the outcomes of similar cases (through precedents).

     

    In this project our objective is to propose models for the following two tasks:

    • Task 1 : Identifying relevant prior cases for a given situation
    • Task 2 : Identifying most relevant statutes for a given situation

     

    An ideal candidate should have:

    1. Strong background in python and deep learning
    2. motivation behind learning and exploring the data

     

    Related Papers:

    1. Identification of Rhetorical Roles of Sentences in Indian Legal Judgments
    2. Overview of the FIRE 2019 AILA Track: Artificial Intelligence for Legal Assistance

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Koustav Rudra

  • Deep Constrained Siamese Network for Near Duplicate Image Detection (Dr. Marco Fisichella)

    The problem of near–duplicate detection consists in finding those elements within a data set which are closest to a new input element, according to a given distance function and a given closeness threshold. Solving such problem for high–dimensional data sets is computationally expensive, since the amount of computation required to assess the similarity between any two elements increases with the number of dimensions. As a motivating example, an image or video sharing website would take advantage of detecting near–duplicates whenever new multi-media content is uploaded. Among different approaches, near–duplicate detection in high–dimensional data sets has been effectively addressed by SimPair LSH [1, 2]. Built on top of Locality Sensitive Hashing (LSH), SimPair LSH computes and stores a small set of near-duplicate pairs in advance, and uses them to prune the candidate set generated by LSH for a given new element. In the paper [2], we developed an algorithm to predict a lower bound of the number of elements pruned by SimPair LSH from the candidate set generated by LSH. 

    With the current research project, we investigate how to extend the SimPair approach involving a deep constrained siamese neural network combined with deep feature learning. We research whether the neural network is able to extract effective features for near duplicate image detection. The extracted features are used to construct a LSH-based index. In summary, the goal of the new proposed LSH-based approach is that it will be able to substantially increase the detection efficiency of the index structure without loss of detection accuracy.

    In conclusion, we are interested to research:

    Tasks:

    1. Develop the approach proposed by the paper [2].
    2. Apply the developed approach to real world datasets.
    3. Revise the approach proposed by the paper [3].
    4. Using a Siamese Neural Network based on CNN (inspired by paper [3]), extend the approach depicted in paper [2].

     

    References:

    1. M. Fisichella, F. Deng, and W. Nejdl. Efficient incremental near duplicate detection based on locality sensitive hashing. In DEXA, 2010. doi.org/10.1007/978-3-642-15364-8_11
    2. M. Fisichella, A. Ceroni, F. Deng, W. Nejdl. Predicting Pair Similarities for Near-Duplicate Detection in High Dimensional Spaces. In DEXA, 2014. doi.org/10.1007/978-3-319-10085-2_5
    3. Weiming Hu, et al. Deep Constrained Siamese Hash Coding Network and Load-Balanced Locality-Sensitive Hashing for Near Duplicate Image Detection. IEEE Trans. Image Process. 27(9). doi.org/10.1109/TIP.2018.2839886

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Marco Fisichella

  • Privacy Attacks on Graph Neural Networks (Dr. Megha Khosla)

    Machine learning (ML) algorithms have been applied on various applications including privacy-sensitive application such as in healthcare and finance.

    Similarly, many real world applications can be modeled as graphs where the node represents entities and the edges the connections between the entities. This kind of relationship mapping preserves the underlying properties of the data. Examples includes friendship network, telephone call network, co-authorship network, biological network, molecules, financial network and disease transmission. A special family of ML model called graph neural network (GNN) have been designed to handle such data.

    The problem though is that ML model tends to learn more than is required and as such, they leak more information when the model is released. This makes it a honeycomb for attackers to exploit. Such attacks includes membership, attribute and property inference attack.

    The drawback in all proposed attacks for ML model is that they are designed for Euclidean or independently and identically distributed (idd) data. However, GNNs utilize the aggregation of the features of the neighboring nodes to make prediction for the node. This makes it a unique problem as well as an opportunity to understand the risk posed by GNN models based on their utilization of the graph structure to make predictions. The aim of this master thesis is to investigate the vulnerabilities of different GNN models to different attacks.

    Prerequisites:

    1. Good knowledge of ML and Graphs
    2. Strong programming background in Python and libraries such as PyTorch

    References:

    • Iyiola E Olatunji, Wolfgang Nejdl, and Megha Khosla. “Membership inference attack on graph neural networks”. In:arXiv preprint arXiv:2101.06570 (2021)
    • Reza Shokri et al. “Membership Inference Attacks Against Machine Learning Models”. In: 2017 IEEE Symposium on Security and Privacy (SP). 2017, pp. 3–18.
    • Ahmed Salem et al. “Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models”. In: arXiv preprint arXiv:1806.01246(2018).
    • S. Yeom et al. “Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting”. In:2018 IEEE 31st Computer Security Foundations Symposium (CSF). 2018, pp. 268–282.

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Megha Khosla Emmanuel Iyiola Olatunji

  • Identifying conserved host response for virus infections such as Covid-19 (Prof. Nejdl, Prof. Li)

    Clinical presentations of COVID-19 are highly variable, and while the majority of patients experiences mild to moderate symptoms, 10%–20% of patients develop pneumonia and severe disease. We recently performed the first single-cell RNA-sequencing of blood cells to determine changes in immune cell composition and activation in mild versus severe COVID-19 over time.1 A recent study based on multi-cohort analysis of host immune response identifies conserved protective and detrimental modules associated with severity across viruses2,3. Therefore, we hypothesized that viral infections induce a conserved host response and the conserved response is associated with disease severity. In this project, we will 1) implement the analysis framework for identifying conserved host response and 2) apply it to transcriptome data from multi-cohorts of COVID-19 and other virus infected patients.

    PDF-Flyer

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. techn. Wolfgang Nejdl Prof. Dr. Yang Li

  • A benchmark for biomedical understanding (Tam Nguyen)

    Psychiatric Disorders (PDs) rank 5th in terms of prevalence and account for 6.7% of “Disability Adjusted Life Years”. We have explored different types of datasets to understand the landscape of research on psychiatric disorders. In particular, we designed a categorization of psychiatric data, including genomics data, molecules data, drug review data, research publication data, clinical data, etc. Moreover, we also built a repository of related venues and downable resources, including top-tier journals (Bioinformatics, ACM Transactions on Computing for Healthcare, Nature Research, Genome Research, Social Psyhiatry and Psychiatric Epidemiology, etc.), top-tier conferences (IEEE International Conference on Bioinformatics and Biomedicine, ACM Conference on Health, Inference, and Learning etc.), and top-tier workshops (AI for public health, BioKDD, WWW AI In Health). In this project, we will develop a benchmark based on the explored datasets and related to the aforementioned research community such as performing question answering to help users cross-check psychiatric facts.

    Possible benchmark topics include:

    1. A Benchmark for (bio)medical Machine Reading Comprehension
    2. Benchmarking the Generality and Domain-Specification of MCR models
    3. Benchmark on active learning for biomedical document annotation
    4. Benchmark on active learning for biomedical image segmentation
    5. Benchmark on active learning for biomedical question answering

     

    An ideal candidate should have:

    • Good background in machine learning, deep learning, and programming (Python or R).
    • Knowledge about data analytics, especially on social media, textual data, and knowledge graph data.
    • Experiences with natural language processing tasks such as machine reading comprehension and question answering is a plus.
    • Love to read and explore scientific articles, preferably every day.
    • Pro-active in learning new things, preferably every day.

    Interested students are encouraged to email to Dr. Tam Nguyen tamnguyen(at)l3s(dot)de for discussions.

    References:

    • Survey Paper 1: Q. Jin, Z. Yuan, G. Xiong, Q. Yu, C. Tan, M. Chen, S. Huang, X. Liu, S. Yu, Biomedical Question Answering: A Comprehensive Review, (2021).
    • Towards Medical Machine Reading Comprehension with Structural Knowledge and Plain Text (EMNLP 2020)
    • BioMRC: A Dataset for Biomedical Machine Reading Comprehension 

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Tam Nguyen

  • Debunking Medical Misinformation - Rootcauses, Benchmarks, and Explanations (Tam Nguyen)

    An abundance of false or misleading medical information has been observed on the Web and particularly on social media, posing a considerable threat to public health while eroding trust in healthcare systems. 6 out of 10 people search for the cause of their medical condition online, and among those who found a diagnosis online, 35% does not visit a professional medical provider. The COVID-19 pandemic has exacerbated this problem by bringing forward an infodemic surrounding the coronavirus that spreads as quickly and deadly as the virus itself. For example, rumours about remedies such as methanol to cure COVID-19 resulted in 300+ deaths and 1000+ people fallen ill. This project aims to build a framework to combat medical misinformation with focus on social media analytics, detection benchmarks, knowledge intensive tasks (question answering, machine reading comprehension, knowledge graph construction) and explainable mitigation measures.

    An ideal candidate should have:

    • Good background in data analytics, especially on social media, textual data, and knowledge graph data.
    • Motivated in learning and exploring the data of interest in some human level annotation.
    • Knowledge about machine learning, deep learning, and programming (Python or R).
    • Love to read and explore scientific articles, preferably every day.
    • Pro-active in learning new things, preferably every day.

    Interested students are encouraged to email to Dr. Tam Nguyen tamnguyen(at)l3s(dot)de for discussions.

    References:

    • Waszak, P.M., Kasprzycka-Waszak, W. and Kubanek, A., 2018. The spread of medical fake news in social media–the pilot quantitative study. Health policy and technology, 7(2), pp.115-118.
    • Naeem, S.B., Bhatti, R. and Khan, A., 2020. An exploration of how fake news is taking over social media and putting public health at risk. Health Information & Libraries Journal.
    • Treharne, T. and Papanikitas, A., 2020. Defining and detecting fake news in health and medicine reporting. Journal of the Royal Society of Medicine, 113(8), pp.302-305.

     

    Leitung und Ansprechpartner der Abschlussarbeit

    Tam Nguyen

  • Online Bahnanpassung für die intelligente Prozessregelung (Svenja Reimer)

    Bisher werden die Bearbeitungsbahnen bei der Fräsbearbeitung im NC-Code fest vorgegeben. Intelligente Überwachungssysteme können Prozessinstabilitäten bei schlecht gewählten Schnittparametern zwar bereits erkennen - eine Regelung der Schnitttiefe und -breite ist durch die fest vorgegebenen Werkzeugbahnen jedoch bislang nicht möglich. Der Inhalt dieser Arbeit ist daher die Entwicklung von Methoden zur autonomen Bahnanpassung.

    Teilaspekte der Arbeit sind:

    • Entwicklung von Algorithmen zur Bildung von parametrischem NC-Code
    • Erarbeitung von Grenzwerten und intelligenten Regelungsansätzen
    • Versuche in der Maschinensimulation und an der realen Werkzeugmaschine

    PDF-Seite

    Leitung und Ansprechpartner der Abschlussarbeit

    Svenja Reimer

  • Intelligente Prozessregelung durch mitlernende Stabilitätskarten (Svenja Reimer)

    Ratterschwingungen im Zerspanprozess führen zu einer schlechten Oberflächenqualität. Das Auftreten von Ratterschwingungen ist eng mit den gewählten Prozessstellgrößen (Schnitttiefe, - breite, Drehzahl) verknüpft. Zur Ermittlung der Zusammenhänge zwischen den Prozessstellgrößen und Prozessstabilität unter zur Wahl optimaler Prozessstellgrößen sind bislang aufwändige Simulationen notwendig. Durch den Einsatz von maschinellem Lernen und modernen Überwachungssystemen können diese Zusammenhänge auch im Prozess selbständig von der Maschine erlernt werden ("mitlernende Stabilitätskarten"). Ziel dieser Arbeit ist die Entwicklung und Umsetzung einer intelligenten Prozessregelung auf Basis der mitlernenden Stabilitätskarten. Die Arbeit umfasst unter anderem folgende Punkte:

    • Entwicklung von Bearbeitungsstrategien zur gezielten Datengenerierung für mitlernende Stabilitätskarten
    • Entwicklung von Strategien für das "online-Lernen" während dem Prozess
    • Entwicklung einer Zielfunktion für die Regelung
    • Zerspanuntersuchungen zur online Anpassung von Prozessstellgrößen

    PDF-Seite

    Leitung und Ansprechpartner der Abschlussarbeit

    Svenja Reimer

  • Understanding Relevance Search over Medical Knowledge Graphs (Prof. Ganguly)

    There is an exponential rise in the amount of medical evidence being produced, which makes it very difficult for medical professionals to stay regularly updated with the recent research studies in order to practice evidence-based medicine. In this master thesis, we aim to accommodate richer query variations for online biomedical literature search and redesign the document collection used for search into a knowledge graph. In other words, we will adapt the “exemplar query” setting developed by Mottin et al. (2016,2018) to the biomedical information retrieval domain and try to achieve state-of-the-art performance in TREC Precision Medicine Track.

    Specifically, the student will get first-hand experience working with unstructured text (clinical notes) and knowledge graphs, in a medical domain. 

    Prerequisites:

    1. Good knowledge of Natural Language Processing (NLP) and Graphs
    2. Strong programming background in Python and working knowledge of standard Machine Learning and NLP libraries

    References:

    1. Mottin et al. Exemplar queries: a new way of searching (VLDB 2016)
    2. Gu et al. Relevance Search over Schema-Rich Knowledge Graphs (WSDM 2019)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Niloy Ganguly Soumyadeep Roy

  • Fairness-aware Online Learning under Class Imbalance (Dr. Vasileios Iosifidis, Prof. Dr. Wolfgang Nejdl)

    Fairness-aware online learning has become an evolving field during the last fewyears. Fairness-aware online learning goal is to maintain a classifier that performswell and does not discriminate over the course of the stream. Some initial works havebeen proposed to tackle discriminatory outcomes from online classification [1, 2];however, these methods do not take into consideration the uneven class distributionover the course of the stream. If the imbalance problem is not tackled, the learnermainly learns the majority class and strongly misclassifies/rejects the minority. Suchmethods might appear to be fair for certain fairness definitions that rely on parity inthe predictions between the protected and non-protected groups. In reality though thelow discrimination scores are just an artifact of the low prediction rates for theminority class.

    In this master thesis, we want to investigate the combined problem of class-imbalanceand fairness-aware learning in the online setup. We focus on Naive Bayes classifierwhich has been extensively studied in the context of fairness but in the static setting.In this work, we plan to extend these models to the online setting taking into accountthe imbalance of the population under different fairness notions such as statisticalparity[3], equal opportunity [4], and equalized odds [4].

    An ideal candidate should be:

    • a self motivated and independent learner
    • knowledgeable about machine learning (good grades in Data Mining I, DataMining II)
    • experienced with python or java

     

    Interested students are encouraged to email to Wolfgang Nejdl and/or Vasileios Iosifidis at for scheduling an appointment. CV and transcript of records must be sent beforehand.

    References

    1. V. Iosifidis, H. Tran, E. Ntoutsi, "Fairness-enhancing interventions in streamclassification", 30th International Conference on Databases and ExpertSystems Applications (DEXA), 2019.
    2. W. Zhang, E. Ntoutsi, "An Adaptive Fairness-aware Decision Tree Classifier",International Joint Conference on Artificial Intelligence (IJCAI), 2019.
    3. Kamiran, F., & Calders, T. (2012). Data preprocessing techniques forclassification without discrimination. Knowledge and Information Systems,33(1), 1-33.
    4. Hardt, M., Price, E. and Srebro, N., 2016. Equality of opportunity insupervised learning. In Advances in neural information processing systems (pp.3315-3323).

     

     

    Leitung und Ansprechpartner der Abschlussarbeit

    M.Sc. Vasileios IosifidisProf. Dr. techn. Wolfgang Nejdl

  • Question Answering Using Deep Learning (Prof. Anand)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Avishek Anand

  • Faster Inference for Deep Neural Rankers (Prof. Anand)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Avishek Anand

  • Interpretability of Neural Models (Prof. Anand)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Avishek Anand

  • Neural Information Retrieval (Prof. Anand)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Avishek Anand

  • Data Analytics and Mobility (Dr. Elena Demidova, M.Sc. Nicolas Tempelmeier)

    Cities of the future have a growing demand in intelligent mobility services andinfrastructure to support better mobility and enhance quality of life in urban areas. Anincreasing availability of urban data holds a great potential to facilitate efficient mobilityservices and infrastructure, for instance, through a better understanding of long-termtrends (such as e-mobility) and their impact on transportation needs, or the correlation ofmobility behavior in densely populated areas with influence factors such as weather,regional events or temporal fluctuations. Extraction, integration and analysis ofheterogeneous mobility-related urban data is of interest to various stakeholder groups,including city inhabitants, city councils, providers of mobility services and publictransportation. While data is spread across heterogeneous institutional repositories andWeb platforms, semantic technologies and machine learning methods can be exploited toenable the extraction and analysis of data.

    Examples of such data are:

    • Queries for public transportation services
    • Historical traffic speed records
    • Environmental data
    • Warnings about traffic incidents
    • Location of construction sites
    • Calendar of scheduled events

     

    Possible topics for a MSc thesis in this context could address one of the following research questions through utilizing state-of-the-art machine learning and data mining methods:

    • Prediction of road traffic from public transportation query logs
    • Estimating the impact of events
    • Impact of traffic incidents
    • Verification of traffic incidents•Prediction of congestion and bottlenecks in traffic
    • Event detection

     

    For further information, see the homepage of the project “Data4UrbanMobility”: http://d4um.l3s.uni-hannover.de

    Leitung und Ansprechpartner der Abschlussarbeit

    Dr. Elena Demidova M.Sc. Nicolas Tempelmeier

  • Modelling and predicting the knowledge state of students (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Search as learning (SaL): Investigating the impact of images and videos in SaL scenarios (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Automatic highlighting of importants segments in educational videos (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Automatic question generation for (educational) videos (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Using knowledge graphs for video question answering (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Automatic captioning for scholarly figures (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Analysing and Linking Graphical Representations in Computer Science Publications to their Implementation (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Linking Formulas in Scientific Publication with their Software Implementation (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Information extraction from scientific (textual) publications for knowledge graph enrichment (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Improving OCR for recognizing text in scientific videos (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth

  • Semi-automatic enrichment of scientific videos with external recommendations (Prof. Ewerth)

    Leitung und Ansprechpartner der Abschlussarbeit

    Prof. Dr. Ralph Ewerth