Adaptive, generalized and personalized preference models for speech enhancement

Background Speech enhancement, the process of improving the quality speech signals, can not only improve quality of experience for listeners and the quality of communication, it can also aid the performance of machine-and deep-learning models in downstream tasks. However, the challenge of the trade-off between noise removal and artifact incorporation is ongoing [1]. The project aims to investigate the factors influencing noise reduction preferences and develop a technical framework around it....

May 27, 2024

Low-resource speech technology for healthcare

Background We are seeking students interested advancing speech technology in low-resource environments. The project is sufficiently open-ended and will be focused on developing machine learning models and algorithms tailored to address the unique challenges posed by limited data and computational resources in speech processing, also in high-stakes applications like healthcare and education. Objective(s) Potential directions are: Research and develop novel machine learning techniques optimized for low-resource speech technology applications. Design and implement efficient algorithms for speech recognition, synthesis, and understanding in resource-constrained settings....

May 27, 2024

Functional dependency between EEG oscillatory activity and pupil dilation

Background Working memory engage the frontal lobes and the autonoumous nervous system. The activation of the frontal lobes can be seen as an increase in frontal midline theta (4-6 Hz) activity in the EEG signal. The activation of the autonomous nervous system can be seen as a dilation of the pupils. These two signals thus co-occur when humans use their working memory. The temporal dependency between the two signals is, however, poorly understood....

December 7, 2023 · Tobias Andersen

Probabilistic Tensor Trains for Large-scale Linear Problems

Background This project will develop arithmetic’s for Probabilistic Tensor Train Decomposition (Hinrich & Mørup, 2019) and apply it to solving large-scale linear problems which are relevant for a variety of applications, e.g. in quantum computing/physics, machine learning, and finance. First, consider the linear problem, A x = b, where A, x, and b are an N by M matrix, M by 1 vector, and N by 1 vector, respectively. In large-scale linear problems M and N are large numbers that have exponential scaling, e....

November 23, 2023

Bayesian VAEs - Linearized Laplace Approximation - Can B-VAEs generate meaningful examples from its latent space?

Background Variational Auto-Encoders (VAEs) are useful models for learning non-linear latent representations of data. Usually, VAEs are learned by obtaining an approximation of the maximum likelihood estimate of the parameters through the evidence lower bound. In VAEs Bayesian Variational Auto-Encoders (B-VAEs), we obtain, instead, a posterior distribution of the parameters, leading to a more robust model, e.g., for out of distribution detection (Daxberger and Hernández-Lobato, 2019). Bayesian VAEs are typically trained by learning a variational posterior approximation of both the latent variable and the model parameters....

November 15, 2023

Benchmarking Data Augmentation Techniques

Background Data augmentation is used to increase the diversity of a dataset by applying various transformations to the existing data, which helps improve model generalization, performance, and robustness while reducing the risk of overfitting. For this project, you will do a rigorous survey and comparison of existing methods and metrics. Currently, it is difficult to compare effects of different data augmentation strategies. Firstly, you will research existing relevant methods, metrics and underlying tasks in order to design a comprehensive data augmentation benchmark....

November 15, 2023

CHEAT-GPT4I:Controlled Human-out-of-thE-loop AssisTed-GPT for Instructors *- automating exam production in the age of burned-out teachers.*

Background Being a teacher at DTU is challenging, facing severe issues of work-life balance, in particular it is more exciting to work on the latest research than having to be occupied by administrative tasks such as developing exam sets for assessments of student learning outcomes. For instance, in the course 02450 three exam sets are required to be generated pr. year each of which requires a minimum of one week of full time work to complete....

November 15, 2023

CHEAT-GPT4S: Controlled Human-out-of-thE-loop AssisTed-GPT for Students *- automating report production in the age of lazy students and evil teachers.*

Background Being a student at DTU is challenging facing severe issues of work-life balance, in particular facing horrible teachers with unreasonable perceptions of what is fair in terms of course workloads. Recent efforts have tried to guide students using the laziness barometer using the course-analyzer. However, some study programmes still enforce tough courses on students that score unreasonably high on the required work-load. One such example includes the 02450 Introduction to Machine Learning and Data Mining course which includes two reports during the semester with extensive work efforts required to timely hand-in a satisfactory report product....

November 15, 2023

Computational Metacognition

Background Metacognition is thinking about thinking see e.g., Akturk and Sahin 2011. Metacognition in intelligent systems refers e.g. to processes/strategies for monitoring/controlling performance, uncertainty quantification, explainability, decision-making, and learning capabilities. Subcategories include metamemory, meta-awareness and elements of social cognition (theory of mind) Can we translate tools for evaluating metacognitive awareness in humans for use in AI? Can we design tools for evaluating the levels of metacognition in machines? Objective(s) i) Review natural and artificial metacognition - similarities and contrasts...

November 15, 2023

Effective simulation of spreading processes from privacy-preserving location data.

Background [In recent times, mobile phone operators have been sharing aggregated location data with researchers to study real-world phenomena such as epidemic spreading.]{.mark} [The aggregated data are not always appropriate for modeling contagion processes, due to their aggregated nature. On the other hand, aggregation is important to preserve individuals' privacy. How can we aggregate mobility data in a way that still enables us to effectively study contagion processes (such as epidemics spreading)?...

November 15, 2023

Explainability of Multimodal Models

Background A multimodal model is any model that takes in one or more different modalities of data. This could be text and images, image and audio ect. An example of this is the CLIP model [1], that learns embeddings of text and images simultaneously. Common methods for explainability in general focus on a single modality e.g. what part of an image was important for a given prediction. The purpose of this project is to investigate how methods for single modality explainability can be extended to multimodality data....

November 15, 2023

GPT^2^A: Generative Pretrained Transformers as Teaching Assistants *- scaling report evaluations in the age of limited TA resources.*

Background Correcting reports are a very time consuming task in courses that is challenged by limited resources. Historically, we have a lot of data on carefully corrected reports using rubric evaluation criterias. This project will explore the use of large language models (LLMs) and in particular recent developments enhancing LLMs with multimodality (i.e., image comprehension) [2] to enable an automated report evaluation system. The data for the project will be historically evaluated 02450 Introduction to Machine Learning and Data Mining reports containing in the order of 6 years of two semesters with each about 200 groups performing two reports....

November 15, 2023

Guided representation learning of medical signals using textual (meta) data

Background When designing and training machine learning architectures for medical data, we often neglect prior knowledge about the data domain. In this project, we want to investigate whether we can find shared information between medical signals and their corresponding textual descriptions. Current advances in the field of NLP have made it possible to learn rich contextual information from textual data. Given cross-modality observations, Relative Representations make it possible to compare and align latent spaces of different modalities or models....

November 15, 2023

Human Data Fusion

Background Write a few paragraphs with background info on your project. Remember to include information regarding available datasets. data: eeg; eye-tracking: gaze position (fixations and saccades), pupil size, posture; skin conductance; eeg, ecg (and hrv, ...), blood oxygen saturation, breathing rate. contextual data synchronization of data multiple persons fusion of multi-modal data that has been recorded at different times (varying contexts). provide data sources. ________________________________________________________________________________________________________ Project 1: Create a common framework to make it easier to apply ML methods to an otherwise heterogeneous set of datas-sources....

November 15, 2023

Identifying Archetypes in Social Media Users

Background The goal of this project is to find archetypes of social media users, exploiting highly complex individual sequences capturing the usage of different social media apps (Facebook, Instagram). Archetypal Analysis (AA), is a highly interpretable decomposition technique renowned for identifying distinct characteristics known as archetypes, expressed as convex combinations of the data observations. The analysis will be based on a large-scale dataset collected from smartphones that capture app-activity as well as other behaviors (physical activity, sleep) for ~1M individuals world-wide....

November 15, 2023

Learning Data Augmentation

Background Data augmentation is used to increase the diversity of a dataset by applying various transformations to the existing data, which helps improve model generalization, performance, and robustness while reducing the risk of overfitting. For this project, you will begin by surveying existing methods and metrics. Afterwards, you will focus on creating new methods for learning a data augmentation scheme in order to optimize one or more selected metrics. You will evaluate your methods using well-established benchmark datasets in a selected data domain and task, which could be images, time series, or molecular graph data, e....

November 15, 2023

Sugar translation tool

Background Polysaccharides, also commonly known as sugars, are one of the main molecules in living beings, and they present a complex structure that may include a variety of residues and ramifications. There exists more than 4 systems that represent these structures, but in some cases it cannot be 1-1 translated. Some science backgrounds may be more used to one of these naming systems, which prevents transcommunication between fields. Also, due to the complexity of the polysaccharides it is not straightforward to understand the systems for newcomers....

November 15, 2023

Unsupervised Speech Enhancement

Background Speech enhancement is the task of recovering clean speech from noisy speech that has been degraded due to e.g. background noise, interfering speakers or reverberation in poor acoustic conditions. Deep learning-based speech enhancement is typically trained in a supervised setup using datasets consisting of separate clean speech and background noise samples, which are combined at training time and the network is then trained to directly recover the clean speech from the artificially corrupted mixture....

November 15, 2023

Unsupervised Speech Enhancement

Background Speech enhancement is the task of recovering clean speech from noisy speech that has been degraded due to e.g. background noise, interfering speakers or reverberation in poor acoustic conditions. Deep learning-based speech enhancement is typically trained in a supervised setup using datasets consisting of separate clean speech and background noise samples, which are combined at training time and the network is then trained to directly recover the clean speech from the artificially corrupted mixture....

November 15, 2023