Projects

All-atom Diffusion Transformers for Unified 3D Molecular and Material Generation

Background The All-atom Diffusion Transformer (ADiT) represents a significant advancement in the generative modeling of 3D atomic systems. Unlike traditional models that are tailored specifically for either molecules or materials, ADiT introduces a unified latent diffusion framework capable of jointly generating both periodic materials and non-periodic molecular systems using a single model. This is achieved through a two-stage process: An autoencoder maps unified, all-atom representations of molecules and materials to a shared latent embedding space. ...

Supervised Principal Component Analysis using Neural Autoencoders

Motivation and Background Principal Component Analysis (PCA) is a widely-used method for dimensionality reduction, capturing maximum variance in data through orthogonal linear transformations. However, standard PCA ignores label information, potentially overlooking directions critical for predictive tasks. By incorporating label information into PCA, supervised PCA can extract dimensions directly related to the target variable, enhancing predictive performance and interpretability. Objectives The primary objective of this project is to develop a supervised neural network autoencoder (NN-AE) that integrates label information into PCA by learning orthogonal basis functions informed by supervised targets. This methodology aims to enhance interpretability and predictive accuracy relative to traditional PCA. ...

Geometric Deep Generative Models for Scientific Data

Add link to the project on the DTU website: Background At the Machine Learning in Life Science (MLSS) Research Center, we have been developing methodologies that we aim to apply and test on real-world data. In short, our work focuses on latent representations, typically obtained from a Variational Autoencoder (VAE), with the goal of extracting meaningful and reliable knowledge from them. An example of this research direction is this article, which explores representations of protein sequences. They found that the geodesic distances seem to recover evolutionary development protein sequences. ...

Enhancing Relative Representations using Custom Weighted Mahalanobis Distance

Background Relative representations are a powerful tool in machine learning and data analysis, where data points are represented based on their distances or similarities to a set of reference points called anchors. Traditional methods often rely on similarity measures, such as cosine similarity, which are invariant under rotation and scaling but may not satisfy the properties of a metric space, particularly the triangle inequality. This limitation can hinder the effectiveness of certain algorithms that require a proper distance metric. ...

Diffusion models for generating molecules in 3-d

Background Diffusion models are rapidly advancing the field of 3D molecular generation, offering new tools for applications in drug discovery and materials science. These models generate realistic molecular structures by iteratively refining noisy inputs, capturing the intricate spatial relationships crucial to molecular properties. The aim of this project is to explore Equivariant Neural Diffusion (END), an innovative 3D molecular generation model that preserves equivariance to Euclidean transformations. END stands out for its learnable forward process, parameterized in a time- and data-dependent manner, ensuring robust equivariance to rigid transformations. The project will involve extending the capabilities of END, benchmarking its performance on standard molecular generation datasets, and refining its generative accuracy and scalability to enhance its utility in molecular modeling applications. ...

Equivariant graph neural networks for molecular modeling.

Background Graph Neural Networks (GNNs) have become powerful tools for modeling molecular systems, with applications in drug discovery and materials science. Equivariant GNNs, which preserve symmetries like rotations and translations, are especially well-suited for molecular modeling as they ensure that the model’s output changes consistently with the molecular structure’s orientation. This capability enhances accuracy and generalization, making them valuable for tasks such as drug discovery and material design. However, challenges remain around their robustness, generalization, and uncertainty quantification (UQ) capabilities. Reliable UQ is crucial for scientific applications, where predictions need to be interpretable and uncertainties well-calibrated. ...

Graph neural networks based on geometric algebra

Background Geometric Algebra (GA) provides a unified mathematical framework for representing and manipulating geometric entities and transformations in arbitrary dimensions. Its ability to elegantly encode rotations, reflections, and other symmetries makes it an ideal tool for advancing geometric deep learning. While current Graph Neural Networks (GNNs) have made significant strides in processing molecular data, they often rely on specialized techniques to handle equivariance and fail to fully leverage the expressive power of GA. ...

Variational inference with the spacings estimator

Background Variational inference (VI) is a key framework in Bayesian deep learning, enabling scalable approximations of complex posterior distributions. Accurate entropy estimation is critical in VI but remains challenging, particularly for high-dimensional or multimodal distributions. Traditional methods, such as closed-form approximations or Monte Carlo sampling, can be computationally intensive or inaccurate. The spacings estimator, a non-parametric technique leveraging the ordering and spacing of samples, offers a promising alternative for efficient and robust entropy estimation. ...

Improved Spectral Analysis for Greenhouse Gas Monitoring

Background As the world transitions from fossil fuels to renewable energy sources, natural gas ($CH_4$) has gained prominence as a cleaner alternative to coal and oil due to its higher $\Delta Hr /CO_2$ ratio. However, CH4 is a potent greenhouse gas, with a global warming potential equivalent to 30.5 $CO_2$ molecules over a 20-year horizon. This project aims to enhance the accuracy of unburned $CH_4$ emissions monitoring, supporting efforts to reduce greenhouse gas emissions and align with global sustainability objectives. ...

Photon-Matter Interaction Detection Using Machine Learning and Computer Vision

Background The i-RASE project is a high-impact research collaboration that brings together leading international partners, including DTU Space and DTU Compute, to develop the first real-time photon-by-photon radiation detector. This novel system has transformative potential in fields such as medical imaging, industrial inspection, scientific space instrumentation, and environmental monitoring, where rapid, precise photon detection is essential. The project seeks to design an intelligent, compact, and energy-efficient system-in-package (SIP) that combines physics-inspired artificial neural networks with advanced signal processing to achieve unprecedented accuracy and speed in photon interaction detection. ...

Optimizing masking-based XAI for enhanced interpretability of deep learning models

Background Explainability is a necessary component for implementation of deep learning models in domains with critical decision-making, such as healthcare, finance and climate. The black-box nature of the models makes them less trustworthy and the aim of eXplainable AI (XAI) is to open to black box. Masking-based methods uses repeated perturbation of the input to measure the change in the output and assess the relevance of each input pixel. The relevance is either estimated using Monte Carlo sampling of the masks [2, 3] or by optimizing the masks using back-propagation [1, 4]. Both of these methods have drawbacks, since the first requires repeatedly sampling many (potentially redundant) masks in the input space, while the latter requires access to the model gradients, which may be detrimental to the safety of the models. ...

Self-supervised Learning for Point Clouds

Background 3Shape developes hardware and software for intraoral scanners. This project will revolve around the 3D mesh data of intraoral scans (scan of people’s teeth). You will have the opportunity to work part-time from their offices in Kongens Nytorv. Deep self-supervised learning has been established as a great foundational model for a plethora of downstream tasks. Especially SimCLR has been shown to perform very well on downstream tasks. However, these empirical results have not been replicated in the 3D domain. In this project, we would like to explore how to apply this to the 3D models of teeth that 3Shape works with. Initially, we would like to benchmark some established methods on our data and then we would like to explore how to translate these methods to the 3D space. ...

Style Transfer for Point Clouds

Background 3Shape developes hardware and software for intraoral scanners. This project will revolve around the 3D mesh data of intraoral scans (scan of people’s teeth). You will have the opportunity to work part-time from their offices in Kongens Nytorv. Style transfer involves merging the style of one image with the content of another. This process typically involves optimizing an objective function that minimizes the content loss and style loss. The result is an image that retains the original content while adopting the visual characteristics of the chosen style. However, for point clouds this is not clear as the content and style is not easily separable as the content and style are both given in the point cloud geometry. Previous works have considered the coloring of the point cloud to be the style, however, this is not satisfactory. ...

Adaptive, generalized and personalized preference models for speech enhancement

Background Speech enhancement, the process of improving the quality speech signals, can not only improve quality of experience for listeners and the quality of communication, it can also aid the performance of machine-and deep-learning models in downstream tasks. However, the challenge of the trade-off between noise removal and artifact incorporation is ongoing [1]. The project aims to investigate the factors influencing noise reduction preferences and develop a technical framework around it. Low data-resources will be an important consideration in this project. ...

Low-resource speech technology for healthcare

Background We are seeking students interested advancing speech technology in low-resource environments. The project is sufficiently open-ended and will be focused on developing machine learning models and algorithms tailored to address the unique challenges posed by limited data and computational resources in speech processing, also in high-stakes applications like healthcare and education. Objective(s) Potential directions are: Research and develop novel machine learning techniques optimized for low-resource speech technology applications. Design and implement efficient algorithms for speech recognition, synthesis, and understanding in resource-constrained settings. Conduct experiments, analyze results, and iterate on models to continuously improve performance and robustness. Contribute to the development of tools and frameworks to streamline the deployment and evaluation of low-resource speech models. Requirements Need to have: ...

Characterizing Knowledge Graphs using Dirichlet-Multinomial Soft Stochastic Block Modeling

Background Knowledge graphs are widely used to represent various types of entity relationships and several methodologies have been developed to characterize and predict their structure, see also [1,2] for surveys. Typically these characterizations are based on various approaches to characterizing similarities among the entities. Recently, it has been demonstrated that the Dirichlet-ulmtinomial stochastic block model can be used to identify entite structures in terms of how they optimally differentiate in their relational structure across a set of graphs (i.e., across the relationships) [3,4]. This project will advance such modeling approaches to provide a scalable and new framework for the modeling of knowledge graphs in which entities are defined in terms of such optimally differentiating properties. ...

Geometric Analysis of Deep Representations

Background Modern deep neural networks, especially those in the overparameterized regime with a very high number of parameters, perform impressively well. Traditional learning theories contradict these empirical results and fail to explain this phenomenon, leading to new approaches that aim to understand why deep learning generalizes. A common belief is that flat minima [1] in the parameter space lead to models with good generalization characteristics. For instance, these models may learn to extract high-quality features from the data, known as representations. However, it has also been shown that models with equivalent performance can exist at sharp minima [2, 3]. In this project, we will study from a geometric perspective the learned representations of deep learning models and their relationship to the sharpness of the loss landscape. We will also consider in the same framework additional aspects of training that enhance generalization. ...

Geometric Bayesian Inference

Background Bayesian neural networks is a principled technique to learn the function that maps input to output, while quantifying the uncertainty of the problem. Due to the computational complexity exact inference is prohibited and among several approaches Laplace approximation is a rather simple but yet effective way for approximate inference [1]. Recently, a geometric extension relying on Riemannian manifolds has been proposed that enables Laplace approximation to adapt to the local structure of the posterior [2]. This new Riemannian Laplace approximation is effective and meaningful, but it comes with an increase to the computational cost. In this project, we will consider techniques to: 1) improve the computational efficiency of the Riemannian Laplace approximation, and 2) provide a relaxation of the basic approach that is potentially fast while retaining the geometric characteristics. ...

Subnetwork Learning for Laplace Approximations

Background The Laplace approximation is a promising approach to posterior approximation which can address some core issues of deep learning such as poor calibration. Scaling this method to large parameters spaces is intractable because the covariance matrix is quadratic in the number of neural network parameters and hence cannot be stored in memory. A proposed solution to this problem is to only treat a subset of the parameters as stochastic [1, 2, 3] and treat the rest as deterministic. However the method of selecting a subnetwork is still an open problem. In this project we will explore the possibility of learning optimal subnetwork structure by instantiating the small covariance matrix and backpropogating through a Bayesian loss function (ELBO, Marginal Likelihood, Predictive Posterior Distribution). ...

Related items in large knowledge graphs

Background Wikidata has over 100 million items. Finding related items in such a large knowledge graph with an interactive application is a challenge. Wembedder is simple system that works with RDF2vec and Gensim embedding and runs as a web service from https://wembedder.toolforge.org/ providing interactive related items search. However, it embeds only around half a million items and needs to be retrained if new additions or modifications appear in Wikidata. Wembedder runs on a platform that only provides a few gigabytes of memory while embedding of all over 100 million Wikidata items would require much more than that. ...

Network Analysis to (Improve Stem) Cell Differentiation

Background Stem cells are a type of cells that can differentiate into other cell types as well as self-renew, which makes them of interest for many different medical applications. Please read https://stemcells.nih.gov/info/basics/stc-basics for a basic introduction to stem cells. The field of systems biology focuses on investigating complex biological processes instead of single entities, which often focuses on computationally modeling and analyzing the data as networks/ graphs. Such networks are for example protein protein interaction networks, regulation network or gene gene co-expression networks. ...

Prediction of Drug Induced Gene Expression Perturbations through Drug Target and Protein-Protein Interaction Information

Background Transcriptomics provide insights into gene expression and with it the ability to analyze one of the fundamental processes of life - the translation from gene to protein. Single Cell RNA sequencing (scRNAseq) is a technology that measures transcriptomics on the single cell level. However, biological data is highly complex, variability and noisy, making it challenging to analyze and work with. The goal of the project is to evaluate if deep learning can infer gene expression profiles of specific conditions (exposures) by only receiving prior information about an exposure, such as a drug’s known gene targets as well as a general protein-protein interaction network. The aim is to evaluate the model based on its zero-shot performance (e.g. unseen drugs). ...

Transfer learning & Training of (Explainable) Deep Learning Model for Single Cell Transcriptomics

Background Transcriptomics provide insights into gene expression and with it the ability to analyse one of the fundamental processes of life - the translation from gene to protein. single Cell RNA sequencing (scRNAseq) is a technology that measures transcriptomics on the single cell level. However, biological data is highly complex, variability and noisy, making it challenging to analyse and wo. By building on pre-trained general scRNAseq deep learning model we want to fine-tune and train the model task specific. Examples of existing models are Geneformer https://www.nature.com/articles/s41586-023-06139-9 or scGPT https://www.nature.com/articles/s41592-024-02201-0. If the student decides that existing models are not suitable there is also the option to build/train from scratch. ...

Detecting consciousness in clinically unresponsive patients with brain injury

Background Each year, traumatic brain injury results in 1.5 million hospital admissions in the EU. Of all comatose patients with traumatic brain injury, 40% die in the ICU and 20% enter a prolonged disorder of consciousness, seemingly unaware of themselves and their environment. Recent studies indicate that 15-20% of these behaviorally unresponsive patients have residual (covert) consciousness. Detecting consciousness in those people is challenging, but of utmost importance since the presumed presence or absence of consciousness affects medical decisions about treatment, including prognosis and end-of-life decisions. Consciousness can be detected from measurements of brain activity even in patients who are unable to overtly respond. ...

Functional dependency between EEG oscillatory activity and pupil dilation

Background Working memory engage the frontal lobes and the autonoumous nervous system. The activation of the frontal lobes can be seen as an increase in frontal midline theta (4-6 Hz) activity in the EEG signal. The activation of the autonomous nervous system can be seen as a dilation of the pupils. These two signals thus co-occur when humans use their working memory. The temporal dependency between the two signals is, however, poorly understood. We do not know, for example, to what extend we can predict the development of one signal from the other. A better understanding of this dependency may provide tools for diagnosing poor working memory function as seen in e.g. patients with dementia. It may also provide a useful tool for detecting extended working memory load and the fatigue that results therefrom. ...

Blind Non-linear Equalization Using Variational Autoencoders

Background In digital communication the goal is to send information, usually represented by bits, from A (transmitter, Tx) to B (receiver, Rx). At some point in this process, the bits “meet” the physical world in the form of a channel. In optical communication, light from a laser is used to carry the information that travels through an optical fiber and is then detected at receiver using a photodiode. However, the optical fiber channel does not perfectly pass on the light as it will be attenuated and distorted the longer the light travels. ...

Probabilistic Tensor Trains for Large-scale Linear Problems

Background This project will develop arithmetic’s for Probabilistic Tensor Train Decomposition (Hinrich & Mørup, 2019) and apply it to solving large-scale linear problems which are relevant for a variety of applications, e.g. in quantum computing/physics, machine learning, and finance. First, consider the linear problem, A x = b, where A, x, and b are an N by M matrix, M by 1 vector, and N by 1 vector, respectively. In large-scale linear problems M and N are large numbers that have exponential scaling, e.g. $M=m^d$ and $N=n^d$ for some m,n < 10, but d in the tens or hundreds which leads to exponential computational and storage complexity for conventional methods. ...

Bayesian VAEs - Linearized Laplace Approximation - Can B-VAEs generate meaningful examples from its latent space?

Background Variational Auto-Encoders (VAEs) are useful models for learning non-linear latent representations of data. Usually, VAEs are learned by obtaining an approximation of the maximum likelihood estimate of the parameters through the evidence lower bound. In VAEs Bayesian Variational Auto-Encoders (B-VAEs), we obtain, instead, a posterior distribution of the parameters, leading to a more robust model, e.g., for out of distribution detection (Daxberger and Hernández-Lobato, 2019). Bayesian VAEs are typically trained by learning a variational posterior approximation of both the latent variable and the model parameters. ...

Benchmarking Data Augmentation Techniques

Background Data augmentation is used to increase the diversity of a dataset by applying various transformations to the existing data, which helps improve model generalization, performance, and robustness while reducing the risk of overfitting. For this project, you will do a rigorous survey and comparison of existing methods and metrics. Currently, it is difficult to compare effects of different data augmentation strategies. Firstly, you will research existing relevant methods, metrics and underlying tasks in order to design a comprehensive data augmentation benchmark. You will evaluate methods using well-established benchmark datasets in a selected data domain, which could be images, time series, or molecular graph data. ...

CHEAT-GPT4I:Controlled Human-out-of-thE-loop AssisTed-GPT for Instructors - automating exam production in the age of burned-out teachers.

Background Being a teacher at DTU is challenging, facing severe issues of work-life balance, in particular it is more exciting to work on the latest research than having to be occupied by administrative tasks such as developing exam sets for assessments of student learning outcomes. For instance, in the course 02450 three exam sets are required to be generated pr. year each of which requires a minimum of one week of full time work to complete. ...

CHEAT-GPT4S: Controlled Human-out-of-thE-loop AssisTed-GPT for Students - automating report production in the age of lazy students and evil teachers.

Background Being a student at DTU is challenging facing severe issues of work-life balance, in particular facing horrible teachers with unreasonable perceptions of what is fair in terms of course workloads. Recent efforts have tried to guide students using the laziness barometer using the course-analyzer. However, some study programmes still enforce tough courses on students that score unreasonably high on the required work-load. One such example includes the 02450 Introduction to Machine Learning and Data Mining course which includes two reports during the semester with extensive work efforts required to timely hand-in a satisfactory report product. ...

Computational Metacognition

Background Metacognition is thinking about thinking see e.g., Akturk and Sahin 2011. Metacognition in intelligent systems refers e.g. to processes/strategies for monitoring/controlling performance, uncertainty quantification, explainability, decision-making, and learning capabilities. Subcategories include metamemory, meta-awareness and elements of social cognition (theory of mind) Can we translate tools for evaluating metacognitive awareness in humans for use in AI? Can we design tools for evaluating the levels of metacognition in machines? Objective(s) i) Review natural and artificial metacognition - similarities and contrasts ...

Effective simulation of spreading processes from privacy-preserving location data.

Background [In recent times, mobile phone operators have been sharing aggregated location data with researchers to study real-world phenomena such as epidemic spreading.]{.mark} [The aggregated data are not always appropriate for modeling contagion processes, due to their aggregated nature. On the other hand, aggregation is important to preserve individuals' privacy. How can we aggregate mobility data in a way that still enables us to effectively study contagion processes (such as epidemics spreading)?]{.mark} ...

Explainability of Multimodal Models

Background A multimodal model is any model that takes in one or more different modalities of data. This could be text and images, image and audio ect. An example of this is the CLIP model [1], that learns embeddings of text and images simultaneously. Common methods for explainability in general focus on a single modality e.g. what part of an image was important for a given prediction. The purpose of this project is to investigate how methods for single modality explainability can be extended to multimodality data. ...

GPT^2^A: Generative Pretrained Transformers as Teaching Assistants - scaling report evaluations in the age of limited TA resources.

Background Correcting reports are a very time consuming task in courses that is challenged by limited resources. Historically, we have a lot of data on carefully corrected reports using rubric evaluation criterias. This project will explore the use of large language models (LLMs) and in particular recent developments enhancing LLMs with multimodality (i.e., image comprehension) [2] to enable an automated report evaluation system. The data for the project will be historically evaluated 02450 Introduction to Machine Learning and Data Mining reports containing in the order of 6 years of two semesters with each about 200 groups performing two reports. Report 1 contains 23 evaluation criterias on a likert scale from 0 to 4 whereas report 2 contains 17 evaluation criterias. Additionally, an overall evaluation of the report quality is also provided that is used to assess the students performance in the course. ...

Guided representation learning of medical signals using textual (meta) data

Background When designing and training machine learning architectures for medical data, we often neglect prior knowledge about the data domain. In this project, we want to investigate whether we can find shared information between medical signals and their corresponding textual descriptions. Current advances in the field of NLP have made it possible to learn rich contextual information from textual data. Given cross-modality observations, Relative Representations make it possible to compare and align latent spaces of different modalities or models. ...

Human Data Fusion

Background Write a few paragraphs with background info on your project. Remember to include information regarding available datasets. data: eeg; eye-tracking: gaze position (fixations and saccades), pupil size, posture; skin conductance; eeg, ecg (and hrv, ...), blood oxygen saturation, breathing rate. contextual data synchronization of data multiple persons fusion of multi-modal data that has been recorded at different times (varying contexts). provide data sources. ________________________________________________________________________________________________________ Project 1: Create a common framework to make it easier to apply ML methods to an otherwise heterogeneous set of datas-sources. ...

Identifying Archetypes in Social Media Users

Background The goal of this project is to find archetypes of social media users, exploiting highly complex individual sequences capturing the usage of different social media apps (Facebook, Instagram). Archetypal Analysis (AA), is a highly interpretable decomposition technique renowned for identifying distinct characteristics known as archetypes, expressed as convex combinations of the data observations. The analysis will be based on a large-scale dataset collected from smartphones that capture app-activity as well as other behaviors (physical activity, sleep) for ~1M individuals world-wide. ...

Learning Data Augmentation

Background Data augmentation is used to increase the diversity of a dataset by applying various transformations to the existing data, which helps improve model generalization, performance, and robustness while reducing the risk of overfitting. For this project, you will begin by surveying existing methods and metrics. Afterwards, you will focus on creating new methods for learning a data augmentation scheme in order to optimize one or more selected metrics. You will evaluate your methods using well-established benchmark datasets in a selected data domain and task, which could be images, time series, or molecular graph data, e.g. for classification. ...

Sugar translation tool

Background Polysaccharides, also commonly known as sugars, are one of the main molecules in living beings, and they present a complex structure that may include a variety of residues and ramifications. There exists more than 4 systems that represent these structures, but in some cases it cannot be 1-1 translated. Some science backgrounds may be more used to one of these naming systems, which prevents transcommunication between fields. Also, due to the complexity of the polysaccharides it is not straightforward to understand the systems for newcomers. ...

Unsupervised Speech Enhancement

Background Speech enhancement is the task of recovering clean speech from noisy speech that has been degraded due to e.g. background noise, interfering speakers or reverberation in poor acoustic conditions. Deep learning-based speech enhancement is typically trained in a supervised setup using datasets consisting of separate clean speech and background noise samples, which are combined at training time and the network is then trained to directly recover the clean speech from the artificially corrupted mixture. This artificial mixing step limits the diversity of data that models are exposed to and harms the resulting models’ ability to generalize. ...