For Master's theses, Bachelor's theses or for Software Engineering projects in the Master's program
(Most topics can be adapted in scale to fit any of the above categories)
The goal of this project is to develop a web-based tool where a lecturer can ask questions and students can provide
answers interactively via a notebook or a mobile. The answers should be immediately evaluated and visualized
(individually or in summary). Question types should be multiple choice and free text (maybe also source code).
For developing the tool, simplicity and responsiveness are more important than the number of features.
Time Series-based Event Prediction with Metadata in a Multi-System Environment
Machine learning models that can predict performance-relevant events are of high
interest when dealing with large software systems. However, due to their huge
sizes and diversity, creating appropriate machine learning models might require
finding common subparts of these systems to learn from similar parts rather than
from entire but diverse systems.
The goal of this thesis is to use monitoring time series data, performance event
data and metadata of a multi-system environment to train and test machine learning
models to predict these performance events. The metadata should be used to find
similarities among the systems and their components, and then to utilize these
similarities to create machine learning models.
Resource Exhaustion Prediction in a Multi-System Environment
Resource exhaustion, such as high CPU load, low memory or disk space, are common
problems in software systems. One way of dealing with such problems is to create
predictive models to detect early signs of exhaustion, which enables administrators
to take actions before the actual exhaustion occurs.
The goal of this thesis is to use monitoring time series data of a multi-system
environment to (1) create exhaustion events with a custom heuristic which indicate
resource exhaustion, and (2) to train and test machine learning models to predict
these exhaustion events.
Time Series-based Predictive Maintenance with RapidMiner
Prediction of failures and events plays an important role in today's software systems. An open challenge
is to find similarities in failures across different systems. The sheer amount of available monitoring
and event data in this field requires tool support, such as RapidMiner
a powerful data science environment which enables fast prototyping and validation of predictive models.
The goal of this thesis is to use RapidMiner to explore monitoring time series data and event data of a
multi-system environment, and to investigate and compare various predictive maintenance approaches,
thereby utilizing the different parts of RapidMiner's workflow pipeline, including data preparation and cleansing.
Low-Overhead Debugging with Sulong (Java)
is an interpreter for LLVM IR, an intermediate representation of source code that can be produced by the Clang
compiler for the C family of programming languages. It is based on the Truffle
framework for implementing interpreters for programming languages and part of the GraalVM
project. In addition to executing programs that were compiled to LLVM IR, Sulong also supports GraalVM's integrated debugging framework to allow users to debug these programs at source-level. At the moment, this debugging support is aimed at providing correct values for all possible symbols. This, however, comes at the cost of a significant run-time overhead since it prevents or even requires undoing several performance optimizations introduced by both Clang and GraalVM.
The goal of this project is to implement an alternative performance mode
for source-level debugging with Sulong. Instead of providing current values for all source-level symbols, in this mode the debugger should display only symbols whose values are still available despite optimizations. This will require analyzing the LLVM IR executed by Sulong and the debug information
it contains. Once GraalVM's debugging framework has halted the running program and requests symbol information from it, the values referenced in the debug information can be compared against the current program state to determine which values are still available. For this project, no knowledge about compiler optimizations performed by either Clang or GraalVM is required.
Automatic Debugging Assistant for Actor Programs (Java)
Debugging of Concurrent Programs can be difficult due to their non-determinism, as bugs may occur only under rare circumstances.
The problem of non-determinism can be solved by employing record & replay to capture those circumstances, and allow for deterministic reproduction of a program execution.
What remains is the challenge to debug a complex program, finding the right places to set breakpoints and choosing the right stepping operations requires try and error.
The goal of this project is to develop debugging features for SOMns (a Truffle language, implemented in Java) that guide the debugging efforts towards the bug.
When the program crashes, we have an execution trace that allows us to reproduce the execution, and a stack trace of that error.
Using both traces we can analyse when and where a breakpoint is useful, and allow developers to focus on inspecting the program to find the cause of the bug.
Reference implementation of SOMns record and replay in GraaJS (Java)
SOMns is a research language similar to Smalltalk. It provides specialised
debugging support for its concurrency models, e.g. record & replay. Record
and replay debugging is based on the idea of recording a program trace that
allows one to deterministically reproduce an execution (including bugs). SOMns and
GraalJS are both implemented in Java with the Truffle framework and use similar
concurrency models. The goal of this thesis is to reimplement
the record and replay strategy from SOMns in GraalJS. In addition, recording
performance of the GraalJS implementation should be evaluated with benchmarks.
Enhancing the AcmeAir benchmark application
AcmeAir is a simple web-application that represents the booking system of an airline.
It is used to evaluate the run-time performance of debugging tools in our SOMns language
implementation. Currently, AcmeAir supports a limited set of operations that are very
database dependent. The goal of this project is to enhance AcmeAir with features one
could find in a real booking system, for example multiple options for finding flight
connections. Additionally, the JMeter configuration used to drive the benchmark needs
to be updated to use the new features.
The Truffle Framework allows you to write code that interoperates between
different languages. In this project, you should prepare test cases to see how well-behaving
complete set of unit tests and an analysis of the failing behaviour.
NUMA support for the G1 Garbage Collector
On multi-socket systems memory access time depends on the memory location
relative to the processor (locality group): "closer" memory access latency is significantly
smaller than memory that is located with a different processor.
Currently G1 does not exploit this by improving or at least keeping access locality the same.
Goals for this task could include implementation of the common heuristics used in literature that
keep objects in the same locality group as long as possible, like a) let G1 keep data in the
same locality group in the young generation and try to evenly spread data across locality
groups in old generation; or b) try to keep locality in both young and old generation.
Measure the impact of these strategies across a set of industry benchmarks and analyze
other areas in the garbage collector that might benefit from NUMA awareness and potentially