Open Projects

For Master's theses, Bachelor's theses or for Software Engineering projects in the Master's program

(Most topics can be adapted in scale to fit any of the above categories)

  • Taint Tracking for Strings in Graal.js (Java, JavaScript)
    Dynamic taint tracking is a popular analysis technique that tracks sensitive data as it flows through an executing program by marking it as tainted. Taint tracking is commonly used to prevent attackers from exploiting program vulnerabilities by injecting malicious inputs and to prevent sensitive information from leaking to untrusted third parties. One way of implementing taint tracking is to extend a runtime environment for the targeted programming language. For example, when a program receives a string value over the network from an untrusted source, the runtime may store this string value using a "tainted string" data type. This runtime could then be hardened against code injection attacks by not executing code stored in such a tainted string.
    In this project, the Graal.js JavaScript runtime should be extended to support a data type for tainted strings, which stores taint information on a per character basis. This data type should support all operations that regular strings do. When a tainted string value is concatenated with other strings, even regular ones, the resulting value should also be a tainted string that retains the taint information of the original strings. As a goal of this project, builtin functions should be available to create tainted strings from regular strings and JavaScript's builtin "eval" function should be prevented from executing code stored in such a tainted string value. The concrete scope of the project can be adapted in accordance with the supervisor to be suitable for a bachelor thesis, master project, or master thesis.

  • Multi-Language Benchmarks for a New Dynamic Taint Analysis Platform (C/C++, JavaScript, Python)
    Dynamic taint analysis is a program analysis technique in which a program is instrumented to track the flow of sensitive data. This analysis technique can be used to prevent information leaks, to uncover security vulnerabilities, and it also has applications in several other fields. TruffleTaint is a novel dynamic taint analysis platform based on GraalVM, a multi-language virtual machine, and aims to track sensitive data across the language boundary with little overhead in execution time.
    The goal of this project is to use common benchmarks, such as those from the *Are we fast yet?* benchmark suite, to evaluate TruffleTaint's run-time overhead and data tracking capability. To this end, all benchmark programs must be implemented using combinations of C/C++, JavaScript and Python code by using GraalVM's APIs for language interoperability. Furthermore, each benchmark program must also use TruffleTaint's API to mark certain data which the benchmark operates on as sensitive and check that its flow is properly tracked.

  • COCO/R Parser for IEC 61631-3 Structure Text (Java)
    IEC 61631-3 is a standard for languages for Programmable Logic Controller (PLC) Programs. One language within this standard is Structure Text, which is a language similar to Pascal. In this project a Coco/R parser for building an abstract syntax tree should be created.

  • Data Flow and Call Graph Analysis for IEC 61631-3 Structure Text Programs (Java, optional Kotlin)
    IEC 61631-3 is a standard for languages for Programmable Logic Controller (PLC) Programs. One language within this standard is Structure Text, which is a language similar to Pascal. In this project, data flow and call graph graphs should be created for Structured Text programs. Data flow graphs represent the data dependencies between program statements. A call graph represents the call dependencies between procedures. The input is an abstract syntax tree of Structured Text programs (see project above). Remark: It is also possible to split the task into two independent projects, one for the data flow and one for the call graph.

  • Low Overhead Neural Network Predictors in a Dynamic Compiler (Java)
    In an ongoing research project, machine learning is leveraged into the domain of dynamic compilers. In contrast to static compilers, compile time overhead due to neural network loading and inference directly impacts overall run time. For this project, a slim ANN predictor should be implemented in the Graal compiler and compared to existing more versatile ML frameworks (DL4J, Tribuo). The question is, whether a slim predictor solely implemented in Java, can outperform complex frameworks when it comes to prediction only.

  • New JavaScript Language features - ECMAScript proposals (Java, some JavaScript)
    JavaScript is specified in the ECMAScript language specification. It is an evolving language, and is improved by a "proposal" process. Each new or improved feature is specified by one proposal. Currently open proposals include Temporal (a date/time library), optional chaining (avoiding null value exceptions), decorators (similar to Annotations in Java), additional methods to the Set builtin, and many more. As the different proposals vastly differ in effort to implement them, we have topics for projects (project in software engineering), bachelor theses and master theses. The task is to fully implement the current state of the proposal in the GraalVM/Graal.js JavaScript engine.
    Contact: Dr. Wirth

  • Humongous Object Aware Region Allocation (C++)
    The Hotspot G1 garbage collector is a regional collector: the Java heap is strictly split into same-sized regions. Objects larger than a single region ("humongous regions") are allocated using separate contiguous sets of regions, and are unmovable for performance reasons. This poses a few problems, for example:
    • at the end of such a humongous region there is often a significant amount of space that is effectively wasted and unavailable for allocation.
    • region level fragmentation due to never moving these objects can cause unexpected Out-of-memory situations if there are not enough contiguous regions left for a given new allocation.
    This project could lessen the problem by implementing one or more changes to the existing strategy in heap management by for example better region selection for evacuation and placement, automatic region level defragmentation efforts, over-provisioning the heap area, more aggressive reclaimation of humongous objects and regular object allocation at the end of a humongous object.
    Contact: DI Schatzl

  • G1 garbage collector Full GC improvements (C++)
    Only in JDK10 the Hotspot G1 garbage collector received a parallel full-heap collector. It uses a parallelized mark-sweep-compact algorithm. While its performance is on par with the Parallel GC Full GC algorithm, there are opportunities to improve the algorithm related to work distribution, exploiting pre-existing work and handling various edge cases better.
    Contact: DI Schatzl