Academic Job Offers

You want to be part of our team as a PhD student or long term full staff member? We herewith encourage good students to send an application (incl. motiviation letter, CV etc.) to Prof. Stiefelhagen (rainer stiefelhagenDoy4∂kit edu).

Please note that we do not offer summer internships.

 

Hiwi Job Offers

Open position HiWi Linux Research Infrastructure VirtualizationGPU Compute Servers

Student Assistant — Scientific Linux Infrastructure

Support the administration of our Linux-based research servers, identity management, and virtualization infrastructure. This is not first-level IT support — the focus is on scientific computing, working alongside the responsible staff.

Tasks

GPU server setup and integration; administration of the Linux domain, Kerberos, and LDAP; work on LXC containers and KVM/QEMU VMs, including migration from LXC to VM-based setups.

Occasional workstation setup and minor hardware work on research systems (e.g. robotics). Almost no routine office IT.

Required skills

Solid Linux experience and a strong interest in server administration. Useful: shell scripting, networking basics, Kerberos/LDAP, LXC, KVM/QEMU.

What you gain

Hands-on experience with research-grade Linux infrastructure: server administration, virtualization, networking, identity management, and scientific computing platforms (+ money).

Interested? Contact CVHCI Admins. Online since April 2026.

Hiwi für SMART AGE Projekt [pdf] (online since September 2021)

HiWi for Mobility Assistance Systems (iOS User Interface Development) [pdf] (online since February 2025)

HiWi for Mobility Assistance Systems (Hardware Integration) [pdf] (online since February 2025)

Bachelor/Master Theses

Vision in Robotics

Perception and action for robotic agents — vision–language–action models, world action models, object segmentation, grasp planning.

Topic Level Supervisor Online since Description
Various topics related to Vision–Language–Action (VLA) Models, World Action Models (WAM), Robotics Simulation (Isaac Sim), object detection, segmentation, grasp planning, etc.  VLA WAM Isaac Sim Robotics

Open supervision across robotic perception and action learning. Multiple concrete directions available — contact David Schneider to shape a topic that fits your background.

MA D. Schneider April 2026
Cross-image visual prompting for open-vocabulary detection with SAM-3 SAM3 Object Detection Object Segmentation Visual Prompting

Transfer visual prompts across images to enable open-vocabulary object detection.

MA D. Schneider April 2026

More in this area: further topics in robotic perception, embodied learning, and manipulation — contact David Schneider.

Human motion, label noise, domain generalization & GenAI

Activity recognition, muscle activation estimation, synthetic data generation, domain adaptation and generalization.

Topic Level Supervisor Online since Description
A robust approach towards imperfect scribble semantic segmentation available

Label-noise-tolerant learning from scribble annotations for semantic segmentation.

MA Kunyu Peng December 2024
Referring object tracking in long video available

Language-guided object tracking across long-horizon video sequences.

MA Kunyu Peng August 2024
LLMs for human muscle activation interpretation / estimation

Use large language models to interpret and estimate muscle activation patterns from motion data.

MA D. Schneider April 2026

More in this area: further topics on deep learning for human activity understanding are available — contact David Schneider or Kunyu Peng.

Document analysis & Vision–Large Language Models

Layout analysis, retrieval-augmented generation, anomaly detection, and unified representations for complex documents.

Topic Level Supervisor Online since Description
Document anomaly detection

Detect anomalous regions, artifacts, or forgeries in complex document layouts.

MA O. Moured · Y. Chen February 2025
Multilingual few-shot document layout analysis

Few-shot layout analysis that generalizes across scripts and languages.

MA O. Moured · Y. Chen February 2025
Vision-based long-document information retrieval

Retrieve information from long documents end-to-end with vision–language models.

MA O. Moured · Y. Chen February 2025
Unified document representation

A single representation spanning text, layout, and visual elements of documents.

MA O. Moured · Y. Chen March 2025

More in this area: further theses on document analysis, generation, and VLLMs with RAG — contact O. Moured or Yufan Chen.

Medical computer vision

Surgical video understanding, interactive segmentation, and learning under generalization, adaptation and data-scarcity constraints.

Topic Level Supervisor Online since Description
Benchmarking hallucinations in multi-turn surgical video dialogue new

Evaluate VLLM hallucinations in multi-turn dialogue grounded on surgical video.

MA K. Peng · J. Wei December 2025

More in this area: interactive segmentation for medical image analysis — contact Z. Marinov. Medical CV under generalization, adaptation and data-scarcity — contact Simon Reiß.

Scene segmentation & understanding

Open-ended scene understanding, vision–language, and embodied AI.

Topic Level Supervisor Online since Description
Computer vision for real-world scene understanding

Open-ended scene understanding beyond closed-vocabulary benchmarks.

MA J. Zhang February 2025

More in this area: vision-and-language and embodied AI — contact J. Zheng.

Other topics

Visual in-context learning, few-shot, self-/semi-supervised and data-centric learning — contact Simon Reiß.

Your own proposal

Don’t see a matching topic? We welcome own proposals in all of the areas above. Please reach out to the supervisor whose group best matches your interest.

Additional open positions across KIT are listed at ACCESS∂KIT.