Academic Job Offers

You want to be part of our team as a PhD student or long term full staff member? We herewith encourage good students to send an application (incl. motiviation letter, CV etc.) to Prof. Stiefelhagen (rainer stiefelhagenDjn9∂kit edu).

Please note that we do not offer summer internships.

Hiwi Job Offers

Open position

Student Assistant — Scientific Linux Infrastructure

Support the administration of our Linux-based research servers, identity management, and virtualization infrastructure. This is not first-level IT support — the focus is on scientific computing, working alongside the responsible staff.

Tasks

GPU server setup and integration; administration of the Linux domain, Kerberos, and LDAP; work on LXC containers and KVM/QEMU VMs, including migration from LXC to VM-based setups.

Occasional workstation setup and minor hardware work on research systems (e.g. robotics). Almost no routine office IT.

Required skills

Solid Linux experience and a strong interest in server administration. Useful: shell scripting, networking basics, Kerberos/LDAP, LXC, KVM/QEMU.

What you gain

Hands-on experience with research-grade Linux infrastructure: server administration, virtualization, networking, identity management, and scientific computing platforms (+ money).

Interested? Contact CVHCI Admins. Online since April 2026.

Hiwi für SMART AGE Projekt [pdf] (online since September 2021)

HiWi for Mobility Assistance Systems (iOS User Interface Development) [pdf] (online since February 2025)

HiWi for Mobility Assistance Systems (Hardware Integration) [pdf] (online since February 2025)

Bachelor/Master Theses

Vision in Robotics

Perception and action for robotic agents — vision–language–action models, world action models, object segmentation, grasp planning.

Topic	Level	Supervisor	Online since	Description
Various topics related to Vision–Language–Action (VLA) Models, World Action Models (WAM), Robotics Simulation (Isaac Sim), object detection, segmentation, grasp planning, etc. Open supervision across robotic perception and action learning. Multiple concrete directions available — contact David Schneider to shape a topic that fits your background.	`MA`	D. Schneider	April 2026	Contact
Cross-image visual prompting for open-vocabulary detection with SAM-3 Transfer visual prompts across images to enable open-vocabulary object detection.	`MA`	D. Schneider	April 2026	PDF

More in this area: further topics in robotic perception, embodied learning, and manipulation — contact David Schneider.

Human motion, label noise, domain generalization & GenAI

Activity recognition, muscle activation estimation, synthetic data generation, domain adaptation and generalization.

Topic	Level	Supervisor	Online since	Description
A robust approach towards imperfect scribble semantic segmentation available Label-noise-tolerant learning from scribble annotations for semantic segmentation.	`MA`	Kunyu Peng	December 2024	PDF
Referring object tracking in long video available Language-guided object tracking across long-horizon video sequences.	`MA`	Kunyu Peng	August 2024	PDF
LLMs for human muscle activation interpretation / estimation Use large language models to interpret and estimate muscle activation patterns from motion data.	`MA`	D. Schneider	April 2026	PDF

More in this area: further topics on deep learning for human activity understanding are available — contact David Schneider or Kunyu Peng.

Document analysis & Vision–Large Language Models

Layout analysis, retrieval-augmented generation, anomaly detection, and unified representations for complex documents.

Topic	Level	Supervisor	Online since	Description
Document anomaly detection Detect anomalous regions, artifacts, or forgeries in complex document layouts.	`MA`	O. Moured · Y. Chen	February 2025	PDF
Multilingual few-shot document layout analysis Few-shot layout analysis that generalizes across scripts and languages.	`MA`	O. Moured · Y. Chen	February 2025	PDF
Vision-based long-document information retrieval Retrieve information from long documents end-to-end with vision–language models.	`MA`	O. Moured · Y. Chen	February 2025	PDF
Unified document representation A single representation spanning text, layout, and visual elements of documents.	`MA`	O. Moured · Y. Chen	March 2025	PDF

More in this area: further theses on document analysis, generation, and VLLMs with RAG — contact O. Moured or Yufan Chen.

Medical computer vision

Surgical video understanding, interactive segmentation, and learning under generalization, adaptation and data-scarcity constraints.

Topic	Level	Supervisor	Online since	Description
Benchmarking hallucinations in multi-turn surgical video dialogue new Evaluate VLLM hallucinations in multi-turn dialogue grounded on surgical video.	`MA`	K. Peng · J. Wei	December 2025	PDF

More in this area: interactive segmentation for medical image analysis — contact Z. Marinov. Medical CV under generalization, adaptation and data-scarcity — contact Simon Reiß.

Scene segmentation & understanding

Open-ended scene understanding, vision–language, and embodied AI.

Topic	Level	Supervisor	Online since	Description
Computer vision for real-world scene understanding Open-ended scene understanding beyond closed-vocabulary benchmarks.	`MA`	J. Zhang	February 2025	PDF

More in this area: vision-and-language and embodied AI — contact J. Zheng.

Your own proposal

Don’t see a matching topic? We welcome own proposals in all of the areas above. Please reach out to the supervisor whose group best matches your interest.

Additional open positions across KIT are listed at ACCESS∂KIT.