PhD defense : Abdolrahim KADKHODAMOHAMMADI
Team : AVR
Title : 3D Detection and Pose Estimation of Medical Staff in Operating Rooms using RGB-D Images
Abstract : In this thesis, we address the problems of person detection and pose estimation in Operating Rooms (ORs), which are key ingredients needed to develop many applications in such environments, like surgical activity recognition, surgical skill analysis and radiation safety monitoring. Because of the strict sterilization requirements of the OR and of the fact that the surgical workflow should not be disrupted, cameras are currently one of the least intrusive options that can be conveniently installed in the room to sense the environment. Even though recent vision-based human detection and pose estimation methods have achieved fairly promising results on standard computer vision datasets, we show that they do not necessarily generalize well to challenging OR environments. The main challenges are the presence of many visually similar surfaces, loose and textureless clinical clothes, clutter, occlusions and the fact that the environment is crowded. To address these challenges, we propose to use a set of compact RGB-D cameras installed on the ceiling of the OR. Such cameras capture the environment by using two inherently different sensors and therefore provide complementary information about the surfaces present in the scene, namely their visual appearance and their distances to the camera.
In this dissertation, we propose novel approaches that take into account depth, multi-view and temporal information to perform human detection and pose estimation. Firstly, we introduce an energy optimization approach to consistently track body poses over entire RGB-D sequences. Secondly, we present a novel approach to estimate the body poses directly in 3D by relying on both color and depth images. The approach also uses a new RGB-D body part detector. Finally, we present a multi-view approach for 3D human pose estimation, which relies on depth data to reliably incorporate information across all views. We also present a method to automatically model a priori information about the OR environment for obtaining a more robust human detection model. To evaluate our approaches, we generate several single- and multi-view datasets in operating rooms. We demonstrate very promising results on these datasets and show that our approaches outperform state-of-the-art methods on data acquired during real surgeries.
The defense will be held in English on Thursday, Dec 1st, at 13:30 in the Hirsch amphitheater at IRCAD.
La conférence EGC (Extraction et Gestion des Connaissances) s’est déroulée du 27 au 31 janvier 2025...
À compter du 1er septembre 2025, une nouvelle équipe de direction prendra ses fonctions à la tête...
ICube et l'université de Strasbourg lance son premier Student Chapter dédié à la photonique ! Une...
Nous sommes fiers de voir les travaux menés au sein du laboratoire ICube contribuer à une solution...
Du 6 au 11 avril 2025, la communauté internationale du traitement du signal s’est réunie à...
Lors de sa 11ème édition (27 mai – 29 juin 2025), le Street Art Fest Grenoble-Alpes a présenté une...
L’article “Few-shot Text-driven Adaptation of Foundation Models for Surgical Workflow Analysis” de...
💡 Et si les sciences se racontaient à la première personne ? C’est l’idée originale au cœur de...
Que se passe-t-il lorsque l’eau envahit un quartier urbain ? Comment circule-t-elle entre les...
Les 26 et 27 juin 2025, à la Faculté de Chirurgie Dentaire de Strasbourg, se sont tenues les 12ᵉ...
Les 26 et 27 juin 2025, à la Faculté de Chirurgie Dentaire de Strasbourg, se sont déroulées les 12ᵉ...