skip to main content

Mechanical and Civil Engineering Seminar

Thursday, May 26, 2022
2:00pm to 3:00pm
Add to Cal
"Autonomous Temporal Understanding and State Estimation during Robot-Assisted Surgery"
Ida Qin, Graduate Student, Mechanical Engineering, Caltech,

PhD Thesis Defense


Robot-Assisted Surgery (RAS) has become increasingly important in modern surgical pratice for its many benefits and advantages for both the patient and the healthcare professionals, as compared to traditional open surgeries and minimally invasive surgeries such as laparoscopy. Artificial intelligence applications during RAS and post-operative analysis can provide various surgeon-assisting functionalities and could potentially achieve a better surgery outcome. These applications, ranging from providing surgeons with advisory information during RAS and post-operative analysis to virtual fixture and supervised autonomous surgical tasks, share a necessary prerequisite of a comprehensive understanding of the current surgical scene. This understanding should include the knowledge of the current surgical task being performed, the surgeon's actions and gestures, the state of the patient, etc. Currently, there is yet to be a unified effort to achieve the autonomous temporal understanding and perception of an RAS at the high accuracy and efficiency required in the highly safety-critical field of medicine.

This thesis develops novel modeling methodologies and deep learning-based models for the autonomous perception and temporal segmentation of the current surgical scene during an RAS. An RAS procedure is modeled as a hierarchical system consisting of discrete surgical states at multiple levels of temporal granularity. These surgical states take the form of surgical tasks, operational steps, fine-grained surgical actions, etc. A broad range of computational experiments were performed to develop methods that achieve an accurate, robust, and efficient estimation of these surgical states. Multiple novel deep learning-based models for feature extraction, noise elimination, and efficient training were proposed and tested. This thesis also shows the significant benefits of incorporating multiple types of data streams recorded by the surgical robotic system to a more accurate surgical state estimation effort.

Two new RAS datasets that contains real-world RAS procedures and diverse experimental settings were collected and annotated--filling a gap in the data sets available for the development and testing of of robust surgical state estimation models. The performance and robustness of models in this thesis work were showcased with these highly complex and dynamic real-world RAS datasets and compared against state-of-the-art methods. A significant model performance improvement was observed in both surgical state estimation accuracy and efficiency. The modeling methodologies and deep learning-based models developed in this work has diverse potential applications to the development of a next-generation surgical robotic systems.

Zoom Link:

For more information, please contact Sonya Lincoln by email at [email protected] or visit