MULTIMODAL GRAPH REPRESENTATION LEARNING FOR ROBUST SURGICAL WORKFLOW RECOGNITION WITH ADVERSARIAL FEATURE DISENTANGLEMENT

Authors

  • Muhammad Usman
  • Hasnain Kashif
  • Huzaifa Majeed
  • Saba Shahid

Keywords:

Surgical Workflow Recognition, Multimodal Data Fusion, Graph Convolutional Networks (GCN), Robotic-Assisted Surgery, MDGNet

Abstract

Recognizing the workflow of surgeries is really important for automating tasks and making sure patients are safe. When the data gets corrupted it becomes a big problem. This document talks about an approach that uses graphs and combines what we see and the movement of things to make things more accurate even when conditions are tough. The Multimodal Disentanglement Graph Network or MDGNet for short looks at how what we see. The movement of things work together using a special framework to make sure the features match up. The Contextual Calibrated Decoder uses information about time and context to make the system more resilient to changes and corruption of data. This helps the Surgical workflow recognition system to work. The Surgical workflow recognition system is important, for safety and the Multimodal Disentanglement Graph Network helps it to work more accurately. The model achieved accuracies of 86.87% and 92.38% on two datasets, demonstrating effectiveness in addressing data corruption issues and advancing automated surgical workflow recognition.

Downloads

Published

2026-02-11

How to Cite

Muhammad Usman, Hasnain Kashif, Huzaifa Majeed, & Saba Shahid. (2026). MULTIMODAL GRAPH REPRESENTATION LEARNING FOR ROBUST SURGICAL WORKFLOW RECOGNITION WITH ADVERSARIAL FEATURE DISENTANGLEMENT. Spectrum of Engineering Sciences, 4(2), 253–266. Retrieved from https://www.thesesjournal.com/index.php/1/article/view/1982