DYNAMIC URDU DISCOURSE-AWARE PROMPT TUNING (DUDAPT) FOR CONTEXT-ADAPTIVE IMAGE CAPTIONING

Authors

  • Ammad Hussain
  • Mubasher Hussain Malik

Abstract

We propose Dynamic Urdu Discourse-Aware Prompt Tuning (DUDAPT), a novel framework for context-adaptive image captioning that addresses the unique challenges of Urdu language integration. Traditional captioning systems rely on static word embeddings, which often fail to capture Urdu’s rich discourse features such as syntactic complexity and anaphora resolution. The proposed method introduces a dynamic embedding layer that adapts to linguistic context through three key components: a Discourse Complexity Analyzer (DCA) to evaluate sentence complexity in real-time, a Dynamic Prompt Pool (DPP) that selectively activates context-aware soft prompts, and an Urdu-Aware Embedding Projector to align tokens with visual-semantic spaces. The DCA employs a lightweight transformer to compute complexity scores, which then guide the DPP to expand or prune prompts dynamically. Moreover, the projector combines frozen Urdu embeddings with adaptive prompts, enabling seamless integration with conventional language decoders. The framework is realized using a distilled Urdu-BERT model for efficiency and meta-learned multilingual prompts for robustness. Experimental validation demonstrates that DUDAPT outperforms fixed-embedding approaches by effectively capturing discourse nuances while maintaining compatibility with existing captioning pipelines. This work bridges a critical gap in low-resource language processing, offering a scalable solution for Urdu-centric multimodal applications.

Downloads

Published

2026-06-10

How to Cite

Ammad Hussain, & Mubasher Hussain Malik. (2026). DYNAMIC URDU DISCOURSE-AWARE PROMPT TUNING (DUDAPT) FOR CONTEXT-ADAPTIVE IMAGE CAPTIONING. Spectrum of Engineering Sciences, 4(6), 1110–1120. Retrieved from https://www.thesesjournal.com/index.php/1/article/view/3174