RETRIEVAL-AUGMENTED GENERATION: ARCHITECTURES, ADAPTIVE RETRIEVAL, FEEDBACK-DRIVEN OPTIMIZATION, AND OPEN RESEARCH CHALLENGES

Muhammad Ali Hassan; Muhammad Azam; Mehwish Amin; Afsheen; Ammad Hussain

Authors

Muhammad Ali Hassan
Muhammad Azam
Mehwish Amin
Afsheen
Ammad Hussain

Abstract

RAG has become a core paradigm for grounding LLMs into external knowledge sources, preventing hallucinations, and allowing scalable reasoning over dynamic corpora. By integrating parametric language modeling with non-parametric retrieval mechanisms, RAG systems close the gap between fluent natural language generation and factuality. Unlike fully parametric models, RAG provides access to recent and domain specific information at the time of inference. Nevertheless, recent empirical results suggest that a majority of already-deployed RAG pipelines are still brittle because they only rely on static similarity-based retrieval techniques and simple naïve pipeline strategies for context construction and weak or even relevance-only reranking with no feedback-driven adaptivity. The following survey will be based on a very brief overview of current RAG research, including: basic architectural designs; retriever and re-ranker strategies; context construction methodologies; adaptive and reinforcement-learning-based RAGs; feedback aware models; graph based extensions; memory based extensions; long context behaviour; hallucination analysis; and new evaluation benchmarks. We base our work on over forty representative studies to introduce a single taxonomy, provide a metadata-based comparative analysis, and provide comprehensive information on open research challenges. Our argument is that the RAG systems of the future should no longer be in the paradigm of static retrieval, but rather integrate context adaptation mechanisms to feedback to improve the resilience, efficiency and effectiveness of deployment.