Abstract: This work presents a novel architecture for building Retrieval-Augmented Generation (RAG) systems to improve Question Answering (QA) tasks from a target corpus. Large Language Models (LLMs) ...
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
Please cite this work with the following BibTeX: @inproceedings{cocchi2024augmenting, title={{Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering}}, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback