Efficient Vision-Language Instruction Tuning for LLMs
-
arxiv.org
Clear