Research
Peer-reviewed and ongoing work in multimodal reasoning, knowledge augmentation, and instruction-driven adaptation. Each page follows a consistent academic layout with paper metadata and figures.
Learning how much visual budget each frame receives via a lightweight allocator paired with a frozen MLLM, improving efficiency–accuracy trade-offs under aggressive compression.
Open project page QA · KnowledgeAwakening latent knowledge in LLMs through explicit and implicit imagination modules without retrieval-heavy pipelines, evaluated on NQ, TQA, and WQ.
Open project page Meta-learning · AdaptersGenerating task-specific adapters from instructions via hypernetworks and distillation, improving cross-task generalization with lower compute than instance-only training.
Open project page