retrieval for multimodal interaction