papers mi-zo multi-information for camera control in 3D scenes with multiple objects priority map a priority map module for vision-and-language navigation mlm multitask learning with multiple languages and modalities pm+mo training multimodal systems to generalise with multiple objectives