Andrew Luo

我是香港大学助理教授，博士生导师，由香港大学数据科学研究院和心理学系于 2024 年 10 月合聘。

我在2024从卡内基梅隆大学 (CMU) 获得了机器学习和计算神经科学联合博士学位。在那里我与 Michael Tarr and Leila Wehbe 一起工作。在此之前，我于 2019 年获得了麻省理工学院 (MIT) 计算机科学本科学位。我也从 CMU 获得了机器学习硕士学位。

我的工作重点是理解视觉感知背后的计算原理，以及这些原理如何能帮助我们改善并设计更好的AI生成模型。最终，我的目标是弥合人类和机器视觉之间的差距，从而加深对人类认知的理解 & 达到人工智能的进步。

我正在招收2026秋的博士生。(由7月1日2026年更新)

仍然欢迎博士生，硕士生和本科生远程合作或者RA。

Email / HKU Email / Google Scholar / Github / WeChat / English

研究

我想理解人类的感知机理，并构建更好的生成模型和具有类人推理能力的机器。

	Vision Transformers with Self-Distilled Registers Yinjie Chen, Zipeng Yan, Chong Zhou, Bo Dai, Andrew F. Luo * Co-first authors in submission arxiv page / bibtex We show that artifacts can be removed from pre-trained ViTs without any labeled data by introducing registers in post-training. Our method uses a model combined with test-time augmentation to distill itself, leading to significant improvements in open-vocabulary segmentation and dense prediction tasks.
	Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex Muquan Yu, Mu Nan, Hossein Adeli, Jacob S. Prince, John A. Pyles, Leila Wehbe, Margaret M. Henderson, Michael J. Tarr, Andrew F. Luo in submission arxiv page / bibtex We show how to construct higher visual cortex encoders that can generalize across subjects, scanners, voxel sizes, protocols, and images without any additional finetuning by using meta-learning across subjects and in-context learning across stimuli.
	Reanimating Images using Neural Representations of Dynamic Stimuli Jacob Yeung, Andrew F. Luo, Gabriel Sarch, Margaret M. Henderson, Deva Ramanan, Michael J. Tarr CVPR 2025 oral arxiv page / bibtex We propose image-conditioned decoding of perceived motion from fMRI data, we show that this can be used to animate images with an video diffusion model.
	Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers Andrew F. Luo, Jacob Yeung, Rushikesh Zawar, Shaurya Dewan, Margaret M. Henderson, Leila Wehbe, Michael J. Tarr * Co-corresponding authors ICLR 2025 arxiv page / bibtex We propose an efficient gradient-free distillation module capable of extraction high quality dense CLIP embeddings, and utilize these embeddings to understand semantic selectivity in the visual cortex.
	Disentangled Acoustic Fields For Multimodal Physical Scene Understanding Jie Yin, Andrew F. Luo, Yilun Du, Anoop Cherian, Tim K Marks, Jonathan Le Roux, Chuang Gan IROS 2024 arxiv page / bibtex We investigate the problem of visual-acoustic navigation conditioned on a continuous acoustic field representation of audio.
	DiffusionPID: Interpreting Diffusion via Partial Information Decomposition Shaurya Dewan, Rushikesh Zawar, Prakanshul Saxena, Yingshan Chang, Andrew F. Luo, Yonatan Bisk NeurIPS 2024 arxiv page / bibtex We leverage ideas from information theory to understand the contributions of individual text tokens and their interactions when generating images.
	BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity Andrew F. Luo, Margaret M. Henderson, Michael J. Tarr, Leila Wehbe ICLR 2024 arxiv page / bibtex We propose a way to leverage contrastive image-language models (CLIP) and fine-tuned language models to generate natural language descriptions of voxel-wise selectivity in the higher order visual areas.
	Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models Andrew F. Luo, Margaret M. Henderson, Leila Wehbe, Michael J. Tarr * Co-corresponding authors NeurIPS 2023 oral, (top 0.7% of all submissions) project page / bibtex / code We propose a way to generate images that activate regions of the brain by leveraging natural image priors from Diffusion models.
	Neural Selectivity for Real-World Object Size In Natural Images Andrew F. Luo, Leila Wehbe, Michael J. Tarr, Margaret M. Henderson BioRxiv, 2023 (in submission) bioRxiv page / bibtex We examine the selectivity of the brain to real-world size in complex natural images.
	Learning Neural Acoustic Fields Andrew F. Luo, Yilun Du, Michael J. Tarr, Joshua B. Tenenbaum, Antonio Torralba, Chuang Gan NeurIPS 2022 (Summer intership at IBM) project page / bibtex / code We propose a learnable and compact implicit encoding for acoustic impulse responses. We find that our NAFs can achieve state-of-the-art performance at a tiny size footprint.
	Prototype memory and attention mechanisms for few shot image generation Tianqin Li, Zijie Li, Andrew F. Luo, Harold Rockwell, Amir Barati Farimani, Tai Sing Lee ICLR 2022 bibtex / code We show that having a prototype memory with attention mechanisms can improve image synthesis quality, and learn interpretable visual concept clusters.
	SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators Andrew F. Luo, Tianqin Li, Wen-Hao Zhang, Tai Sing Lee ICCV 2021 arxiv page / bibtex / code We propose a surface based discriminator for implicit shape generation. Our discriminator uses differentiable ray-casting and marching cubes.
	End-to-End Optimization of Scene Layout Andrew F. Luo, Zhoutong Zhang, Jiajun Wu, Joshua B. Tenenbaum CVPR 2020 oral project page / bibtex / code We propose contrained scene synthesis using graph neural networks, we show that generated scenes can be refined using differentiable rendering.
	Learning to Infer and Execute 3D Shape Programs Yonglong Tian, Andrew F. Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu ICLR 2019 project page / bibtex / code We propose a learnable decomposition of 3D shapes into symbolic programs that can be executed.