Andrew Luo

我是香港大学助理教授,由香港大学数据科学研究院心理学系于 2024 年 10 月合聘。

我在2024从卡内基梅隆大学 (CMU) 获得了机器学习计算神经科学联合博士学位。在那里我与 Michael Tarr and Leila Wehbe 一起工作。在此之前,我于 2019 年获得了麻省理工学院 (MIT) 计算机科学本科学位。我也从 CMU 获得了机器学习硕士学位。

我的工作重点是理解视觉感知背后的计算原理,以及这些原理如何能帮助我们改善并设计更好的AI生成模型。最终,我的目标是弥合人类和机器视觉之间的差距,从而加深对人类认知的理解 & 达到人工智能的进步。

2024 年末和2025 夏天,我正在招有机器视觉, AI for Neuroscience (NeuroAI), 和图像生成模型背景的博士生加入我的研究团队(加入港大)。也欢迎博士生,硕士生和本科生远程合作或者远程RA。请发送电子邮件至 aluo@hku.hk,附上简历和自我介绍。

Email  /  HKU Email  /  Google Scholar  /  Github /  WeChat / English

profile photo

研究

我想理解人类的感知机理,并构建更好的生成模型和具有类人推理能力的机器。

PontTuset Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers
Andrew F. Luo, Jacob Yeung, Rushikesh Zawar, Shaurya Dewan, Margaret M. Henderson, Leila Wehbe*, Michael J. Tarr*
* Co-corresponding authors
Arxiv, 2024 (in submission)
arxiv page / bibtex

We propose an efficient gradient-free distillation module capable of extraction high quality dense CLIP embeddings, and utilize these embeddings to understand semantic selectivity in the visual cortex.

PontTuset Disentangled Acoustic Fields For Multimodal Physical Scene Understanding
Jie Yin, Andrew F. Luo, Yilun Du, Anoop Cherian, Tim K Marks, Jonathan Le Roux, Chuang Gan
IROS 2024
arxiv page / bibtex

We investigate the problem of visual-acoustic navigation conditioned on a continuous acoustic field representation of audio.

PontTuset DiffusionPID: Interpreting Diffusion via Partial Information Decomposition
Shaurya Dewan, Rushikesh Zawar, Prakanshul Saxena, Yingshan Chang, Andrew F. Luo, Yonatan Bisk
NeurIPS 2024
arxiv page / bibtex

We leverage ideas from information theory to understand the contributions of individual text tokens and their interactions when generating images.

PontTuset BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity
Andrew F. Luo, Margaret M. Henderson, Michael J. Tarr, Leila Wehbe
ICLR 2024
arxiv page / bibtex

We propose a way to leverage contrastive image-language models (CLIP) and fine-tuned language models to generate natural language descriptions of voxel-wise selectivity in the higher order visual areas.

PontTuset Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models
Andrew F. Luo, Margaret M. Henderson, Leila Wehbe*, Michael J. Tarr*
* Co-corresponding authors
NeurIPS 2023 oral, (top 0.7% of all submissions)
project page / bibtex / code

We propose a way to generate images that activate regions of the brain by leveraging natural image priors from Diffusion models.

PontTuset Neural Selectivity for Real-World Object Size In Natural Images
Andrew F. Luo, Leila Wehbe, Michael J. Tarr, Margaret M. Henderson
BioRxiv, 2023 (in submission)
bioRxiv page / bibtex

We examine the selectivity of the brain to real-world size in complex natural images.

Learning Neural Acoustic Fields
Andrew F. Luo, Yilun Du, Michael J. Tarr, Joshua B. Tenenbaum, Antonio Torralba, Chuang Gan
NeurIPS 2022 (Summer intership at IBM)
project page / bibtex / code

We propose a learnable and compact implicit encoding for acoustic impulse responses. We find that our NAFs can achieve state-of-the-art performance at a tiny size footprint.

PontTuset Prototype memory and attention mechanisms for few shot image generation
Tianqin Li*, Zijie Li*, Andrew F. Luo, Harold Rockwell, Amir Barati Farimani, Tai Sing Lee
ICLR 2022
bibtex / code

We show that having a prototype memory with attention mechanisms can improve image synthesis quality, and learn interpretable visual concept clusters.

PontTuset SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators
Andrew F. Luo, Tianqin Li, Wen-Hao Zhang, Tai Sing Lee
ICCV 2021
arxiv page / bibtex / code

We propose a surface based discriminator for implicit shape generation. Our discriminator uses differentiable ray-casting and marching cubes.

PontTuset End-to-End Optimization of Scene Layout
Andrew F. Luo, Zhoutong Zhang, Jiajun Wu, Joshua B. Tenenbaum
CVPR 2020 oral
project page / bibtex / code

We propose contrained scene synthesis using graph neural networks, we show that generated scenes can be refined using differentiable rendering.

PontTuset Learning to Infer and Execute 3D Shape Programs
Yonglong Tian, Andrew F. Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
ICLR 2019
project page / bibtex / code

We propose a learnable decomposition of 3D shapes into symbolic programs that can be executed.