Mengtian (Martin) Li

Research at    

Email: martinli [dot] work [at] gmail [dot] com

I am a Member of Technical Staff on the Deep Research team at OpenAI, working on ChatGPT agent. Previously, I was part of the Multimodal team, where I contributed to enhancing the visual reasoning capabilities of our flagship models. I earned my Ph.D. from the Robotics Institute at Carnegie Mellon University, where I had the privilege of working with Deva Ramanan.

My research interests include multimodal perception and generation, long-horizon tasks, and agentic AI.


Projects

GPT-5   Aug 2025
ChatGPT agent   Jul 2025
o3 and o4-mini   Apr 2025
Thinking with Images   Apr 2025

Papers

Neehar Peri, Mengtian Li, Benjamin Wilson, Yu-Xiong Wang, James Hays, Deva Ramanan. An Empirical Analysis of Range for 3D Object Detection. In ICCV Workshop, Oct 2023.

[Paper] [Bibtex]

Ziqi Pang, Deva Ramanan, Mengtian Li and Yu-Xiong Wang. Streaming Motion Forecasting for Autonomous Driving. In IROS, Oct 2023.

[Paper] [Code] [Bibtex]


Shengcao Cao, Mengtian Li, James Hays, Deva Ramanan, Yu-Xiong Wang and Liangyan Gui. Learning Lightweight Object Detectors via Progressive Knowledge Distillation. In ICML, Jul 2023.

[Paper] [Code] [Bibtex]


Chittesh Thavamani, Mengtian Li, Francesco Ferroni and Deva Ramanan. Learning to Zoom and Unzoom. In CVPR, Jun 2023.

[Project page] [Paper] [Talk] [Code] [Bibtex]

Shubham Gupta*, Jeet Kanjani*, Mengtian Li, Francesco Ferroni, James Hays, Deva Ramanan* and Shu Kong*. Far3Det: Towards Far-Field 3D Detection. In WACV, Jan 2023.

[Project page] [Paper] [Bibtex]

Neehar Peri, Jonathon Luiten, Mengtian Li, Aljosa Osep, Laura Leal-Taixé and Deva Ramanan. Forecasting from LiDAR via Future Object Detection. In CVPR, Jun 2022.

[Project page + code] [Paper] [Bibtex]

Xiaofang Wang, Shengcao Cao*, Mengtian Li*, Kris M. Kitani. Neighborhood-Aware Neural Architecture Search. In BMVC, 2021.

[Paper] [Bibtex]

Chittesh Thavamani*, Mengtian Li*, Nicolas Cebron, Deva Ramanan. FOVEA: Foveated Image Magnification for Autonomous Navigation. In ICCV, 2021.

[Project page] [Paper] [Poster] [Code] [Bibtex]

Mengtian Li, Yu-Xiong Wang and Deva Ramanan. Towards Streaming Perception. In ECCV, 2020.

Best Paper Honorable Mention

[Project page + talk + data] [Paper] [Code] [Bibtex]

Mengtian Li, Ersin Yumer and Deva Ramanan. Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints. In ICLR, 2020.

[Project page] [Paper] [Talk] [Code] [Bibtex]

Mengtian Li, Zhe Lin, Radomír Měch, Ersin Yumer and Deva Ramanan. Photo-Sketching: Inferring Contour Drawings from Images. In WACV, 2019.

[Project page] [Paper] [Code] [Bibtex]

Mengtian Li, Laszlo Jeni, Deva Ramanan. Brute-Force Facial Landmark Analysis with a 140,000-Way Classifier. In AAAI, 2018.

[Paper] [Code] [Bibtex]

Mengtian Li, Daniel Huber. Guaranteed Parameter Estimation for Discrete Energy Minimization. In WACV, 2017.

[Paper] [Bibtex]

Mengtian Li, Alexander Shekhovtsov, Daniel Huber. Complexity of Discrete Energy Minimization Problems In ECCV, 2016.

Spotlight Presentation

[Paper] [Poster] [Talk] [Bibtex]


Misc

Confucius (孔子) once said: “if a craftsman wants to do good work, he must first sharpen his tools (工欲善其事,必先利其器).” I find that this concept also applies to research. Over the years, I have created various tools related to my research and I have some of them open sourced on Github:

  • HTML4Vision: a python-HTML-javascript tool for visualizing datasets, comparing algorithms and making figures
  • MTCMon: a web-based cluster resource monitor (widely adopted at CMU RI)
  • DLGPUBench: a latency-focused GPU benchmark for deep learning