Karthik R. Narasimhan
Associate Professor, Computer Science, Princeton
co-Director, Princeton NLP
Associate Director, Princeton Language and Intelligence (PLI)
I develop intelligent autonomous agents that interact with and adapt to complex, real-world environments. My research focuses on creating new paradigms, benchmarks, and frameworks to advance the capabilities and applications of machine learning and AI. Some recent highlights include:
- GPT: demonstrated an autoregressive transformer for language modeling and introduced the idea of solving NLP tasks through token prediction
- ReAct and Tree of Thoughts (ToT): combined reasoning and acting with language models into one paradigm and helped kickstart LM-based AI agents.
- SWE-bench, SWE-agent (++): introduced a comprehensive benchmark for software engineering AI agents (much more than just writing code) and helped turbocharge progress in AI coding agents.
- WebShop: introduced the idea of web-based AI agents that can perform tasks on realistic websites (e.g. Amazon/Ebay shopping).
- GEO: introduced the paradigm of content optimization in the age of generative engines like ChatGPT.
- TAU-bench: introduced a dynamic dual-control environment for testing AI agents at user-facing tasks like customer support.
I previously received my PhD from MIT, advised by Prof. Regina Barzilay. I have also spent time as a research scientist at OpenAI (2017-18) and head of research at Sierra (2023-25).
Selected Research Papers
- Language models: GPT (2018)
- Language agents: Text-DQN (2015), CALM (2020), ReAct (2022), Tree of Thoughts (2023), Reflexion (2023), CoALA (2023), SWE-agent (2024)
- Datasets/Benchmarks: SWE-bench (2023), TAU-bench (2024), WebShop (2022), GEO (2023), InterCode (2023), C-STS (2023), SILG (2021)
- Efficiency and Safety: DataMUX (2022), Toxicity in ChatGPT (2023)
- Reinforcement Learning: h-DQN (2016), Multi-Objective RL (2019), POLCO (2021), XTX (2022)
For more publications, please see my Google Scholar page.