CV | Pranav Rajbhandari

Basics

Name	Pranav Rajbhandari
Label	PhD student
Email	prajbhan [at] cs [dot] cmu [dot] edu
Url	https://pranavraj575.github.io
Summary	I am primarily interested in developing Machine Learning algorithms for strategic scenarios. I am also interested in Topology.

Education

08/2025 - Present

Pittsburgh, USA

Carnegie Mellon University

PhD

Machine Learning Department

Relevant Coursework

• Advanced Topics in Machine Learning & Game Theory

• Computational Game Solving

• Advanced Introduction to Machine Learning

08/2020 - 05/2024

Pittsburgh, USA

Carnegie Mellon University

Undergraduate

Bachelor of Science in Artificial Intelligence; additional major in Discrete Mathematics and Logic

GPA: 4.0/4.0

Relevant Coursework

	Artificial Intelligence
	• AI: Representation & Problem Solving
	• Deep Reinforcement Learning & Control
	• Advanced Deep Learning
	• Natural Language Processing
	• Convex Optimization
	• Art & Machine Learning
	• Autonomous Agents
	• Search Engines

	Mathematics
	• Algebraic Topology
	• General Topology
	• Dynamics of Polish Groups
	• Probabilistic Combinatorics
	• Extremal Combinatorics
	• Graph Theory
	• Game Theory
	• Modern Regression
	• Real Analysis

	Missileanious
	• Quantum Computation
	• Star Wars: The Course Awakens

Projects

02/2025 - 08/2025

Canberra, Australia
Understanding visual attention beehind bee-inspired UAV navigation

Bioengineering Group, University of New South Wales - PI: Sridhar Ravi
- Used the attention patterns of trained Reinforcement Learning (RL) agents to infer how a real bee makes movement decisions
- Built a goal-conditioned RL environment in OpenAI Gym to train a UAV to imitate bee behaviors using bee-like input sensors
- Used SHAP values, a tool for explaining model output, to measure visual regions that trained RL agents pay attention to
04/2024 - Present

Pittsburgh, USA
AlephZero: Extending AlphaZero to Infinite Boards

Independent Research - PI: Pranav Rajbhandari
- Defined and analyzed $\aleph_0$ board games, a class of games with potentially unbounded action spaces. Interesting examples include 'Jenga' and '5D Chess with Multiverse Time Travel', as well as classic games like 'Chess' and 'Tic-Tac-Toe'
- Developed AlephZero, an extension of AlphaZero able to learn optimal policies in $\aleph_0$ board games
- Utilized transformer architectures to define policy networks and value networks able to take multi-dimensional sequential input
- Compared approach to standard algorithms such as AlphaZero, Deep Q-Learning, and Monte Carlo Tree Search
01/2024 - 08/2025

Canberra, Australia
Fine Tuning Swimming Locomotion Learned from Mosquito Larvae

University of New South Wales; U.S. Naval Research Laboratory - PI: Sridhar Ravi; Donald Sofge
- Optimized swimming locomotion copied from mosquito larvae for use on a robotic platform
- Utilized Reinforcement Learning to guide a local search algorithm optimizing swimming locomotion
- Designed an OpenAI Gym environment utilizing a Computational Fluid Dynamics (CFD) model for training
- Sped up the training process by using a pre-trained deep neural network to accurately predict forces on a robotic swimmer
- Compared performance of various architectures, including Deep Neural Networks, Recurrent Neural Networks, and LSTMs
07/2024 - 10/2024

Washington, D.C., USA
Transformer guided coevolution: Team selection in multiagent adversarial games

U.S. Naval Research Laboratory - PI: Donald Sofge; Prithviraj Dasgupta
- Developed BERTeam, an algorithm to learn diverse and cooperative team selection for multiagent adversarial team games
- Evaluated algorithm on Pyquaticus, a simulation of robotic Marine Capture-The-Flag
- Used Masked Language Modeling to teach optimal team composition to BERTeam's transformer architecture
- Cotrained BERTeam with Coevolutionary Deep Reinforcement Learning to select teams from a diverse population of agents
- Compared result of training with established algorithms in literature
- Developed and maintained unstable_baselines3, a Python package extending stable_baselines3 to multiagent environments
08/2023 - 05/2024

Pittsburgh, USA
Geodesic complexity? It's actually quite simplex

Department of Mathematical Sciences, Carnegie Mellon University - PI: Florian Frick
- Explored geodesic complexity, a measure of difficulty for creating an efficient continuous motion plan on a metric space
- Designed a technique utilizing local properties of a space to lower bound its geodesic complexity
- Created and proved correctness of an algorithm calculating cut loci on surfaces of polyhedra, a property related to their geodesic complexity
- Applied these techniques to produce a novel result for the geodesic complexity of the octahedron
- Proved existing geodesic complexity bounds in a new way, displaying the utility of our general method
01/2023 - 05/2024

Pittsburgh, USA
Utilizing Sim-to-Real Methods for Training a Robot Arm

Reliable Autonomous Systems Laboratory, Carnegie Mellon University - PI: Reid Simmons
- Led a team of four to design and maintain an OpenAI Gym environment for a Kinova Jaco Gen3 6DOF robot arm
- Simulated a model of the robot arm compatible with the control scheme of the physical arm using the Gazebo simulator
- Utilized ROS to handle communication between the robot arm and Python scripts
- Trained a 'real life filter' with the CycleGAN algorithm to make photo-realistic simulation images used for training
- Implemented a training pipeline for a robotic manipulation task, trained in simulation and refined on the real arm
05/2023 - 08/2023

Washington, D.C., USA
Learning NEAT Emergent Behavior in Robot Swarms

Distributed Autonomous Systems Group, U.S. Naval Research Laboratory - PI: Donald Sofge
- Developed an algorithm for training local policies to produce emergent behaviors in a robot swarm
- Designed a training pipeline applying the NeuroEvolution of Augmenting Topologies (NEAT) algorithm to robot swarm control
- Tested the algorithm's performance on a variety of tasks and simulated robotic swarms using the CoppeliaSim simulator
- Utilized ROS to handle communication between Python scripts and robotic swarms (both real and simulated)
05/2023 - 08/2023

Washington, D.C., USA
UAV Routing for Enhancing the Performance of a Classifier-in-the-loop

Distributed Autonomous Systems Group, U.S. Naval Research Laboratory - PI: Swaroop Darbha
- Collaborated on an interdisciplinary research project optimizing the information gained from targets by robot swarms
- Designed a heuristic algorithm for planning robot paths inspired by approximate solutions to the Traveling Salesman Problem
- Utilized Mathematica software, as well as methods from 'Convex Optimization' to optimize solutions for large test cases
- Tested our algorithm on both generated and real-life problem instances using Julia and the Gurobi optimizer
05/2022 - 08/2022

Washington, D.C., USA
Comparing Transfer Learning Methods for Continuous Reinforcement Learning

Adaptive Systems Section, U.S. Naval Research Laboratory - PI: Laura Hiatt
- Planned and executed a research project evaluating various transfer learning methods on robot arm manipulation tasks
- Designed an OpenAI Gym environment for a robotic manipulation task using the MuJoCo simulator
- Compared the performance of known transfer learning methods in transferring knowledge between Deep Neural Networks
- Utilized ROS to handle communication between the robot arm and Python scripts
02/2021 - 05/2022

Pittsburgh, USA
Creating a Strategic Agent to Play Jenga

Reliable Autonomous Systems Laboratory, Carnegie Mellon University - PI: Reid Simmons
- Planned and executed a research project evaluating the performance of various adversarial AI algorithms playing Jenga
- Implemented algorithms such as Monte Carlo Tree Search, Deep Q-Networks, and Inverse Reinforcement Learning
- Created a statistical model to estimate the stability of a Jenga tower for use in Model Based Reinforcement Learning
- Trained the model through repeatedly sampling stabilities of towers with the PyBullet physics engine

Publications

08/26/2025

Geodesic complexity of the octahedron, and an algorithm for cut loci on convex polyhedra

Florian Frick and Pranav Rajbhandari

arXiv

(Preprint, submitted to Journal of Computational Geometry)
07/16/2025

Understanding visual attention beehind bee-inspired UAV navigation

Pranav Rajbhandari, Abhi Veda, and 3 more authors

arXiv

(Preprint, poster accepted to 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025))
05/21/2025

Transformer Guided Coevolution: Improved Team Formation in Multiagent Adversarial Games

Pranav Rajbhandari, Prithviraj Dasgupta, and Donald Sofge

Proc. of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025)
12/13/2024

Fine Tuning Swimming Locomotion Learned from Mosquito Larvae

Pranav Rajbhandari, Karthick Dhileep, and 2 more authors

2024 IEEE International Conference on Robotics and Biomimetics (ROBIO)
12/11/2024

Learning NEAT Emergent Behaviors in Robot Swarms

Pranav Rajbhandari and Donald Sofge

2024 IEEE International Conference on Robotics and Biomimetics (ROBIO)
09/14/2024

UAV Routing for Enhancing the Performance of a Classifier-in-the-loop

Deepak Prakash Kumar, Pranav Rajbhandari, and 3 more authors

2024 IEEE International Conference on Robotics and Biomimetics (ROBIO)

Presentations

03/11/2024

San Diego, USA

Learning Emergent Behavior in Robot Swarms with NEAT

Pranav Rajbhandari and Donald Sofge

Naval Applications of Machine Learning
05/03/2023

Pittsburgh, USA

Sim-to-real Transfer Reinforcement Learning

Pranav Rajbhandari, Sophia Zalewski, and Reid Simmons

Carnegie Mellon University Meeting of the Minds
05/02/2022

Pittsburgh, USA

Creating Agents to Learn Jenga

Pranav Rajbhandari and Reid Simmons

Carnegie Mellon University Meeting of the Minds

Experience

01/2026 - Present

08/2021 - 12/2022

Pittsburgh, USA
Teaching Assistant

Carnegie Mellon University

For 'AI: Representation and Problem Solving' (4 semesters), 'Concepts of Mathematics' (1 semester), and 'Probability Theory for Computer Scientists' (1 semester)
- Collaborated in a team of up to 10 Teaching Assistants to manage classes of up to 100 students
- Planned and led class-wide review sessions, as well as recitations of about 20 students
- Held office hours to help students understand course material in a one-on-one setting
- Created, tested, and graded programming assignments and written homework
- Maintained course website
02/2025 - 08/2025

Canberra, Australia
Researcher

University of New South Wales - Bioengineering Group
- Worked in the Bioengineering Group on project "Understanding visual attention beehind bee-inspired UAV navigation"
- Worked in the Bioengineering Group on project "Fine Tuning Swimming Locomotion Learned from Mosquito Larvae"
01/2024 - 10/2024

05/2023 - 08/2023

05/2022 - 08/2022

Washington, D.C., USA
Researcher

U.S. Naval Research Laboratory
- Worked in the Distributed Autonomous Systems Group on project "Transformer guided coevolution: Team selection in multiagent adversarial games"
- Worked with USNW Canberra on project "Fine Tuning Swimming Locomotion Learned from Mosquito Larvae"
- Worked in the Distributed Autonomous Systems Group on project "Learning NEAT Emergent Behavior in Robot Swarms"
- Worked with Texas A&M Department of Mechanical Engineering on project "UAV Routing for Enhancing the Performance of a Classifier-in-the-loop"
- Worked in the Adaptive Systems Section on project "Comparing Transfer Learning Methods for Continuous Reinforcement Learning"
01/2023 - 05/2023

Mountain View, USA
Researcher

National Aeronautics and Space Administration - Ames Research Center
- Created an AI system to automate calling airport TMI events, especially Ground Stops and Ground Delay Programs
- Explored Imitation Reinforcement Learning methods to compete against the baseline of training a classifier model
- Processed historical data and created models to approximate decision processes using Python and R
01/2023 - 05/2024

02/2021 - 05/2022

Pittsburgh, USA
Researcher

Carnegie Mellon University
- Worked in Department of Mathematical Sciences on project "Geodesic complexity? It's actually quite simplex"
- Worked in the Reliable Autonomous Systems Laboratory on project "Utilizing Sim-to-Real Methods for Training a Robot Arm"
- Worked in the Reliable Autonomous Systems Laboratory on project "Creating a Strategic Agent to Play Jenga"
05/2021 - 08/2021

Pittsburgh, USA
Research Assistant

Carnegie Mellon University
- Collaborated with a team of three researchers to develop and maintain an R package for Natural Language Processing
- Utilized Rust's BERT Natural Language Processing to tokenize and classify strings in R
12/2020 - 01/2021

Atlanta, USA
Programmer

Centers for Disease Control and Prevention - Chronic Viral Diseases Branch Immunology Lab
- Designed a Constraint Satisfaction Problem instance to automate generating laboratory experiment setup procedures
- Utilized Python and R to automate post-experiment data processing
- Refined and deployed these programs across the laboratory after prototyping and incorporating feedback from lab members

Awards

05/12/2024

Dean’s List, High Honors (8 semesters)

Carnegie Mellon University
05/10/2024

Senior Leadership Recognition Award

Carnegie Mellon University
05/01/2024

Dr. William Brown Academic Achievement Award

Carnegie Mellon University
05/01/2024

Tartan Leaders of Tomorrow

Carnegie Mellon University
03/04/2023

Winner of AI/ML Innovation Challenge

Naval Surface Warfare Center Dahlgren Division

Was awarded \$50,000 cash prize at three-day competition hosted by the US Navy; Designed algorithm to protect ships from enemy missiles

Activities

08/2023 - 05/2024

Pittsburgh, USA
Carnegie Mellon University Super Informal Topology Discussion Group

Presenter, Member
08/2020 - 05/2024

Pittsburgh, USA
Carnegie Mellon University Track & Field

Sprint Team Captian
08/2020 - 05/2024

Pittsburgh, USA
Carnegie Mellon University PRISM Club

Volunteer, Member