CV
Basics
Name | Pranav Rajbhandari |
Label | Researcher |
prajbhan [at] cs [dot] cmu [dot] edu | |
Url | https://pranavraj575.github.io |
Summary | I am primarily interested in developing Machine Learning algorithms for strategic scenarios. I am also interested in Topology. |
Education
-
08/2025 - Present Pittsburgh, USA
Relevant CourseworkMachine Learning Advanced Introduction to Machine Learning Game Theory Computational Game Solving -
08/2020 - 05/2024 Pittsburgh, USA
Carnegie Mellon University
Undergraduate
Bachelor of Science in Artificial Intelligence; additional major in Discrete Mathematics and Logic
GPA: 4.0/4.0
Relevant CourseworkMathematics Algebraic Topology General Topology Dynamics of Polish Groups Probabilistic Combinatorics Extremal Combinatorics Graph Theory Game Theory Modern Regression Real Analysis Missileanious Quantum Computation Star Wars: The Course Awakens
Projects
-
02/2025 - 08/2025 Canberra, Australia
Understanding visual attention beehind bee-inspired UAV navigation
Bioengineering Group, University of New South Wales - PI: Sridhar Ravi
- Used the attention patterns of trained Reinforcement Learning (RL) agents to infer how a real bee makes movement decisions
- Built a goal-conditioned RL environment in OpenAI Gym to train a UAV to imitate bee behaviors using bee-like input sensors
- Used SHAP values, a tool for explaining model output, to measure visual regions that trained RL agents pay attention to
-
04/2024 - Present Pittsburgh, USA
AlephZero: Extending AlphaZero to Infinite Boards
Independent Research - PI: Pranav Rajbhandari
- Defined and analyzed $\aleph_0$ board games, a class of games with potentially unbounded action spaces. Interesting examples include 'Jenga' and '5D Chess with Multiverse Time Travel', as well as classic games like 'Chess' and 'Tic-Tac-Toe'
- Developed AlephZero, an extension of AlphaZero able to learn optimal policies in $\aleph_0$ board games
- Utilized transformer architectures to define policy networks and value networks able to take multi-dimensional sequential input
- Compared approach to standard algorithms such as AlphaZero, Deep Q-Learning, and Monte Carlo Tree Search
-
01/2024 - 08/2025 Canberra, Australia
Fine Tuning Swimming Locomotion Learned from Mosquito Larvae
University of New South Wales; U.S. Naval Research Laboratory - PI: Sridhar Ravi; Donald Sofge
- Optimized swimming locomotion copied from mosquito larvae for use on a robotic platform
- Utilized Reinforcement Learning to guide a local search algorithm optimizing swimming locomotion
- Designed an OpenAI Gym environment utilizing a Computational Fluid Dynamics (CFD) model for training
- Sped up the training process by using a pre-trained deep neural network to accurately predict forces on a robotic swimmer
- Compared performance of various architectures, including Deep Neural Networks, Recurrent Neural Networks, and LSTMs
-
07/2024 - 10/2024 Washington, D.C., USA
Transformer guided coevolution: Team selection in multiagent adversarial games
U.S. Naval Research Laboratory - PI: Donald Sofge; Prithviraj Dasgupta
- Developed BERTeam, an algorithm to learn diverse and cooperative team selection for multiagent adversarial team games
- Evaluated algorithm on Pyquaticus, a simulation of robotic Marine Capture-The-Flag
- Used Masked Language Modeling to teach optimal team composition to BERTeam's transformer architecture
- Cotrained BERTeam with Coevolutionary Deep Reinforcement Learning to select teams from a diverse population of agents
- Compared result of training with established algorithms in literature
- Developed and maintained unstable_baselines3, a Python package extending stable_baselines3 to multiagent environments
-
08/2023 - 05/2024 Pittsburgh, USA
Geodesic complexity? It's actually quite simplex
Department of Mathematical Sciences, Carnegie Mellon University - PI: Florian Frick
- Explored geodesic complexity, a measure of difficulty for creating an efficient continuous motion plan on a metric space
- Designed a technique utilizing local properties of a space to lower bound its geodesic complexity
- Created and proved correctness of an algorithm calculating cut loci on surfaces of polyhedra, a property related to their geodesic complexity
- Applied these techniques to produce a novel result for the geodesic complexity of the octahedron
- Proved existing geodesic complexity bounds in a new way, displaying the utility of our general method
-
01/2023 - 05/2024 Pittsburgh, USA
Utilizing Sim-to-Real Methods for Training a Robot Arm
Reliable Autonomous Systems Laboratory, Carnegie Mellon University - PI: Reid Simmons
- Led a team of four to design and maintain an OpenAI Gym environment for a Kinova Jaco Gen3 6DOF robot arm
- Simulated a model of the robot arm compatible with the control scheme of the physical arm using the Gazebo simulator
- Utilized ROS to handle communication between the robot arm and Python scripts
- Trained a 'real life filter' with the CycleGAN algorithm to make photo-realistic simulation images used for training
- Implemented a training pipeline for a robotic manipulation task, trained in simulation and refined on the real arm
-
05/2023 - 08/2023 Washington, D.C., USA
Learning NEAT Emergent Behavior in Robot Swarms
Distributed Autonomous Systems Group, U.S. Naval Research Laboratory - PI: Donald Sofge
- Developed an algorithm for training local policies to produce emergent behaviors in a robot swarm
- Designed a training pipeline applying the NeuroEvolution of Augmenting Topologies (NEAT) algorithm to robot swarm control
- Tested the algorithm's performance on a variety of tasks and simulated robotic swarms using the CoppeliaSim simulator
- Utilized ROS to handle communication between Python scripts and robotic swarms (both real and simulated)
-
05/2023 - 08/2023 Washington, D.C., USA
UAV Routing for Enhancing the Performance of a Classifier-in-the-loop
Distributed Autonomous Systems Group, U.S. Naval Research Laboratory - PI: Swaroop Darbha
- Collaborated on an interdisciplinary research project optimizing the information gained from targets by robot swarms
- Designed a heuristic algorithm for planning robot paths inspired by approximate solutions to the Traveling Salesman Problem
- Utilized Mathematica software, as well as methods from 'Convex Optimization' to optimize solutions for large test cases
- Tested our algorithm on both generated and real-life problem instances using Julia and the Gurobi optimizer
-
05/2022 - 08/2022 Washington, D.C., USA
Comparing Transfer Learning Methods for Continuous Reinforcement Learning
Adaptive Systems Section, U.S. Naval Research Laboratory - PI: Laura Hiatt
- Planned and executed a research project evaluating various transfer learning methods on robot arm manipulation tasks
- Designed an OpenAI Gym environment for a robotic manipulation task using the MuJoCo simulator
- Compared the performance of known transfer learning methods in transferring knowledge between Deep Neural Networks
- Utilized ROS to handle communication between the robot arm and Python scripts
-
02/2021 - 05/2022 Pittsburgh, USA
Creating a Strategic Agent to Play Jenga
Reliable Autonomous Systems Laboratory, Carnegie Mellon University - PI: Reid Simmons
- Planned and executed a research project evaluating the performance of various adversarial AI algorithms playing Jenga
- Implemented algorithms such as Monte Carlo Tree Search, Deep Q-Networks, and Inverse Reinforcement Learning
- Created a statistical model to estimate the stability of a Jenga tower for use in Model Based Reinforcement Learning
- Trained the model through repeatedly sampling stabilities of towers with the PyBullet physics engine
Publications
-
08/26/2025 Geodesic complexity of the octahedron, and an algorithm for cut loci on convex polyhedra
arXiv
(Preprint, submitted to Journal of Applied and Computational Topology)
-
07/16/2025 Understanding visual attention beehind bee-inspired UAV navigation
arXiv
(Preprint, poster accepted to 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025))
-
05/21/2025 Transformer Guided Coevolution: Improved Team Formation in Multiagent Adversarial Games
Proc. of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025)
-
12/13/2024 Fine Tuning Swimming Locomotion Learned from Mosquito Larvae
2024 IEEE International Conference on Robotics and Biomimetics (ROBIO)
-
12/11/2024 Learning NEAT Emergent Behaviors in Robot Swarms
2024 IEEE International Conference on Robotics and Biomimetics (ROBIO)
-
09/14/2024 UAV Routing for Enhancing the Performance of a Classifier-in-the-loop
2024 IEEE International Conference on Robotics and Biomimetics (ROBIO)
Presentations
-
03/11/2024 San Diego, USA
Learning Emergent Behavior in Robot Swarms with NEAT
Naval Applications of Machine Learning
-
05/03/2023 Pittsburgh, USA
Sim-to-real Transfer Reinforcement Learning
Carnegie Mellon University Meeting of the Minds
-
05/02/2022 Pittsburgh, USA
Experience
-
02/2025 - 08/2025 Canberra, Australia
Researcher
University of New South Wales - Bioengineering Group
- Worked in the Bioengineering Group on project "Understanding visual attention beehind bee-inspired UAV navigation"
- Worked in the Bioengineering Group on project "Fine Tuning Swimming Locomotion Learned from Mosquito Larvae"
-
01/2024 - 10/2024 05/2023 - 08/2023 05/2022 - 08/2022 Washington, D.C., USA
Researcher
U.S. Naval Research Laboratory
- Worked in the Distributed Autonomous Systems Group on project "Transformer guided coevolution: Team selection in multiagent adversarial games"
- Worked with USNW Canberra on project "Fine Tuning Swimming Locomotion Learned from Mosquito Larvae"
- Worked in the Distributed Autonomous Systems Group on project "Learning NEAT Emergent Behavior in Robot Swarms"
- Worked with Texas A&M Department of Mechanical Engineering on project "UAV Routing for Enhancing the Performance of a Classifier-in-the-loop"
- Worked in the Adaptive Systems Section on project "Comparing Transfer Learning Methods for Continuous Reinforcement Learning"
-
01/2023 - 05/2023 Mountain View, USA
Researcher
National Aeronautics and Space Administration - Ames Research Center
- Created an AI system to automate calling airport TMI events, especially Ground Stops and Ground Delay Programs
- Explored Imitation Reinforcement Learning methods to compete against the baseline of training a classifier model
- Processed historical data and created models to approximate decision processes using Python and R
-
01/2023 - 05/2024 02/2021 - 05/2022 Pittsburgh, USA
Researcher
Carnegie Mellon University
- Worked in Department of Mathematical Sciences on project "Geodesic complexity? It's actually quite simplex"
- Worked in the Reliable Autonomous Systems Laboratory on project "Utilizing Sim-to-Real Methods for Training a Robot Arm"
- Worked in the Reliable Autonomous Systems Laboratory on project "Creating a Strategic Agent to Play Jenga"
-
08/2021 - 12/2022 Pittsburgh, USA
Teaching Assistant
Carnegie Mellon University
For 'AI: Representation and Problem Solving' (3 semesters), 'Concepts of Mathematics' (1 semester), and 'Probability Theory for Computer Scientists' (1 semester)
- Collaborated in a team of up to 10 Teaching Assistants to manage classes of up to 100 students
- Planned and led class-wide review sessions, as well as recitations of about 20 students
- Held office hours to help students understand course material in a one-on-one setting
- Created, tested, and graded programming assignments and written homework
-
05/2021 - 08/2021 Pittsburgh, USA
Research Assistant
Carnegie Mellon University
- Collaborated with a team of three researchers to develop and maintain an R package for Natural Language Processing
- Utilized Rust's BERT Natural Language Processing to tokenize and classify strings in R
-
12/2020 - 01/2021 Atlanta, USA
Programmer
Centers for Disease Control and Prevention - Chronic Viral Diseases Branch Immunology Lab
- Designed a Constraint Satisfaction Problem instance to automate generating laboratory experiment setup procedures
- Utilized Python and R to automate post-experiment data processing
- Refined and deployed these programs across the laboratory after prototyping and incorporating feedback from lab members
Awards
- 05/12/2024
Dean’s List, High Honors (8 semesters)
Carnegie Mellon University
- 05/10/2024
Senior Leadership Recognition Award
Carnegie Mellon University
- 05/01/2024
Dr. William Brown Academic Achievement Award
Carnegie Mellon University
- 05/01/2024
Tartan Leaders of Tomorrow
Carnegie Mellon University
- 03/04/2023
Winner of AI/ML Innovation Challenge
Naval Surface Warfare Center Dahlgren Division
Was awarded \$50,000 cash prize at three-day competition hosted by the US Navy; Designed algorithm to protect ships from enemy missiles
Activities
-
08/2023 - 05/2024 Pittsburgh, USA
Carnegie Mellon University Super Informal Topology Discussion Group
Presenter, Member
-
08/2020 - 05/2024 Pittsburgh, USA
Carnegie Mellon University Track & Field
Sprint Team Captian
-
08/2020 - 05/2024 Pittsburgh, USA
Carnegie Mellon University PRISM Club
Volunteer, Member
Software & tools
|
|
|
• Stable Baselines |
|
|
|
|
|
• MuJoCo |
• LaTeX |
Languages
|
|
|
🏴☠️ R |
|
|
|
SML |
🦫 Golang |
|
|
|
Other languages
• English (Native) |
• नेपाली (मूल) |
• Latin |
Other other languages
|
|