publications
2025
- Florian Frick and Pranav Rajbhandari2025
The geodesic complexity of a length space $X$ quantifies the required number of case distinctions to continuously choose a shortest path connecting any given start and end point. We prove a local lower bound for the geodesic complexity of $X$ obtained by embedding simplices into $X\times X$. We additionally create and prove correctness of an algorithm to find cut loci on surfaces of convex polyhedra, as the structure of a space's cut loci is related to its geodesic complexity. We use these techniques to prove the geodesic complexity of the octahedron is four. Our method is inspired by earlier work of Recio-Mitter and Davis, and thus recovers their results on the geodesic complexity of the $n$-torus and the tetrahedron, respectively.
- Pranav Rajbhandari, Abhi Veda, and 3 more authors2025
Bio-inspired design is often used in autonomous UAV navigation due to the capacity of biological systems for flight and obstacle avoidance despite limited sensory and computational capabilities. In particular, honeybees mainly use the sensory input of optic flow, the apparent motion of objects in their visual field, to navigate cluttered environments. In our work, we train a Reinforcement Learning agent to navigate a tunnel with obstacles using only optic flow as sensory input. We inspect the attention patterns of trained agents to determine the regions of optic flow on which they primarily base their motor decisions. We find that agents trained in this way pay most attention to regions of discontinuity in optic flow, as well as regions with large optic flow magnitude. The trained agents appear to navigate a cluttered tunnel by avoiding the obstacles that produce large optic flow, while maintaining a centered position in their environment, which resembles the behavior seen in flying insects. This pattern persists across independently trained agents, which suggests that this could be a good strategy for developing a simple explicit control law for physical UAVs.
- Pranav Rajbhandari, Prithviraj Dasgupta, and Donald SofgeIn Proc. of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025), Detroit, Michigan, USA, 2025
With the increasing number of autonomous platforms in everyday life, forming coordinated teams of agents becomes vital. To solve this, we propose BERTeam, an algorithm inspired by Natural Language Processing. BERTeam trains a transformer-based deep neural network to select from a population of agents. It can integrate with coevolutionary deep reinforcement learning, which evolves a diverse set of players to choose from. We evaluate BERTeam in Marine Capture-The-Flag, and find it learns non-trivial team compositions that outperform unknown opponents. In this setting, we find that BERTeam outperforms MCAA, another team selection algorithm.
2024
- Pranav Rajbhandari and Donald SofgeIn 2024 IEEE International Conference on Robotics and Biomimetics (ROBIO), 2024
When researching robot swarms, many studies observe complex group behavior emerging from the individual agents' simple local actions. However, the task of learning an individual policy to produce a desired group behavior remains a challenging problem. We present a method of training distributed robotic swarm algorithms to produce emergent behavior. Inspired by the biological evolution of emergent behavior in animals, we use an evolutionary algorithm to train a 'population' of individual behaviors to produce a desired group behavior. We perform experiments using simulations of the Georgia Tech Miniature Autonomous Blimps (GT-MABs) aerial robotics platforms conducted in the CoppeliaSim simulator. Additionally, we test on simulations of Anki Vector robots to display our algorithm's effectiveness on various modes of actuation. We evaluate our algorithm on various tasks where a somewhat complex group behavior is required for success. These tasks include an Area Coverage task and a Wall Climb task. We compare behaviors evolved using our algorithm against designed policies, which we create in order to exhibit the emergent behaviors we desire.
- Pranav Rajbhandari, Karthick Dhileep, and 2 more authorsIn 2024 IEEE International Conference on Robotics and Biomimetics (ROBIO), 2024
In prior research, we analyzed the backwards swimming motion of mosquito larvae, and created a parametrized approximation in a Computational Fluid Dynamics simulation. Since the parameterized swimming motion is replicated from observed larvae, it is not necessarily the most efficient locomotion. In this project, we further optimize this swimming locomotion for the simulated platform, using Reinforcement Learning to guide local parameter updates. Since the majority of the computation cost arises from the Computational Fluid Dynamics model, we additionally train a deep neural network to replicate the forces acting on the swimmer model. We find that this method is effective at performing local search to improve the parameterized swimming locomotion.
- Deepak Prakash Kumar, Pranav Rajbhandari, and 3 more authorsJournal of Intelligent & Robotic Systems, 2024
Some human-machine systems are designed so that machines (robots) gather and deliver data to remotely located operators (humans) through an interface to aid them in classification. The performance of a human as a (binary) classifier-in-the-loop is characterized by probabilities of correctly classifying objects (or points of interest) as a true target or a false target. These two probabilities depend on the time spent collecting information at a point of interest (POI), known as dwell time. The information gain associated with collecting information at a POI is then a function of dwell time and discounted by the revisit time, i.e., the duration between consecutive revisits to the same POI, to ensure that the vehicle covers all POIs in a timely manner. The objective of the routing problem for classification is to route the vehicles optimally, which is a discrete problem, and determine the optimal dwell time at each POI, which is a continuous optimization problem, to maximize the total discounted information gain while visiting every POI at least once. Due to the coupled discrete and continuous problem, which makes the problem hard to solve, we make a simplifying assumption that the information gain is discounted exponentially by the revisit time; this assumption enables one to decouple the problem of routing with the problem of determining optimal dwell time at each POI for a single vehicle problem. For the multi-vehicle problem, since the problem involves task partitioning between vehicles in addition to routing and dwell time computation, we provide a fast heuristic to obtain high-quality feasible solutions.