CoverNet: Multimodal Behavior Prediction using Trajectory Sets2020-04-01 ${\displaystyle \cong }$ |

We present CoverNet, a new method for multimodal, probabilistic trajectory prediction for urban driving. Previous work has employed a variety of methods, including multimodal regression, occupancy maps, and 1-step stochastic policies. We instead frame the trajectory prediction problem as classification over a diverse set of trajectories. The size of this set remains manageable due to the limited number of distinct actions that can be taken over a reasonable prediction horizon. We structure the trajectory set to a) ensure a desired level of coverage of the state space, and b) eliminate physically impossible trajectories. By dynamically generating trajectory sets based on the agent's current state, we can further improve our method's efficiency. We demonstrate our approach on public, real-world self-driving datasets, and show that it outperforms state-of-the-art methods. |

Sub-Goal Trees -- a Framework for Goal-Directed Trajectory Prediction and Optimization2019-06-12 ${\displaystyle \cong }$ |

Many AI problems, in robotics and other domains, are goal-directed, essentially seeking a trajectory leading to some goal state. In such problems, the way we choose to represent a trajectory underlies algorithms for trajectory prediction and optimization. Interestingly, most all prior work in imitation and reinforcement learning builds on a sequential trajectory representation -- calculating the next state in the trajectory given its predecessors. We propose a different perspective: a goal-conditioned trajectory can be represented by first selecting an intermediate state between start and goal, partitioning the trajectory into two. Then, recursively, predicting intermediate points on each sub-segment, until a complete trajectory is obtained. We call this representation a sub-goal tree, and building on it, we develop new methods for trajectory prediction, learning, and optimization. We show that in a supervised learning setting, sub-goal trees better account for trajectory variability, and can predict trajectories exponentially faster at test time by leveraging a concurrent computation. Then, for optimization, we derive a new dynamic programming equation for sub-goal trees, and use it to develop new planning and reinforcement learning algorithms. These algorithms, which are not based on the standard Bellman equation, naturally account for hierarchical sub-goal structure in a task. Empirical results on motion planning domains show that the sub-goal tree framework significantly improves both accuracy and prediction time. |

A Scalable Framework for Trajectory Prediction2019-02-27 ${\displaystyle \cong }$ |

Trajectory prediction (TP) is of great importance for a wide range of location-based applications in intelligent transport systems such as location-based advertising, route planning, traffic management, and early warning systems. In the last few years, the widespread use of GPS navigation systems and wireless communication technology enabled vehicles has resulted in huge volumes of trajectory data. The task of utilizing this data employing spatio-temporal techniques for trajectory prediction in an efficient and accurate manner is an ongoing research problem. Existing TP approaches are limited to short-term predictions. Moreover, they cannot handle a large volume of trajectory data for long-term prediction. To address these limitations, we propose a scalable clustering and Markov chain based hybrid framework, called Traj-clusiVAT-based TP, for both short-term and long-term trajectory prediction, which can handle a large number of overlapping trajectories in a dense road network. Traj-clusiVAT can also determine the number of clusters, which represent different movement behaviours in input trajectory data. In our experiments, we compare our proposed approach with a mixed Markov model (MMM)-based scheme, and a trajectory clustering, NETSCAN-based TP method for both short- and long-term trajectory predictions. We performed our experiments on two real, vehicle trajectory datasets, including a large-scale trajectory dataset consisting of 3.28 million trajectories obtained from 15,061 taxis in Singapore over a period of one month. Experimental results on two real trajectory datasets show that our proposed approach outperforms the existing approaches in terms of both short- and long-term prediction performances, based on prediction accuracy and distance error (in km). |

Interplanetary Transfers via Deep Representations of the Optimal Policy and/or of the Value Function2019-04-18 ${\displaystyle \cong }$ |

A number of applications to interplanetary trajectories have been recently proposed based on deep networks. These approaches often rely on the availability of a large number of optimal trajectories to learn from. In this paper we introduce a new method to quickly create millions of optimal spacecraft trajectories from a single nominal trajectory. Apart from the generation of the nominal trajectory, no additional optimal control problems need to be solved as all the trajectories, by construction, satisfy Pontryagin's minimum principle and the relevant transversality conditions. We then consider deep feed forward neural networks and benchmark three learning methods on the created dataset: policy imitation, value function learning and value function gradient learning. Our results are shown for the case of the interplanetary trajectory optimization problem of reaching Venus orbit, with the nominal trajectory starting from the Earth. We find that both policy imitation and value function gradient learning are able to learn the optimal state feedback, while in the case of value function learning the optimal policy is not captured, only the final value of the optimal propellant mass is. |

Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models2020-07-13 ${\displaystyle \cong }$ |

In this work we show that semi-supervised models for vehicle trajectory prediction significantly improve performance over supervised models on state-of-the-art real-world benchmarks. Moving from supervised to semi-supervised models allows scaling-up by using unlabeled data, increasing the number of images in pre-training from Millions to a Billion. We perform ablation studies comparing transfer learning of semi-supervised and supervised models while keeping all other factors equal. Within semi-supervised models we compare contrastive learning with teacher-student methods as well as networks predicting a small number of trajectories with networks predicting probabilities over a large trajectory set. Our results using both low-level and mid-level representations of the driving environment demonstrate the applicability of semi-supervised methods for real-world vehicle trajectory prediction. |

Diverse Trajectory Forecasting with Determinantal Point Processes2019-12-23 ${\displaystyle \cong }$ |

The ability to forecast a set of likely yet diverse possible future behaviors of an agent (e.g., future trajectories of a pedestrian) is essential for safety-critical perception systems (e.g., autonomous vehicles). In particular, a set of possible future behaviors generated by the system must be diverse to account for all possible outcomes in order to take necessary safety precautions. It is not sufficient to maintain a set of the most likely future outcomes because the set may only contain perturbations of a single outcome. While generative models such as variational autoencoders (VAEs) have been shown to be a powerful tool for learning a distribution over future trajectories, randomly drawn samples from the learned implicit likelihood model may not be diverse -- the likelihood model is derived from the training data distribution and the samples will concentrate around the major mode that has most data. In this work, we propose to learn a diversity sampling function (DSF) that generates a diverse and likely set of future trajectories. The DSF maps forecasting context features to a set of latent codes which can be decoded by a generative model (e.g., VAE) into a set of diverse trajectory samples. Concretely, the process of identifying the diverse set of samples is posed as a parameter estimation of the DSF. To learn the parameters of the DSF, the diversity of the trajectory samples is evaluated by a diversity loss based on a determinantal point process (DPP). Gradient descent is performed over the DSF parameters, which in turn move the latent codes of the sample set to find an optimal diverse and likely set of trajectories. Our method is a novel application of DPPs to optimize a set of items (trajectories) in continuous space. We demonstrate the diversity of the trajectories produced by our approach on both low-dimensional 2D trajectory data and high-dimensional human motion data. |

It Is Not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction2020-07-18 ${\displaystyle \cong }$ |

Human trajectory forecasting with multiple socially interacting agents is of critical importance for autonomous navigation in human environments, e.g., for self-driving cars and social robots. In this work, we present Predicted Endpoint Conditioned Network (PECNet) for flexible human trajectory prediction. PECNet infers distant trajectory endpoints to assist in long-range multi-modal trajectory prediction. A novel non-local social pooling layer enables PECNet to infer diverse yet socially compliant trajectories. Additionally, we present a simple "truncation-trick" for improving few-shot multi-modal trajectory prediction performance. We show that PECNet improves state-of-the-art performance on the Stanford Drone trajectory prediction benchmark by ~20.9% and on the ETH/UCY benchmark by ~40.8%. Project homepage: https://karttikeya.github.io/publication/htf/ |

SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction2020-07-26 ${\displaystyle \cong }$ |

We propose advances that address two key challenges in future trajectory prediction: (i) multimodality in both training data and predictions and (ii) constant time inference regardless of number of agents. Existing trajectory predictions are fundamentally limited by lack of diversity in training data, which is difficult to acquire with sufficient coverage of possible modes. Our first contribution is an automatic method to simulate diverse trajectories in the top-view. It uses pre-existing datasets and maps as initialization, mines existing trajectories to represent realistic driving behaviors and uses a multi-agent vehicle dynamics simulator to generate diverse new trajectories that cover various modes and are consistent with scene layout constraints. Our second contribution is a novel method that generates diverse predictions while accounting for scene semantics and multi-agent interactions, with constant-time inference independent of the number of agents. We propose a convLSTM with novel state pooling operations and losses to predict scene-consistent states of multiple agents in a single forward pass, along with a CVAE for diversity. We validate our proposed multi-agent trajectory prediction approach by training and testing on the proposed simulated dataset and existing real datasets of traffic scenes. In both cases, our approach outperforms SOTA methods by a large margin, highlighting the benefits of both our diverse dataset simulation and constant-time diverse trajectory prediction methods. |

Amortized Q-learning with Model-based Action Proposals for Autonomous Driving on Highways2020-12-06 ${\displaystyle \cong }$ |

Well-established optimization-based methods can guarantee an optimal trajectory for a short optimization horizon, typically no longer than a few seconds. As a result, choosing the optimal trajectory for this short horizon may still result in a sub-optimal long-term solution. At the same time, the resulting short-term trajectories allow for effective, comfortable and provable safe maneuvers in a dynamic traffic environment. In this work, we address the question of how to ensure an optimal long-term driving strategy, while keeping the benefits of classical trajectory planning. We introduce a Reinforcement Learning based approach that coupled with a trajectory planner, learns an optimal long-term decision-making strategy for driving on highways. By online generating locally optimal maneuvers as actions, we balance between the infinite low-level continuous action space, and the limited flexibility of a fixed number of predefined standard lane-change actions. We evaluated our method on realistic scenarios in the open-source traffic simulator SUMO and were able to achieve better performance than the 4 benchmark approaches we compared against, including a random action selecting agent, greedy agent, high-level, discrete actions agent and an IDM-based SUMO-controlled agent. |

An Unsupervised Learning Method with Convolutional Auto-Encoder for Vessel Trajectory Similarity Computation2021-01-09 ${\displaystyle \cong }$ |

To achieve reliable mining results for massive vessel trajectories, one of the most important challenges is how to efficiently compute the similarities between different vessel trajectories. The computation of vessel trajectory similarity has recently attracted increasing attention in the maritime data mining research community. However, traditional shape- and warping-based methods often suffer from several drawbacks such as high computational cost and sensitivity to unwanted artifacts and non-uniform sampling rates, etc. To eliminate these drawbacks, we propose an unsupervised learning method which automatically extracts low-dimensional features through a convolutional auto-encoder (CAE). In particular, we first generate the informative trajectory images by remapping the raw vessel trajectories into two-dimensional matrices while maintaining the spatio-temporal properties. Based on the massive vessel trajectories collected, the CAE can learn the low-dimensional representations of informative trajectory images in an unsupervised manner. The trajectory similarity is finally equivalent to efficiently computing the similarities between the learned low-dimensional features, which strongly correlate with the raw vessel trajectories. Comprehensive experiments on realistic data sets have demonstrated that the proposed method largely outperforms traditional trajectory similarity computation methods in terms of efficiency and effectiveness. The high-quality trajectory clustering performance could also be guaranteed according to the CAE-based trajectory similarity computation results. |

Kernel Trajectory Maps for Multi-Modal Probabilistic Motion Prediction2019-10-07 ${\displaystyle \cong }$ |

Understanding the dynamics of an environment, such as the movement of humans and vehicles, is crucial for agents to achieve long-term autonomy in urban environments. This requires the development of methods to capture the multi-modal and probabilistic nature of motion patterns. We present Kernel Trajectory Maps (KTM) to capture the trajectories of movement in an environment. KTMs leverage the expressiveness of kernels from non-parametric modelling by projecting input trajectories onto a set of representative trajectories, to condition on a sequence of observed waypoint coordinates, and predict a multi-modal distribution over possible future trajectories. The output is a mixture of continuous stochastic processes, where each realisation is a continuous functional trajectory, which can be queried at arbitrarily fine time steps. |

Variational Autoencoder Trajectory Primitives with Continuous and Discrete Latent Codes2019-12-09 ${\displaystyle \cong }$ |

Imitation learning is an intuitive approach for teaching motion to robotic systems. Although previous studies have proposed various methods to model demonstrated movement primitives, one of the limitations of existing methods is that it is not trivial to modify their planned trajectory once the model is learned. The trajectory of a robotic manipulator is often high-dimensional, and it is not easy to tune the shape of the planned trajectory in an intuitive manner. We address this problem by learning the latent space of the robot trajectory. If the latent variable of the trajectories can be learned, it can be used to tune the trajectory in an intuitive manner even when the user is an expert. We propose a framework for modeling demonstrated trajectories with a neural network that learns the low-dimensional latent space. Our neural network structure is built on the variational autoencoder (VAE) with discrete and continuous latent variables. We extend the structure of the existing VAE to obtain the decoder that is conditioned on the goal position of the trajectory for generalization to different goal positions. To cope with requirement of the massive training data, we use a trajectory augmentation technique inspired by the data augmentation commonly used in the computer vision community. In the proposed framework, the latent variables that encodes the multiple types of trajectories are learned in an unsupervised manner. The learned decoder can be used as a motion planner in which the user can specify the goal position and the trajectory types by setting the latent variables. The experimental results show that our neural network can be trained using a limited number of demonstrated trajectories and that the interpretable latent representations can be learned. |

Scalable Unsupervised Multi-Criteria Trajectory Segmentation and Driving Preference Mining2020-10-23 ${\displaystyle \cong }$ |

We present analysis techniques for large trajectory data sets that aim to provide a semantic understanding of trajectories reaching beyond them being point sequences in time and space. The presented techniques use a driving preference model w.r.t. road segment traversal costs, e.g., travel time and distance, to analyze and explain trajectories. In particular, we present trajectory mining techniques that can (a) find interesting points within a trajectory indicating, e.g., a via-point, and (b) recover the driving preferences of a driver based on their chosen trajectory. We evaluate our techniques on the tasks of via-point identification and personalized routing using a data set of more than 1 million vehicle trajectories collected throughout Denmark during a 3-year period. Our techniques can be implemented efficiently and are highly parallelizable, allowing them to scale to millions or billions of trajectories. |

Occ-Traj120: Occupancy Maps with Associated Trajectories2020-03-02 ${\displaystyle \cong }$ |

Trajectory modelling had been the principal research area for understanding and anticipating human behaviour. Predicting the dynamic path by observing the agent and its surrounding environment are essential for applications such as autonomous driving and indoor navigation suggestions. However, despite the numerous researches that had been presented, most available dataset does not contains any information on environmental factors---such as the occupancy representation of the map---which arguably plays a significant role on how an agent chooses its trajectory. We present a trajectory dataset with the corresponding occupancy representations of different local-maps. The dataset contains more than 120 locally-structured maps with occupancy representation and more than 110K trajectories in total. Each map has few hundred corresponding simulated trajectories that navigate from a spatial location of a room to another point. The dataset is freely available online. |

Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein Graph Double-Attention Network2020-02-14 ${\displaystyle \cong }$ |

Effective understanding of the environment and accurate trajectory prediction of surrounding dynamic obstacles are indispensable for intelligent mobile systems (like autonomous vehicles and social robots) to achieve safe and high-quality planning when they navigate in highly interactive and crowded scenarios. Due to the existence of frequent interactions and uncertainty in the scene evolution, it is desired for the prediction system to enable relational reasoning on different entities and provide a distribution of future trajectories for each agent. In this paper, we propose a generic generative neural system (called Social-WaGDAT) for multi-agent trajectory prediction, which makes a step forward to explicit interaction modeling by incorporating relational inductive biases with a dynamic graph representation and leverages both trajectory and scene context information. We also employ an efficient kinematic constraint layer applied to vehicle trajectory prediction which not only ensures physical feasibility but also enhances model performance. The proposed system is evaluated on three public benchmark datasets for trajectory prediction, where the agents cover pedestrians, cyclists and on-road vehicles. The experimental results demonstrate that our model achieves better performance than various baseline approaches in terms of prediction accuracy. |

Intention-aware Long Horizon Trajectory Prediction of Surrounding Vehicles using Dual LSTM Networks2019-06-06 ${\displaystyle \cong }$ |

As autonomous vehicles (AVs) need to interact with other road users, it is of importance to comprehensively understand the dynamic traffic environment, especially the future possible trajectories of surrounding vehicles. This paper presents an algorithm for long-horizon trajectory prediction of surrounding vehicles using a dual long short term memory (LSTM) network, which is capable of effectively improving prediction accuracy in strongly interactive driving environments. In contrast to traditional approaches which require trajectory matching and manual feature selection, this method can automatically learn high-level spatial-temporal features of driver behaviors from naturalistic driving data through sequence learning. By employing two blocks of LSTMs, the proposed method feeds the sequential trajectory to the first LSTM for driver intention recognition as an intermediate indicator, which is immediately followed by a second LSTM for future trajectory prediction. Test results from real-world highway driving data show that the proposed method can, in comparison to state-of-art methods, output more accurate and reasonable estimate of different future trajectories over 5s time horizon with root mean square error (RMSE) for longitudinal and lateral prediction less than 5.77m and 0.49m, respectively. |

Map-Adaptive Goal-Based Trajectory Prediction2020-09-09 ${\displaystyle \cong }$ |

We present a new method for multi-modal, long-term vehicle trajectory prediction. Our approach relies on using lane centerlines captured in rich maps of the environment to generate a set of proposed goal paths for each vehicle. Using these paths -- which are generated at run time and therefore dynamically adapt to the scene -- as spatial anchors, we predict a set of goal-based trajectories along with a categorical distribution over the goals. This approach allows us to directly model the goal-directed behavior of traffic actors, which unlocks the potential for more accurate long-term prediction. Our experimental results on both a large-scale internal driving dataset and on the public nuScenes dataset show that our model outperforms state-of-the-art approaches for vehicle trajectory prediction over a 6-second horizon. We also empirically demonstrate that our model is better able to generalize to road scenes from a completely new city than existing methods. |

Robust Trajectory Forecasting for Multiple Intelligent Agents in Dynamic Scene2020-05-26 ${\displaystyle \cong }$ |

Trajectory forecasting, or trajectory prediction, of multiple interacting agents in dynamic scenes, is an important problem for many applications, such as robotic systems and autonomous driving. The problem is a great challenge because of the complex interactions among the agents and their interactions with the surrounding scenes. In this paper, we present a novel method for the robust trajectory forecasting of multiple intelligent agents in dynamic scenes. The proposed method consists of three major interrelated components: an interaction net for global spatiotemporal interactive feature extraction, an environment net for decoding dynamic scenes (i.e., the surrounding road topology of an agent), and a prediction net that combines the spatiotemporal feature, the scene feature, the past trajectories of agents and some random noise for the robust trajectory prediction of agents. Experiments on pedestrian-walking and vehicle-pedestrian heterogeneous datasets demonstrate that the proposed method outperforms the state-of-the-art prediction methods in terms of prediction accuracy. |

Latent Variable Nested Set Transformers & AutoBots2021-02-19 ${\displaystyle \cong }$ |

Humans have the innate ability to attend to the most relevant actors in their vicinity and can forecast how they may behave in the future. This ability will be crucial for the deployment of safety-critical agents such as robots or vehicles which interact with humans. We propose a theoretical framework for this problem setting based on autoregressively modelling sequences of nested sets, using latent variables to better capture multimodal distributions over future sets of sets. We present a new model architecture which we call a Nested Set Transformer which employs multi-head self-attention blocks over sets of sets that serve as a form of social attention between the elements of the sets at every timestep. Our approach can produce a distribution over future trajectories for all agents under consideration, or focus upon the trajectory of an ego-agent. We validate the Nested Set Transformer for autonomous driving settings which we refer to as ("AutoBot"), where we model the trajectory of an ego-agent based on the sequential observations of key attributes of multiple agents in a scene. AutoBot produces results better than state-of-the-art published prior work on the challenging nuScenes vehicle trajectory modeling benchmark. We also examine the multi-agent prediction version of our model and jointly forecast an ego-agent's future trajectory along with the other agents in the scene. We validate the behavior of our proposed Nested Set Transformer for scene level forecasting with a pedestrian trajectory dataset. |

A Generic Framework for Clustering Vehicle Motion Trajectories2020-09-25 ${\displaystyle \cong }$ |

The development of autonomous vehicles requires having access to a large amount of data in the concerning driving scenarios. However, manual annotation of such driving scenarios is costly and subject to the errors in the rule-based trajectory labeling systems. To address this issue, we propose an effective non-parametric trajectory clustering framework consisting of five stages: (1) aligning trajectories and quantifying their pairwise temporal dissimilarities, (2) embedding the trajectory-based dissimilarities into a vector space, (3) extracting transitive relations, (4) embedding the transitive relations into a new vector space, and (5) clustering the trajectories with an optimal number of clusters. We investigate and evaluate the proposed framework on a challenging real-world dataset consisting of annotated trajectories. We observe that the proposed framework achieves promising results, despite the complexity caused by having trajectories of varying length. Furthermore, we extend the framework to validate the augmentation of the real dataset with synthetic data generated by a Generative Adversarial Network (GAN) where we examine whether the generated trajectories are consistent with the true underlying clusters. |