Synthesizing Skeletons for Reactive Systems2018-03-25 ${\displaystyle \cong }$ |

We present an analysis technique for temporal specifications of reactive systems that identifies, on the level of individual system outputs over time, which parts of the implementation are determined by the specification, and which parts are still open. This information is represented in the form of a labeled transition system, which we call skeleton. Each state of the skeleton is labeled with a three-valued assignment to the output variables: each output can be true, false, or open, where true or false means that the value must be true or false, respectively, and open means that either value is still possible. We present algorithms for the verification of skeletons and for the learning-based synthesis of skeletons from specifications in linear-time temporal logic (LTL). The algorithm returns a skeleton that satisfies the given LTL specification in time polynomial in the size of the minimal skeleton. Our new analysis technique can be used to recognize and repair specifications that underspecify critical situations. The technique thus complements existing methods for the recognition and repair of overspecifications via the identification of unrealizable cores. |

Elaborating on Learned Demonstrations with Temporal Logic Specifications2020-05-22 ${\displaystyle \cong }$ |

Most current methods for learning from demonstrations assume that those demonstrations alone are sufficient to learn the underlying task. This is often untrue, especially if extra safety specifications exist which were not present in the original demonstrations. In this paper, we allow an expert to elaborate on their original demonstration with additional specification information using linear temporal logic (LTL). Our system converts LTL specifications into a differentiable loss. This loss is then used to learn a dynamic movement primitive that satisfies the underlying specification, while remaining close to the original demonstration. Further, by leveraging adversarial training, our system learns to robustly satisfy the given LTL specification on unseen inputs, not just those seen in training. We show that our method is expressive enough to work across a variety of common movement specification patterns such as obstacle avoidance, patrolling, keeping steady, and speed limitation. In addition, we show that our system can modify a base demonstration with complex specifications by incrementally composing multiple simpler specifications. We also implement our system on a PR-2 robot to show how a demonstrator can start with an initial (sub-optimal) demonstration, then interactively improve task success by including additional specifications enforced with our differentiable LTL loss. |

Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition2020-03-16 ${\displaystyle \cong }$ |

In skeleton-based action recognition, graph convolutional networks (GCNs), which model human body skeletons using graphical components such as nodes and connections, have achieved remarkable performance recently. However, current state-of-the-art methods for skeleton-based action recognition usually work on the assumption that the completely observed skeletons will be provided. This may be problematic to apply this assumption in real scenarios since there is always a possibility that captured skeletons are incomplete or noisy. In this work, we propose a skeleton-based action recognition method which is robust to noise information of given skeleton features. The key insight of our approach is to train a model by maximizing the mutual information between normal and noisy skeletons using a predictive coding manner. We have conducted comprehensive experiments about skeleton-based action recognition with defected skeletons using NTU-RGB+D and Kinetics-Skeleton datasets. The experimental results demonstrate that our approach achieves outstanding performance when skeleton samples are noised compared with existing state-of-the-art methods. |

SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition2019-07-30 ${\displaystyle \cong }$ |

Due to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have focused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture longrange joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset. |

Skeleton-Aware Networks for Deep Motion Retargeting2020-05-12 ${\displaystyle \cong }$ |

We introduce a novel deep learning framework for data-driven motion retargeting between skeletons, which may have different structure, yet corresponding to homeomorphic graphs. Importantly, our approach learns how to retarget without requiring any explicit pairing between the motions in the training set. We leverage the fact that different homeomorphic skeletons may be reduced to a common primal skeleton by a sequence of edge merging operations, which we refer to as skeletal pooling. Thus, our main technical contribution is the introduction of novel differentiable convolution, pooling, and unpooling operators. These operators are skeleton-aware, meaning that they explicitly account for the skeleton's hierarchical structure and joint adjacency, and together they serve to transform the original motion into a collection of deep temporal features associated with the joints of the primal skeleton. In other words, our operators form the building blocks of a new deep motion processing framework that embeds the motion into a common latent space, shared by a collection of homeomorphic skeletons. Thus, retargeting can be achieved simply by encoding to, and decoding from this latent space. Our experiments show the effectiveness of our framework for motion retargeting, as well as motion processing in general, compared to existing approaches. Our approach is also quantitatively evaluated on a synthetic dataset that contains pairs of motions applied to different skeletons. To the best of our knowledge, our method is the first to perform retargeting between skeletons with differently sampled kinematic chains, without any paired examples. |

Continuous Motion Planning with Temporal Logic Specifications using Deep Neural Networks2020-04-02 ${\displaystyle \cong }$ |

In this paper, we propose a model-free reinforcement learning method to synthesize control policies for motion planning problems for continuous states and actions. The robot is modelled as a labeled Markov decision process (MDP) with continuous state and action spaces. Linear temporal logics (LTL) are used to specify high-level tasks. We then train deep neural networks to approximate the value function and policy using an actor-critic reinforcement learning method. The LTL specification is converted into an annotated limit-deterministic Büchi automaton (LDBA) for continuously shaping the reward so that dense reward is available during training. A naive way of solving a motion planning problem with LTL specifications using reinforcement learning is to sample a trajectory and, if the trajectory satisfies the entire LTL formula then we assign a high reward for training. However, the sampling complexity needed to find such a trajectory is too high when we have a complex LTL formula for continuous state and action spaces. As a result, it is very unlikely that we get enough reward for training if all sample trajectories start from the initial state in the automata. In this paper, we propose a method that samples not only an initial state from the state space, but also an arbitrary state in the automata at the beginning of each training episode. We test our algorithm in simulation using a car-like robot and find out that our method can learn policies for different working configurations and LTL specifications successfully. |

Inverse Reinforcement Learning of Autonomous Behaviors Encoded as Weighted Finite Automata2021-03-10 ${\displaystyle \cong }$ |

This paper presents a method for learning logical task specifications and cost functions from demonstrations. Linear temporal logic (LTL) formulas are widely used to express complex objectives and constraints for autonomous systems. Yet, such specifications may be challenging to construct by hand. Instead, we consider demonstrated task executions, whose temporal logic structure and transition costs need to be inferred by an autonomous agent. We employ a spectral learning approach to extract a weighted finite automaton (WFA), approximating the unknown logic structure of the task. Thereafter, we define a product between the WFA for high-level task guidance and a Labeled Markov decision process (L-MDP) for low-level control and optimize a cost function that matches the demonstrator's behavior. We demonstrate that our method is capable of generalizing the execution of the inferred task specification to new environment configurations. |

Interpretable Apprenticeship Learning with Temporal Logic Specifications2017-10-28 ${\displaystyle \cong }$ |

Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based ("what actually happened") and action-based ("what the agent expected to happen") objective functions based on a notion of "violation cost". We demonstrate the efficacy of the approach by employing genetic programming to solve this problem in two simple domains. |

Provable Repair of Deep Neural Networks2021-04-09 ${\displaystyle \cong }$ |

Deep Neural Networks (DNNs) have grown in popularity over the past decade and are now being used in safety-critical domains such as aircraft collision avoidance. This has motivated a large number of techniques for finding unsafe behavior in DNNs. In contrast, this paper tackles the problem of correcting a DNN once unsafe behavior is found. We introduce the provable repair problem, which is the problem of repairing a network N to construct a new network N' that satisfies a given specification. If the safety specification is over a finite set of points, our Provable Point Repair algorithm can find a provably minimal repair satisfying the specification, regardless of the activation functions used. For safety specifications addressing convex polytopes containing infinitely many points, our Provable Polytope Repair algorithm can find a provably minimal repair satisfying the specification for DNNs using piecewise-linear activation functions. The key insight behind both of these algorithms is the introduction of a Decoupled DNN architecture, which allows us to reduce provable repair to a linear programming problem. Our experimental results demonstrate the efficiency and effectiveness of our Provable Repair algorithms on a variety of challenging tasks. |

A Self-Supervised Gait Encoding Approach with Locality-Awareness for 3D Skeleton Based Person Re-Identification2020-09-05 ${\displaystyle \cong }$ |

Person re-identification (Re-ID) via gait features within 3D skeleton sequences is a newly-emerging topic with several advantages. Existing solutions either rely on hand-crafted descriptors or supervised gait representation learning. This paper proposes a self-supervised gait encoding approach that can leverage unlabeled skeleton data to learn gait representations for person Re-ID. Specifically, we first create self-supervision by learning to reconstruct unlabeled skeleton sequences reversely, which involves richer high-level semantics to obtain better gait representations. Other pretext tasks are also explored to further improve self-supervised learning. Second, inspired by the fact that motion's continuity endows adjacent skeletons in one skeleton sequence and temporally consecutive skeleton sequences with higher correlations (referred as locality in 3D skeleton data), we propose a locality-aware attention mechanism and a locality-aware contrastive learning scheme, which aim to preserve locality-awareness on intra-sequence level and inter-sequence level respectively during self-supervised learning. Last, with context vectors learned by our locality-aware attention mechanism and contrastive learning scheme, a novel feature named Constrastive Attention-based Gait Encodings (CAGEs) is designed to represent gait effectively. Empirical evaluations show that our approach significantly outperforms skeleton-based counterparts by 15-40% Rank-1 accuracy, and it even achieves superior performance to numerous multi-modal methods with extra RGB or depth information. Our codes are available at https://github.com/Kali-Hac/Locality-Awareness-SGE. |

Model-Free Learning of Safe yet Effective Controllers2021-03-26 ${\displaystyle \cong }$ |

In this paper, we study the problem of learning safe control policies that are also effective -- i.e., maximizing the probability of satisfying the linear temporal logic (LTL) specification of the task, and the discounted reward capturing the (classic) control performance. We consider unknown environments that can be modeled as Markov decision processes (MDPs). We propose a model-free reinforcement learning algorithm that learns a policy that first maximizes the probability of ensuring the safety, then the probability of satisfying the given LTL specification and lastly, the sum of discounted Quality of Control (QoC) rewards. Finally, we illustrate the applicability of our RL-based approach on a case study. |

Learning Optimal Strategies for Temporal Tasks in Stochastic Games2021-02-08 ${\displaystyle \cong }$ |

Linear temporal logic (LTL) is widely used to formally specify complex tasks for autonomy. Unlike usual tasks defined by reward functions only, LTL tasks are noncumulative and require memory-dependent strategies. In this work, we introduce a method to learn optimal controller strategies that maximize the satisfaction probability of LTL specifications of the desired tasks in stochastic games, which are natural extensions of Markov Decision Processes (MDPs) to systems with adversarial inputs. Our approach constructs a product game using the deterministic automaton derived from the given LTL task and a reward machine based on the acceptance condition of the automaton; thus, allowing for the use of a model-free RL algorithm to learn an optimal controller strategy. Since the rewards and the transition probabilities of the reward machine do not depend on the number of sets defining the acceptance condition, our approach is scalable to a wide range of LTL tasks, as we demonstrate on several case studies. |

Extended Markov Games to Learn Multiple Tasks in Multi-Agent Reinforcement Learning2020-02-14 ${\displaystyle \cong }$ |

The combination of Formal Methods with Reinforcement Learning (RL) has recently attracted interest as a way for single-agent RL to learn multiple-task specifications. In this paper we extend this convergence to multi-agent settings and formally define Extended Markov Games as a general mathematical model that allows multiple RL agents to concurrently learn various non-Markovian specifications. To introduce this new model we provide formal definitions and proofs as well as empirical tests of RL algorithms running on this framework. Specifically, we use our model to train two different logic-based multi-agent RL algorithms to solve diverse settings of non-Markovian co-safe LTL specifications. |

Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning2020-03-05 ${\displaystyle \cong }$ |

We present a reinforcement learning (RL) framework to synthesize a control policy from a given linear temporal logic (LTL) specification in an unknown stochastic environment that can be modeled as a Markov Decision Process (MDP). Specifically, we learn a policy that maximizes the probability of satisfying the LTL formula without learning the transition probabilities. We introduce a novel rewarding and path-dependent discounting mechanism based on the LTL formula such that (i) an optimal policy maximizing the total discounted reward effectively maximizes the probabilities of satisfying LTL objectives, and (ii) a model-free RL algorithm using these rewards and discount factors is guaranteed to converge to such policy. Finally, we illustrate the applicability of our RL-based synthesis approach on two motion planning case studies. |

Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints2019-09-11 ${\displaystyle \cong }$ |

In the last years, the computer vision research community has studied on how to model temporal dynamics in videos to employ 3D human action recognition. To that end, two main baseline approaches have been researched: (i) Recurrent Neural Networks (RNNs) with Long-Short Term Memory (LSTM); and (ii) skeleton image representations used as input to a Convolutional Neural Network (CNN). Although RNN approaches present excellent results, such methods lack the ability to efficiently learn the spatial relations between the skeleton joints. On the other hand, the representations used to feed CNN approaches present the advantage of having the natural ability of learning structural information from 2D arrays (i.e., they learn spatial relations from the skeleton joints). To further improve such representations, we introduce the Tree Structure Reference Joints Image (TSRJI), a novel skeleton image representation to be used as input to CNNs. The proposed representation has the advantage of combining the use of reference joints and a tree structure skeleton. While the former incorporates different spatial relationships between the joints, the latter preserves important spatial relations by traversing a skeleton tree with a depth-first order algorithm. Experimental results demonstrate the effectiveness of the proposed representation for 3D action recognition on two datasets achieving state-of-the-art results on the recent NTU RGB+D~120 dataset. |

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks2016-03-24 ${\displaystyle \cong }$ |

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network for skeleton based action recognition. Inspired by the observation that the co-occurrences of the joints intrinsically characterize human actions, we take the skeleton as the input at each time slot and introduce a novel regularization scheme to learn the co-occurrence features of skeleton joints. To train the deep LSTM network effectively, we propose a new dropout algorithm which simultaneously operates on the gates, cells, and output responses of the LSTM neurons. Experimental results on three human action recognition datasets consistently demonstrate the effectiveness of the proposed model. |

Multi-Agent Reinforcement Learning with Temporal Logic Specifications2021-01-31 ${\displaystyle \cong }$ |

In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour. From a learning perspective these specifications provide a rich formal language with which to capture tasks or objectives, while from a logic and automated verification perspective the introduction of learning capabilities allows for practical applications in large, stochastic, unknown environments. The existing work in this area is, however, limited. Of the frameworks that consider full linear temporal logic or have correctness guarantees, all methods thus far consider only the case of a single temporal logic specification and a single agent. In order to overcome this limitation, we develop the first multi-agent reinforcement learning technique for temporal logic specifications, which is also novel in its ability to handle multiple specifications. We provide correctness and convergence guarantees for our main algorithm - ALMANAC (Automaton/Logic Multi-Agent Natural Actor-Critic) - even when using function approximation. Alongside our theoretical results, we further demonstrate the applicability of our technique via a set of preliminary experiments. |

Learning Interpretable Models in the Property Specification Language2020-02-10 ${\displaystyle \cong }$ |

We address the problem of learning human-interpretable descriptions of a complex system from a finite set of positive and negative examples of its behavior. In contrast to most of the recent work in this area, which focuses on descriptions expressed in Linear Temporal Logic (LTL), we develop a learning algorithm for formulas in the IEEE standard temporal logic PSL (Property Specification Language). Our work is motivated by the fact that many natural properties, such as an event happening at every n-th point in time, cannot be expressed in LTL, whereas it is easy to express such properties in PSL. Moreover, formulas in PSL can be more succinct and easier to interpret (due to the use of regular expressions in PSL formulas) than formulas in LTL. Our learning algorithm builds on top of an existing algorithm for learning LTL formulas. Roughly speaking, our algorithm reduces the learning task to a constraint satisfaction problem in propositional logic and then uses a SAT solver to search for a solution in an incremental fashion. We have implemented our algorithm and performed a comparative study between the proposed method and the existing LTL learning algorithm. Our results illustrate the effectiveness of the proposed approach to provide succinct human-interpretable descriptions from examples. |

LTL2Action: Generalizing LTL Instructions for Multi-Task RL2021-02-12 ${\displaystyle \cong }$ |

We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. We employ a well-known formal language -- linear temporal logic (LTL) -- to specify instructions, using a domain-specific vocabulary. We propose a novel approach to learning that exploits the compositional syntax and the semantics of LTL, enabling our RL agent to learn task-conditioned policies that generalize to new instructions, not observed during training. The expressive power of LTL supports the specification of a diversity of complex temporally extended behaviours that include conditionals and alternative realizations. Experiments on discrete and continuous domains demonstrate the strength of our approach in learning to solve (unseen) tasks, given LTL instructions. |

Feature Interactions on Steroids: On the Composition of ML Models2021-05-13 ${\displaystyle \cong }$ |

The lack of specifications is a key difference between traditional software engineering and machine learning. We discuss how it drastically impacts how we think about divide-and-conquer approaches to system design, and how it impacts reuse, testing and debugging activities. Traditionally, specifications provide a cornerstone for compositional reasoning and for the divide-and-conquer strategy of how we build large and complex systems from components, but those are hard to come by for machine-learned components. While the lack of specification seems like a fundamental new problem at first sight, in fact software engineers routinely deal with iffy specifications in practice: we face weak specifications, wrong specifications, and unanticipated interactions among components and their specifications. Machine learning may push us further, but the problems are not fundamentally new. Rethinking machine-learning model composition from the perspective of the feature interaction problem, we may even teach us a thing or two on how to move forward, including the importance of integration testing, of requirements engineering, and of design. |