AutoSTG:Neural Architecture Search for Predictions of Spatio-Temporal Graphs

An introduction to AutoSTG

Posted by Ccloud on 2023-03-06
Estimated Reading Time 4 Minutes
Words 718 In Total
Viewed Times

The blog introduces the paper : AutoSTG

Introduction

A growing number of ST neural networks have been proposed for STG prediction, by leveraging the capability of modeling ST correlations, these models can achieve significant performance.

Fig1

To adapt to different tasks and less time-consuming, automated neural architecture search is used for STG prediction.

There exists two main problems:

  1. Define a proper search space for modeling STGs.
  2. Learn the weight parameters related to the attributed graph of an STG.

Fig2

The characteristics of nodes and edges are related to their own attributes and the graph structure.

The main contributions are three-fold:

  1. Propose a novel framework, entitled AutoSTG, to model STGs.
  2. Use meta learning technique to learn the adjacency matrices of spatial graph convolution layers and kernels of temporal convolution layers from the meta knowledge of the attributed graph.
  3. Conduct extensive experiments on two widely used real-world benchmark datasets to verify our framework.

Preliminaries

Tab1

Definitions and Problem Statement

Attributed graph:, where denotes nodes (locations), denotes edges (relationships between locations), denotes D dimensional attribute values for nodes, and de notes K dimensional attribute values for edges, denotes the neighbors of node i.

Graph Convolution

Employ diffusion convolution for modeling spatial correlations inn traffic predcition.

Given node state and adjacency matrices {} as inputs, diffusion convolution outputs node state :

where denotes the number of diffusion steps , is the set of weight matrices for feature learning.

can be constructed from edge representation by function ​.

As the adjacency matrices represent the diffusion probability on the graph, we adopt softmax function to compute each by normalizing :

Methodologies

Fig4

The framework consists of two parts:

  1. constructing architectures from the search space
  2. employing meta learning technique to learn the weights of SC and TC layers in the built architectures.

Search Space for Architecture Construction

The search space is convolutional-based. The network is composed of a series of cells and temporal pooling layers, where the cells are employed for modeling ST correlations and the temporal pooling layers are adopted to increase the temporal receptive fields of hidden states.

The cell search space can be denoted as a direct acyclic graph with vertices. Each vertex denotes a latent representation of the STG, i.e., , where is the number of timestamps and is the number of features. The output is the sum of all latent representations.

For each pair , the operation with the highest score is selected as the operation in the final architecture.

The candidate operation set:

  • Spatial Graph Convolution

    Suppose that the input hidden state

  • Temporal Convolution
    Suppose the input hidden state ,where each denotes the hidden state of node . Then, given the input convolution kernels .


    where denotes 1D convolution for sequence modeling.

  • zero and identity

Graph Meta Knowledge Learner

The ST correlations of data are related to be the characteristics of attributed graph . Therefore, it is essential to learn the representations, namely the meta knowledge, of nodes and edges from .

Fig5

Apply iterations to learn node and edge representations by node learners and edge learners. Let and denote the node and edge representations at -th iteration. The inputs are denoted as and while the outputs are denoted as and .

Node Learner. The first step is to compute adjacency matrices of graph convolution by edge representations.

Then the diffusion convolution can be employed for to get the higher-level node representations:

Edge Learner. For each edge, its characteristics are related to its connected nodes. Push the representations of its two connected nodes, and then use an FC layer to learn the higher-level edge representation.

where is the weight matrix and is the bias.

Meta Learners

An ST correlations of STGs are impacted by the characteristics of the attributed graph, we propose to learn the adjacency matrices of SC layer and kernels of TC layer from the meta knowledge of the attributed graph by meta learners.

SC-Meta Learner. To learn adjacency matrices from the edge meta knowledge , which consists of .

First, it employs an FC layer to learn edge representation ,which is composed of {}

where , are learnable weights and bias. So that the adjacency matrices of spatial graph convolution can be generated by .

TC-Meta Learner. As temporal correlations depend on node meta knowledge, TC-meta learner is employed to generate the temporal convolution kernels from .

where are learnable weight and bias. So that the kernel of temporal convolution is generated.

Searching Algorithm

Alg

Experiments

exp

The paper


If you like this blog or find it useful for you, you are welcome to comment on it. You are also welcome to share this blog, so that more people can participate in it. If the images used in the blog infringe your copyright, please contact the author to delete them. Thank you !