The blog introduces the paper : AutoSTG
Introduction
A growing number of ST neural networks have been proposed for STG prediction, by leveraging the capability of modeling ST correlations, these models can achieve significant performance.
To adapt to different tasks and less time-consuming, automated neural architecture search is used for STG prediction.
There exists two main problems:
- Define a proper search space for modeling STGs.
- Learn the weight parameters related to the attributed graph of an STG.
The characteristics of nodes and edges are related to their own attributes and the graph structure.
The main contributions are three-fold:
- Propose a novel framework, entitled AutoSTG, to model STGs.
- Use meta learning technique to learn the adjacency matrices of spatial graph convolution layers and kernels of temporal convolution layers from the meta knowledge of the attributed graph.
- Conduct extensive experiments on two widely used real-world benchmark datasets to verify our framework.
Preliminaries
Definitions and Problem Statement
Attributed graph:
Graph Convolution
Employ diffusion convolution for modeling spatial correlations inn traffic predcition.
Given node state
where
As the adjacency matrices represent the diffusion probability on the graph, we adopt softmax function to compute each
Methodologies
The framework consists of two parts:
- constructing architectures from the search space
- employing meta learning technique to learn the weights of SC and TC layers in the built architectures.
Search Space for Architecture Construction
The search space is convolutional-based. The network is composed of a series of cells and temporal pooling layers, where the cells are employed for modeling ST correlations and the temporal pooling layers are adopted to increase the temporal receptive fields of hidden states.
The cell search space can be denoted as a direct acyclic graph with
For each pair
The candidate operation set:
Spatial Graph Convolution
Suppose that the input hidden state
Temporal Convolution
Suppose the input hidden state,where each denotes the hidden state of node . Then, given the input convolution kernels .
wheredenotes 1D convolution for sequence modeling. zero and identity
Graph Meta Knowledge Learner
The ST correlations of data are related to be the characteristics of attributed graph
Apply
Node Learner. The first step is to compute adjacency matrices of graph convolution by edge representations.
Then the diffusion convolution can be employed for
Edge Learner. For each edge, its characteristics are related to its connected nodes. Push the representations of its two connected nodes, and then use an FC layer to learn the higher-level edge representation.
where
Meta Learners
An ST correlations of STGs are impacted by the characteristics of the attributed graph, we propose to learn the adjacency matrices of SC layer and kernels of TC layer from the meta knowledge of the attributed graph by meta learners.
SC-Meta Learner. To learn adjacency matrices from the edge meta knowledge
First, it employs an FC layer to learn edge representation
where
TC-Meta Learner. As temporal correlations depend on node meta knowledge, TC-meta learner is employed to generate the temporal convolution kernels
where
Searching Algorithm
Experiments
The paper
If you like this blog or find it useful for you, you are welcome to comment on it. You are also welcome to share this blog, so that more people can participate in it. If the images used in the blog infringe your copyright, please contact the author to delete them. Thank you !