AutoST:Towards the Universal Modeling of Spatio-temporal Sequences

The blog introduces the paper : AutoST:Towards the Universal Modeling of Spato-temporal Sequences

Introduction

Fig1

Previous research for different forecasting tasks fall into three typical paradigms：

Spatial-first modeling
Temporal-first modeling
Spatio-temporal synchronous modeling

However, the distribution of ST dependencies varies and depends on the forecasting task and corresponding datasets. They are mixed in a compound way when modeling ST sequences, and the three tasks in Fig.1(b) are the representative ones. What makes it worse is that, the prevalent modeling methods show anisotropic tendency to capture the ST dependency. If we use the spatial-first models on the three tasks in Fig.2(b), the task 1’s states are highly influenced by the surrounding information, and the periodic pattern is the underlying factors, which makes the spatial-first model fits it properly.

We can compare the model ability (red lines) with the ideal one in Fig.2(a), this kind of model will be insuffificient for task 2 and task 3. Similarly, suppose we use the temporal-first models on the three tasks in Fig.2(c). In that case, the model ability only matches task 3, where the periodic pattern decides the states other than the spatial information. The previous analysis also applies to the spatio-temporal synchronous situation, where the states are mainly influenced by the complex associations across the spatial and temporal, like semantic relationships. In this paper, we aim to propose a universal model that alleviates the the modeling gap on different tasks.

Fig2

The contributions are:

The first to raise and address the modeling order problem in spatio-temporal forecasting tasks by proposing a universal modeling framework UniST and an automatic structure search strategy AutoST.
Proposing 3 replacable and unified attention-based modeling units named S2T,T2S and STS, which model spatio-time sequence with three different priorities: spatial first, temporal first and spatio-temporal synchronous.
Extensive experiments on 5 datasets and 3 sequence forecasting tasks demonstrate that only using our three modeling units outperforms the baseline methods, and our framework together with AutoST achieves the new state-of-the-art performance.

Spatial-first:

STG2Seq
STGCN
DCRNN

Temporal-first:

Graph WaveNet
GSTNet

Spatio-temporal synchronous:

STSGCN
STFGNN
ST-ResNet
ASTGCN

Preliminary

Spatio-temporal Sequence Forecasting

Spatio-temporal Sequence Forecasting is to predict the future sequence of spatio-temporal inputs based on the historical observations.

Given a graph , where and are the node set and edge set, and is the number of nodes, is the adjacency matrix of . is a ST sequence of time steps, where . The problem can be defined as:

Network Architecture Search

DARTS is the foundation of existing Auto ST-forecasting research, the objective function is:

where is the architecture, and is the model weights.

Methods

Introduce two basic modeling units: the time series linear self-attention and the high order mix graph convolution
Propose three layers as different network backbones and build a universal modeling framework based on the three “atomic” layers
Propose an automatic searching strategy for spatio-temporal information fusion, which aimed for the optimal order of spatio-temporal modeling on various downstream tasks

Spatial/Temporal Modeling Unit

Time Series Linear Self-Attention

Use the linear self-attention mechanism to decrease the time and space complexity:

where is a row-wise feature map.

High-order Mix Graph Convolution

To acquire better spatial information representation, we propose a high-order mix graph convolutional operation for spatial information mixing and feature extraction of the original inputs:

refers to a forward and backward state transition matrix. is an adaptive matrix for complementary spatial state information.

denotes the total order of graph convolution operations, i.e., to consider order-hop neighbor relationship of each node.

Unified Spatio-temporal Modeling Backbone

In order to solve the problem of spatio-temporal dependency distribution differences in the modeling procedure, we first propose three novel modules: S2T Layer, T2S Layer, STS Layer, that are suitable for three typical spatio-temporal dependencies: spatial-first, temporal-first, spatio-temporal synchronous, respectively. We design all these three modeling module to have the same dimension of inputs and outputs. This provides a solid foundation for our later flexible and universal modeling.

Fig3

Spatial-first Modeling Layer

S2T Layer models from spatial to temporal.

Temporal-first Modeling Layer

T2S Layer models the ST sequence from temporal to spatial.

Spatial-temporal Synchronous Layer

STS Layer aims to model the spatial and temporal information simultaneously.

Universal Modeling Framework

Propose a unified ST sequence modeling framework (UniST) with the proposed unified modeling backbone as follows:

Fig4

Spatio-temporal Embedding Layer

Fusion embedding

they will be replicated and expanded with broadcast on the respective missing dimensions.

Encoder

The encoder of UniST consists of multiple Spatio-Temporal Extractors , which can be arbitrarily chosen from {T2S Layer, S2T Layer, STS Layer}. All extractors are connected end to end, i.e., the output of the previous one is the input of the next one. To acquire a more diversity representation, the outputs of each extractor are added to form the final output of the encoder. Let the outputs of the embedding layer be , the encoder is computed as:

, where L refers to the number of spatio-temporal extractors.

Decoder

The decoder accepts the output of encoder, i.e., L outputs from L spatio-temporal extractors. They are firstly added as a unified spatio-temporal representation. Then the results are through two times of ReLU activation and Linear projection, and produce the final sequence forecasting result. Denote as the output of extractor , we have the calculation of decoder as:

Automated Search for UniST

With the proposed unifified ST sequence modeling framework UniST, it still suffers from the potential wrong network configuration problem, where we build an arbitrary modeling order with the replaceable model units {T2S Layer, S2T Layer, STS Layer}. Considering the various downstream tasks, how can we build a universal model with an optimal configuration? Here we propose the Automated Spatio-Temporal modeling approach (AutoST), which learns the optimal combinatorial order that suits the spatio-temporal dependency of the current task. We designed two schemes for layer combination. In this section, we first define the basic searching unit of AutoST, then we introduce two designs of AutoST with different searching schemes.

AutoST Cell

Fig5

Use the manner proposed in DARTS:

Fig6

Experiments

Tab1

The paper

If you like this blog or find it useful for you, you are welcome to comment on it. You are also welcome to share this blog, so that more people can participate in it. If the images used in the blog infringe your copyright, please contact the author to delete them. Thank you !

AutoST:Towards the Universal Modeling of Spatio-temporal Sequences

An introduction to AutoST

Introduction