

1 Transformer-based的多变量时序表示学习 (KDD 2021)

2 单变量时序的迁移学习分类模型 (ICML 2021)

3 基于异构图神经网络的非完整数据分类 (WWW 2021 Best paper runner up)

4 基于图神经网络的多变量时序填补模型 (arXiv 2021)

5 MinRocket:一种快速的时间序列分类模型 (KDD 2021)


1 Transformer-based的多变量时序表示学习 (KDD 2021)



Pre-trained models can be potentially used for downstream tasks such as regression and classification, forecasting and missing value imputation.

Meanwhile, the availability of labeled multivariate time series  data in particular is far more limited: extensive data labeling is often prohibitively expensive or impractical, as it may require much time and effort, special infrastructure or domain expertise.

Therefore, it is worth exploring using only a limited amount of labeled data or leveraging the existing plethora of unlabeled data for time series data modeling.


In this work, we investigate, for the first time, the use of a transformer encoder for unsupervised representation learning of multivariate time series, as well as for the tasks of time series regression and classification.

Experimental results indicated that transformer models can convincingly outperform all current state-of-the-art modeling approaches, even when only having access to a very limited amount of training data samples (on the order of hundreds of samples), an unprecedented success for deep learning models.

Importantly, we also demonstrate that our models, using at most hundreds of thousands of parameters, can be practically trained even on CPUs; training them on GPUs allows them to be trained as fast as even the fastest and most accurate non-deep learning based approaches.



My thoughts



        GitHub: https://github.com/gzerveas/mvts_transformer (目前37 Star)


2 单变量时序的迁移学习分类模型 (ICML 2021)



Learning to classify time series with limited data is a practical yet challenging problem. Current methods are primarily based on hand-designed feature extraction rules or domain-specific data augmentation.


Motivated by the advances in deep speech processing models and the fact that voice data are univariate temporal signals, in this paper we propose Voice2Series (V2S), a novel end-to-end approach that reprograms acoustic models for time series classification, through input transformation learning and output label mapping.

Leveraging the representation learning power of a large-scale pre-trained speech processing model, on 30 different time series tasks we show that V2S either outperforms or is tied with state-of-the-art methods on 20 tasks, and improves their average accuracy by 1.84%.



My thoughts


GitHub: https://github.com/huckiyang/Voice2Series-Reprogramming (目前28 Star)


3 基于异构图神经网络的非完整数据分类 (WWW 2021 Best paper runner up)



Heterogeneous information networks (HINs), also called heterogeneous graphs, are composed of multiple types of nodes and edges, and contain comprehensive information and rich semantics.

Graph neural networks (GNNs) based heterogeneous models can not be trained with some nodes with no attributes.

Previous studies take some handcrafted methods to solve this problem, which separate the attribute completion from the graph learning process and, in turn, result in poor performance.


In this paper, we hold that missing attributes can be acquired by a learnable manner, and propose an end-to-end framework for Heterogeneous Graph Neural Network via Attribute Completion (HGNN-AC), including pre-learning of topological embedding and attribute completion with attention mechanism.

HGNN-AC first uses existing HIN-Embedding methods to obtain node topological embedding.

Then it uses the topological relationship between nodes as guidance to complete attributes for no-attribute nodes by weighted aggregation of the attributes from these attributed nodes.



My thoughts



GitHub:https://github.com/search?q=Heterogeneous+Graph+Neural+Network+via+Attribute+Completion (目前11 Star)


4 基于图神经网络的多变量时序填补模型 (arXiv 2021)



Dealing with missing values and incomplete multivariate time series is a labor-intensive and time-consuming inevitable task when handling data coming from real-world applications.

Standard methods fall short in capturing the nonlinear timeand space dependencies existing within networks of interconnected sensors and do not take full advantage of the available – and often strong – relational information.

Notably, most of state-of-the-art imputation methods based on deep learning do not explicitly model relational aspects and, in any case, do not exploit processing frameworks able to adequately represent structured spatio-temporal data.


In this work, we present the first assessment of graph neural networks in the context of multivariate time series imputation. In particular, we introduce a novel graph neural network architecture, named GRIL , which aims at reconstructing missing data in the different channels of a multivariate time series by learning spatial-temporal representations through message passing.



My thoughts





5 MinRocket:一种快速的时间序列分类模型 (KDD 2021)



Until recently, the most accurate methods for time series classification were limited by high computational complexity.

While there have been considerable advances in recent years, computational complexity and a lack of scalability remain persistent problems.


We reformulate Rocket into a new method, MiniRocket. MiniRocket is up to 75 times faster than Rocket on larger datasets, and almost deterministic (and optionally, fully deterministic), while maintaining essentially the same accuracy. Using this method, it is possible to train and test a classifier on all of 109 datasets from the UCR archive to state-of-the-art accuracy in under 10 minutes.

GitHub: https://github.com/angus924/minirocket  (目前87 Star)


