-
tsfresh
对时间序列数据进行特征提取这个过程,进行模块化操作的工具.
tsfresh
is a python package. It automatically calculates a large number of time series characteristics, the so called features. Further the package contains methods to evaluate the explaining power and importance of such charateristics for regression or classification tasks. -
Coding paradigms
-
Keep it simple
We believe that “Programs should be written for people to read, and only incidentally for machines to execute”
-
keep it documented
By at least including a docstring for each method and class. Do not describe what you are doing but why you are doing it.
-
Keep it tested
We aim for a high test coverage.
-
-
Feature Calculator Naming
tsfresh enforces a strict naming of the created features, which you have to follow whenever you create new feature calculators.
This is due to the
tsfresh.feature_extraction.settings.from_columns()
method which needs to deduce the following information from the feature name:-
the time series that was used to calculate the feature
-
the feature calculator method that was used to derive the feature
-
all parameters that have been used to calculate the feature (optional)
The features will be named in the following format:
{time_series_name}__{feature_name}__{parameter_name_1}_{parameter_value_1}__[..]_{parameter_name_k}_{parameter_value_k}
-
-
Quick Start
运行文章中的教程,可能出现下述错误:
《(20201125已解决)tsfresh下载案例数据出错[Errno 111] Connection refused》
简单梳理案例想要告诉我们的,给你一组数据,里面表示的是不同机器人在6个维度上的时间序列数据,你可以把每个机器人的6个时间序列都画出来,人工肉眼可以看出failure与否的机器人,不同维度上的图有所区别。
人工之外,总要有些数据用来说明机器人是否failure吧?好,tsfresh就是干这个的,它可以从这6维数据中自动提取出1200多个特征。
然后,就可以把这1200多个特征塞到模型中进行训练了。