初识tsfresh in Python

  • tsfresh

    对时间序列数据进行特征提取这个过程,进行模块化操作的工具.

    tsfresh is a python package. It automatically calculates a large number of time series characteristics, the so called features. Further the package contains methods to evaluate the explaining power and importance of such charateristics for regression or classification tasks.

  • Coding paradigms

    1. Keep it simple

      We believe that “Programs should be written for people to read, and only incidentally for machines to execute”

    2. keep it documented

      By at least including a docstring for each method and class. Do not describe what you are doing but why you are doing it.

    3. Keep it tested

      We aim for a high test coverage.

  • Feature Calculator Naming

    tsfresh enforces a strict naming of the created features, which you have to follow whenever you create new feature calculators.

    This is due to the tsfresh.feature_extraction.settings.from_columns() method which needs to deduce the following information from the feature name:

    • the time series that was used to calculate the feature
    • the feature calculator method that was used to derive the feature
    • all parameters that have been used to calculate the feature (optional)

    The features will be named in the following format:

    {time_series_name}__{feature_name}__{parameter_name_1}_{parameter_value_1}__[..]_{parameter_name_k}_{parameter_value_k}

  • Quick Start

    运行文章中的教程,可能出现下述错误:

    (20201125已解决)tsfresh下载案例数据出错[Errno 111] Connection refused

    简单梳理案例想要告诉我们的,给你一组数据,里面表示的是不同机器人在6个维度上的时间序列数据,你可以把每个机器人的6个时间序列都画出来,人工肉眼可以看出failure与否的机器人,不同维度上的图有所区别。

    人工之外,总要有些数据用来说明机器人是否failure吧?好,tsfresh就是干这个的,它可以从这6维数据中自动提取出1200多个特征。

    然后,就可以把这1200多个特征塞到模型中进行训练了。

上一篇:web开发(1-3章)


下一篇:ROS2 launch文件demo与parameter设置