python – 从时间序列图中对值进行求和/堆栈的算法,其中数据点在时间上不匹配

我有一个图形/分析问题我无法理解.我可以做一个蛮力,但它太慢了,也许有人有更好的主意,或知道或快速的python库?

我有2个时间序列数据集(x,y),我想聚合(随后绘图).问题是系列中的x值不匹配,我真的不想诉诸于将值复制到时间箱中.

所以,鉴于这两个系列:

S1: (1;100) (5;100) (10;100)
S2: (4;150) (5;100) (18;150)

加在一起时,应该导致:

ST: (1;100) (4;250) (5;200) (10;200) (18;250)

逻辑:

x=1 s1=100, s2=None, sum=100
x=4 s1=100, s2=150, sum=250 (note s1 value from previous value)
x=5 s1=100, s2=100, sum=200
x=10 s1=100, s2=100, sum=200
x=18 s1=100, s2=150, sum=250

我目前的想法是迭代一个排序的键列表(x),保留每个系列的前一个值,并查询每个集合是否有x的新y.

任何想法,将不胜感激!

解决方法:

这是另一种方法,将更多行为放在单个数据流上:

class DataStream(object):
    def __init__(self, iterable):
        self.iterable = iter(iterable)
        self.next_item = (None, 0)
        self.next_x = None
        self.current_y = 0
        self.next()

    def next(self):
        if self.next_item is None:
            raise StopIteration()
        self.current_y = self.next_item[1]
        try:
            self.next_item = self.iterable.next()
            self.next_x = self.next_item[0]
        except StopIteration:
            self.next_item = None
            self.next_x = None
        return self.next_item

    def __iter__(self):
        return self


class MergedDataStream(object):
    def __init__(self, *iterables):
        self.streams = [DataStream(i) for i in iterables]
        self.outseq = []

    def next(self):
        xs = [stream.next_x for stream in self.streams if stream.next_x is not None]
        if not xs:
            raise StopIteration()
        next_x = min(xs)
        current_y = 0
        for stream in self.streams:
            if stream.next_x == next_x:
                stream.next()
            current_y += stream.current_y
        self.outseq.append((next_x, current_y))
        return self.outseq[-1]

    def __iter__(self):
        return self


if __name__ == '__main__':
    seqs = [
        [(1, 100), (5, 100), (10, 100)],
        [(4, 150), (5, 100), (18, 150)],
        ]

    sm = MergedDataStream(*seqs)
    for x, y in sm:
        print "%02s: %s" % (x, y)

    print sm.outseq
上一篇:用于Java应用程序的锁分析器


下一篇:数据科学问题的类型有哪些