c#-使用linq聚合文本文件内容,以便将它们分组

给出以下查询.

var query = files
            .SelectMany(file => File.ReadAllLines(file))
            .Where(_ => !_.StartsWith("*"))
            .Select(line => new {
                Order = line.Substring(32, 7),
                Delta = line.Substring(40, 3),
                Line = new String[] { line }
            });

显然,这将产生具有以下属性的对象列表:顺序:字符串,增量:字符串和线:字符串[]

我有一个看起来像这样的物品清单.

{ 1, 'A', {'line1'} }, 
{ 1, 'A', {'line2'} }, 
{ 2, 'B', {'line3'} }, 
{ 1, 'B', {'line4 } }

是否可以使用Linq聚合或类似的功能构造在收集线的同时将所有相邻的Order和Delta组合收集在一起.

这样聚合就是包含所有“行”的项目列表

{ 1, 'A', {'line1', 'line2'} }
{ 2, 'B', {'line3'} }
{ 1, 'B', {'line4'} }

由于聚合顺序地进行迭代,因此应该有可能收集所有具有相同字段的相邻行.

循环很容易做到,但我正在尝试使用一组lambda来实现.

解决方法:

您将需要GroupBy的以下变体:

public static class EnumerableExtensions
{
    public class AdjacentGrouping<K, T> : List<T>, IGrouping<K, T>
    {
        public AdjacentGrouping(K key) { Key = key; }
        public K Key { get; private set; }
    }

    public static IEnumerable<IGrouping<K, T>> GroupByAdjacent<T, K>(
                            this IEnumerable<T> sequence, Func<T, K> keySelector)
    {
        using (var it = sequence.GetEnumerator())
        {
            if (!it.MoveNext())
                yield break;
            T curr = it.Current;
            K currKey = keySelector(curr);
            var currentCluster = new AdjacentGrouping<K, T>(currKey) { curr };
            while (it.MoveNext())
            {
                curr = it.Current;
                currKey = keySelector(curr);
                if (!EqualityComparer<K>.Default.Equals(currKey, currentCluster.Key))
                {
                    // start a new cluster
                    yield return currentCluster;
                    currentCluster = new AdjacentGrouping<K, T>(currKey);
                }
                currentCluster.Add(curr);
            };
            // currentCluster is never empty
            yield return currentCluster;
        }
    }
}

具有此相邻分组,您的代码可以与Chris’s answer中的相同:

var query = files
    .SelectMany(file => File.ReadAllLines(file))
    .Where(_ => !_.StartsWith("*"))
    .Select(line => new
    {
        Order = line.Substring(32, 7),
        Delta = line.Substring(40, 3),
        Line = new String[] { line }
    })
    .GroupByAdjacent(o => new { o.Order, o.Delta })
    .Select(g => new { g.Key.Order, g.Key.Delta, Lines = g.Select(o => o.Line).ToList() });

免责声明:GroupByAdjacent函数来自我自己的宠物项目,未从任何地方复制.

上一篇:MongoDB(七):聚合aggregate


下一篇:python-按间隔将数据帧分组