mysql-分组以查找每个组的最小值,最大值

如果我只关心每个组的最小值和最大值,这将是相对容易的,问题是我的要求是找到各种界限.数据集示例如下:

BoundaryColumn  GroupIdentifier
1                  A
3                  A
4                  A
7                  A
8                  B
9                  B  
11                 B  
13                 A
14                 A
15                 A
16                 A

我需要从sql中得到的结果集如下:

min  max  groupid
1    7    A
8    11   B
13   16   A

本质上是为每个组群找到边界.

数据将存储在oracle11g或mysql中,因此可以为任何一个平台提供语法.

解决方法:

免责声明:查询部分结果并使用前端语言处理类似的事情要容易得多.那个…

以下查询适用于Oracle(支持分析查询),不适用于MySQL(不支持).有一个SQL Fiddle here.

WITH BoundX AS (
  SELECT * FROM (
    SELECT
     BoundaryColumn,
     GroupIdentifier,
     LAG(GroupIdentifier) OVER (ORDER BY BoundaryColumn) AS GIDLag,
     LEAD(GroupIdentifier) OVER (ORDER BY BoundaryColumn) AS GIDLead
    FROM MyTable
    ORDER BY BoundaryColumn
  )
  WHERE GIDLag IS NULL OR GroupIdentifier <> GIDLag
     OR GIDLead IS NULL OR GroupIdentifier <> GIDLead
)
SELECT MIN, MAX, GROUPID
FROM (
  SELECT
    BoundaryColumn AS MIN,
    LEAD(BoundaryColumn) OVER (ORDER BY BoundaryColumn) AS MAX,
    GroupIdentifier AS GROUPID,
    GIDLag,
    GIDLead
  FROM BoundX
)
WHERE GROUPID = GIDLead

这是逐步的逻辑.您可能可以对此进行改进,因为我觉得这里有一个子查询太多了…

此查询将之前和之后的GroupIdentifier值拉入每一行:

SELECT
 BoundaryColumn,
 GroupIdentifier,
 LAG(GroupIdentifier) OVER (ORDER BY BoundaryColumn) AS GIDLag,
 LEAD(GroupIdentifier) OVER (ORDER BY BoundaryColumn) AS GIDLead
FROM MyTable
ORDER BY BoundaryColumn

结果看起来像这样:

BoundaryColumn  GroupIdentifier  GIDLag  GIDLead
1                  A                         A
3                  A                A        A
4                  A                A        A
7                  A                A        B
8                  B                A        B
9                  B                B        B
11                 B                B        A
13                 A                B        A
14                 A                A        A
15                 A                A        A
16                 A                A

如果添加逻辑以消除GIDLag = GIDLead = GroupIdentifier的所有行,则最终将得到边界:

WITH BoundX AS (
  SELECT * FROM (
    SELECT
     BoundaryColumn,
     GroupIdentifier,
     LAG(GroupIdentifier) OVER (ORDER BY BoundaryColumn) AS GIDLag,
     LEAD(GroupIdentifier) OVER (ORDER BY BoundaryColumn) AS GIDLead
    FROM MyTable
    ORDER BY BoundaryColumn
  )
  WHERE GIDLag IS NULL OR GroupIdentifier <> GIDLag
     OR GIDLead IS NULL OR GroupIdentifier <> GIDLead
)
SELECT
  BoundaryColumn AS MIN,
  LEAD(BoundaryColumn) OVER (ORDER BY BoundaryColumn) AS MAX,
  GroupIdentifier AS GROUPID,
  GIDLag,
  GIDLead
FROM BoundX

加上此结果,结果是:

MIN MAX GROUPID GIDLAG GIDLEAD
--- --- ------- ------ -------
  1   7 A              A
  7   8 A       A      B
  8  11 B       A      B
 11  13 B       B      A
 13  16 A       B      A
 16     A       A

最后,仅包括GroupID = GIDLead的那些行.这是此答案顶部的查询.结果是:

MIN MAX GROUPID
--- --- -------
  1   7 A
  8  11 B
 13  16 A
上一篇:c#-ORA-21500:内部错误代码,参数:[%s],[%s],[%s],[%s],[%s],[%s],[%s],[%s ]


下一篇:完全卸载oracle11g步骤