0%

SQL-Count-by-Block

分块分别计数

按理说SQL是并行对数据处理的,对于有序的数据处理非常困难。其中窗口函数可以实现对数据的串行操作,如对表格中数据按照成绩排名,分组排名等。这些操作其实是块间计数,要实现块内计数比较复杂。如下图,目标是实现,对A,B,C等块内计数。

实现这种计数的原理还是窗口函数,但是需要定位每块的起始位置,具体代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
SELECT  num
,seq
,class
,la
,diff
,cum
,ROW_NUMBER() OVER(PARTITION BY num, cum ORDER BY seq) AS rn
FROM (
SELECT num
,seq
,class
,la
,diff
,SUM(diff) OVER(PARTITION BY num ORDER BY seq) AS cum
FROM (
SELECT num
,seq
,class
,LAG(class, 1) OVER(PARTITION BY num ORDER BY seq) AS la
,IF( class = LAG(class, 1) OVER(PARTITION BY num ORDER BY seq) ,0 ,1 ) AS diff
FROM (
SELECT 1 AS num
,seq
,class
FROM alihealth_algo_dev.toy_yaoguang
) org_tab
) block_tb
) res_tb
;