.開窗函數
1.聚合函數 : 多行數據 按照一定規則 進行聚合 為一行
sum avg max 。。。
理論上: 聚合後的行數 <= 聚合前的行數 [看維度選取 groupby 裏面的字段]2.需求:
既要顯示 聚合前的數據 又要顯示 聚合後的數據 ?
eg:
id name sal dt sal_all
1 zs 1000 2022-4 1000
2 ls 2000 2022-4 2000
3 ww 3000 2022-4 3000
1 zs 1000 2022-5 2000
2 ls 2000 2022-5 4000
3 ww 3000 2022-5 6000窗口函數:
窗口 + 函數
窗口: 函數運行時 計算的數據集的範圍
函數:運行時的函數
1.聚合函數
2.內置窗口函數語法結構:
函數 over([partition by xxx,...] [order by xxx,...] )
over() :以誰進行開窗 【table】
partition by: 以誰進行分組 【group by column】
order by: 以誰進行排序 【column】窗口函數:
數據:
服務器 每天的啓動 次數linux01,2022-04-15,1
linux01,2022-04-16,5
linux01,2022-04-17,7
linux01,2022-04-18,2
linux01,2022-04-19,3
linux01,2022-04-20,10
linux01,2022-04-21,4統計累計問題:
需求:
每個服務器每天的累計 啓動次數
name dt cnt
linux01,2022-04-15,1
linux01,2022-04-16,5
linux01,2022-04-17,7
linux01,2022-04-18,2
linux01,2022-04-19,3
linux01,2022-04-20,10
linux01,2022-04-21,4name dt cnt cnt_all
linux01,2022-04-15,1 1
linux01,2022-04-16,5 6
linux01,2022-04-17,7 13
linux01,2022-04-18,2 15
linux01,2022-04-19,3 18
linux01,2022-04-20,10 28
linux01,2022-04-21,4 32create table window01(
name varchar(50),
dt varchar(20),
cnt int
);每個服務器每天的累計 啓動次數
1.聚合函數
COUNT,SUM,MIN,MAX,AVG
select
name,
dt,
cnt,
sum(cnt) over(partition by name order by dt ) as cnt_all
from window01;2.內置窗口函數
窗口大小:
select
name,
dt,
cnt,
sum(cnt) over(partition by name order by dt ) as sum_all,
– sum(cnt) over(partition by name order by dt ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) as sum_all1
– sum(cnt) over(partition by name order by dt ROWS BETWEEN 3 PRECEDING AND CURRENT ROW ) as sum_all2
– sum(cnt) over(partition by name order by dt ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING ) as sum_all3
– sum(cnt) over(partition by name order by dt ROWS BETWEEN 3 PRECEDING AND UNBOUNDED FOLLOWING) as sum_all4
sum(cnt) over(partition by name order by dt ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) as sum_all5from window01;
2.內置窗口函數
1.取值 串行2.排序
RANK
ROW_NUMBER
DENSE_RANK
CUME_DIST
PERCENT_RANK
NTILENTILE:
ntile
需求:
把數據按照某個字段進行排序 ,把數據分成幾分select
name,
dt,
cnt,
sum(cnt) over(partition by name order by dt ) as sum_all,
ntile(2) over(partition by name order by dt ) as n2,
ntile(3) over(partition by name order by dt ) as n3
from window01 ;
把數據平均分配 指定 N個桶 ,如果不能平均分配 ,優先分配到 編號 小的裏面RANK
ROW_NUMBER
DENSE_RANKRANK: 從1 開始 , 按照順序 相同會重複 名次會留下 空的位置 生成組內的記錄編號
ROW_NUMBER: 從1 開始 , 按照順序 生成組內的記錄編號
DENSE_RANK:從1 開始 , 按照順序 生成組內的記錄編號 相同會重複 名次不會會留下空的位置select
name,
dt,
cnt,
sum(cnt) over(partition by name order by dt ) as sum_all,
RANK() over(partition by name order by cnt desc ) as rk,
ROW_NUMBER() over(partition by name order by cnt desc) as rw,
DENSE_RANK() over(partition by name order by cnt desc ) as d_rk
from window01 ;1. 串行
LEAD
LAG
1. 串行
LEAD:窗口內 向下 第n行的值
LAG:窗口內 向上 第n行的值
LEAD(column,n,default)
column => 列名
n =》 取 幾行
default =》 取不到就給一個默認值
select
name,
dt,
cnt,
sum(cnt) over(partition by name order by dt ) as sum_all,
LEAD(dt,1,"9999-99-99") over(partition by name order by dt ) as lead_alias,
LAG(dt,1,"9999-99-99") over(partition by name order by dt ) as lag_alias
from window01 ;
2.取值
FIRST_VALUE
LAST_VALUE
本文章為轉載內容,我們尊重原作者對文章享有的著作權。如有內容錯誤或侵權問題,歡迎原作者聯繫我們進行內容更正或刪除文章。