OVER(PARTITIONBY)函数⽤法
2010年10⽉26⽇
OVER(PARTITION BY)函数介绍
开窗函数
Oracle从8.1.6开始提供分析函数,分析函数⽤于计算基于组的某种聚合值,它和聚合函数的不同之处是:对于每个组返回多⾏,⽽聚合函数对于每个组只返回⼀⾏。
开窗函数指定了分析函数⼯作的数据窗⼝⼤⼩,这个数据窗⼝⼤⼩可能会随着⾏的变化⽽变化,举例如下:
1:over后的写法:
over(order by salary)按照salary排序进⾏累计,order by是个默认的开窗函数
over(partition by deptno)按照部门分区
over(partition by deptno order by salary)
2:开窗的窗⼝范围:
over(order by salary range between 5 preceding and 5 following):窗⼝范围为当前⾏数据幅度减5加5后的范围内的。
举例:
--sum(s)over(order by s range between 2 preceding and 2 following) 表⽰加2或2的范围内的求和
select name,class,s, sum(s)over(order by s range between 2 preceding and 2 following) mm from t2
adf 3 45 45 --45加2减2即43到47,但是s在这个范围内只有45
asdf 3 55 55
cfe 2 74 74
3dd 3 78 158 --78在76到80范围内有78,80,求和得158
fda 1 80 158
gds 2 92 92
ffd 1 95 190
dss 1 95 190
ddd 3 99 198
gf 3 99 198
over(order by salary rows between 5 preceding and 5 following):窗⼝范围为当前⾏前后各移动5⾏。
举例:
--sum(s)over(order by s rows between 2 preceding and 2 following)表⽰在上下两⾏之间的范围内
select name,class,s, sum(s)over(order by s rows between 2 preceding and 2 following) mm from t2
adf 3 45 174 (45+55+74=174)
asdf 3 55 252 (45+55+74+78=252)
cfe 2 74 332 (74+55+45+78+80=332)
3dd 3 78 379 (78+74+55+80+92=379)
fda 1 80 419
gds 2 92 440
ffd 1 95 461
dss 1 95 480
ddd 3 99 388
gf 3 99 293
over(order by salary range between unbounded preceding and unbounded following)或者
over(order by salary rows between unbounded preceding and unbounded following):窗⼝不做限制
3、与over函数结合的⼏个函数介绍
row_number()over()、rank()over()和dense_rank()over()函数的使⽤
下⾯以班级成绩表t2来说明其应⽤
t2表信息如下:
cfe 2 74
dss 1 95
ffd 1 95
fda 1 80
gds 2 92
gf 3 99
ddd 3 99
adf 3 45
asdf 3 55
3dd 3 78
select * from
(
select name,class,s,rank()over(partition by class order by s desc) mm from t2
)
where mm=1;
得到的结果是:
dss 1 95 1
ffd 1 95 1
gds 2 92 1
gf 3 99 1
ddd 3 99 1
注意:
1.在求第⼀名成绩的时候,不能⽤row_number(),因为如果同班有两个并列第⼀,row_number()只返回⼀个结果; select * from
(
select name,class,s,row_number()over(partition by class order by s desc) mm from t2
)
where mm=1;
1 95 1 --95有两名但是只显⽰⼀个
2 92 1
3 99 1 --99有两名但也只显⽰⼀个
2.rank()和dense_rank()可以将所有的都查出来:
如上可以看到采⽤rank可以将并列第⼀名的都查出来;
rank()和dense_rank()区别:
--rank()是跳跃排序,有两个第⼆名时接下来就是第四名;
select name,class,s,rank()over(partition by class order by s desc) mm from t2
dss 1 95 1
ffd 1 95 1
fda 1 80 3 --直接就跳到了第三
gds 2 92 1
cfe 2 74 2
gf 3 99 1
ddd 3 99 1
3dd 3 78 3
asdf 3 55 4
adf 3 45 5
--dense_rank()l是连续排序,有两个第⼆名时仍然跟着第三名
select name,class,s,dense_rank()over(partition by class order by s desc) mm from t2
dss 1 95 1
ffd 1 95 1
fda 1 80 2 --连续排序(仍为2)
gds 2 92 1
cfe 2 74 2
gf 3 99 1
ddd 3 99 1
3dd 3 78 2
asdf 3 55 3
adf 3 45 4
--sum()over()的使⽤select name,class,s, sum(s)over(partition by class order by s desc) mm from t2 --根据班级进⾏分数求和dss 1 95 190 --由于两个95都是第⼀名,所以累加时是两个第⼀名的相加ffd 1 95 190 fda 1 80 270 --第⼀名加上第⼆名的gds 2 92 92cfe 2 74 166gf 3 99 198ddd 3 99 1983dd 3 78 276asdf
3 55 331adf 3 45 376first_value() over()和last_value() over()的使⽤ --出这三条电路每条电路的第⼀条记录类型和最后⼀条记录类型SELECT opr_id,res_type, first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,
last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type rows BETWEEN unbounded preceding AND unbounded following) high FROM rm_circuit_routeWHERE opr_id IN
('000100190000000000021311','000100190000000000021355','000100190000000000021339') ORDER BY opr_id; 注:rows BETWEEN unbounded preceding AND unbounded following 的使⽤--取last_value时不使⽤rows BETWEEN unbounded preceding AND unbounded following的结果 SELECT opr_id,res_type, first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low, last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) high FROM rm_circuit_route WHERE opr_id IN
('000100190000000000021311','000100190000000000021355','000100190000000000021339') ORDER BY opr_id;如下图可以看到,如果不使⽤ rows BETWEEN unbounded preceding AND unbounded following,取出的last_value由于与res_type进⾏进⾏排列,因此取出的电路的最后⼀⾏记录的类型就不是按照电路的范围提取了,⽽是以res_type为范围进⾏提取了。在first_value和last_value中ignore nulls 的使⽤数据如下:取出该电路的第⼀条记录,加上ignore nulls后,如果第⼀条是判断的那个字段是空的,则默认取下⼀条,结果如下所⽰: --lag() over()函数⽤法(取出前n⾏数据)lag(expresstion,<offset>,<default>)with a as (select 1 id,'a' name from dual union select 2 id,'b' name from dual union select 3 id,'c' name from dual union select 4 id,'d' name from dual union select 5 id,'e' name from dual) select id,name,lag(id,1,'')over(order by name) from a;--lead() over()函数⽤法(取出后N⾏数据)lead(expresstion,<offset>,<default>)with a as (select 1 id,'a' name from dual unio
n select 2 id,'b' name from dual union select 3 id,'c' name from dual union select 4 id,'d' name from
rank函数的用法dual union select 5 id,'e' name from dual) select id,name,lead(id,1,'')over(order by name) from a;--ratio_to_report(a)函数⽤法
Ratio_to_report() 括号中就是分⼦,over() 括号中就是分母with a as (select 1 a from dual union allselect 1 a from dual union allselect 1 a from dual union allselect 2 a from dual union all select 3 a from dual union allselect 4 a from dual union allselect 4 a from dual union allselect 5 a from dual )select a, ratio_to_report(a)over(partition by a) b from a order by a; with a as (select 1 a from dual union allselect 1 a from dual union allselect 1 a from dual union allselect 2 a from dual union all select 3 a from dual union allselect 4 a from dual union allselect 4 a from dual union allselect 5 a from dual )select a, ratio_to_report(a)over() b from a --分母缺省就是整个占⽐order by a; with a as (select 1 a from dual union allselect 1 a from
dual union allselect 1 a from dual union allselect 2 a from dual union all select 3 a from dual union allselect 4 a from dual union allselect 4 a from dual union allselect 5 a from dual )select a, ratio_to_report(a)over() b from agroup by a order by a;--分组后的占⽐ per
cent_rank⽤法计算⽅法:所在组排名序号-1除以该组所有的⾏数-1,如下所⽰⾃⼰计算的pr1与通过percent_rank函数得到的值是⼀样的:SELECT a.deptno, a.ename, a.sal, a.r, b.n, (a.r-1)/(n-1) pr1, percent_rank() over(PARTITION BY a.deptno ORDER BY a.sal) pr2 FROM (SELECT deptno, ename, sal, rank() over(PARTITION BY deptno ORDER BY sal) r --计算出在组中的排名序号 FROM emp ORDER BY deptno, sal) a, (SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b --按部门计算每个部门的所有成员数 WHERE a.deptno = b.deptno; cume_dist函数计算⽅法:所在组排名序号除以该组所有的⾏数,但是如果存在并列情况,则需加上并列的个数-1,如下所⽰⾃⼰计算的pr1与通过percent_rank函数得到的值是⼀样的:SELECT a.deptno, a.ename, a.sal, a.r, b.n, c.rn, (a.r + c.rn - 1) / n pr1, cume_dist() over(PARTITION BY
a.deptno ORDER BY a.sal) pr2 FROM (SELECT deptno, ename, sal, rank() over(PARTITION BY deptno ORDER BY sal) r FROM emp ORDER BY deptno, sal) a, (SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b, (SELECT deptno, r, COUNT(1) rn,sal FROM (SELECT deptno,sal, rank() over(PARTITION BY deptno ORDER BY sal) r FROM emp) GROUP BY deptno, r,sal ORDER BY deptno) c --c表就是为了得到每个部门员⼯⼯资的⼀样的个
数 WHERE a.deptno = b.deptno AND a.deptno = c.deptno(+) AND a.sal = c.sal; percentile_cont函数含义:输⼊⼀个百分⽐(该百分⽐就是按照percent_rank函数计算的值),返回该百分⽐位置的平均值如下,输⼊百分⽐为0.7,因为0.7介于0.6和0.8之间,因此返回的结果就是0.6对应的sal的1500加上0.8对应的sal的1600平均SELECT ename, sal, deptno, percentile_cont(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont", percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank" FROM emp WHERE deptno IN (30, 60); 若输⼊的百分⽐为0.6,则直接0.6对应的sal值,即1500SELECT ename, sal, deptno, percentile_cont(0.6) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont", percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank" FROM emp WHERE deptno IN (30, 60); PERCENTILE_DISC函数功能描述:返回⼀个与输⼊的分布百分⽐值相对应的数据值,分布百分⽐的计算⽅法见函数CUME_DIST,如果没有正好对应的数据值,就取⼤于该分布值的下⼀个值。注意:本函数与PERCENTILE_CONT的区别在不到对应的分布值时返回的替代值的计算⽅法不同SAMPLE:下例中0.7的分布值在部门30中没有对应的Cume_Dist值,所以就取下⼀个分布值0.83333333所对应的SALARY来替代SELECT ename, sal, deptno, percentile_disc(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Disc", cume_dist() over(PARTITION BY deptno ORDER BY sal) "Cume_Dist" FROM emp WHERE deptno IN (30, 60);
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论