hive:函数:排名函数:Rank(笔记)
hive:函数:排名函数:Rank(笔记)Rank
1.函数说明
RANK() 排序(排名)相同时会重复,总数不会变
DENSE_RANK() 排序(排名)相同时会重复,总数会减少
ROW_NUMBER() 依次进⾏排名
2.数据准备
表6-7 数据准备
name subject score
孙悟空语⽂87
孙悟空数学95
孙悟空英语68
⼤海语⽂94
⼤海数学56
⼤海英语84
宋宋语⽂64
宋宋数学86
宋宋英语84
婷婷语⽂65
婷婷数学85
婷婷英语78
3.需求
计算每门学科成绩排名
rank函数的用法
4.创建本地,导⼊数据
5.创建hive表并导⼊数据
create table score(
name string,
subject string,
score int)
row format delimited fields terminated by "\t";
load data local inpath '/opt/module/' into table score;
6.按需求查询数据
select name,
subject,
score,
rank() over(partition by subject order by score desc) rank,
dense_rank() over(partition by subject order by score desc) dense_rank,
row_number() over(partition by subject order by score desc) row_number
from tmp.score;
扩展:求出每门学科前三名的学⽣?
select * from
(
select
name,
subject,
score,
rank() over(partition by subject order by score desc ) rank,
dense_rank() over(partition by subject order by score desc) dense_rank,
row_number() over(partition by subject order by score desc) row_number
from tmp.score
) s where row_number<=3
注意:partition后边有哪个字段只会影响最后的rank的排名,跟去重没关系,select后边字段可以指定多个,不⼀定跟partition后边的字段⼀致。
举例:
=================sql1:
select
create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname order by create_time desc) as rank
from (
select from_unixtime(ate_time/1000),'yyyy-MM-dd') create_ator_id,e.fullname,
from tmp.t_conditions c
--ator_id,explode(split(regexp_replace(regexp_replace(expectCity,'\\[',''),'\\]',''),',')) city from tmp.t_conditions c
left join ods.ods_aimsen_walre_anlle_base_employees e
ator_id=e.number
ator_id not in ('130000062','130000063','130000071','430000003')
and c.tag in(2)
and c.phone = ''
sumeId = ''
and c.name = ''
and from_unixtime(ate_time/1000),'yyyy-MM-dd') > '2020-07-20'
--and expectCity !='' and degree !=''  and (maxAge!='' or minAge!='') and (positionName !='' or keyWords!='')
and creator_id=100400287
group by create_ator_id,e.pectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,cpany    ) x
=================sql2:
select
create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname,expectCity order by create_time desc)    from (
select from_unixtime(ate_time/1000),'yyyy-MM-dd') create_ator_id,e.fullname,
from tmp.t_conditions c
--ator_id,explode(split(regexp_replace(regexp_replace(expectCity,'\\[',''),'\\]',''),',')) city from tmp.t_conditions c
left join ods.ods_aimsen_walre_anlle_base_employees e
ator_id=e.number
ator_id not in ('130000062','130000063','130000071','430000003')
and c.tag in(2)
and c.phone = ''
sumeId = ''
and c.name = ''
and from_unixtime(ate_time/1000),'yyyy-MM-dd') > '2020-07-20'
--and expectCity !='' and degree !=''  and (maxAge!='' or minAge!='') and (positionName !='' or keyWords!='')
and creator_id=100400287
group by create_ator_id,e.pectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,cpany    ) x
sql1和sql2区别为:
sql2多个expectCity字段
create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by
creator_id,fullname,expectCity order by create_time desc) as rank
sql1查询效果:
sql2查询效果:
r ank排名发⽣了变化,但最终查询出的结果条数并没有改变。

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。