hive Expression Not In Group By Key
I create a table in HIVE. It has the following columns:
id bigint, rank bigint, date string
I want to get avg(rank) per month. I can use this command. It works.
select a.lens_id, avg(a.rank)
from tableA a
group by a.lens_id, year(a.date_saved), month(a.date_saved);
However, I also want to get date information. I use this command:
select a.lens_id, avg(a.rank), a.date_saved
from lensrank_archive a
group by a.lens_id, year(a.date_saved), month(a.date_saved);
It complains: Expression Not In Group By Key
answer:
The full error message should be in the format Expression Not In Group By Key [value].
The [value]
will tell you what expression needs to be in the Group By
.
Just looking at the two queries, I’d say that you need to add a.date_saved explicitly to the Group By.
question:
Yes. After adding a.date_saved, it works. However, it does not do what I want. I want avg(rank) per month. Now it does not do average. It just shows all records since adding group by a.date_saved.
answer:
You can’t have a column selected and not have it grouped by that column. If you want to display a.date_saved you need to group by it. You might be able to display year(a.date_saved) and month(a.date_saved) since those are in the Group by but not 100% on that.
解决方案二:
A walk around is to put the additional field in a collect_set and return the first element of the set. For example
select a.lens_id, avg(a.rank), collect_set(a.date_saved)[0]
from lensrank_archive a
group by a.lens_id, year(a.date_saved), month(a.date_saved);
–客单价(含退拒),在hql中,要在select…group by…中要展示某个字段,就必须对这个字段进行group by,否则就不要select那个字段
select user_id,sum(order_money)/count(order_money) from us_order;
报错:
修改答案:
select user_id,sum(order_money)/count(order_money) from us_order group by user_id;
另外一种答案:
select collect_set(user_id)[0],sum(order_money)/count(order_money) from us_order;