自Hive 0.70之后,HAVING 子句可以用于对聚集结果进行条件过滤。该用法和SQL中类似,这样我们就不需要把聚集语句作为子查询来过滤相关结果。请看以下示例,
> SELECT
> gender_age.age
> FROM employee
> GROUP BY gender_age.age
> HAVING count(*)=1;
+----------------+
| gender_age.age |
+----------------+
| 27 |
| 30 |
| 35 |
| 57 |
+----------------+
4 rows selected (25.829 seconds)
> SELECT
> gender_age.age,
> count(*) as cnt -- Support use column alias in HAVING, like ORDER BY
> FROM employee
> GROUP BY gender_age.age HAVING cnt=1;
+----------------+-----+
| gender_age.age | cnt |
+----------------+-----+
| 27 | 1 |
| 30 | 1 |
| 35 | 1 |
| 57 | 1 |
+----------------+-----+
4 rows selected (25.804 seconds)
此外,HAVING子句支持针对非聚集列进行过滤,这一功能在SQL语句中没有。不过,为了更好的性能,不推荐这样使用,推荐在WHERE子句中加入相关过滤。以下为不使用HAVING,而用子查询的示例,
> SELECT
> a.age
> FROM (
> SELECT count(*) as cnt, gender_age.age
> FROM employee GROUP BY gender_age.age
> ) a WHERE a.cnt <= 1;
+--------+| a.age |+--------+
| 57 |
| 27 |
| 35 |
+--------+
3 rows selected (87.298 seconds)