如何限制 SQL 中每个字段值的行数?

2021-12-28 00:00:00 sql hive mysql greatest-n-per-group

例如，我在 Hive 中有一个这样的表:

For example, I have a table like this in Hive:

1 1 1 4 1 8 2 1 2 5 3 1 3 2

我只想返回第一列的每个唯一值的前两行.我希望这能够限制我从 Hive 传输到 MySQL 以用于报告目的的数据量.我想要一个给我这个的 HiveQL 查询:

and I want to only return the first two rows of each unique value of the first column. I want this to be able to limit the amount of data that I transfer from Hive into MySQL for reporting purposes. I'd like a single HiveQL query that gives me this:

1 1 1 4 2 1 2 5 3 1 3 2

推荐答案

不幸的是 mysql 没有分析函数.所以你必须玩弄变量.假设您有一个自动增量字段:

Unluckily mysql doesn't have Analytical Functions. So you have to play with variables. Supposing you have an autoincrement field:

mysql> create table mytab ( -> id int not null auto_increment primary key, -> first_column int, -> second_column int -> ) engine = myisam; Query OK, 0 rows affected (0.05 sec) mysql> insert into mytab (first_column,second_column) -> values -> (1,1),(1,4),(2,10),(3,4),(1,4),(2,5),(1,6); Query OK, 7 rows affected (0.00 sec) Records: 7 Duplicates: 0 Warnings: 0 mysql> select * from mytab order by id; +----+--------------+---------------+ | id | first_column | second_column | +----+--------------+---------------+ | 1 | 1 | 1 | | 2 | 1 | 4 | | 3 | 2 | 10 | | 4 | 3 | 4 | | 5 | 1 | 4 | | 6 | 2 | 5 | | 7 | 1 | 6 | +----+--------------+---------------+ 7 rows in set (0.00 sec) mysql> select -> id, -> first_column, -> second_column, -> row_num -> from ( -> select *, -> @num := if(@first_column = first_column, @num:= @num + 1, 1) as row_num, -> @first_column:=first_column as c -> from mytab order by first_column,id) as t,(select @first_column:='',@num: =0) as r; +----+--------------+---------------+---------+ | id | first_column | second_column | row_num | +----+--------------+---------------+---------+ | 1 | 1 | 1 | 1 | | 2 | 1 | 4 | 2 | | 5 | 1 | 4 | 3 | | 7 | 1 | 6 | 4 | | 3 | 2 | 10 | 1 | | 6 | 2 | 5 | 2 | | 4 | 3 | 4 | 1 | +----+--------------+---------------+---------+ 7 rows in set (0.00 sec) mysql> select -> id, -> first_column, -> second_column, -> row_num -> from ( -> select *, -> @num := if(@first_column = first_column, @num:= @num + 1, 1) as row_num, -> @first_column:=first_column as c -> from mytab order by first_column,id) as t,(select @first_column:='',@num: =0) as r -> having row_num<=2; +----+--------------+---------------+---------+ | id | first_column | second_column | row_num | +----+--------------+---------------+---------+ | 1 | 1 | 1 | 1 | | 2 | 1 | 4 | 2 | | 3 | 2 | 10 | 1 | | 6 | 2 | 5 | 2 | | 4 | 3 | 4 | 1 | +----+--------------+---------------+---------+ 5 rows in set (0.02 sec)

相关文章