Hive——第二章（Hive基本操作）

2020-07-01 00:00:00 数据命令导入分区表相关

基础知识：

hive常用命令

1.创建新表

CREATE TABLE t_hive (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

例如：

create table user_info (user_id int, cid string, ckid string, username string)

row format delimited

fields terminated by '\t'

lines terminated by '\n';

导入数据表的数据格式是：字段之间是tab键分割，行之间是断行。

2.导入数据t_hive.txt到t_hive表

LOAD DATA LOCAL INPATH '/home/cos/demo/t_hive.txt' OVERWRITE INTO TABLE t_hive ;

3.正则匹配表名

show tables '*t*';

4.增加一个字段

ALTER TABLE t_hive ADD COLUMNS (new_col String);

5.重命令表名

ALTER TABLE t_hive RENAME TO t_hadoop;

6.从HDFS加载数据

LOAD DATA INPATH '/user/hive/warehouse/t_hive/t_hive.txt' OVERWRITE INTO TABLE t_hive2;

7.从其他表导入数据

INSERT OVERWRITE TABLE t_hive2 SELECT * FROM t_hive ;

8.创建表并从其他表导入数据

CREATE TABLE t_hive AS SELECT * FROM t_hive2 ;

9.仅复制表结构不导数据

CREATE TABLE t_hive3 LIKE t_hive;

10.通过Hive导出到本地文件系统

INSERT OVERWRITE LOCAL DIRECTORY '/tmp/t_hive' SELECT * FROM t_hive;

11.Hive查询HiveQL

from ( select b,c as c2 from t_hive) t select t.b, t.c2 limit 2;

select b,c from t_hive limit 2;

12.创建视图

CREATE VIEW v_hive AS SELECT a,b FROM t_hive;

13.删表

drop table if exists t_hft;

14.创建分区表

DROP TABLE IF EXISTS t_hft;
CREATE TABLE t_hft(
SecurityID STRING,
tradeTime STRING,
PreClosePx DOUBLE
) PARTITIONED BY (tradeDate INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

15.导入分区数据

load data local inpath '/home/BlueBreeze/data/t_hft_1.csv' overwrite into table t_hft partition(tradeDate=20130627);

16.查看分区表

SHOW PARTITIONS t_hft;

2.1、Hive基本操作

1、本地文件导入表的测试

1）在本地新建“生词本”

相关命令与内容：

vim vocab.txt

------------------------------内容------------------------------

1.ability

2.ambition

3.headquarters

4.industrialize

------------------------------内容------------------------------

2）进入hiveshell模式

相关命令：

hive

注意：当环境变量设置后才能直接使用以上命令。

3）建立新表并查看存在新表与新表结构

建立一个存放“生词本”单词的表格，字段之间是“.”分割。

相关命令：

create table VOCAB(num int,word string)row format delimited fields terminated by '.';

show tables;

desc VOCAB;

4）导入数据到表中

相关命令：

load data local inpath '/home/hadoop/vocab.txt' overwrite into table VOCAB;

5）查询表中内容

相关命令：

select * from VOCAB;

2、词频统计

1）在本地建立不完全相同的词频文件

相关命令与内容：

vim wordCount.txt

------------------------------内容------------------------------

I,100

have,1000

a,200

pen,3000

you,2222

are,777

amazing,9999

------------------------------内容------------------------------

2）进入hiveshell模式

相关命令：

hive

3）建立新表并查看存在新表与新表结构

建立一个存放不完全相同的词频单词的表格，字段之间是“,”分割。

相关命令：

create table WOCO(word string,count int)row format delimited fields terminated by ',';

show table;

desc WOCO;

4）导入数据到表中

相关命令：

load data local inpath '/home/hadoop/wordCount.txt' overwrite into table WOCO;

5）查询表中内容

相关命令：

select * from WOCO;

6）使用命令进行mapreduce筛选查询

相关命令：

select WOCO.word from WOCO;

select * from WOCO where WOCO.count>1000; //筛选满足出现次数大于1000的单词

select * from WOCO sort by count desc limit 3;//通过降序来筛选单词

相关文章