1. Hive默认显示当前使用库
、需要用时,即时配置,在cli执行属性设置,这种配置方式,当重新打开cli时,就会生效:
hive> set hive.cli.print.current.db=true; 、一次配置,永久生效,在当前用户的HOME目录下,新建.hiverc文件,把属性设置命令放置到该文件中,每次打开cli时,都会先执行该文件。
[yun@mini01 ~]$ pwd
/app
[yun@mini01 ~]$ cat .hiverc
set hive.cli.print.current.db=true; 、在hive配置文件中添加配置【推荐】,上一篇文章hive配置中已经有了该配置项
<!-- 显示当前使用的数据库 -->
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive prompt.</description>
</property>
2. 创建库
# 没有显示当前使用库
[yun@mini01 ~]$ hive Logging initialized using configuration in jar:file:/app/hive-1.2./lib/hive-common-1.2..jar!/hive-log4j.properties
hive> show databases; # 默认库为default
OK
default
Time taken: 0.774 seconds, Fetched: row(s)
# 创建库
hive> create database zhang;
OK
Time taken: 0.168 seconds
hive> show databases;
OK
default
zhang
Time taken: 0.02 seconds, Fetched: row(s)
浏览器访问
3. 创建表
# 默认显示当前使用库
hive (default)> show databases;
OK
default
zhang
Time taken: 0.729 seconds, Fetched: row(s)
hive (default)> use zhang;
OK
Time taken: 0.036 seconds
hive (zhang)> create table t_sz01(id int, name string)
> row format delimited
> fields terminated by ',';
OK
Time taken: 0.187 seconds
hive (zhang)> show tables;
OK
t_sz01
Time taken: 0.031 seconds, Fetched: row(s)
浏览器访问
4. 创建数据并上传
[yun@mini01 hive]$ cat sz.dat
,zhangsan
,李四
,wangwu
,赵六
,zhouqi
,孙八
[yun@mini01 hive]$ hadoop fs -put sz.dat /user/hive/warehouse/zhang.db/t_sz01 # 上传
[yun@mini01 hive]$ hadoop fs -ls /user/hive/warehouse/zhang.db/t_sz01/
Found items
-rw-r--r-- yun supergroup -- : /user/hive/warehouse/zhang.db/t_sz01/sz.dat
[yun@mini01 hive]$ hadoop fs -cat /user/hive/warehouse/zhang.db/t_sz01/sz.dat
,zhangsan
,李四
,wangwu
,赵六
,zhouqi
,孙八
5. Hive查询数据
hive (zhang)> show tables;
OK
t_sz01
Time taken: 0.028 seconds, Fetched: row(s)
hive (zhang)> select * from t_sz01; # 全表查询
OK
zhangsan
李四
wangwu
赵六
zhouqi
孙八
Time taken: 0.264 seconds, Fetched: row(s)
hive (zhang)> select count() from t_sz01; # 表数据条数
Query ID = yun_20180703213443_ebca743c--405a--59359e1566c2
Total jobs =
Launching Job out of
Number of reduce tasks determined at compile time:
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1530619476293_0001, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0001/
Kill Command = /app/hadoop/bin/hadoop job -kill job_1530619476293_0001
Hadoop job information for Stage-: number of mappers: ; number of reducers:
-- ::, Stage- map = %, reduce = %
-- ::, Stage- map = %, reduce = %, Cumulative CPU 2.5 sec
-- ::, Stage- map = %, reduce = %, Cumulative CPU 6.37 sec
MapReduce Total cumulative CPU time: seconds msec
Ended Job = job_1530619476293_0001
MapReduce Jobs Launched:
Stage-Stage-: Map: Reduce: Cumulative CPU: 6.37 sec HDFS Read: HDFS Write: SUCCESS
Total MapReduce CPU Time Spent: seconds msec
OK Time taken: 25.312 seconds, Fetched: row(s)
hive (zhang)> select id,name from t_sz01 where id >; # 查询id>
OK
李四
zhouqi
孙八
Time taken: 0.126 seconds, Fetched: row(s)
hive (zhang)> select id,name from t_sz01 where id > limit ; # 不能使用 limit m,n
OK
李四
zhouqi
Time taken: 0.072 seconds, Fetched: row(s)
hive (zhang)> select id, name from t_sz01 order by name; # 使用order by 排序
Query ID = yun_20180703214314_db222afe--4c8e--73aa4fec62ef
Total jobs =
Launching Job out of
Number of reduce tasks determined at compile time:
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1530619476293_0002, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0002/
Kill Command = /app/hadoop/bin/hadoop job -kill job_1530619476293_0002
Hadoop job information for Stage-: number of mappers: ; number of reducers:
-- ::, Stage- map = %, reduce = %
-- ::, Stage- map = %, reduce = %, Cumulative CPU 2.64 sec
-- ::, Stage- map = %, reduce = %, Cumulative CPU 4.85 sec
MapReduce Total cumulative CPU time: seconds msec
Ended Job = job_1530619476293_0002
MapReduce Jobs Launched:
Stage-Stage-: Map: Reduce: Cumulative CPU: 4.85 sec HDFS Read: HDFS Write: SUCCESS
Total MapReduce CPU Time Spent: seconds msec
OK
wangwu
zhangsan
zhouqi
孙八
李四
赵六
Time taken: 26.768 seconds, Fetched: row(s)
MapReduce信息
http://mini02:8088
6. Hive的访问方式
6.1. Hive交互shell
# 之间已经添加环境变量
[yun@mini01 ~]$ hive
6.2. Hive thrift服务
启动方式,(例如是在mini01上):
# 之间已经添加环境变量
启动为前台: hiveserver2
启动为后台: nohup hiveserver2 >/app/hive/logs/hiveserver.log >/app/hive/logs/hiveserver.err &
# 没有 /app/hive/logs 目录就创建
启动成功后,可以在别的节点上用beeline去连接
方式1
# 由于没有在其他机器安装,所以还是在本机用beeline去连接
[yun@mini01 bin]$ beeline
Beeline version 1.2. by Apache Hive
beeline> !connect jdbc:hive2://mini01:10000 # jdbc连接 可以是mini01、127.0.0.0、10.0.0.11、172.16.1.11
Connecting to jdbc:hive2://mini01:10000
Enter username for jdbc:hive2://mini01:10000: yun
Enter password for jdbc:hive2://mini01:10000:
Connected to: Apache Hive (version 1.2.)
Driver: Hive JDBC (version 1.2.)
Transaction isolation: TRANSACTION_REPEATABLE_READ
: jdbc:hive2://mini01:10000>
方式2
# 或者启动就连接:
[yun@mini01 ~]$ beeline -u jdbc:hive2://mini01:10000 -n yun
接下来就可以做正常sql查询了
例如:
: jdbc:hive2://mini01:10000> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| zhang |
+----------------+--+
rows selected (0.437 seconds)
: jdbc:hive2://mini01:10000> use zhang;
No rows affected (0.058 seconds)
: jdbc:hive2://mini01:10000> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| t_sz01 |
+-----------+--+
row selected (0.054 seconds)
: jdbc:hive2://mini01:10000> select * from t_sz01;
+------------+--------------+--+
| t_sz01.id | t_sz01.name |
+------------+--------------+--+
| | zhangsan |
| | 李四 |
| | wangwu |
| | 赵六 |
| | zhouqi |
| | 孙八 |
+------------+--------------+--+
rows selected (0.641 seconds)
: jdbc:hive2://10.0.0.11:10000> select count(1) from t_sz01; # 条数查询
INFO : Number of reduce tasks determined at compile time:
INFO : In order to change the average load for a reducer (in bytes):
INFO : set hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:
INFO : Submitting tokens for job: job_1530619476293_0003
INFO : The url to track the job: http://mini02:8088/proxy/application_1530619476293_0003/
INFO : Starting Job = job_1530619476293_0003, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0003/
INFO : Kill Command = /app/hadoop/bin/hadoop job -kill job_1530619476293_0003
INFO : Hadoop job information for Stage-: number of mappers: ; number of reducers:
INFO : -- ::, Stage- map = %, reduce = %
INFO : -- ::, Stage- map = %, reduce = %, Cumulative CPU 2.56 sec
INFO : -- ::, Stage- map = %, reduce = %, Cumulative CPU 5.28 sec
INFO : MapReduce Total cumulative CPU time: seconds msec
INFO : Ended Job = job_1530619476293_0003
+------+--+
| _c0 |
+------+--+
| |
+------+--+
row selected (25.433 seconds)
6.3. hive -e "HiveQL"
适用于写脚本
[yun@mini01 ~]$ hive -e "use exercise; select * from student;" Logging initialized using configuration in jar:file:/app/hive-1.2./lib/hive-common-1.2..jar!/hive-log4j.properties
OK
Time taken: 1.109 seconds
OK
李勇 男 CS
刘晨 女 IS
王敏 女 MA
张立 男 IS
刘刚 男 MA
孙庆 男 CS
易思玲 女 MA
李娜 女 CS
梦圆圆 女 MA
孔小涛 男 CS
Time taken: 0.786 seconds, Fetched: row(s)
6.4. hive -f 'test.sql'
适用于hive直接调用一个脚本,该脚本中全是hive的类SQL语句。
# aztest.sql 脚本名称
CREATE DATABASE IF NOT EXISTS azkaban;
use azkaban;
DROP TABLE IF EXISTS aztest;
DROP TABLE IF EXISTS azres;
create table aztest(id int,name string) row format delimited fields terminated by ',';
load data inpath '/aztest/hiveinput/azkb.dat' into table aztest;
create table azres row format delimited fields terminated by '#' as select * from aztest;
insert overwrite directory '/aztest/hiveoutput' select count(1) from aztest;
7. 文章参考
1、Hive创建表格报【Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.