开发者学堂课程【大数据Hive教程精讲:Apache Hive--DDL--创建表--数据类型&分隔符】学习笔记,与课程紧密联系,让用户快速学习知识。
课程地址:https://developer.aliyun.com/learning/course/90/detail/1367
Apache Hive--DDL--创建表--数据类型&分隔符
内容介绍:
一、Apache Hive--DDL--创建表--数据类型&分隔符
一、Apache Hive--DDL--创建表--数据类型&分隔符
1、创建表
建表语法
CREATE[EXTERNAL]TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment],..)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], .)]
[CLUSTERED BY (col_name, col_name, ...)
[SORTED BY (col_name [ASC|DESC], .
…)] INTO num_buckets BUCKETS]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION hdfs_path]
2、说明:
CREATE TABLE 创建一个指定名字的表。如果相同名字的表已经存在,则抛出异常;
用户可以用 IF NOT EXISTS 选项来忽略这个异常。
Misc Types
l BOOLEAN
l BINARY(Note: Only available starting with Hive 0.8.0)
complex Types:
l arrays: ARRAY(Note: negative values and non-constant expressions are allowed as of Hive 0.14.).
l maps: MAP,data_type〉(Note: negative values and non-constant expressions are allowed asof Hive 0.14.)
l structs: STRUCT
l union:UNIONTYPE,data_type,...>(Note: Only available starting with Hive 0.7.0.)
Column Types
lntegral Types (TINYINT,SMALLINT,INT/INTEGER, BIGINT)
Integral literals are assumed to be INT by default, unless the number exceeds the range of INT in which case it isinterpreted as a BIGINT, or if one of the following postfixes is present on the number.
3、Strings
String literals can be expressed with either single quotes () or double quotes (").Hive uses C-style escaping within thestrings.
Node-1:
[root@node-1 hivedata]# hadoop fs -cp /hivedata/1.txt /user/hive/warehouse/itcast.db/t_t1
[root@node-1 hivedata]# hadoop fs -cp /hivedata/1.txt /user/hive/warehouse/itcast.db/t_t2
[root@node-1 hivedataj# vi a.txt
[FIELDS TERMINATED BY char]
[COLLECTION ITEMS TERMINATED BY char][MAP KEYS TERMINATED BY char]
[LINES TERMINATED BY char] | SERDE serde_name[WITH SERDEPROPERTIES
(property_name=property_value, property_name=property_value,.
…)]
ROW FORMAT DELIMITED
hive建表的时候默认的分割符是’\001',若在建表的时候没有指明分隔符,load 文件的时候文件的分隔符需要是'\001';若文件分隔符不是'001',程序不会报错,但表查询的结果会全部为'null';
Hive sql 语句:
4、复杂类型的数据表指定分隔符:
create table complex_array(name string ,work_locations array) ROW FORMAT DELIMITED FIELDS
zhangsanbeijing, shanghai,tianjin,hangzhouwangwu shanghai,chengdu, wuhan, haerbin
create table t_map(id int,name string, hobby map)row format delimited
fields terminated by ','
collection items terminated by '-'map keys terminated by ':' ;
1,zhangsan,唱歌:非常喜欢-跳舞:喜欢-游泳:一般般
2,lisi,打游戏:非常喜欢-篮球:不喜欢
5、内部表、外部表
建内部表
create table student (Sno int,Sname string,sex string,sage int , sdept string) row format delimited 建外部表
课堂笔记:
1、hive 建立一张表跟已经存在的结构化的数据文件产生映射关系
映射成功之后,就可以通过写 sql 来分析这结构化的数据避免了写mr程序的麻烦
2、数据库---》/user/hive/warehouse 下的一个文件夹对应
表---》数据库文件夹下面的子文件夹/user/hive/warehouse/itcast.db/t_t1表的数据位置目前不能随便存放一定要在指定的数据库表的文件夹下面
建立表的时候可能还需要指定分隔符否则有可能映射不成功
3、建表的时候一定要根据结构化数据文件的分隔符类型指定分隔符
建表的字段个数和字段类型要跟结构化数据中的个数类型一致
分隔符一般使用内置的来指定 ROW FORMAT DELIMITED 分割字段还是分割集合