HDP 11.sqoop 将 MySQL 数据导⼊ Hive

十二、 sqoop  将 MySQL 数据导⼊ Hive

1、 将 ambari.users 表导入到 dim_v8sp 

数据库在 client-v01 服务器创建 etl 用户 HOME目录,生成kerberos票据文件,并添加 crontab以定期刷新飘据到缓存。

cd /home/;

cp -r ./bigdata ./etl;

chown -R etl:etl ./etl;

rm -rf /home/etl/security/keytabs/bigdata.keytab;

su - etl

kinit admin

ipa-getkeytab -s ipa-v01.tianlingqun.com -p${USER}@tianlingqun.COM -k /home/${USER}/security/keytabs/${USER}.keytab

kinit -k -t /home/etl/security/keytabs/etl.keytab etl@tianlingqun.COM

klist

crontab -e

# kerberos kinit, add by luoyoumou, 20191129

0 0 * * * kinit -k -t /home/etl/security/keytabs/etl.keytab

etl@tianlingqun.COM

cat ~/.bash_profile

export PATH

export USER=etl

# beeline

zk_nodes="master-v01.tianlingqun.com:2181,master-

v02.tianlingqun.com:2181,datanode-v01.tianlingqun.com:2181"

hive_password=etl123

hive2_db_url="jdbc:hive2://${zk_nodes}/;serviceDiscoveryMode=zooKeeper;z

ooKeeperNamespace=hiveserver2?tez.queue.name=${USER}"

alias hive2='/usr/hdp/current/hive-client/bin/beeline -u "${hive2_db_url}" -n

${USER} -p ${hive_password} --showDbInPrompt=true --color=true --

incremental=false --hiveconf tez.queue.name=${USER}'

# phoenix

alias phoenix='/usr/hdp/current/phoenix-client/bin/sqlline.py master-

v01.tianlingqun.com,master-v02.tianlingqun.com,datanode-

v01.tianlingqun.com:2181:/hbase-secure'

在 ambari 服务器 配置数据库帐号,授权只读帐号

PASSWD="Bee#"$(cat /dev/urandom | head -n 10 | md5sum | head -c 15)

echo $PASSWD

mysql -uroot -p123456 -e "grant select on ambari.* to 'bigdata_read'@'client-v01.tianlingqun.com' identified by '${PASSWD}' WITH GRANT OPTION;"

mysql -uroot -p123456 -e "grant select on ambari.* to 'bigdata_read'@'client-v02.tianlingqun.com' identified by '${PASSWD}' WITH GRANT OPTION;"

mysql -uroot -p123456 -e "grant select on ambari.* to 'bigdata_read'@'192.168.0.%' identified by '${PASSWD}' WITH GRANT OPTION;"

 HDP 11.sqoop 将 MySQL 数据导⼊ Hive

在 client-v01 服务器执行如下命令, 配置数据库连接相关配置文件

mkdir -p /home/configure

cd /home/configure/

vi db_ambari_metadata.conf

db_host=ambari.tianlingqun.com

db_port=3306

db_user=bigdata_read

db_user_pwd=Bee#d80aab52f2e859e

db_name=ambari

 HDP 11.sqoop 将 MySQL 数据导⼊ Hive

su - etl

mkdir -p ~/code/hive/dim_v8sp/;

mkdir -p ~/code/hive/dim_v8sp_tmp/;

mkdir ~/data;

mkdir ~/logs;

cd ~/code/;

ln -s /home/configure ./configure

备注:

~/code 用户存放各种项目代码;

~/data用于存储项目运行时需要的“临时数据”支持;

~/logs 用于存储项目运行时输出的日志

在 Ranger 里,将 hdfs 目录

/warehouse/tablespace/external/hive/dim_v8sp_tmp.db 的所有权限给 etl 用户

 HDP 11.sqoop 将 MySQL 数据导⼊ Hive

 HDP 11.sqoop 将 MySQL 数据导⼊ Hive

创建相关配置文件

数据库配置信息文件

/home/configure/db_ambari_metadata.conf 参考db_ambari_metadata.conf

hive配置文件 /home/configure/hive.conf 参考hive.conf

Azkaban传参脚本文件

/home/configure/parameter_dayhour.sh 参考parameter_dayhour.sh

创建 Hive 相关表

hive2

create external table dim_v8sp_tmp.users(

user_id int,

principal_id bigint,

user_name varchar(255),

active int,

consecutive_failures int,

active_widget_layouts varchar(1024),

display_name varchar(255),

local_username varchar(255),

create_time bigint,

version bigint

) stored as orc;

create table dim_v8sp.users(

user_id int,

principal_id bigint,

user_name varchar(255),

active int,

consecutive_failures int,

active_widget_layouts varchar(1024),

display_name varchar(255),

local_username varchar(255),

create_time bigint,

version bigint

) stored as orc;

写 sqoop 相关脚本从库延迟检查相关脚本

/home/etl/code/hive/dim_v8sp_tmp/check_ambari_db.sh 参考check_ambari_db.sh

全量导入相关脚本

/home/etl/code/hive/dim_v8sp_tmp/users.imp.full.sh 参考users.imp.full.sh

全量插入相关脚本

/home/etl/code/hive/dim_v8sp_tmp/users.insert.full.sh 参考users.insert.full.sh

将“ sqoop  将 MySQL 数据导⼊ Hive ”的任务部署到 Azkaban

登陆 Azkaban WEB UI,  创建 Azkaban Project

上一篇:ETL工具Kettle与Sqoop场景分析


下一篇:Photoshop教程:人物照片美容和色彩调整