十二、 sqoop 将 MySQL 数据导⼊ Hive
1、 将 ambari.users 表导入到 dim_v8sp
数据库在 client-v01 服务器创建 etl 用户 HOME目录,生成kerberos票据文件,并添加 crontab以定期刷新飘据到缓存。
cd /home/;
cp -r ./bigdata ./etl;
chown -R etl:etl ./etl;
rm -rf /home/etl/security/keytabs/bigdata.keytab;
su - etl
kinit admin
ipa-getkeytab -s ipa-v01.tianlingqun.com -p${USER}@tianlingqun.COM -k /home/${USER}/security/keytabs/${USER}.keytab
kinit -k -t /home/etl/security/keytabs/etl.keytab etl@tianlingqun.COM
klist
crontab -e
# kerberos kinit, add by luoyoumou, 20191129
0 0 * * * kinit -k -t /home/etl/security/keytabs/etl.keytab
cat ~/.bash_profile
export PATH
export USER=etl
# beeline
zk_nodes="master-v01.tianlingqun.com:2181,master-
v02.tianlingqun.com:2181,datanode-v01.tianlingqun.com:2181"
hive_password=etl123
hive2_db_url="jdbc:hive2://${zk_nodes}/;serviceDiscoveryMode=zooKeeper;z
ooKeeperNamespace=hiveserver2?tez.queue.name=${USER}"
alias hive2='/usr/hdp/current/hive-client/bin/beeline -u "${hive2_db_url}" -n
${USER} -p ${hive_password} --showDbInPrompt=true --color=true --
incremental=false --hiveconf tez.queue.name=${USER}'
# phoenix
alias phoenix='/usr/hdp/current/phoenix-client/bin/sqlline.py master-
v01.tianlingqun.com,master-v02.tianlingqun.com,datanode-
v01.tianlingqun.com:2181:/hbase-secure'
在 ambari 服务器 配置数据库帐号,授权只读帐号
PASSWD="Bee#"$(cat /dev/urandom | head -n 10 | md5sum | head -c 15)
echo $PASSWD
mysql -uroot -p123456 -e "grant select on ambari.* to 'bigdata_read'@'client-v01.tianlingqun.com' identified by '${PASSWD}' WITH GRANT OPTION;"
mysql -uroot -p123456 -e "grant select on ambari.* to 'bigdata_read'@'client-v02.tianlingqun.com' identified by '${PASSWD}' WITH GRANT OPTION;"
mysql -uroot -p123456 -e "grant select on ambari.* to 'bigdata_read'@'192.168.0.%' identified by '${PASSWD}' WITH GRANT OPTION;"
在 client-v01 服务器执行如下命令, 配置数据库连接相关配置文件
mkdir -p /home/configure
cd /home/configure/
vi db_ambari_metadata.conf
db_host=ambari.tianlingqun.com
db_port=3306
db_user=bigdata_read
db_user_pwd=Bee#d80aab52f2e859e
db_name=ambari
su - etl
mkdir -p ~/code/hive/dim_v8sp/;
mkdir -p ~/code/hive/dim_v8sp_tmp/;
mkdir ~/data;
mkdir ~/logs;
cd ~/code/;
ln -s /home/configure ./configure
备注:
~/code 用户存放各种项目代码;
~/data用于存储项目运行时需要的“临时数据”支持;
~/logs 用于存储项目运行时输出的日志
在 Ranger 里,将 hdfs 目录
/warehouse/tablespace/external/hive/dim_v8sp_tmp.db 的所有权限给 etl 用户
创建相关配置文件
数据库配置信息文件
/home/configure/db_ambari_metadata.conf 参考db_ambari_metadata.conf
hive配置文件 /home/configure/hive.conf 参考hive.conf
Azkaban传参脚本文件
/home/configure/parameter_dayhour.sh 参考parameter_dayhour.sh
创建 Hive 相关表
hive2
create external table dim_v8sp_tmp.users(
user_id int,
principal_id bigint,
user_name varchar(255),
active int,
consecutive_failures int,
active_widget_layouts varchar(1024),
display_name varchar(255),
local_username varchar(255),
create_time bigint,
version bigint
) stored as orc;
create table dim_v8sp.users(
user_id int,
principal_id bigint,
user_name varchar(255),
active int,
consecutive_failures int,
active_widget_layouts varchar(1024),
display_name varchar(255),
local_username varchar(255),
create_time bigint,
version bigint
) stored as orc;
写 sqoop 相关脚本从库延迟检查相关脚本
/home/etl/code/hive/dim_v8sp_tmp/check_ambari_db.sh 参考check_ambari_db.sh
全量导入相关脚本
/home/etl/code/hive/dim_v8sp_tmp/users.imp.full.sh 参考users.imp.full.sh
全量插入相关脚本
/home/etl/code/hive/dim_v8sp_tmp/users.insert.full.sh 参考users.insert.full.sh
将“ sqoop 将 MySQL 数据导⼊ Hive ”的任务部署到 Azkaban
登陆 Azkaban WEB UI, 创建 Azkaban Project