python2.7连接hive(使用impyla)环境搭建

已有环境:

hive部署在linux系统的跳板机上的某个节点,要连接hive,需要将自己的vpn加入白名单(运维协助)

 

python2.7连接hive

先安装必要的包

1.pip install six;

2.pip install bit_array;

3.pip install thriftpy         备注: thrift (on Python 2.x) or thriftpy (on Python 3.x)

4.pip install thrift_sasl 
5.pip install sasl

若报错error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools

此错误需要安装Visual Studio 

安装地址:https://pan.baidu.com/s/1URHzFhpsYA06Ck7FJzu4yw

安装好又报错error: command’C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.exe’ failed with exit status 2

安装的sasl版本不适用问题,python27的
重新下载sasl,https://pan.baidu.com/s/14-ubkI6YdHBrjKQC2DCY3Q
将下载的文件解压发在site-packages目录下,这是已安装好的包,放在此目录下就行

如报错:'TSocket' object has no attribute 'isOpen'
则是thrift-sasl的版本太高了(0.3.0),故将thrift-sasl的版本降级到0.2.1
pip install thrift-sasl==0.2.1

此时安装impyla
pip install impyla
可以安装成功了,测试连接hive

若报错TypeError: can’t concat str to bytes

定位到错误的最后一条,在init.py第94行
header = struct.pack(">BI", status, len(body))
self._trans.write(header + body)

修改为
header = struct.pack(">BI", status, len(body))
if(type(body) is str):
    body = body.encode() 
self._trans.write(header + body)
 

 

#coding=utf8
from impala.dbapi import connect

ip='192.168.97.47'
db='aaa'
sql='select * from a limit 5'


conn = connect(host=ip, port=10000, database=db, auth_mechanism='PLAIN')
cursor = conn.cursor()
try:
    cursor.execute(sql)
    result = cursor.fetchone()
    print(result)


except Exception as  e:
    print('get record nexthiveUpdateTime file fail:"dw_utime" ')

返回结果,则连接成功

可参考博客https://blog.csdn.net/Xiblade/article/details/82318294?utm_source=copy

上一篇:pop3协议auth指令总结


下一篇:单元测试 – 使用Spring注入EasyMock模拟会导致ClassCastException