1.如何安装Thrift编译器,
2.通过一个小例子展示如何使用python通过Thrift调用Hive
必要前提条件:1.已安装配置jdk 2.已安装Hadoop,Hive并已启动、可使用。3.已安装gcc与g++并可使用。
本机环境说明:
jdk1.7
hadoop-1.1.2
hive-0.11.0
操作系统:Ubuntu12.04
一.安装Thrift
1.安装需要的工具和类库
sudo apt-get install libboost-dev libboost-test-dev libboost-program-options-dev libevent-dev automake libtool flex bison pkg-config g++ libssl-dev
2.下载\解压\编译\安装 Thrift
wget https://pypi.python.org/packages/source/t/thrift/thrift-0.9.1.tar.gz tar -zxvf thrift-0.9.1.tar.gz cd thrift-0.9.1 ./configure sudo make && sudo make install
二.将hive_home/lib/py下的文件copy至python可加载的目录
cp -r $HIVE_HOME/lib/py/* /usr/local/lib/python2.7/dist-packages/
至此环境已搭建完毕
三.启动hiveserver
hive --service hiveserver -p 100028 -v
请在已将HIVE_HOME/bin目录配置进系统环境变量的情况下执行该命令,否则请在HIVE_HOME/bin目录下执行
四.测试程序
测试程序如下,在本机环境下测试成功,在生产环境中一定要需要捕获的异常捕获,将所有异常情况控制在自己手里。
#!/usr/bin/env python #*-*coding:UTF-8 *-* #Author:JohnWang #Date:2014-01-15 #Version:1.0 import sys from hive_service import ThriftHive from hive_service.ttypes import HiveServerException from thrift import Thrift from thrift.transport import TSocket,TTransport from thrift.transport.TTransport import TTransportException from thrift.protocol import TBinaryProtocol class HiveClient(object): def __init__(self,conInfo): self.__dict__.update(conInfo) def connect(self): transport = None client = None try: transport = TSocket.TSocket(self.ip,self.port) transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = ThriftHive.Client(protocol) transport.open() except TTransportException,error: print error return False self.client = client self.tranport = transport if self.db: return self.execute(‘use %s‘ %self.db) return True def execute(self,sql): try: self.client.execute(sql) except HiveServerException,error: print error return False return True def getOne(self): pass def getAll(self): pass def close(self): if self.transport: self.transport.close() return True else: return False if __name__ == ‘__main__‘: conInfo = {‘ip‘:‘127.0.0.1‘,‘port‘:10028,‘user‘:‘‘,‘passwd‘:‘‘,‘db‘:‘test_hive‘} client = HiveClient(conInfo) if client.connect(): print ‘connect hive succes‘