hadoop学习笔记:zookeeper学习(上)

  在前面的文章里我多次提到zookeeper对于分布式系统开发的重要性,因此对zookeeper的学习是非常必要的。本篇博文主要是讲解zookeeper的安装和zookeeper的一些基本的应用,同时我还会教大家如何安装伪分布式,伪分布式不能在windows下实现,只能在linux下实现,我的伪分布式是通过电脑的虚拟机完成了,好了,不废话了,具体内容如下:

  首先我们要下载一个zookeeper,下载地址是:

  http://www.apache.org/dyn/closer.cgi/zookeeper/

  一般我们会选择一个stable版(稳定版)进行下载,我下载的版本是zookeeper-3.4.5。

  我笔记本的操作系统是windows7,windows操作系统可以作为zookeeper的开发平台,但是不能作为zookeeper的生产平台,首先我们在windows下安装一个单机版的zookeeper。

  我们先解压zookeeper的安装包,解压后的zookeeper安装包我放置的路径是:

  E:\zookeeper\zookeeper-3.4.5

  下图是zookeeper的目录结构:

hadoop学习笔记:zookeeper学习(上)

  我们进入conf包,将zoo_sample.cfg文件复制一份,并将复制好的文件改名为zoo.cfg。打开新建的zoo.cfg文件,将里面的内容进行修改,修改后的文件内容如下:

#initLimit=10
#syncLimit=5
tickTime=2000
dataDir=E:/zookeeper/zookeeper-3.4.5/data
clientPort=2181

  下面我来解释下配置文件里的各个参数:

  initLimit和syncLimit是针对集群的参数,在我后面讲解伪分布式安装时候我会再讲解。

  tickTime:该参数用来定义心跳的间隔时间,zookeeper的客户端和服务端之间也有和web开发里类似的session的概念,而zookeeper里最小的session过期时间就是tickTime的两倍。

  dataDir:英文注释可以翻译为存储在内存中的数据库快照功能,我们可以看看运行后dataDir所指向的文件存储了什么样的数据,如下图所示:

hadoop学习笔记:zookeeper学习(上)

  看来dataDir里还存储了日志信息,dataDir不能存放在命名为tmp的文件里。

  clientPort:是监听客户端连接的端口号。

  接下来我们要将zookeeper的安装信息配置到windows的环境变量里,我们在“我的电脑”上点击右键,选择属性,再点击高级系统设置,点击环境变量按钮,在系统变量这一栏,点击新建,添加:

变量名:ZOOKEEPER_HOME
变量值:E:\zookeeper\zookeeper-3.4.5

  还是在系统变量这一栏,找到path,点击编辑path,在变量值里添加:% ZOOKEEPER_HOME %\bin; % ZOOKEEPER_HOME %\conf;

  Zookeeper使用java编写的,因此安装zookeeper之前一定要先安装好jdk,并且jdk的版本要大于或等于1.6。

  这样单机版的zookeeper就安装好了,下面我们将运行zookeeper。

  首先我们打开windows的命令行工具,将文件夹转到zookeeper安装目录的下的bin目录,然后输入zkServer命令,回车执行,那么zookeeper服务就启动成功了。

  下面我们用客户端连接zookeeper的服务端,我们再打开一个命令行工具,输入命令:

zkCli -server localhost:2181

  下面是相关测试,如下图所示:

hadoop学习笔记:zookeeper学习(上)

  伪分布式的安装,zookeeper和hadoop一样也可以进行伪分布式的安装,下面我就讲解如何进行伪分布式安装。

  我开始尝试在windows下安装伪分布式,但是没有成功,最后是在linux操作系统下才安装好伪分布式,我们首先下载好zookeeper的安装程序,然后新建三个配置文件分别是:

zoo1.cfg:

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=E:/zookeeper/zookeeper-3.4.5/d_1
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
dataLogDir=E:/zookeeper/zookeeper-3.4.5/log1_2
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

zoo2.cfg:

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=E:/zookeeper/zookeeper-3.4.5/d_2
# the port at which the clients will connect
clientPort=2182
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
dataLogDir=E:/zookeeper/zookeeper-3.4.5/logs_2
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

zoo3.cfg:

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=E:/zookeeper/zookeeper-3.4.5/d_3
# the port at which the clients will connect
clientPort=2183
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
dataLogDir=E:/zookeeper/zookeeper-3.4.5/logs_3
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

  这里我们把每个配置文件里的clientPort做了一定修改,让每个文件之间的clientPort不一样,dataDir属性也做了同样的调整,同时还添加了新配置内容,如下所示:

server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

  这里localhost指的是组成zookeeper服务的机器IP的地址,2887是用于进行leader选举的端口,3887是zookeeper集群里各个机器之间的通信接口。

  initLimit:是指follower连接并同步到leader的初始化连接,它是通过tickTime的倍数表示,例如我们上面的配置就是10倍的tickTime,当初始化连接时间超过设置的倍数时候则连接失败。

  syncLimit:是指follower和leader之间发送消息时请求和应答的时间长度,如果follower在设置的时间范围内不能喝leader通信,那么该follower将会被丢弃,它也是按tickTime的倍数进行设置的。

  dataLogDir:这个配置是指zookeeper运行的相关日志写入的目录,设定了配置,那么dataLog里日志的目录将无效,专门的日志存放路径,对zookeeper的性能和稳定性有好处。

  这里每一个配置文件都代表一个zookeeper服务器,下面我们启动伪分布式的zookeeper集群。

  zkServer.sh start zoo1.cfg

  zkServer.sh start zoo2.cfg

  zkServer.sh start zoo3.cfg

  下面我写一个java程序,该程序作为客户端调用zookeeper的服务,代码如下:

package cn.com.test;

import java.io.IOException;

import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooDefs.Ids;
import org.apache.zookeeper.ZooKeeper; public class zkClient { public static void main(String[] args) throws Exception{
Watcher wh = new Watcher(){
@Override
public void process(WatchedEvent event) {
System.out.println(event.toString());
}
};
ZooKeeper zk = new ZooKeeper("localhost:2181",30000,wh);
System.out.println("=========创建节点===========");
zk.create("/sharpxiajun", "znode1".getBytes(), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
System.err.println("=============查看节点是否安装成功===============");
System.out.println(new String(zk.getData("/sharpxiajun", false, null)));
System.out.println("=========修改节点的数据==========");
zk.setData("/sharpxiajun", "sharpxiajun130901".getBytes(), -1);
System.out.println("========查看修改的节点是否成功=========");
System.out.println(new String(zk.getData("/sharpxiajun", false, null)));
System.out.println("=======删除节点==========");
zk.delete("/sharpxiajun", -1);
System.out.println("==========查看节点是否被删除============");
System.out.println("节点状态:" + zk.exists("/sharpxiajun", false));
zk.close();
} }

  执行结果如下:

log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
=========创建节点===========
WatchedEvent state:SyncConnected type:None path:null
=============查看节点是否安装成功===============
znode1
=========修改节点的数据==========
========查看修改的节点是否成功=========
sharpxiajun130901
=======删除节点==========
==========查看节点是否被删除============
节点状态:null

  程序我今天不讲解了,只是给大伙展示下使用zookeeper的方式,本文可能没啥新颖的东西,但是本文是一个基础,有了这个基础我们才能真正操作zookeeper。

上一篇:shell脚本中执行sql脚本(mysql为例)


下一篇:【Alpha阶段】第六次Scrum例会