《OD大数据实战》Flume入门实例

一、netcat source + memory channel + logger sink

1. 修改配置

1)修改$FLUME_HOME/conf下的flume-env.sh文件,修改内容如下

export JAVA_HOME=/opt/modules/jdk1..0_67

2)在$FLUME_HOME/conf目录下,创建agent子目录,新建netcat-memory-logger.conf,配置内容如下:

# netcat-memory-logger

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1 # Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = beifeng-hadoop-
a1.sources.r1.port = # Describe the sink
a1.sinks.k1.type = logger # Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity =
a1.channels.c1.transactionCapacity = # Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

2. 启动flume并测试

1) 启动

bin/flume-ng agent -n a1 -c conf/ -f conf/agent/netcat-memory-logger.conf -Dflume.root.logger=INFO,console

2) 测试

nc beifeng-hadoop- 

输入任意字符串,观察服务器的日志文件即可。

使用linux的nc命令,如果命令不存在则先安装一下。

安装netcat:sudo yum -y install nc

二、agent: avro source + file channel + hdfs sink

1. 增加配置

在$FLUME_HOME/conf目录下,创建agent子目录,新建avro-file-hdfs.conf,配置内容如下:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1 # Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = beifeng-hadoop-
a1.sources.r1.port = # Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://beifeng-hadoop-02:9000/flume/events/%Y-%m-%d
# default:FlumeData
a1.sinks.k1.hdfs.filePrefix = FlumeData
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.rollInterval =
a1.sinks.k1.hdfs.rollCount =
# 一般接近block
a1.sinks.k1.hdfs.rollSize =
a1.sinks.k1.hdfs.fileType = DataStream
#a1.sinks.k1.hdfs.round = true
#a1.sinks.k1.hdfs.roundValue =
#a1.sinks.k1.hdfs.roundUnit = minute # Use a channel which buffers events in memory
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /opt/modules/cdh/apache-flume-1.5.-cdh5.3.6-bin/checkpoint
a1.channels.c1.dataDirs = /opt/modules/cdh/apache-flume-1.5.-cdh5.3.6-bin/data # Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

2. 启动并测试

1)启动flume agent

bin/flume-ng agent -n a1 -c conf/ -f conf/agent/avro-file-hdfs.conf -Dflume.root.logger=INFO,console

2)使用flume自带的avro-client测试

bin/flume-ng avro-client --host beifeng-hadoop- --port  --filename /home/beifeng/order_info.txt
上一篇:


下一篇:mssql sqlserver 下文分享一种新颖的字符串截取方法