欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 科技 > 名人名企 > 数据采集工具之Flume

数据采集工具之Flume

2024/11/30 10:53:03 来源:https://blog.csdn.net/weixin_44996457/article/details/140993503  浏览:    关键词:数据采集工具之Flume

本文主要实现数据到datahub的采集过程

1、下载

Index of /dist/flume/1.11.0

datahub插件下载

https://aliyun-datahub.oss-cn-hangzhou.aliyuncs.com/tools/aliyun-flume-datahub-sink-2.0.9.tar.gz

2、安装

$ tar aliyun-flume-datahub-sink-x.x.x.tar.gz
$ cd aliyun-flume-datahub-sink-x.x.x
$ mkdir ${FLUME_HOME}/plugins.d
$ mv aliyun-flume-datahub-sink ${FLUME_HOME}/plugins.d

3、编写配置文件 

# A single-node Flume configuration for DataHub
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /soft/data/test.csv
# Describe the sink
a1.sinks.k1.type = com.aliyun.datahub.flume.sink.DatahubSink
a1.sinks.k1.datahub.accessId = 2Z8tAOpDPBm5LEkA
a1.sinks.k1.datahub.accessKey = Tlupsw2G0PdKGCRyPLucHjeESqoCla
a1.sinks.k1.datahub.endPoint = https://datahub.cn-beijing-tbdg-d01.dh.res.bigdata.tbea.com
a1.sinks.k1.datahub.project = bigdata
a1.sinks.k1.datahub.topic = txt_flume
a1.sinks.k1.serializer = DELIMITED
a1.sinks.k1.serializer.delimiter = ,
a1.sinks.k1.serializer.fieldnames = id,name,gender,salary,my_time,decimal
a1.sinks.k1.serializer.charset = UTF-8
a1.sinks.k1.datahub.retryTimes = 5
a1.sinks.k1.datahub.retryInterval = 5
a1.sinks.k1.datahub.batchSize = 100
a1.sinks.k1.datahub.batchTimeout = 5
a1.sinks.k1.datahub.enablePb = true
a1.sinks.k1.datahub.compressType = DEFLATE
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

4、启动

flume-ng agent -n a1 -c conf -f ./conf/flume-txt2datahub.conf -Dflume.root.logger=INFO,console

Q:启动报错

[root@hadoop2 apache-flume-1.11.0-bin]# flume-ng agent -n a1 -c conf -f ./conf/flume-txt2datahub.conf -Dflume.root.logger=INFO,console
Info: Including Hive libraries found via () for Hive access
+ exec /soft/jdk1.8.0_421/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/soft/apache-flume-1.11.0-bin/conf:/soft/apache-flume-1.11.0-bin/lib/*:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/lib/*:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/libext/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application -n a1 -f ./conf/flume-txt2datahub.conf
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/apache-flume-1.11.0-bin/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/libext/slf4j-log4j12-1.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkNotNull(Ljava/lang/Object;Ljava/lang/String;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;at com.aliyun.datahub.flume.sink.DatahubSink.configure(DatahubSink.java:59)at org.apache.flume.conf.Configurables.configure(Configurables.java:41)at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:456)at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:109)at org.apache.flume.node.Application.main(Application.java:491)

A:删除Flume lib文件夹中的guava  jar包文件,重新启动

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com