本文主要介绍如何在Storm编程实现与Kafka的集成
一、实现模型
数据流程:
1、Kafka Producter生成topic1主题的消息
2、Storm中有个Topology,包含了KafkaSpout、SenqueceBolt、KafkaBolt三个组件。其中KafkaSpout订阅了topic1主题消息,然后发送
给SenqueceBolt加工处理,最后数据由KafkaBolt生成topic2主题消息发送给Kafka
3、Kafka Consumer负责消费topic2主题的消息
二、Topology实现
1、创建maven工程,配置pom.xml
需要依赖storm-core、kafka_2.10、storm-kafka三个包
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 <dependencies> <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-core</artifactId> <version> 0.9 . 2 -incubating</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2. 10 </artifactId> <version> 0.8 . 1.1 </version> <exclusions> <exclusion> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> </exclusion> <exclusion> <groupId>log4j</groupId> <artifactId>log4j</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-kafka</artifactId> <version> 0.9 . 2 -incubating</version> </dependency> </dependencies> <build> <plugins> <plugin> <artifactId>maven-assembly-plugin</artifactId> <version> 2.4 </version> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <phase> package </phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> </plugins> </build>
2、KafkaSpout
KafkaSpout是Storm中自带的Spout,源码在https://github.com/apache/incubator-storm/tree/master/external
使用KafkaSpout时需要子集实现Scheme接口,它主要负责从消息流中解析出需要的数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 public class MessageScheme implements Scheme { /* (non-Javadoc) * @see backtype.storm.spout.Scheme#deserialize(byte[]) */ public List<Object> deserialize( byte [] ser) { try { String msg = new String(ser, "UTF-8" ); return new Values(msg); } catch (UnsupportedEncodingException e) { } return null ; } /* (non-Javadoc) * @see backtype.storm.spout.Scheme#getOutputFields() */ public Fields getOutputFields() { // TODO Auto-generated method stub return new Fields( "msg" ); } }3、SenqueceBolt
SenqueceBolt实现很简单,在接收的spout的消息前面加上“I‘m”
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 public class SenqueceBolt extends BaseBasicBolt{ /* (non-Javadoc) * @see backtype.storm.topology.IBasicBolt#execute(backtype.storm.tuple.Tuple, backtype.storm.topology.BasicOutputCollector) */ public void execute(Tuple input, BasicOutputCollector collector) { // TODO Auto-generated method stub String word = (String) input.getValue( 0 ); String out = "I'm " + word + "!" ; System.out.println( "out=" + out); collector.emit( new Values(out)); } /* (non-Javadoc) * @see backtype.storm.topology.IComponent#declareOutputFields(backtype.storm.topology.OutputFieldsDeclarer) */ public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare( new Fields( "message" )); } }4、KafkaBolt
KafkaBolt是Storm中自带的Bolt,负责向Kafka发送主题消息
5、Topology
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 public class StormKafkaTopo { public static void main(String[] args) throws Exception { // 配置Zookeeper地址 BrokerHosts brokerHosts = new ZkHosts( "node04:2181,node05:2181,node06:2181" ); // 配置Kafka订阅的Topic,以及zookeeper中数据节点目录和名字 SpoutConfig spoutConfig = new SpoutConfig(brokerHosts, "topic1" , "/zkkafkaspout" , "kafkaspout" ); // 配置KafkaBolt中的kafka.broker.properties Config conf = new Config(); Map<String, String> map = new HashMap<String, String>(); // 配置Kafka broker地址 map.put( "metadata.broker.list" , "node04:9092" ); // serializer.class为消息的序列化类 map.put( "serializer.class" , "kafka.serializer.StringEncoder" ); conf.put( "kafka.broker.properties" , map); // 配置KafkaBolt生成的topic conf.put( "topic" , "topic2" ); spoutConfig.scheme = new SchemeAsMultiScheme( new MessageScheme()); TopologyBuilder builder = new TopologyBuilder(); builder.setSpout( "spout" , new KafkaSpout(spoutConfig)); builder.setBolt( "bolt" , new SenqueceBolt()).shuffleGrouping( "spout" ); builder.setBolt( "kafkabolt" , new KafkaBolt<String, Integer>()).shuffleGrouping( "bolt" ); if (args != null && args.length > 0 ) { conf.setNumWorkers( 3 ); StormSubmitter.submitTopology(args[ 0 ], conf, builder.createTopology()); } else { LocalCluster cluster = new LocalCluster(); cluster.submitTopology( "Topo" , conf, builder.createTopology()); Utils.sleep( 100000 ); cluster.killTopology( "Topo" ); cluster.shutdown(); } } }
三、测试验证
1、使用Kafka client模拟Kafka Producter ,生成topic1主题
bin/kafka-console-producer.sh --broker-list node04:9092 --topic topic1
2、使用Kafka client模拟Kafka Consumer,订阅topic2主题
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topic2 --from-beginning
3、运行Strom Topology
bin/storm jar storm-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar StormKafkaTopo KafkaStorm
4、运行结果
转自:http://www.tuicool.com/articles/f6RVvq
http://www.sxt.cn/u/756/blog/4584
