storm集成kafka

    xiaoxiao2021-03-26  34

    本文主要介绍如何在Storm编程实现与Kafka的集成

      一、实现模型

       数据流程:

        1、Kafka Producter生成topic1主题的消息 

        2、Storm中有个Topology,包含了KafkaSpout、SenqueceBolt、KafkaBolt三个组件。其中KafkaSpout订阅了topic1主题消息,然后发送

          给SenqueceBolt加工处理,最后数据由KafkaBolt生成topic2主题消息发送给Kafka

        3、Kafka Consumer负责消费topic2主题的消息

        

      二、Topology实现

        1、创建maven工程,配置pom.xml

          需要依赖storm-core、kafka_2.10、storm-kafka三个包

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 <dependencies>       <dependency>           <groupId>org.apache.storm</groupId>             <artifactId>storm-core</artifactId>             <version> 0.9 . 2 -incubating</version>             <scope>provided</scope>       </dependency>       <dependency>        <groupId>org.apache.kafka</groupId>        <artifactId>kafka_2. 10 </artifactId>        <version> 0.8 . 1.1 </version>        <exclusions>            <exclusion>                <groupId>org.apache.zookeeper</groupId>                <artifactId>zookeeper</artifactId>            </exclusion>            <exclusion>                <groupId>log4j</groupId>                <artifactId>log4j</artifactId>            </exclusion>        </exclusions>    </dependency>               <dependency>           <groupId>org.apache.storm</groupId>          <artifactId>storm-kafka</artifactId>           <version> 0.9 . 2 -incubating</version>     </dependency>  </dependencies>   <build>    <plugins>      <plugin>        <artifactId>maven-assembly-plugin</artifactId>        <version> 2.4 </version>        <configuration>          <descriptorRefs>            <descriptorRef>jar-with-dependencies</descriptorRef>          </descriptorRefs>        </configuration>        <executions>          <execution>            <id>make-assembly</id>            <phase> package </phase>            <goals>              <goal>single</goal>            </goals>          </execution>        </executions>      </plugin>    </plugins> </build>

     

        2、KafkaSpout

          KafkaSpout是Storm中自带的Spout,源码在https://github.com/apache/incubator-storm/tree/master/external

          使用KafkaSpout时需要子集实现Scheme接口,它主要负责从消息流中解析出需要的数据

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 public class MessageScheme implements Scheme {            /* (non-Javadoc)       * @see backtype.storm.spout.Scheme#deserialize(byte[])       */      public List<Object> deserialize( byte [] ser) {          try {              String msg = new String(ser, "UTF-8" );              return new Values(msg);          } catch (UnsupportedEncodingException e) {                      }          return null ;      }                  /* (non-Javadoc)       * @see backtype.storm.spout.Scheme#getOutputFields()       */      public Fields getOutputFields() {          // TODO Auto-generated method stub          return new Fields( "msg" );       }  }

        3、SenqueceBolt

           SenqueceBolt实现很简单,在接收的spout的消息前面加上“I‘m” 

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 public class SenqueceBolt extends BaseBasicBolt{            /* (non-Javadoc)       * @see backtype.storm.topology.IBasicBolt#execute(backtype.storm.tuple.Tuple, backtype.storm.topology.BasicOutputCollector)       */      public void execute(Tuple input, BasicOutputCollector collector) {          // TODO Auto-generated method stub           String word = (String) input.getValue( 0 );            String out = "I'm " + word +  "!" ;            System.out.println( "out=" + out);           collector.emit( new Values(out));      }            /* (non-Javadoc)       * @see backtype.storm.topology.IComponent#declareOutputFields(backtype.storm.topology.OutputFieldsDeclarer)       */      public void declareOutputFields(OutputFieldsDeclarer declarer) {          declarer.declare( new Fields( "message" ));      } }

        4、KafkaBolt

          KafkaBolt是Storm中自带的Bolt,负责向Kafka发送主题消息

        5、Topology

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 public class StormKafkaTopo {        public static void main(String[] args) throws Exception {      // 配置Zookeeper地址          BrokerHosts brokerHosts = new ZkHosts( "node04:2181,node05:2181,node06:2181" );          // 配置Kafka订阅的Topic,以及zookeeper中数据节点目录和名字          SpoutConfig spoutConfig = new SpoutConfig(brokerHosts, "topic1" , "/zkkafkaspout" , "kafkaspout" );               // 配置KafkaBolt中的kafka.broker.properties          Config conf = new Config();           Map<String, String> map = new HashMap<String, String>();      // 配置Kafka broker地址                map.put( "metadata.broker.list" , "node04:9092" );          // serializer.class为消息的序列化类          map.put( "serializer.class" , "kafka.serializer.StringEncoder" );          conf.put( "kafka.broker.properties" , map);      // 配置KafkaBolt生成的topic          conf.put( "topic" , "topic2" );                    spoutConfig.scheme = new SchemeAsMultiScheme( new MessageScheme());           TopologyBuilder builder = new TopologyBuilder();            builder.setSpout( "spout" , new KafkaSpout(spoutConfig));           builder.setBolt( "bolt" , new SenqueceBolt()).shuffleGrouping( "spout" );          builder.setBolt( "kafkabolt" , new KafkaBolt<String, Integer>()).shuffleGrouping( "bolt" );                   if (args != null && args.length > 0 ) {               conf.setNumWorkers( 3 );               StormSubmitter.submitTopology(args[ 0 ], conf, builder.createTopology());           } else {                   LocalCluster cluster = new LocalCluster();               cluster.submitTopology( "Topo" , conf, builder.createTopology());               Utils.sleep( 100000 );               cluster.killTopology( "Topo" );               cluster.shutdown();           }       }  }

     

      三、测试验证

        1、使用Kafka client模拟Kafka Producter ,生成topic1主题   

          bin/kafka-console-producer.sh --broker-list node04:9092 --topic topic1

        2、使用Kafka client模拟Kafka Consumer,订阅topic2主题

          bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topic2 --from-beginning

        3、运行Strom Topology

          bin/storm jar storm-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar  StormKafkaTopo KafkaStorm

        4、运行结果

         

    转自:http://www.tuicool.com/articles/f6RVvq

    http://www.sxt.cn/u/756/blog/4584

    转载请注明原文地址: https://ju.6miu.com/read-661958.html

    最新回复(0)