40.top10热门品类之进行二次排序

    xiaoxiao2021-04-15  87

    本文为《Spark大型电商项目实战》 系列文章之一,主要代码实现top10热门品类模块中的第五步:二次排序。

    代码实现

    /** * 第五步:将数据映射成<SortKey,info>格式的RDD,然后进行二次排序(降序) */ JavaPairRDD<CategorySortKey, String> sortKey2countRDD = categoryid2countRDD.mapToPair( new PairFunction<Tuple2<Long, String>, CategorySortKey, String>() { private static final long serialVersionUID = 1L; public Tuple2<CategorySortKey, String> call( Tuple2<Long, String> tuple) throws Exception { String countInfo = tuple._2; long clickCount = Long.valueOf(StringUtils.getFieldFromConcatString( countInfo, "\\|", Constants.FIELD_CLICK_COUNT)); long orderCount = Long.valueOf(StringUtils.getFieldFromConcatString( countInfo, "\\|", Constants.FIELD_ORDER_COUNT)); long payCount = Long.valueOf(StringUtils.getFieldFromConcatString( countInfo, "\\|", Constants.FIELD_PAY_COUNT)); CategorySortKey sortKey = new CategorySortKey(clickCount, orderCount, payCount); return new Tuple2<CategorySortKey, String>(sortKey, countInfo); } }); JavaPairRDD<CategorySortKey, String> sortedCategoryCountRDD = sortKey2countRDD.sortByKey(false);

    《Spark 大型电商项目实战》源码:https://github.com/Erik-ly/SprakProject

    本文为《Spark大型电商项目实战》系列文章之一, 更多文章:Spark大型电商项目实战:http://blog.csdn.net/u012318074/article/category/6744423

    转载请注明原文地址: https://ju.6miu.com/read-670797.html

    最新回复(0)