row_nubmer使用说明: row_number接收到的数据是已经分区排序的数据, row_number() OVER (PARTITION BY c ORDER BY d)
description = @Description( name = "row_number", value = "_FUNC_() - The ROW_NUMBER function assigns a unique number (sequentially, starting " + "from 1, as defined by ORDER BY) to each row within the partition." ),hive版本:2.1.0
row_number对应的源代码类:org.apache.hadoop.hive.ql.udf.generic.GenericUDAFRowNumber
这个是一个UDAF,hive的udf在FunctionRegistry,对应的包在ql目录下
system.registerGenericUDAF("row_number", new GenericUDAFRowNumber());2、iterate方法
调用RowNumberBuffer.incr(),来一条记录进行加1
public void iterate(AggregationBuffer agg, Object[] parameters) throws HiveException { ((RowNumberBuffer) agg).incr(); }3、terminate
返回最终结果,RowNumberBuffer中的list
public Object terminate(AggregationBuffer agg) throws HiveException { return ((RowNumberBuffer) agg).rowNums; }