datafu.pig.stats
Class HyperLogLogPlusPlus
java.lang.Object
org.apache.pig.EvalFunc<T>
org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
datafu.pig.stats.HyperLogLogPlusPlus
- All Implemented Interfaces:
- org.apache.pig.Accumulator<java.lang.Long>
public class HyperLogLogPlusPlus
- extends org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
A UDF that applies the HyperLogLog++ cardinality estimation algorithm.
This uses the implementation of HyperLogLog++ from stream-lib.
The HyperLogLog++ algorithm is an enhanced version of HyperLogLog as described in
here.
This is a streaming implementation, and therefore the input data does not need to be sorted.
- Author:
- mhayes
Fields inherited from class org.apache.pig.EvalFunc |
log, pigLogger, reporter, returnType |
Method Summary |
void |
accumulate(org.apache.pig.data.Tuple arg0)
|
void |
cleanup()
|
java.lang.Long |
getValue()
|
org.apache.pig.impl.logicalLayer.schema.Schema |
outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
|
Methods inherited from class org.apache.pig.AccumulatorEvalFunc |
exec |
Methods inherited from class org.apache.pig.EvalFunc |
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HyperLogLogPlusPlus
public HyperLogLogPlusPlus()
- Constructs a HyperLogLog++ estimator.
HyperLogLogPlusPlus
public HyperLogLogPlusPlus(java.lang.String p)
- Constructs a HyperLogLog++ estimator.
- Parameters:
p
- precision value
accumulate
public void accumulate(org.apache.pig.data.Tuple arg0)
throws java.io.IOException
- Specified by:
accumulate
in interface org.apache.pig.Accumulator<java.lang.Long>
- Specified by:
accumulate
in class org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
- Throws:
java.io.IOException
cleanup
public void cleanup()
- Specified by:
cleanup
in interface org.apache.pig.Accumulator<java.lang.Long>
- Specified by:
cleanup
in class org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
getValue
public java.lang.Long getValue()
- Specified by:
getValue
in interface org.apache.pig.Accumulator<java.lang.Long>
- Specified by:
getValue
in class org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
outputSchema
public org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
- Overrides:
outputSchema
in class org.apache.pig.EvalFunc<java.lang.Long>
Matthew Hayes, Sam Shah