Class StreamingMedian

  extended by org.apache.pig.EvalFunc<T>
      extended by org.apache.pig.AccumulatorEvalFunc<>
          extended by datafu.pig.stats.StreamingQuantile
              extended by datafu.pig.stats.StreamingMedian
All Implemented Interfaces:

public class StreamingMedian
extends StreamingQuantile

Computes the approximate median for a (not necessarily sorted) input bag, using the Munro-Paterson algorithm. This is a convenience wrapper around StreamingQuantile.

N.B., all the data is pushed to a single reducer per key, so make sure some partitioning is done (e.g., group by 'day') if the data is too large. That is, this isn't distributed median.

See Also:

Field Summary
Fields inherited from class org.apache.pig.EvalFunc
log, pigLogger, reporter, returnType
Constructor Summary
Method Summary
Methods inherited from class datafu.pig.stats.StreamingQuantile
accumulate, cleanup, getValue, outputSchema
Methods inherited from class org.apache.pig.AccumulatorEvalFunc
Methods inherited from class org.apache.pig.EvalFunc
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public StreamingMedian()

Matthew Hayes, Sam Shah