datafu.pig.stats
Class WilsonBinConf
java.lang.Object
org.apache.pig.EvalFunc<T>
datafu.pig.util.SimpleEvalFunc<org.apache.pig.data.Tuple>
datafu.pig.stats.WilsonBinConf
public class WilsonBinConf
- extends SimpleEvalFunc<org.apache.pig.data.Tuple>
Computes the Wilsonian binomial proportion confidence interval
Constructor requires the confidence interval (alpha) parameter, and the
parameters are the number of positive (success) outcomes and the total
number of observations. The UDF returns the (lower,upper) confidence
interval.
Example:
-- the Wilsonian binomial proportion confidence interval for scoring
%declare WILSON_ALPHA 0.10
define WilsonBinConf datafu.pig.stats.WilsonBinConf('$WILSON_ALPHA');
bar = FOREACH foo GENERATE WilsonBinConf(successes, totals).lower as score;
quux = ORDER bar BY score DESC;
top = LIMIT quux 10;
Fields inherited from class org.apache.pig.EvalFunc |
log, pigLogger, reporter, returnType |
Method Summary |
org.apache.pig.data.Tuple |
binconf(java.lang.Long x,
java.lang.Long n)
|
org.apache.pig.data.Tuple |
call(java.lang.Number x,
java.lang.Number n)
|
org.apache.pig.impl.logicalLayer.schema.Schema |
outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
Override outputSchema so we can verify the input schema at pig compile time, instead of runtime |
Methods inherited from class org.apache.pig.EvalFunc |
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getSchemaName, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
WilsonBinConf
public WilsonBinConf(double alpha)
WilsonBinConf
public WilsonBinConf(java.lang.String alpha)
call
public org.apache.pig.data.Tuple call(java.lang.Number x,
java.lang.Number n)
throws java.io.IOException
- Throws:
java.io.IOException
binconf
public org.apache.pig.data.Tuple binconf(java.lang.Long x,
java.lang.Long n)
throws java.io.IOException
- Parameters:
x
- The number of positive (success) outcomesn
- The number of observations
- Returns:
- The (lower,upper) confidence interval
- Throws:
java.io.IOException
outputSchema
public org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
- Description copied from class:
SimpleEvalFunc
- Override outputSchema so we can verify the input schema at pig compile time, instead of runtime
- Overrides:
outputSchema
in class SimpleEvalFunc<org.apache.pig.data.Tuple>
- Parameters:
input
- input schema
- Returns:
- call to super.outputSchema in case schema was defined elsewhere
Matthew Hayes, Sam Shah