accumulate(Tuple) - Method in class datafu.pig.bags.CountEach
accumulate(Tuple) - Method in class datafu.pig.bags.DistinctBy
accumulate(Tuple) - Method in class datafu.pig.bags.Enumerate
accumulate(Tuple) - Method in class datafu.pig.linkanalysis.PageRank
accumulate(Tuple) - Method in class datafu.pig.sampling.ReservoirSample
accumulate(Tuple) - Method in class datafu.pig.sessions.SessionCount
accumulate(Tuple) - Method in class datafu.pig.sessions.Sessionize
accumulate(Tuple) - Method in class datafu.pig.stats.HyperLogLogPlusPlus
accumulate(Tuple) - Method in class datafu.pig.stats.StreamingQuantile
accumulate(Tuple) - Method in class datafu.pig.stats.VAR
addNode(Integer, ArrayList<Map<String, Object>>) - Method in class datafu.pig.linkanalysis.PageRankImpl
addNode(Integer, ArrayList<Map<String, Object>>, float) - Method in class datafu.pig.linkanalysis.PageRankImpl
AliasableEvalFunc<T> - Class in datafu.pig.util
Makes implementing and using UDFs easier by enabling named parameters.
AliasableEvalFunc() - Constructor for class datafu.pig.util.AliasableEvalFunc
all_equal(PriorityQueue<SetIntersect.pair>) - Method in class datafu.pig.sets.SetIntersect
AppendToBag - Class in datafu.pig.bags
Appends a tuple to a bag.
AppendToBag() - Constructor for class datafu.pig.bags.AppendToBag
Assert - Class in datafu.pig.util
Deprecated. Use AssertUDF instead.
Assert() - Constructor for class datafu.pig.util.Assert
AssertUDF - Class in datafu.pig.util
Filter function which asserts that a value is true.
AssertUDF() - Constructor for class datafu.pig.util.AssertUDF


BagConcat - Class in datafu.pig.bags
Unions all input bags to produce a single bag containing all tuples.
BagConcat() - Constructor for class datafu.pig.bags.BagConcat
bagFactory - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect
BagGroup - Class in datafu.pig.bags
Performs an in-memory group operation on a bag.
BagGroup() - Constructor for class datafu.pig.bags.BagGroup
BagLeftOuterJoin - Class in datafu.pig.bags
Performs an in-memory left outer join across multiple bags.
BagLeftOuterJoin() - Constructor for class datafu.pig.bags.BagLeftOuterJoin
BagSplit - Class in datafu.pig.bags
Splits a bag of tuples into a bag of bags, where the inner bags collectively contain the tuples from the original bag.
BagSplit() - Constructor for class datafu.pig.bags.BagSplit
BagSplit(String) - Constructor for class datafu.pig.bags.BagSplit
binconf(Long, Long) - Method in class datafu.pig.stats.WilsonBinConf
BoolToInt - Class in datafu.pig.util
UDF which converts a Boolean to an Integer.
BoolToInt() - Constructor for class datafu.pig.util.BoolToInt


call(DataBag, Tuple) - Method in class datafu.pig.bags.AppendToBag
call(DataBag, Tuple) - Method in class datafu.pig.bags.FirstTupleFromBag
call(DataBag, Tuple) - Method in class datafu.pig.bags.PrependToBag
call(DataBag) - Method in class datafu.pig.bags.ReverseEnumerate
call(Double, Double, Double, Double) - Method in class datafu.pig.geo.HaversineDistInMiles
call(String) - Method in class datafu.pig.hash.MD5
call(String) - Method in class datafu.pig.hash.SHA
call(Integer, Integer) - Method in class datafu.pig.random.RandInt
call(DataBag) - Method in class datafu.pig.stats.Quantile
call(Number, Number) - Method in class datafu.pig.stats.WilsonBinConf
call(String) - Method in class datafu.pig.urls.UserAgentClassify
call(Boolean) - Method in class datafu.pig.util.BoolToInt
call(Integer) - Method in class datafu.pig.util.IntToBool
CANDIDATE_FIELD_NAME - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
cleanup() - Method in class datafu.pig.bags.CountEach
cleanup() - Method in class datafu.pig.bags.DistinctBy
cleanup() - Method in class datafu.pig.bags.Enumerate
cleanup() - Method in class datafu.pig.linkanalysis.PageRank
cleanup() - Method in class datafu.pig.sampling.ReservoirSample
cleanup() - Method in class datafu.pig.sessions.SessionCount
cleanup() - Method in class datafu.pig.sessions.Sessionize
cleanup() - Method in class datafu.pig.stats.HyperLogLogPlusPlus
cleanup() - Method in class datafu.pig.stats.StreamingQuantile
cleanup() - Method in class datafu.pig.stats.VAR
clear() - Method in class datafu.pig.linkanalysis.PageRankImpl
Coalesce - Class in datafu.pig.util
Returns the first non-null value from a tuple, just like COALESCE in SQL.
Coalesce() - Constructor for class datafu.pig.util.Coalesce
Coalesce(String) - Constructor for class datafu.pig.util.Coalesce
combine(DataBag) - Static method in class datafu.pig.stats.VAR
commit(ProgressIndicator) - Method in class datafu.pig.linkanalysis.PageRankImpl
ContextualEvalFunc<T> - Class in datafu.pig.util
An abstract class which enables UDFs to store instance properties on the front end which will be available on the back end.
ContextualEvalFunc() - Constructor for class datafu.pig.util.ContextualEvalFunc
count(Tuple) - Static method in class datafu.pig.stats.VAR
CountEach - Class in datafu.pig.bags
Generates a count of the number of times each distinct tuple appears in a bag.
CountEach() - Constructor for class datafu.pig.bags.CountEach
CountEach(String) - Constructor for class datafu.pig.bags.CountEach
countMatches(PriorityQueue<SetDifference.Pair>) - Method in class datafu.pig.sets.SetDifference
Counts how many elements in the priority queue match the element at the front of the queue, which should be from the first bag.


datafu.pig.bags - package datafu.pig.bags
A collection of general purpose UDFs for operating on bags.
datafu.pig.geo - package datafu.pig.geo
UDFs for geographic computations.
datafu.pig.hash - package datafu.pig.hash
UDFs for computing hashes from data.
datafu.pig.linkanalysis - package datafu.pig.linkanalysis
UDFs for performing link analysis, such as PageRank.
datafu.pig.random - package datafu.pig.random
UDFs dealing with randomness.
datafu.pig.sampling - package datafu.pig.sampling
Sampling UDFs, including weighted sample, reservoir sampling, sampling by key, etc.
datafu.pig.sessions - package datafu.pig.sessions
UDFs for sessionizing data.
datafu.pig.sets - package datafu.pig.sets
UDFs for set operations such as intersect and union.
datafu.pig.stats - package datafu.pig.stats
Statistics UDFs for computing median, quantiles, variance, confidence intervals, etc.
datafu.pig.urls - package datafu.pig.urls
UDFs for processing URLs.
datafu.pig.util - package datafu.pig.util
Other useful utilities.
DataFuException - Exception in datafu.pig.util
DataFuException() - Constructor for exception datafu.pig.util.DataFuException
DataFuException(String) - Constructor for exception datafu.pig.util.DataFuException
DataFuException(String, Throwable) - Constructor for exception datafu.pig.util.DataFuException
DataFuException(Throwable) - Constructor for exception datafu.pig.util.DataFuException
disableDanglingNodeHandling() - Method in class datafu.pig.linkanalysis.PageRankImpl
Disables dangling node handling (disabled by default).
disableEdgeDiskCaching() - Method in class datafu.pig.linkanalysis.PageRankImpl
Disable disk caching of edges once there are too many (disabled by default).
disableNodeBiasing() - Method in class datafu.pig.linkanalysis.PageRankImpl
DistinctBy - Class in datafu.pig.bags
Get distinct elements in a bag by a given set of field positions.
DistinctBy(String...) - Constructor for class datafu.pig.bags.DistinctBy
distribute(ProgressIndicator) - Method in class datafu.pig.linkanalysis.PageRankImpl


EARTH_RADIUS - Static variable in class datafu.pig.geo.HaversineDistInMiles
edgeCount() - Method in class datafu.pig.linkanalysis.PageRankImpl
EmptyBagToNull - Class in datafu.pig.bags
Returns null if the input is an empty bag; otherwise, returns the input bag unchanged.
EmptyBagToNull() - Constructor for class datafu.pig.bags.EmptyBagToNull
EmptyBagToNullFields - Class in datafu.pig.bags
For an empty bag, inserts a tuple having null values for all fields; otherwise, the input bag is returned unchanged.
EmptyBagToNullFields() - Constructor for class datafu.pig.bags.EmptyBagToNullFields
enableDanglingNodeHandling() - Method in class datafu.pig.linkanalysis.PageRankImpl
Enables dangling node handling (disabled by default).
enableEdgeDiskCaching() - Method in class datafu.pig.linkanalysis.PageRankImpl
Enable disk caching of edges once there are too many (disabled by default).
enableNodeBiasing() - Method in class datafu.pig.linkanalysis.PageRankImpl
Enumerate - Class in datafu.pig.bags
Enumerate a bag, appending to each tuple its index within the bag.
Enumerate() - Constructor for class datafu.pig.bags.Enumerate
Enumerate(String) - Constructor for class datafu.pig.bags.Enumerate
exec(Tuple) - Method in class datafu.pig.bags.BagConcat
exec(Tuple) - Method in class datafu.pig.bags.BagGroup
exec(Tuple) - Method in class datafu.pig.bags.BagLeftOuterJoin
exec(Tuple) - Method in class datafu.pig.bags.BagSplit
exec(Tuple) - Method in class datafu.pig.bags.EmptyBagToNull
exec(Tuple) - Method in class datafu.pig.bags.EmptyBagToNullFields
exec(Tuple) - Method in class datafu.pig.bags.NullToEmptyBag
exec(Tuple) - Method in class datafu.pig.bags.UnorderedPairs
exec(Tuple) - Method in class datafu.pig.sampling.ReservoirSample
exec(Tuple) - Method in class datafu.pig.sampling.ReservoirSample.Final
exec(Tuple) - Method in class datafu.pig.sampling.ReservoirSample.Initial
exec(Tuple) - Method in class datafu.pig.sampling.ReservoirSample.Intermediate
exec(Tuple) - Method in class datafu.pig.sampling.SampleByKey
exec(Tuple) - Method in class datafu.pig.sampling.SimpleRandomSample.Final
exec(Tuple) - Method in class datafu.pig.sampling.SimpleRandomSample.Initial
exec(Tuple) - Method in class datafu.pig.sampling.SimpleRandomSample.Intermediate
exec(Tuple) - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect.Final
exec(Tuple) - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect.Initial
exec(Tuple) - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect.Intermediate
exec(Tuple) - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
exec(Tuple) - Method in class datafu.pig.sampling.WeightedSample
exec(Tuple) - Method in class datafu.pig.sets.SetDifference
exec(Tuple) - Method in class datafu.pig.sets.SetIntersect
exec(Tuple) - Method in class datafu.pig.sets.SetUnion
exec(Tuple) - Method in class datafu.pig.stats.MarkovPairs
exec(Tuple) - Method in class datafu.pig.stats.VAR
exec(Tuple) - Method in class datafu.pig.stats.VAR.Final
exec(Tuple) - Method in class datafu.pig.stats.VAR.Initial
exec(Tuple) - Method in class datafu.pig.stats.VAR.Intermediate
exec(Tuple) - Method in class datafu.pig.util.AssertUDF
exec(Tuple) - Method in class datafu.pig.util.Coalesce
exec(Tuple) - Method in class datafu.pig.util.InUDF
exec(Tuple) - Method in class datafu.pig.util.SimpleEvalFunc
exec(Tuple) - Method in class datafu.pig.util.TransposeTupleToBag


FAILURE_RATE - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
FieldNotFound - Exception in datafu.pig.util
Thrown by {see AliasableEvalFunc} when attempting to access an unknown field by name.
FieldNotFound() - Constructor for exception datafu.pig.util.FieldNotFound
FieldNotFound(String) - Constructor for exception datafu.pig.util.FieldNotFound
find_cumsum_interval(double[], double, int, int) - Method in class datafu.pig.sampling.WeightedSample
FirstTupleFromBag - Class in datafu.pig.bags
Returns the first tuple from a bag.
FirstTupleFromBag() - Constructor for class datafu.pig.bags.FirstTupleFromBag


getAlpha() - Method in class datafu.pig.linkanalysis.PageRankImpl
Gets the page rank alpha value.
getArgToFuncMapping() - Method in class datafu.pig.stats.VAR
getBag(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getBoolean(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getContextProperties() - Method in class datafu.pig.util.ContextualEvalFunc
Helper method to return the context properties for this class
getData() - Method in exception datafu.pig.util.DataFuException
Gets data relevant to this exception.
getDouble(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getDouble(Tuple, String, Double) - Method in class datafu.pig.util.AliasableEvalFunc
getEdgeCachingThreshold() - Method in class datafu.pig.linkanalysis.PageRankImpl
Gets the number of edges past which they will be cached on disk instead of in memory.
getFieldAliases() - Method in class datafu.pig.util.AliasableEvalFunc
Field aliases are generated from the input schema
Each alias maps to a bag position
Inner bags/tuples will have alias of outer.inner.foo
getFieldAliases() - Method in exception datafu.pig.util.DataFuException
Gets field aliases for a UDF which may be relevant to this exception.
getFinal() - Method in class datafu.pig.sampling.ReservoirSample
getFinal() - Method in class datafu.pig.sampling.SimpleRandomSample
getFinal() - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect
getFinal() - Method in class datafu.pig.stats.VAR
getFloat(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getFloat(Tuple, String, Float) - Method in class datafu.pig.util.AliasableEvalFunc
getInitial() - Method in class datafu.pig.sampling.ReservoirSample
getInitial() - Method in class datafu.pig.sampling.SimpleRandomSample
getInitial() - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect
getInitial() - Method in class datafu.pig.stats.VAR
getInstanceName() - Method in class datafu.pig.util.ContextualEvalFunc
getInstanceProperties() - Method in class datafu.pig.util.ContextualEvalFunc
Helper method to return the context properties for this instance of this class
getInteger(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getInteger(Tuple, String, Integer) - Method in class datafu.pig.util.AliasableEvalFunc
getIntermed() - Method in class datafu.pig.sampling.ReservoirSample
getIntermed() - Method in class datafu.pig.sampling.SimpleRandomSample
getIntermed() - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect
getIntermed() - Method in class datafu.pig.stats.VAR
getLong(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getLong(Tuple, String, Long) - Method in class datafu.pig.util.AliasableEvalFunc
getNodeBias(int) - Method in class datafu.pig.linkanalysis.PageRankImpl
getNodeIds() - Method in class datafu.pig.linkanalysis.PageRankImpl
getNodeRank(int) - Method in class datafu.pig.linkanalysis.PageRankImpl
getNQuantiles(int) - Static method in class datafu.pig.stats.QuantileUtil
getObject(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getOutputSchema(Schema) - Method in class datafu.pig.bags.BagGroup
getOutputSchema(Schema) - Method in class datafu.pig.bags.BagLeftOuterJoin
getOutputSchema(Schema) - Method in class datafu.pig.util.AliasableEvalFunc
Specify the output schema as in {link EvalFunc#outputSchema(Schema)}.
getOutputSchema(Schema) - Method in class datafu.pig.util.Coalesce
getOutputSchema(Schema) - Method in class datafu.pig.util.TransposeTupleToBag
getPosition(String) - Method in class datafu.pig.util.AliasableEvalFunc
getPosition(String, String) - Method in class datafu.pig.util.AliasableEvalFunc
getPrefixedAliasName(String, String) - Method in class datafu.pig.util.AliasableEvalFunc
getQuantilesFromParams(String...) - Static method in class datafu.pig.stats.QuantileUtil
getReturnType() - Method in class datafu.pig.util.SimpleEvalFunc
getString(Tuple, String) - Method in class datafu.pig.util.AliasableEvalFunc
getString(Tuple, String, String) - Method in class datafu.pig.util.AliasableEvalFunc
getTotalRankChange() - Method in class datafu.pig.linkanalysis.PageRankImpl
getValue() - Method in class datafu.pig.bags.CountEach
getValue() - Method in class datafu.pig.bags.DistinctBy
getValue() - Method in class datafu.pig.bags.Enumerate
getValue() - Method in class datafu.pig.linkanalysis.PageRank
getValue() - Method in class datafu.pig.sampling.ReservoirSample
getValue() - Method in class datafu.pig.sessions.SessionCount
getValue() - Method in class datafu.pig.sessions.Sessionize
getValue() - Method in class datafu.pig.stats.HyperLogLogPlusPlus
getValue() - Method in class datafu.pig.stats.StreamingQuantile
getValue() - Method in class datafu.pig.stats.VAR


HaversineDistInMiles - Class in datafu.pig.geo
Computes the distance (in miles) between two latitude-longitude pairs using the Haversine formula.
HaversineDistInMiles() - Constructor for class datafu.pig.geo.HaversineDistInMiles
HyperLogLogPlusPlus - Class in datafu.pig.stats
A UDF that applies the HyperLogLog++ cardinality estimation algorithm.
HyperLogLogPlusPlus() - Constructor for class datafu.pig.stats.HyperLogLogPlusPlus
Constructs a HyperLogLog++ estimator.
HyperLogLogPlusPlus(String) - Constructor for class datafu.pig.stats.HyperLogLogPlusPlus
Constructs a HyperLogLog++ estimator.


In - Class in datafu.pig.util
Deprecated. Use InUDF instead.
In() - Constructor for class datafu.pig.util.In
init() - Method in class datafu.pig.linkanalysis.PageRankImpl
init(ProgressIndicator) - Method in class datafu.pig.linkanalysis.PageRankImpl
IntToBool - Class in datafu.pig.util
UDF which converts an Integer to a Boolean.
IntToBool() - Constructor for class datafu.pig.util.IntToBool
InUDF - Class in datafu.pig.util
Similar to the SQL IN function, this function provides a convenient way to filter using a logical disjunction over many values.
InUDF() - Constructor for class datafu.pig.util.InUDF
isEdgeDiskCachingEnabled() - Method in class datafu.pig.linkanalysis.PageRankImpl
Gets whether edge disk caching is enabled.
isNodeBiasingEnabled() - Method in class datafu.pig.linkanalysis.PageRankImpl
isUsingEdgeDiskCache() - Method in class datafu.pig.linkanalysis.PageRankImpl
Gets whether disk is being used to cache edges.


MarkovPairs - Class in datafu.pig.stats
Accepts a bag of tuples, with user supplied ordering, and generates pairs that can be used for a Markov chain analysis.
MarkovPairs() - Constructor for class datafu.pig.stats.MarkovPairs
MarkovPairs(String) - Constructor for class datafu.pig.stats.MarkovPairs
MD5 - Class in datafu.pig.hash
Computes the MD5 value of a string and outputs it in hex (by default).
MD5() - Constructor for class datafu.pig.hash.MD5
MD5(String) - Constructor for class datafu.pig.hash.MD5
Median - Class in datafu.pig.stats
Computes the median for a sorted input bag, using type R-2 estimation.
Median() - Constructor for class datafu.pig.stats.Median


nextIteration(ProgressIndicator) - Method in class datafu.pig.linkanalysis.PageRankImpl
nextIteration() - Method in class datafu.pig.linkanalysis.PageRankImpl
nodeCount() - Method in class datafu.pig.linkanalysis.PageRankImpl
NullToEmptyBag - Class in datafu.pig.bags
Returns an empty bag if the input is null; otherwise, returns the input bag unchanged.
NullToEmptyBag() - Constructor for class datafu.pig.bags.NullToEmptyBag


OUTPUT_BAG_NAME_PREFIX - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect
Prefix for the output bag name.
OUTPUT_BAG_NAME_PREFIX - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
outputSchema(Schema) - Method in class datafu.pig.bags.AppendToBag
outputSchema(Schema) - Method in class datafu.pig.bags.BagConcat
outputSchema(Schema) - Method in class datafu.pig.bags.BagSplit
outputSchema(Schema) - Method in class datafu.pig.bags.CountEach
outputSchema(Schema) - Method in class datafu.pig.bags.DistinctBy
outputSchema(Schema) - Method in class datafu.pig.bags.EmptyBagToNull
outputSchema(Schema) - Method in class datafu.pig.bags.EmptyBagToNullFields
outputSchema(Schema) - Method in class datafu.pig.bags.Enumerate
outputSchema(Schema) - Method in class datafu.pig.bags.FirstTupleFromBag
outputSchema(Schema) - Method in class datafu.pig.bags.NullToEmptyBag
outputSchema(Schema) - Method in class datafu.pig.bags.PrependToBag
outputSchema(Schema) - Method in class datafu.pig.bags.ReverseEnumerate
outputSchema(Schema) - Method in class datafu.pig.bags.UnorderedPairs
outputSchema(Schema) - Method in class datafu.pig.geo.HaversineDistInMiles
outputSchema(Schema) - Method in class datafu.pig.linkanalysis.PageRank
outputSchema(Schema) - Method in class datafu.pig.random.RandInt
outputSchema(Schema) - Method in class datafu.pig.sampling.ReservoirSample
outputSchema(Schema) - Method in class datafu.pig.sampling.SimpleRandomSample
outputSchema(Schema) - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect
outputSchema(Schema) - Method in class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
outputSchema(Schema) - Method in class datafu.pig.sampling.WeightedSample
outputSchema(Schema) - Method in class datafu.pig.sessions.Sessionize
outputSchema(Schema) - Method in class datafu.pig.stats.HyperLogLogPlusPlus
outputSchema(Schema) - Method in class datafu.pig.stats.MarkovPairs
outputSchema(Schema) - Method in class datafu.pig.stats.Quantile
outputSchema(Schema) - Method in class datafu.pig.stats.StreamingQuantile
outputSchema(Schema) - Method in class datafu.pig.stats.VAR
outputSchema(Schema) - Method in class datafu.pig.stats.WilsonBinConf
outputSchema(Schema) - Method in class datafu.pig.util.AliasableEvalFunc
A wrapper method which captures the schema and then calls getOutputSchema
outputSchema(Schema) - Method in class datafu.pig.util.SimpleEvalFunc
Override outputSchema so we can verify the input schema at pig compile time, instead of runtime


PageRank - Class in datafu.pig.linkanalysis
A UDF which implements PageRank.
PageRank() - Constructor for class datafu.pig.linkanalysis.PageRank
PageRank(String...) - Constructor for class datafu.pig.linkanalysis.PageRank
PageRankImpl - Class in datafu.pig.linkanalysis
An implementation of PageRank, used by the PageRank UDF.
PageRankImpl() - Constructor for class datafu.pig.linkanalysis.PageRankImpl
POSITION_FIELD_NAME - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
PrependToBag - Class in datafu.pig.bags
Prepends a tuple to a bag.
PrependToBag() - Constructor for class datafu.pig.bags.PrependToBag


Quantile - Class in datafu.pig.stats
Computes quantiles for a sorted input bag, using type R-2 estimation.
Quantile(String...) - Constructor for class datafu.pig.stats.Quantile
QuantileUtil - Class in datafu.pig.stats
Methods used by Quantile.
QuantileUtil() - Constructor for class datafu.pig.stats.QuantileUtil


RandInt - Class in datafu.pig.random
Generates a uniformly distributed integer between two bounds.
RandInt() - Constructor for class datafu.pig.random.RandInt
ReservoirSample - Class in datafu.pig.sampling
Performs a simple random sample using an in-memory reservoir to produce a uniformly random sample of a given size.
ReservoirSample(String) - Constructor for class datafu.pig.sampling.ReservoirSample
ReservoirSample.Final - Class in datafu.pig.sampling
ReservoirSample.Final() - Constructor for class datafu.pig.sampling.ReservoirSample.Final
ReservoirSample.Final(String) - Constructor for class datafu.pig.sampling.ReservoirSample.Final
ReservoirSample.Initial - Class in datafu.pig.sampling
ReservoirSample.Initial() - Constructor for class datafu.pig.sampling.ReservoirSample.Initial
ReservoirSample.Initial(String) - Constructor for class datafu.pig.sampling.ReservoirSample.Initial
ReservoirSample.Intermediate - Class in datafu.pig.sampling
ReservoirSample.Intermediate() - Constructor for class datafu.pig.sampling.ReservoirSample.Intermediate
ReservoirSample.Intermediate(String) - Constructor for class datafu.pig.sampling.ReservoirSample.Intermediate
ReverseEnumerate - Class in datafu.pig.bags
Enumerate a bag, appending to each tuple its index within the bag, with indices being produced in descending order.
ReverseEnumerate() - Constructor for class datafu.pig.bags.ReverseEnumerate
ReverseEnumerate(String) - Constructor for class datafu.pig.bags.ReverseEnumerate


SampleByKey - Class in datafu.pig.sampling
Provides a way of sampling tuples based on certain fields.
SampleByKey(String) - Constructor for class datafu.pig.sampling.SampleByKey
SampleByKey(String, String) - Constructor for class datafu.pig.sampling.SampleByKey
SCORE_FIELD_NAME - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
SessionCount - Class in datafu.pig.sessions
Performs a count of events, ignoring events which occur within the same time window.
SessionCount(String) - Constructor for class datafu.pig.sessions.SessionCount
Sessionize - Class in datafu.pig.sessions
Sessionizes an input stream, appending a session ID to each tuple.
Sessionize(String) - Constructor for class datafu.pig.sessions.Sessionize
setAlpha(float) - Method in class datafu.pig.linkanalysis.PageRankImpl
Sets the page rank alpha value (default is 0.85);
setData(Object) - Method in exception datafu.pig.util.DataFuException
Sets data relevant to this exception.
SetDifference - Class in datafu.pig.sets
Computes the set difference of two or more bags.
SetDifference() - Constructor for class datafu.pig.sets.SetDifference
setEdgeCachingThreshold(long) - Method in class datafu.pig.linkanalysis.PageRankImpl
Set the number of edges past which they will be cached on disk instead of in memory.
setFieldAliases(Map<String, Integer>) - Method in exception datafu.pig.util.DataFuException
Sets field aliases for a UDF which may be relevant to this exception.
SetIntersect - Class in datafu.pig.sets
Computes the set intersection of two or more bags.
SetIntersect() - Constructor for class datafu.pig.sets.SetIntersect
setNodeBias(int, float) - Method in class datafu.pig.linkanalysis.PageRankImpl
setUDFContextSignature(String) - Method in class datafu.pig.sampling.SampleByKey
setUDFContextSignature(String) - Method in class datafu.pig.util.ContextualEvalFunc
SetUnion - Class in datafu.pig.sets
Computes the set union of two or more bags.
SetUnion() - Constructor for class datafu.pig.sets.SetUnion
SHA - Class in datafu.pig.hash
SHA() - Constructor for class datafu.pig.hash.SHA
SHA(String) - Constructor for class datafu.pig.hash.SHA
SimpleEvalFunc<T> - Class in datafu.pig.util
Uses reflection to makes writing simple wrapper Pig UDFs easier.
SimpleEvalFunc() - Constructor for class datafu.pig.util.SimpleEvalFunc
SimpleRandomSample - Class in datafu.pig.sampling
Scalable simple random sampling.
SimpleRandomSample() - Constructor for class datafu.pig.sampling.SimpleRandomSample
SimpleRandomSample(String) - Constructor for class datafu.pig.sampling.SimpleRandomSample
SimpleRandomSample.Final - Class in datafu.pig.sampling
SimpleRandomSample.Final() - Constructor for class datafu.pig.sampling.SimpleRandomSample.Final
SimpleRandomSample.Final(String) - Constructor for class datafu.pig.sampling.SimpleRandomSample.Final
SimpleRandomSample.Initial - Class in datafu.pig.sampling
SimpleRandomSample.Initial() - Constructor for class datafu.pig.sampling.SimpleRandomSample.Initial
SimpleRandomSample.Initial(String) - Constructor for class datafu.pig.sampling.SimpleRandomSample.Initial
SimpleRandomSample.Intermediate - Class in datafu.pig.sampling
SimpleRandomSample.Intermediate() - Constructor for class datafu.pig.sampling.SimpleRandomSample.Intermediate
SimpleRandomSample.Intermediate(String) - Constructor for class datafu.pig.sampling.SimpleRandomSample.Intermediate
SimpleRandomSampleWithReplacementElect - Class in datafu.pig.sampling
Select the candidate with the smallest score for each position from the candidates proposed by SimpleRandomSampleWithReplacementVote.
SimpleRandomSampleWithReplacementElect() - Constructor for class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect
SimpleRandomSampleWithReplacementElect.Final - Class in datafu.pig.sampling
SimpleRandomSampleWithReplacementElect.Final() - Constructor for class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect.Final
SimpleRandomSampleWithReplacementElect.Initial - Class in datafu.pig.sampling
SimpleRandomSampleWithReplacementElect.Initial() - Constructor for class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect.Initial
SimpleRandomSampleWithReplacementElect.Intermediate - Class in datafu.pig.sampling
SimpleRandomSampleWithReplacementElect.Intermediate() - Constructor for class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect.Intermediate
SimpleRandomSampleWithReplacementVote - Class in datafu.pig.sampling
Scalable simple random sampling with replacement (ScaSRSWR).
SimpleRandomSampleWithReplacementVote() - Constructor for class datafu.pig.sampling.SimpleRandomSampleWithReplacementVote
StreamingMedian - Class in datafu.pig.stats
Computes the approximate median for a (not necessarily sorted) input bag, using the Munro-Paterson algorithm.
StreamingMedian() - Constructor for class datafu.pig.stats.StreamingMedian
StreamingQuantile - Class in datafu.pig.stats
Computes approximate quantiles for a (not necessarily sorted) input bag, using the Munro-Paterson algorithm.
StreamingQuantile(String...) - Constructor for class datafu.pig.stats.StreamingQuantile
sum(Tuple) - Static method in class datafu.pig.stats.VAR
sumSquare(Tuple) - Static method in class datafu.pig.stats.VAR


toString() - Method in exception datafu.pig.util.DataFuException
TransposeTupleToBag - Class in datafu.pig.util
Performs a transpose on a tuple, resulting in a bag of key, value fields where the key is the column name and the value is the value of that column in the tuple.
TransposeTupleToBag() - Constructor for class datafu.pig.util.TransposeTupleToBag
tupleFactory - Static variable in class datafu.pig.sampling.SimpleRandomSampleWithReplacementElect


UnorderedPairs - Class in datafu.pig.bags
Generates pairs of all items in a bag.
UnorderedPairs() - Constructor for class datafu.pig.bags.UnorderedPairs
UserAgentClassify - Class in datafu.pig.urls
Given a user agent string, this UDF classifies clients to 'mobile' and 'desktop'.
UserAgentClassify() - Constructor for class datafu.pig.urls.UserAgentClassify


VAR - Class in datafu.pig.stats
Generates the Variance of a set of Values.
VAR() - Constructor for class datafu.pig.stats.VAR
VAR.Final - Class in datafu.pig.stats
VAR.Final() - Constructor for class datafu.pig.stats.VAR.Final
VAR.Initial - Class in datafu.pig.stats
VAR.Initial() - Constructor for class datafu.pig.stats.VAR.Initial
VAR.Intermediate - Class in datafu.pig.stats
VAR.Intermediate() - Constructor for class datafu.pig.stats.VAR.Intermediate


WeightedSample - Class in datafu.pig.sampling
Performs weighted bernoulli sampling on a bag.
WeightedSample() - Constructor for class datafu.pig.sampling.WeightedSample
WeightedSample(String) - Constructor for class datafu.pig.sampling.WeightedSample
WilsonBinConf - Class in datafu.pig.stats
Computes the Wilsonian binomial proportion confidence interval
WilsonBinConf(double) - Constructor for class datafu.pig.stats.WilsonBinConf
WilsonBinConf(String) - Constructor for class datafu.pig.stats.WilsonBinConf


Matthew Hayes, Sam Shah