Package datafu.pig.sampling

Sampling UDFs, including weighted sample, reservoir sampling, sampling by key, etc.

See:
          Description

Class Summary
ReservoirSample Maintains an in-memory reservoir to produce a uniformly random sample of a given size.
ReservoirSample.Final  
ReservoirSample.Initial  
ReservoirSample.Intermediate  
SampleByKey Provides a way of sampling tuples based on certain fields.
WeightedSample Create a new bag by performing a weighted sampling without replacement from the input bag.
 

Package datafu.pig.sampling Description

Sampling UDFs, including weighted sample, reservoir sampling, sampling by key, etc.



Matthew Hayes, Sam Shah