public abstract class MetricUDF
extends org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>
It returns one of the tuples of the bag of vectors. For an example of its use, please see datafu.pig.hash.lsh.CosineDistanceHash.
CosineDistanceHash
Modifier and Type | Field and Description |
---|---|
protected int |
dim |
Constructor and Description |
---|
MetricUDF(java.lang.String sDim)
Create a new Metric UDF with a given dimension.
|
Modifier and Type | Method and Description |
---|---|
protected abstract double |
dist(org.apache.commons.math.linear.RealVector v1,
org.apache.commons.math.linear.RealVector v2)
The distance metric used.
|
org.apache.pig.data.Tuple |
exec(org.apache.pig.data.Tuple input)
This UDF expects a query vector as the first element, a threshold (double) as the second, and a bag of vectors.
|
org.apache.pig.impl.logicalLayer.schema.Schema |
outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
Create the output schema, based on the input schema.
|
allowCompileTimeCalculation, finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, getShipFiles, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
public MetricUDF(java.lang.String sDim)
sDim
- dimensionprotected abstract double dist(org.apache.commons.math.linear.RealVector v1, org.apache.commons.math.linear.RealVector v2)
v1
- first vectorv2
- second vectorpublic org.apache.pig.data.Tuple exec(org.apache.pig.data.Tuple input) throws java.io.IOException
It returns one of the tuples of the bag of vectors. For an example of its use, please see datafu.pig.hash.lsh.CosineDistanceHash.
exec
in class org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>
java.io.IOException
CosineDistanceHash
public org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
outputSchema
in class org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>