edu.washington.cs.knowitall.extractor
Class Extractor<S,T>

java.lang.Object
  extended by edu.washington.cs.knowitall.extractor.Extractor<S,T>
Type Parameters:
S -
T -
Direct Known Subclasses:
ChunkedArgumentExtractor, ExtractorComposition, ExtractorUnion, NpListExtractor, OrdinalPhraseExtractor, RegexRelationExtractor, RelationFirstNpChunkExtractor, SentenceExtractor

public abstract class Extractor<S,T>
extends java.lang.Object

An abstract class that defines the basic behavior of an extractor. An Extractor object extracts objects of type T from a source object of type S. Candidate extractions are first obtained by calling the extractCandidates(S source) method, which returns an Iterable object over extractions of type T. These extractions are passed through a list of Mapper objects, each of which can filter or modify the extractions. Other objects can use an Extractor object by calling the extract(S source) object, which returns an Iterable object of extractions after the Mappers have been applied. Mapper objects can be added to the list of Mappers by calling the addMapper(Mapper mapper) method. This will add a Mapper to the end of the list (i.e. it is the last one to be applied to the extractions). Subclasses extending Extractor must implement the abstract extractCandidates method. As an example, this class can be used to implement a class for extracting String sentences from a String block of text. Mapper objects can be added to filter the sentences by length, or remove brackets from the sentences.

Author:
afader

Constructor Summary
Extractor()
          Constructs a new extractor with no Mapper objects.
 
Method Summary
 void addMapper(Mapper<T> mapper)
          Adds a Mapper object to the end of the list of mappers.
static
<R,S,T> Extractor<R,T>
compose(Extractor<R,S> rsExtractor, Extractor<S,T> stExtractor)
          Composes a R->S extractor with a S->T extractor to create a R->T extractor.
 java.lang.Iterable<T> extract(java.lang.Iterable<S> sources)
           
 java.lang.Iterable<T> extract(S source)
           
protected abstract  java.lang.Iterable<T> extractCandidates(S source)
          Extracts candidate extractions from the given source object.
 MapperList<T> getMappers()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Extractor

public Extractor()
Constructs a new extractor with no Mapper objects.

Method Detail

getMappers

public MapperList<T> getMappers()
Returns:
A MapperList object containing the mappers assigned to this extractor.

addMapper

public void addMapper(Mapper<T> mapper)
Adds a Mapper object to the end of the list of mappers. It will be the new final Mapper object applied to the extractions, after the existing Mappers have been applied.

Parameters:
mapper - The mapper to add.

extractCandidates

protected abstract java.lang.Iterable<T> extractCandidates(S source)
Extracts candidate extractions from the given source object. When the user calls the extract(S source) method, the extractCandidate(S source) method is used to generate a set of candidate extractions, which are then passed through each Mapper object assigned to the extractor.

Parameters:
source - The source to extract from.
Returns:
An Iterable object over the candidate extractions.

extract

public java.lang.Iterable<T> extract(S source)
Parameters:
source - the source object to extract from.
Returns:
an Iterable object over extractions from source.

extract

public java.lang.Iterable<T> extract(java.lang.Iterable<S> sources)
Parameters:
sources - a collection of source objects to extract from.
Returns:
an Iterable object over extractions from each of the sources.

compose

public static <R,S,T> Extractor<R,T> compose(Extractor<R,S> rsExtractor,
                                             Extractor<S,T> stExtractor)
Composes a R->S extractor with a S->T extractor to create a R->T extractor.

Type Parameters:
R -
S -
T -
Parameters:
rsExtractor -
stExtractor -
Returns:
an extractor taking objects of type R and returning objects of type T