All Peregrine libraries can be fetched as Maven modules. The Maven Getting Started Guide provides a good introduction to Maven. Here you can find a sample property configuration file and a sample client code using the Maven artifacts. [Freek: the page with the sample property configuration file mentions the Spring configuration file. Is Spring needed for this example or is Maven sufficient?]
Building the Peregrine modules
All Peregrine modules can be build locally by checking out and building the projects. This is a three step process.
- Checkout the source code
svn checkout https://trac.nbic.nl/svn/data-mining/trunk peregrine
- Build the third-party modules
cd peregrine/3rd-party/lvg mvn clean install
- Build Peregrine
cd ../../data-mining/ mvn clean install
From version 1.3 onwards Peregrine stable artifacts will be published in maven central.
pom.xml configuration
Insert the following dependencies into your project pom.xml to use Peregrine.
<dependency> <groupId>org.erasmusmc.data-mining.peregrine</groupId> <artifactId>peregrine-api</artifactId> <version>1.1-SNAPSHOT</version> </dependency> <dependency> <groupId>org.erasmusmc.data-mining.ontology</groupId> <artifactId>ontology-api</artifactId> <version>1.1-SNAPSHOT</version> </dependency> <dependency> <groupId>org.erasmusmc.data-mining.peregrine</groupId> <artifactId>peregrine-normalizer</artifactId> <version>1.1-SNAPSHOT</version> <scope>runtime</scope> </dependency> <dependency> <groupId>org.erasmusmc.data-mining.peregrine</groupId> <artifactId>peregrine-tokenizer</artifactId> <version>1.1-SNAPSHOT</version> <scope>runtime</scope> </dependency> <dependency> <groupId>org.erasmusmc.data-mining.peregrine</groupId> <artifactId>peregrine-disambiguator</artifactId> <version>1.1-SNAPSHOT</version> <scope>runtime</scope> </dependency> <dependency> <groupId>org.erasmusmc.data-mining.peregrine</groupId> <artifactId>peregrine-impl-hash</artifactId> <version>1.1-SNAPSHOT</version> <scope>runtime</scope> </dependency> <!-- using DB ontology --> <dependency> <groupId>org.erasmusmc.data-mining.ontology</groupId> <artifactId>ontology-impl-db</artifactId> <version>1.1-SNAPSHOT</version> <scope>runtime</scope> </dependency> <!-- using File ontology --> <dependency> <groupId>org.erasmusmc.data-mining.ontology</groupId> <artifactId>ontology-impl-file</artifactId> <version>1.1-SNAPSHOT</version> <scope>runtime</scope> </dependency> <dependency> <groupId>org.erasmusmc.data-mining.ontology</groupId> <artifactId>ontology-impl-cache</artifactId> <version>1.1-SNAPSHOT</version> <scope>runtime</scope> </dependency>
Spring configuration
Peregrine is developed, based on the spring application framework. Hence it is easy to incorporate with your new / existing spring applications.
Here is an example spring configuration snippet to incorporate peregrine running with a single file ontology into your spring project.
<import resource="classpath:/org/erasmusmc/data_mining/ontology/impl/file/ontology-file.context.xml" /> <alias name="singleFileOntology" alias="ontology" /> <import resource="classpath:/org/erasmusmc/data_mining/peregrine/tokenizer/impl/tokenizer.context.xml" /> <import resource="classpath:/org/erasmusmc/data_mining/peregrine/normalizer/impl/normalizer.context.xml" /> <import resource="classpath:/org/erasmusmc/data_mining/peregrine/disambiguator/impl/rule_based/disambiguator-complete.context.xml" /> <import resource="classpath:/org/erasmusmc/data_mining/peregrine/disambiguator/impl/disambiguation-decision-maker-complete.context.xml" /> <import resource="classpath:/org/erasmusmc/data_mining/peregrine/impl/hash/peregrine.context.xml" />
Notes
- lvg normalizer is packaged as a Maven artifact. You will have to set the correct value for "LVG_DIR" in lvg2006lite/data/config/lvg.properties.
- ontology file: you can get a sample ontology file from the Download section.