Changes between Version 8 and Version 9 of DisambiguationSteps


Ignore:
Timestamp:
Sep 2, 2011, 1:42:23 PM (11 years ago)
Author:
rob.hooft@…
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DisambiguationSteps

    v8 v9  
    4040[[File(PeregrineStrictDisambiguator.png,500px)]]
    4141
    42 
    43 
    44 === Keyword ===
    45 
    46 A keyword is a token that is rarely used across all concepts. A good keyword:
    47  1. should be part of a multi-tokened term of the to-be-disambiguated concept.
    48    * set in !IndexedOntology.java: `if (normalizedToken == term.getText() || !isComplex.isAComplexKeyword(normalizedToken)) {IsNotAKeywordToken = true;}`
    49  1. should be [wiki:"complex term" complex] (i.e. longer than 5 chars, or with at least 1 number and 1 letter)
    50    * set in !IsComplexRule.java: `isAComplexKeyword()`.
    51  1. should appear fewer than e.g. 100 times in the ontology
    52    * set in !PeregrineImpl.java as `DEFAULT_KEYWORD_THRESHOLD`
    53  1. When concepts are [wiki:homonym]s (ie. they share at least one term), all tokens belong to that homonym term should not be used as the keywords for these two concepts. (However, these tokens might be used as keywords for other concepts that do not have this homonym term as one of their concept terms).
    54    * implement a map (!isPartOfHomonyms@IndexedOntology.java) and a flag (`TokensShouldBeSkippedAsKeywordForThisConcept`) to filter out keywords that are part of a homonym.
    55 
    5642== Disambiguation decision maker ==
    5743Currently there is only a trivial disambiguation decision maker implementation: it removes the indexing result if the corresponding disambiguation result has weight less than `[0.5]`.
    5844
     45== See also ==
     46
     47 * [wiki:Keyword]