A keyword is a token that is rarely used across all concepts. A good keyword:

  1. should be part of a multi-tokened term of the to-be-disambiguated concept.
    • set in if (normalizedToken == term.getText() || !isComplex.isAComplexKeyword(normalizedToken)) {IsNotAKeywordToken = true;}
  2. should be complex (i.e. longer than 5 chars, or with at least 1 number and 1 letter)
    • set in isAComplexKeyword().
  3. should appear fewer than e.g. 100 times in the ontology
  4. When concepts are homonyms (ie. they share at least one term), all tokens belong to that homonym term should not be used as the keywords for these two concepts. (However, these tokens might be used as keywords for other concepts that do not have this homonym term as one of their concept terms).
    • implement a map ( and a flag (TokensShouldBeSkippedAsKeywordForThisConcept) to filter out keywords that are part of a homonym.
Last modified 11 years ago Last modified on Sep 2, 2011, 1:42:37 PM