Message 12247 of the SUO list

Subject: Re: SUO: Multi-Source Ontology (MSO) Draft Ballot Question
Date: Wed, 29 Jan 2004 19:48:07
From: Philippe Martin
In-reply-to: msg12232 by Adam Pease and msg12246 by John Bateman
Follow-up: msg12250 by Adam Pease 


> WordNet synsets are language constructs, not concepts, and the links among
> synsets are correct linguisticaly, but not philosophically.

A synset represents "one of the meanings of the words in the set". Thus,
it does represent a concept. WordNet has 2 kinds of links: lexical links
between words (e.g. "antonym") and semantic links between synsets (e.g.
"hypernym"). To re-use WordNet for knowledge representation, you have to
give a logical interpretation to each kind of semantic links and thus
assume that most hypernym links are genuine "generalization" links
(i.e. subtypeOf links or instanceOf links), possibly distinguish
between subtypeOf and instanceOf links (which I did), and correct 
the links that you find incorrect from a logical and semantic viewpoint.

This is stressed by the first sentence of my article about my integration
of WordNet (http://www.webkb.org/doc/papers/iccs03/):
 This article presents the transformation of the noun-related part of WordNet
 into a genuine "lexical ontology" to support knowledge representation,
 sharing and retrieval within a knowledge base or on the Web, i.e.
 to support "knowledge creation and communication".

Thus, I gave category identifiers (as concise and clear as possible) to the
synsets (and that alone is important; many teams, including the DOLCE team,
had to generate identifiers for WordNet categories).
For what I call the content-oriented links of WordNet (e.g. substance and
meronym), I assumed the following interpretation:
 when such a link L (say the meronym, which I interpret as pm#spatial_part)
 connects a source category S to a destination category D, it means that
 "S (or 'a S' if S is a type) may have for L D (or 'a D' if D is a type)";
 "may have" corresponds to the relation cardinality [0..*]; for example
 in FT: "wn#bird p #wing" means that "a wn#bird may have for pm#spatial_part
 a wn#wing", i.e. also in FT: pm#spatial_part(wn#bird [0..*],wn#wing [0..*]);
Of course, this interpretation may be incorrect or imprecise (here, I'd
have prefered: pm#spatial_part(pm#healthy_bird [1],wn#wing [2]);). Then,
when incorrect links are found, they should be corrected.

I am not sure that these are the "upper level commitments" that John Bateman
was asking for. However, when I subtype a well-defined category (say, from 
DOLCE or SUMO) by another category (say from WordNet) the associated axioms
are inherited, hence you have more precisions about what the specialized
category means. On the other hand, this does impose some added semantics to
the fuzzy semantics of WordNet categories (something which may be regrettable
but is necessary for re-using them for knowledge representation purposes).
Unless (or until) there is a detected inconsistency (detected automatically
or by people) between the inherited axioms, my position is to assume that
things are fine (and for most inference engines, they will certainly be).

Now, as John Sowa and my article hinted, I'd be pleased to have a more 
structured source for a (much needed for my goals) "large lexical ontology"
than WordNet. But WordNet (and, a fortiori, my re-use of WordNet) is an 
ontology (plus a thesaurus), not just a lexicon.



> But it also now gives at least 3 restructured top-levels for WordNet:
> * EuroWordNet * Ontocleaned * MSO ...
> And, my question to all, has anyone yet done a detailed comparison of 
> these alternatives?

My article also discusses about the connections between DOLCE (Ontoclean's
top-level ontology) and WordNet.
I have looked at EuroWordNet but had not the time to do anything about it.



>    As an example, take the inference path that is possible from the SUMO 
> term Motion to the WordNet synset "motion, movement, move" then up the 
> hypernym links to "change", "action" and then "act, human action, human 
> activity".  Through that faulty inference chain, one could conclude that 
> any Motion is an intentional human action, which of course is false.

There is indeed a faulty link somewhere but WordNet is not necessarily to
blame: since the WordNet motion category you refer to (which is refered to
by #human_motion (or wn#human_motion) in the MSO of WebKB-2 and which has
for gloss "the act of changing your location ...") is a subtype of
wn#human_action, it can be argued that it does refer to a human movement
(even if such a name does not appear in the synset) and hence sumo#motion
should not have been set as a subtype of #movement but another category
should have been chosen, e.g. #movement (which is the choice I made when
integrating the NSM).
Here are extracts of the FT descriptions of #movement and #human_motion
(checkable via http://www.webkb.org/interface/categSearch.html).
Please note that I (pm) have set a subtype link between them.

#movement__motion (^a natural event that involves a change in the position
                    or location of something^)
   > #approach.movement  #passing.movement  #deflection.movement ...
     #motion  #movement.change (pm)  #human_motion (pm), 
   =  nsm#move (pm), 
   <  #happening__occurrence__natural_event;
 
#human_motion__motion__movement__move  (^the act of changing your ...)
   >  #approach  #forward_motion  #locomotion  #lunge  #travel  ...,
   <  #change  #movement (pm);

The "SUMO Search tool" at
http://ontology.teknowledge.com/cgi-bin/SUMO-browser-verbs.pl?term=motion&POS=1
gives 5 WordNet "equivalents" to sumo#motion (&%Motion), in fact all 
the WordNet synsets that include the word "motion": these are indeed 
lexical links and not semantic links like those I propose. 
Furthermore, these lexical links are between SUMO and WordNet 1.6 so
they will have to be checked and may be updated one day (the last
version of WordNet is 2.0).
 

> If the MSO lacks information from the ontologies it uses as sources,
> and adds no new information

As noted in my previous e-mail, you loose nothing if not all the
definitions of the ontological primitives are included (since you can
retrieve them anyway), and the MSO does add a lot, e.g. my top-level
and my taxonomy of primitive relation types which are vital for my work
(natural language representation) whereas, for that goal, I have no 
direct use of the SUMO.


Philippe