Knowledge (re-)presentation and retrieval

Dr Philippe Martin

Lecturer at ESIROI (University of La Réunion) since Sept. 2009

PhD obtained in 1996 at Acacia/Edelweiss, INRIA Sophia Antipolis
(then: researcher/lecturer in Australia for 11 years and at Eurecom for 18 months)

Research focus: supports for scalable+precise knowledge sharing
(hence, for scalable+precise knowledge representation/comparison/... and cooperation)
-> enabling the representation + loss-less scalable integration

of any knowledge, best practice, language, cooperation protocol, ...
into ONE well-organized KB (possibly composed of distributed KBs of various persons)
and its re-use by tools for flexible (re-)presentation+comparison+exploitation of knowledge

Goal of this talk:
- presenting some basic principles and examples of a research avenue
which may be of interest to you and might lead to cooperation
- getting feedback on new ideas -> please comment or ask questions during the talk

Plan 1. Basic rules for information (re-)presentation, comparison and retrieval 2. Example of notations for representing knowledge 3. Towards an ontology for knowledge import/export directives/preferences 4. Example of notation for a menu 5. Example of notations for querying knowledge 6. Example of scalable format for knowledge comparison

1. Basic rules for information (re-)presentation, comparison and retrieval

Given any information object (content/presentation object), the more other information objects it can be compared with, associated to or retrieved by, the better => the more precise, semantically organized, and dynamically accessible+modifiable the information objects are stored and presented - e.g., within KBs, documents, menus or query results - the better (for knowledge comparison/retrieval/understanding) => the more "best practices" and normalizing+organizing rules are followed, the better => the more rules/methodologies and ontologies are integrated, the better => the more rules the protocols propose/guide/enforce the user to follow, the better => the more the used languages are expressive, normalizing, guide/enforce the following of "best practices" and allow objects to be "first order entities" (for these objects to be referable via a name or a description), the better => normal evolution: binary representation -> SGML/XML -> RDF/XML -> decent KRL procedural languages -> functional/declarative languages => the more information objects the interfaces/menus permit a user to directly or indirectly select for accessing a cascading menu that lists and organizes all the actual/potential relations and commands about (from/to/object/agent/...) the selected object, the better (indirectly selection: via the menu of a related object, e.g., the selection of a process via the menu of the objects that it modifies) => normal evolution: classic interfaces -> structured document editor interfaces => for the semantic organization of information objects, the more the information objects are categorized into ONE specialization+identity hierarchy with no implicit redundancies/contradictions and where every object has a unique place, the better => need for a very "general" specialization relation types, and many subtypes, e.g., for specializing informal objects.

Ideal agent for a collaboration (in French).
Ideal tool for collaboration (in French).

2. Examples of notations for representing knowledge

En: English; FE: Formalized English; FCG: Frame CG;
FL: For Links; FLDF: FL Display Form;
default creator of terms: pm; default creator of sentences: ; //none
default interpretation for collections: distributive exclusive sets;
italic font -> relation nodes; small font -> contexts (meta-statements)

En: According to John, any human_body is a body and has at most 1 head and 2 arms. En: According to Jack, every human_body happens to have exactly 1 head and conversely. En: According to Jo, male_body and female_body is a type partition of human_body. FE: John#`any human_body is a body and has for part {at most 1 head, at most 2 arm}´. FE: Jack#`every human_body has for part 1 head´. FE: `any head is part of 1 human_body´_[Jack]. FE: Jo#`human_body has for subtype {male_body female_body}_[complete]´. FCG: John#[any human_body, type: body, part: {at most 1 head, at most 2 arm}]; FCG: Jack#[every human_body, part: 1 head]. FCG: Jack#[ [any head, part of: 1 human_body] believer: Jack]; FCG: Jo#[human_body, subtype: {male_body female_body}_[complete] ]; FL: human_body supertype: body __[believer: John], FL: human_body part : arm __[ every->0..1 _[John]] FL: human_body part : head __[ any->0..1 _[John], every->1 _[Jack], 1<-every _[Jack] ], FL: human_body subtype : {male_body female_body}_[complete, Jo] __[Jo]; FLDF: human_body --- supertype _[John] --> body FLDF: human_body |-- part _[ every->0..1 _[John] ] --> arm FLDF: human_body |-- part _[ any->0..1 _[John] ] --> head FLDF: human_body |-- part _[ every->1 _[Jack], 1<-every _[Jack] ] --> head FLDF: human_body |-- subtype _[Jo] --> male_body FLDF: human_body |-- subtype _[Jo] --> female_body FLDF: human_body |-- subtype _[Jo] --> {male_body female_body}_[complete]

An FL-like notation is helpful to visualize (and hence navigate or edit) a realistic quantity of knowledge.

Examples of features of FL, FCG and FE that guide and ease
knowledge representation/normalization/organization/sharing:
numerical quantifiers, lambda-abstraction for numerical quantities or for
the use of certain kinds of concepts types as if they were relation type.
Example of best practices enforced by the quantifiers of FL, FCG and FE:
the use of nouns (or nominal expressions) for objects identifiers and names.

2. Examples of notations for representing knowledge

N3: Notation3+OWL.
OWL makes the representation of cardinalities complex and unreadable.
The Notation3 syntax is not regular, will be hard to extend and, like KIF, is
low-level: no shortcut to ease knowledge representation/organization/sharing,
e.g., numerical quantifiers, lambda-abstraction for numerical quantities or for
the use of certain kinds of concepts types as if they were relation types.

N3:  :armPart  a rdf:Property;  rdfs:subPropertyOf :part;  rdfs:range :Arm;
                                owl:inverseOf :armPartOf.
     :headPart a rdf:Property;  rdfs:subPropertyOf :part;  rdfs:range :Head;
                                owl:inverseOf :headPartOf.
     :StuffWith1Head
        rdfs:subClassOf [a owl:Restriction;  owl:onProperty :headPart;  owl:cardinality 1].
     :StuffOf1HumanBody
        rdfs:subClassOf [a owl:Restriction;  owl:onProperty :headPartOf;  owl:cardinality 1].
     {:human_body  a owl:Class;  rdfs:subClassOf :body;}  :believer :John.
     {:human_body  a owl:Class;  rdfs:subClassOf :body; 
         rdfs:subClassOf  [a owl:Restriction;  owl:onProperty :headPart owl:maxCardinality 1]
         rdfs:subClassOf  [a owl:Restriction;  owl:onProperty :armPart  owl:maxCardinality 2]
     } :believer :John.
     {@forAll :b . {:a rdf:type :human_body;  rdf:type :StuffWith1Head;} } :believer :Jack.
     {@forAll :h . {:a rdf:type :head;  rdf:type :StuffOf1HumanBody;} } :believer :Jack.
     {:male_body  rdfs:subClassOf :body;   owl:disjointWith :female_body} :believer :Jo.
     {:female_body  rdfs:subClassOf :body;} :believer :Jo.


FL:  human_body  supertype:  body __[believer: John],
                 part     :  0..2 arm __[John]
                             head __[ any->0..1 _[John],  every->1 _[Jack],  1<-every _[Jack]],
                 subtype  :  {( male_body  female_body )} __[Jo];

2. Examples of notations for representing knowledge

In KIF (Knowledge Interchange Format; low-level but of expressive;
CL is not expressive enough for knowledge sharing):

(believer '(defconcept human_body (?b)  (body ?b)) John)
(believer '(defconcept human_body (?b)  (atMostN  1 '?a head (part ?b '?a))) John)
(believer '(defconcept human_body (?b)  (atMostN  2 '?a arm  (part ?b '?a))) John)
(believer '(forall ((?b human_body))    (exactlyN 1 '?a head (part ?b '?a))) Jack)
(believer '(forall ((?a head))          (atMostN  1 '?b human_body (part '?b ?a)))  John)
(believer '(forall ((?a head))          (exactlyN 1 '?b human_body (part '?b ?a)))  Jack)
(believer '(defconcept male_body (?b)   (and (human_body) (not (female_body ?b))))  Jo)
(believer '(defconcept female_body (?b) (and (human_body) (not (male_body ?b))))    Jo)

avec

(defrelation atMostN (?num ?var ?type ?predicate)  :=
  (exists ((?s set)(?n)) (and (size ?s ?n) (=< ?n ?num)
    (truth ^(forall (,?var) (=> (member ,?var ,?s)
                                (and (,?type ,?var) ,?predicate)))))))

2. Examples of notations for representing knowledge

Ad-hoc translation in RDF+OWL (every-> any; believer -> dc#creator):

<rdf:Property rdf:ID="armPart"><rdfs:subPropertyOf rdf:resource="Part"/>
                               <owl:inverseOf rdf:ID="armPartOf"/>
                               <rdfs:range rdf:resource="Arm"/></rdf:Property>
<rdf:Property rdf:ID="headPart"><rdfs:subPropertyOf rdf:resource="Part"/>
                               <owl:inverseOf rdf:ID="headPartOf"/>
                               <rdfs:range rdf:resource="Head"/></rdf:Property>

<owl:Class rdf:about="HumanBody"><rdfs:subClassOf rdf:resource="Body" dc:creator="John"/>
  <rdfs:subClassOf><owl:Restriction><owl:onProperty rdf:resource="#headPart"/>
        <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger" dc:creator="Jack">1
        </owl:cardinality></owl:Restriction> </rdfs:subClassOf>
  <rdfs:subClassOf><owl:Restriction><owl:onProperty rdf:resource="#headPart"/>
        <owl:maxCardinality rdf:datatype="&xsd;nonNegativeInteger" dc:creator="John">1
        </owl:maxCardinality></owl:Restriction> </rdfs:subClassOf>
  <rdfs:subClassOf><owl:Restriction><owl:onProperty rdf:resource="#armPart"/>
        <owl:maxCardinality rdf:datatype="&xsd;nonNegativeInteger" dc:creator="John">2
        </owl:maxCardinality></owl:Restriction> </rdfs:subClassOf> </owl:Class>
<owl:Class rdf:about="Head">
  <rdfs:subClassOf><owl:Restriction><owl:onProperty rdf:resource="#headPartOf"/>
        <owl:maxCardinality rdf:datatype="&xsd;nonNegativeInteger" dc:creator="Jack">1
        </owl:maxCardinality></owl:Restriction> </rdfs:subClassOf> </owl:Class>

<owl:Class rdf:about="MaleBody"><rdfs:subClassOf rdf:resource="Body" dc:creator="Jo"/>
        <owl:disjointWith rdf:resource="FemaleBody" dc:creator="Jo"/> </owl:Class>
<owl:Class rdf:about="FemaleBody"><rdfs:subClassOf rdf:resource="Body" dc:creator="Jo"/>
        </owl:Class>

3. Towards an ontology for knowledge import/export directives/preferences

WebKB uses FC (For Control) a language that
- is composed of assertion/querying commands and import/export directives
(various query operators, the assertion operator is implicit,
the parameters can be in FL, FE, FCG or CGLF; the language is detected
from the syntax or is specified: [_ language: FCG][...] )
- has shell-like control structures (if, while, pipe) to combine commands.

The import/export directives are ad-hoc. This is unsatisfactory. Examples for outputs:
use HTML tags; show explanations; default creator of terms: pm; output_language: FL; hide statements from: pm wn;

For genericity/scalability purposes, the input/output directives should look like
(or be abbreviations of):

En: "until further notice, for statements exported by pm, the default term creator is pm and the default language is FL except that the begin/end delimiters for statements are respectively #[ and ]# and may be optional (note: this directive may be derived from "pm's preferred output format is FL, ...") FL: term_part_of_statement_exported_by_pm //in FL, the default cardinality is 'any->?' part of: a (. statement_exported_by_pm := (statement, result of: (an export, agent: pm)), default language: FL default begin_delimiter: 0..1 "#[", default end_delimiter: 0..1 "]#" ), default creator: pm;

To allow such statements, I have represented
- the data models (abstract structures) of KIF-like languages and CG-like languages
into an integrated language ontology
- an ontology of the presentation (concrete structures) of such languages.
However, this last ontology has to be redesigned since it is too complex.
Furthermore, it not yet permit to specify input/output directives that are not notation related.

This approach costs less work than - but could be combined with -
- GRDDL (-> XML to RDF via XSLT), and
- Fresnel (~ CSS/XSLT for RDF).
It is also much more flexible for end-users, especially when contextual menus are proposed for many objects.

4. Example of notation for a menu

Examples of 1st level entries in a menu for the selected informal word "cat"
(menu automatically generated from the content of a KB
and here using an FL-like notation where collections are "distributive" by default
and where ">" refers to "specialization"):

(has for meaning: { ({wn#true_cat wn#domestic_cat wn#big_cat} < wn#feline) ({wn#female_gossiper wn#guy} < wn#person) ({wn#Caterpillar wn#cat-o'-nine-tails} < wn#instrumentation) ({wn#computed_tomography wn#vomit} < wn#activity) } __[language: English] ) (has for structural-characteristics: > (has for embedding-element: a paragraph) (has for embedded-element: ...) //"..." -> clicking on this node expands it ) (has for presentation-characteristics: > (has for font-characteristics: ...) //"..." -> clicking on this node expands it ) (may be parameter of: a search/comparison_command an update_command )

5. Examples of notations for querying knowledge

SP: SPARQL1.1; F-SP: F-SPARQL (SPARQL with macros; to design);

En: List statements implying that "most cars weight at least 100 kg" FC+FE: ?s ` `most car color a red´ <= ?s´ FC+FCG: ?s [ [at least 51% of car, color a red] <= ?s] SP: //no translation since there is no "most" or "at least 51%" in RDF En: In most descriptions of cars in the KB, are the cars red ? FC+FE: ? `a car´ | nbArguments | set nbCars; ? `a car with color a red´ | nbArguments | set nbRedCars; set nbRedCars_div_nbCars `expr $nbRedCars / $nbCars`; if ($nbRedCars_div_nbCars >= 0.51) { echo "True"; } else { echo "False"; } SP: ASK { SELECT (COUNT(DISTINCT ?car) AS ?nbCars) WHERE { ?car a :car }. SELECT (COUNT(DISTINCT ?car) AS ?$nbRedCars) WHERE { ?car; a car; :color [a :red] }. BIND (?nbRedCar / ?nbCars AS ?nbRedCars_div_nbCars). FILTER (?nbRedCars_div_nbCars > 0.51). }

5. Examples of notations for querying knowledge

En: List statements (with their "contextualizing" meta-statements) specializing (implying) the sentence "some car weights at least 100 kg" FC+FE: ?s ` `a car has for weight at least 100 kg´ <= ?s´ SP: //The following assumes that information are stored in named graphs // except for information about the named graphs // (the contextualizing information is in the default graph, // as illustrated in Section 13.2.3 of the SPARQL1.1 documentation). //It also assumes that all used "contextualizing relation types" and their // inverse have been declared as subtypes of pm#contextualizing_relation, // and that the SPARQL endpoint exploits subsumption on relation types. CONSTRUCT { GRAPH ?g {?car :attribute [a :weight; :unit :kg; :value ?v1; :value ?v2]. ?v2 :superior_or_equal 100. //if ?v2 exists -> if ?v1 does not ?g :contextualizing_relation ?co. //we'd like to write: ?g ?(cr|^cr)+ ?co. } //?cr rdfs:subPropertyOf* :contextualizing_relation. WHERE { ?g (:contextualizing_relation)+ ?co. GRAPH ?g { ?car :attribute [a :weight; :unit :kg; :value ?v]. { FILTER(?v >= 100) } UNION { ?v :superior_or_equal 100 } } } F-SP: SPECIALIZATIONS-OF { ?car; a :car; :attribute [a :weight; :unit :kg; :value [?w :superior_or_equal 100] ] }

5. Examples of notations for querying knowledge

En: Display the network (-> specialization hierarchy) of statements specializing that "a car has for weight at least 100 kg" FC+FE: ? `a car has for weight at least 100 kg´ //optional: <= ?s SP: //Same graph as for the previous query, with the same assumptions plus these ones: // 1) the metadata about named graphs (in the default graph) should include // specialization relations between the named graphs // 2) the specialization relation types are declared as subtypes of // pm#contextualizing_relation (this permits to have the same query as before // but this new query should actually use pm#metastatement_relation_to_display // with pm#contextualizing_relation and specialization relation types as subtypes F-SP: NETWORK-SPECIALIZING { ?car a :car; :attribute [a :weight; :unit :kg; :value [?v :superior_or_equal 100] ] }

5. Examples of notations for querying knowledge

En: Display the network (-> specialization hierarchy) of statements generalizing that "a car has for weight at least 100 kg" FC+FE: ? ` `a car has for weight at least 100 kg´ => ?s´ SP: CONSTRUCT //with the same assumptions as for the previous query { GRAPH ?g {?car :attribute [a :weight; :unit :kg; :value [?v :inferior 100] ]. ?g :metastatement_relation_to_display ?co. } //we'd like to write: ?g (?(cr|^cr) | :generalization)+ ?co. WHERE { ?g (:metastatement_relation_to_display)+ ?co. GRAPH ?g { ?g_car rdf:type/(^rdfs:subClassOf)* :car. ?car (:attribute)? [a :weight; :unit :kg; :value [?v (:inferior)? 100] ]. ?g_attribute (^rdfs:subPropertyOf)* :attribute. ?g_weight rdf:type/(^rdfs:subClassOf)* :weight. } } F-SP: NETWORK-GENERALIZING { ?car a :car; :attribute [a :weight; :unit :kg; :value [?v :inferior 100] ] }

6. Example of scalable format for knowledge comparison

compare pm#WebKB-2 km#Ontolingua on (support of: a is#IR_task, output_language: a km#KR_notation, part: a is#user_interface), maxdepth 5

                                               WebKB-2         Ontolingua 
support of: 
  is#IR_task                                      +                 + 
    is#lexical_search                             +                 + 
      is#regular_expression_based_search          +                 . 
    km#knowledge_retrieval_task                   +                 . 
      km#generalization_structural_retrieval      +                 . 
    ...
output_language: 
  km#KR_notation                                  +                 + 
    (expressivity: km#FOL)                        +                 + 
      km#FCG                                      +                 . 
      km#KIF                                      .                 + 
    km#XML-based notation                         +                 . 
      km#RDF                                      +                 - 
    ...

SELECT ?tool ?r ?chrc //tentative of doing something roughly similar in SPARQL WHERE { ?t = pm:WebKB-2 } UNION { ?t = km:Ontolingua }. ?t a tool; ?r{0,5} ?chrc. {?r rdfs:subClassOf [owl:inverseOf :support] } UNION {?r rdfs:subClassOf output_language } UNION {?r rdfs:subClassOf :part }. {?chrc rdf:type/rdfs:subClassOf* is:IR_task } UNION {?chrc rdf:type/rdfs:subClassOf* km:KR_notation } UNION {?chrc rdf:type/rdfs:subClassOf* is:user_interface }. } GROUP BY ?tool ?r //possible? /* SELECT * //older tentative WHERE { ?t = pm:WebKB-2 } UNION { ?t = km:Ontolingua }. [?t a tool; ^support{1,5} [a is:IR_task]; output_language{0,5} [a km:KR_notation]; part{0,5} [a is:user_interface] } */