How should the knowledge in a KG be modeled?
– Which classes of entities do we have?
– Which relations connect them?
– Which constraints hold for them?
→ these questions are defined in the ontology of the knowledge graph
How we have built ontologies so far
– Read the requirements
– Pick a starting point at random
– Start playing around in Protégé
– Trial and error driven
That was rather “Ontology Hacking” than “Ontology Engineering”
How to build ontologies?
Methodologies
How to build good ontologies?
– Best Practices
– Design Patterns
– Anti Patterns
– Top Level Ontologies
The SECI Model
Two type of knowledge
Tacit Knowledge
intuitive, hard to formalize
e.g., riding a bike, playing improvised music
Explicit knowledge
formalized
e.g., kinematics, music theory
Tacit knowledge is created from explicit knowledge and vice versa. Knowledge creation is usually a cooperative process
SECI 模型,也被称为知识转换模型,它描述了组织内知识创造和转换的过程。SECI 代表社会化(Socialization)、外部化(Externalization)、结合(Combination)、内部化(Internalization)。
社会化(Socialization):
这种模式涉及通过直接互动和共同经验分享隐性知识。隐性知识是个人的、难以形式化的,比如技能、洞察力和直觉。社会化发生在个体相互作用、观察和相互学习的过程中。这个模式强调通过社交活动(如导师制度、学徒制度和共同经验)进行学习。
外部化(Externalization):
外部化是将隐性知识表达成明确概念的过程。它涉及将个体的隐性知识转化为可以被传达和理解的共享概念和模型。这种模式通常采用叙述、隐喻或类比的方式,将内隐的知识转化为明确、可传达的形式。
结合(Combination):
在结合模式中,明确知识被组合和组织。这涉及对明确知识的系统收集、分类和重组。结合的目标是通过以有意义的方式组织现有的明确知识来创造新的知识。这个模式通常通过创建数据库、手册或其他结构化的知识库来实现。
内部化(Internalization):
内部化是将明确知识体现为隐性知识的过程。在这个模式中,个体获得明确知识并通过将其应用和整合到个人隐性知识中来内部化。这涉及通过实践、操作和获得个人经验来学习。内部化通过将明确知识重新转化为个体的隐性知识来完成循环。
方法论(Methodology)是指研究、学科或领域内,用于解决问题、推进研究或实施工作的一套原则、规范、程序、技术和方法的体系。它是一种系统性的、有组织的方法体系,用于引导人们在特定领域或活动中进行系统、科学、可重复的工作。
Grüninger & Fox’s Methodology 是一种用于本体建模的方法论,特别关注于描述和表示领域知识的形式化。
Step by step from less to more formal ontologies
Stepping back is allowed
Documentation is produced along the way
Glossary: Terms, descriptions, synonyms, antonyms
Taxonomy: Sub class relations
Ad hoc binary relations: a.k.a. ObjectProperties
Concept dictionary contains: terms, descriptions, relations, instances (optional)
Methodology
A collection of analysis methods and tests
– Does my class hierarchy make sense?
Rule1: Rigidity
OntoClean distinguishes rigid and non-rigid classes
If an entity belongs to a rigid class, this holds once and for all, i.e.: if the entity does not belong to that class anymore, it ceases to exist
This does not hold for non-rigid classes
Examples for rigid classes: Person, mountain, company
Examples for non-rigid classes: Student, stock company, town, Caterpillar and butterfly
OntoClean rule: Rigid classes must not be subclasses of non-rigid classes
Other typical rigidity problems
PhysicalObject > Animal
An entity may die and thus be no longer an animal. If we consider “living” as necessary for animals. The physical object (i.e., the body), however, still exists.
Rule2: Identity
Let us look at some instances
– :1h a :Duration . :2h a :Duration . …
– :Mo10-11 a :Interval . :Mo11-12 a :Interval . …
Obviously, there are more instances of Interval than there are instances of Duration [contradiction]
How do we know that two entities are the same?
Some classes have criteria for identity
? Immatriculation number of students
? Tax number for citizens and companies
? Country codes
? …
Observation: The identity criteria are of the two classes are different
OntoClean rule: If p is a subclass of q, then p must not have any identity criteria that q does not have
Rule3: Unity
For some classes, entities can be decomposed into instances of the same class. We call them “anti unity classes”
Examples:
An amount of water into two amounts of water
A group into two sub groups
Other classes only have “whole” instances → “unity classes”, e.g., people, cities
For “whole” individuals, there is always a functional relation unambiguosly relating a part to the whole
Examples:
relating a body part to a person
relating a district to a city
OntoClean rule: Unity classes may only have unity classes as their subclasses. Anti unity classes may only have anti unity classes as their subclasses
In our example:
– OrganicMatter is an anti unity class
– Animal is a unity class
Summarizing OntoClean
A number of tests that can be carried out on ontologies
– Rigidity, Identity, Unity
– Reveal possible mismodeling issues
– Avoid nonsensical reasoning consequences
Origin of the term “design pattern”
Architecture
– Recurring problems
– Standard solutions with certain degrees of freedom
Example
– Problem: rain falls into the building
– Solution: roof
? Degrees of freedom: shed roof, saddle roof, hip roof…
Types of Ontology Design Patterns
Things that should not be done. But are often done and cause some problems.
Possible causes
– Not thought about each and every consequence
– Little/wrong understanding of RDF/OWL principles
Rampant Classism
How to distinguish classes and instances?
For every class, there must be (one or more) instance(s)
– What should be instances of Goethe?
– Are there any sentences like “X is a Goethe”?
Sub class relations must make sense
– Pattern: “Every X is a Y”
– “Every Goethe is a Writer”?
No, so Goethe is not a sub class of writer.
Exclusivity
What is happening here?
Ontology was built exclusively for a domain, e.g., cities. Breaks if used in another context (here: countries)
Semantic Web Principles
AAA (Anybody can say Anything about Anything) i.e., statements should work in different contexts
Another example:
Every person is married to at most one other person
Top Level Ontologies (Very general)
Aristotle’s Ontological Square
One of the oldest top level ontologies
Basic Categories for Top Level Ontologies
Abstract vs. concrete entities
Abstract entities do neither have a temporal nor a spatial dimension, eg. Numbers, Units of measure
Concrete entities do at least have a temporal dimension,i.e., a time span at which they exist (spatial is optional), e.g. Things (books, tables, …), Events (lectures, tournaments, …)
3D vs. 4D view
3D view: Things extend in space. At every point in time, they are completely present
4D view: Things extend in time and space. At a given point in time, they can also be partially present
Actual vs. possible entities
– Actualism: only existing entities are included in an ontology
– Possibilism: all possible entities are included in an ontology
Co-location
Can multiple entities exist in the same place?
This should be easy…
– 3D view: no
– 4D view: yes, but not at the same time
…but it is not that trivial
Example: a statue and the amount of clay from which it was made
Do statues even exist?
– Or is there only clay in the shape of a statue?
– …and if both exist, should they belong to the same category?
– Another example: a hole in a piece of Swiss cheese
Do holes even exist? Or are there only perforated objects?
John Sowa’s Top Level Ontology
An “older” top level ontology (1990s)
Three distinctions form twelve basic categories
Physical vs. Abstract
Physical: Things that exist in time (and potentially in space)
Abstract: Things that do not
Continuant vs. Occurent
Continuant: Things that exist as a whole at each point in time
Occurent: Things that partially exist at each point in time (a lecture)
Independent vs. Relative vs. Mediating
Independent: Things that can exist on their own
Relative: Things that require other things to exist
Mediating: “Third” things that relate two others
DOLCE
Descriptive Ontology for Linguistic and Cognitive Engineering
One of the most well known top level ontologies
Particulars, universals, and quantities
Universals (think: categories): can have instances
– “City”, “University”
Particulars (think: individuals): cannot have instances
– “Mannheim”, “Mannheim University”
Qualities: describe an instance
– e.g., color of a book, height of a person
– Are neither particulars nor universals
– Cannot exist without an instance
A top level ontology of particulars
For both actual and possible entities (possibilistic view)
4D: Some entities may have a temporal dimension
Co-location
Is allowed
restriction: not two entities of the same kind at the same spatial and temporal location
Not: two statues But: a statue and an amount of clay
Top Hierarchy of DOLCE
Endurants vs. Perdurants
Endurants exist in time
Think: things like people, books, … May also be non-physical: organizations, pieces of information
Are always fully present at each point in time during their existence
Perdurants “happen” in time
Think: events and processes
Only exist partially at each point in time during their existence, i.e., previous and future parts of the perdurant may not (yet|anymore) exist at a given point in time
Qualities are attached to endurants and perdurants
Abstracts: numbers, units of measure, etc.
Endurants take part in perdurants
– Actively (Reader and reading)
– Passively (Book and reading)
– DOLCE defines various types of participation
Endurants only consist of endurants, perdurants only consist of perdurants
– Books consist of pages, cover, …
– Reading consists of perceiving, turning pages, …
Endurants in DOLCE
Distinguishing Endurants
Amount of Matter vs. Phyiscal Object
Amount of Matter is “mereologically invariant”, i.e., a part of an AoM is still an AoM
? A part of “some water” is still “some water”
? But a part of a cup is (likely) not a cup
– cf. unity/anti unity in OntoClean
Features
Cannot exist without a physical endurant, e.g., holes, fringes
Perdurants in DOLCE
Distinguishing Perdurants
Qualities
Basic distinction
Quality is a property of an entity
Quality space is the set of possible values of the quality
Qualities need entities
In general, all particulars can have qualities. Qualities only exist as long as the entity exists
SUMO: Suggested Upper Merged Ontology
– Around 1,000 classes
– Strong formalization in KIF (Knowledge Interchange Format)
Cyc: stems from EnCyClopedia
– Own language (CycL)
– Top Level and deep general ontology
– ~250,000 classes
– OpenCyc: discontinued, but still available
PROTO: PROTo ONtology
– General top level+ upper level, different domain extensions
– ~300 classes, ~100 relations