microsoft/AI-For-Beginners

Public

mirrored fromhttps://github.com/microsoft/AI-For-BeginnersAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
ab540a094b798079e18820cde219916c7b77686a

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

2-Symbolic/README.md

229lines · modecode

1# Knowledge Representation and Expert Systems
2
3![Summary of Symbolic AI content](../sketchnotes/ai-symbolic.png)
4
5> Sketchnote by [Tomomi Imura](https://twitter.com/girlie_mac)
6
7The quest for artificial intelligence is based on a search for knowledge, to make sense of the world similar to how humans do. But how can you go about doing this?
8
9## [Pre-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/3)
10
11In the early days of AI, the top-down approach to creating intelligent systems (discussed in the previous lesson) was popular. The idea was to extract the knowledge from people into some machine-readable form, and then use it to automatically solve problems. This approach was based on two big ideas:
12
13* Knowledge Representation
14* Reasoning
15
16## Knowledge Representation
17
18One of the important concepts in Symbolic AI is **knowledge**. It is important to differentiate knowledge from *information* or *data*. For example, one can say that books contain knowledge, because one can study books and become an expert. However, what books contain is actually called *data*, and by reading books and integrating this data into our world model we convert this data to knowledge.
19
20> ✅ **Knowledge** is something which is contained in our head and represents our understanding of the world. It is obtained by an active **learning** process, which integrates pieces of information that we receive into our active model of the world.
21
22Most often, we do not strictly define knowledge, but we align it with other related concepts using [DIKW Pyramid](https://en.wikipedia.org/wiki/DIKW_pyramid). It contains the following concepts:
23
24* **Data** is something represented in physical media, such as written text or spoken words. Data exists independently of human beings and can be passed between people.
25* **Information** is how we interpret data in our head. For example, when we hear the word *computer*, we have some understanding of what it is.
26* **Knowledge** is information being integrated into our world model. For example, once we learn what a computer is, we start having some ideas about how it works, how much it costs, and what it can be used for. This network of interrelated concepts forms our knowledge.
27* **Wisdom** is yet one more level of our understanding of the world, and it represents *meta-knowledge*, eg. some notion on how and when the knowledge should be used.
28
29<img src="images/DIKW_Pyramid.png" width="30%"/>
30
31*Image [from Wikipedia](https://commons.wikimedia.org/w/index.php?curid=37705247), By Longlivetheux - Own work, CC BY-SA 4.0*
32
33Thus, the problem of **knowledge representation** is to find some effective way to represent knowledge inside a computer in the form of data, to make it automatically usable. This can be seen as a spectrum:
34
35![Knowledge representation spectrum](images/knowledge-spectrum.png)
36
37> Image by [Dmitry Soshnikov](http://soshnikov.com)
38
39* On the left, there are very simple types of knowledge representations that can be effectively used by computers. The simplest one is algorithmic, when knowledge is represented by a computer program. This, however, is not the best way to represent knowledge, because it is not flexible. Knowledge inside our head is often non-algorithmic.
40* On the right, there are representations such as natural text. It is the most powerful, but cannot be used for automatic reasoning.
41
42> ✅ Think for a minute about how you represent knowledge in your head and convert it to notes. Is there a particular format that works well for you to aid in retention?
43
44## Classifying Computer Knowledge Representations
45
46We can classify different computer knowledge representation methods in the following categories:
47
48* **Network representations** are based on the fact that we have a network of interrelated concepts inside our head. We can try to reproduce the same networks as a graph inside a computer - a so-called **semantic network**.
49
501. **Object-Attribute-Value triplets** or **attribute-value pairs**. Since a graph can be represented inside a computer as a list of nodes and edges, we can represent a semantic network by a list of triplets, containing objects, attributes, and values. For example, we build the following triplets about programming languages:
51
52Object | Attribute | Value
53-------|-----------|------
54Python | is | Untyped-Language
55Python | invented-by | Guido van Rossum
56Python | block-syntax | indentation
57Untyped-Language | doesn't have | type definitions
58
59> ✅ Think how triplets can be used to represent other types of knowledge.
60
612. **Hierarchical representations** emphasize the fact that we often create a hierarchy of objects inside our head. For example, we know that canary is a bird, and all birds have wings. We also have some idea about what colour canary usually is, and what is their flight speed.
62
63 - **Frame representation** is based on representing each object or class of objects as a **frame** which contains **slots**. Slots have possible default values, value restrictions, or stored procedures that can be called to obtain the value of a slot. All frames form a hierarchy similar to an object hierarchy in object-oriented programming languages.
64 - **Scenarios** are special kind of frames that represent complex situations that can unfold in time.
65
66**Python**
67Slot | Value | Default value | Interval |
68-----|-------|---------------|----------|
69Name | Python | | |
70Is-A | Untyped-Language | | |
71Variable Case | | CamelCase | |
72Program Length | | | 5-5000 lines |
73Block Syntax | Indent | | |
74
753. **Procedural representations** are based on representing knowledge by a list of actions that can be executed when a certain condition occurs.
76 - Production rules are if-then statements that allow us to draw conclusions. For example, a doctor can have a rule saying that **IF** a patient has high fever **OR** high level of C-reactive protein in blood test **THEN** he has an inflammation. Once we encounter one of the conditions, we can make a conclusion about inflammation, and then use it in further reasoning.
77 - Algorithms can be considered another form of procedural representation, although they are almost never used directly in knowledge-based systems.
78
794. **Logic** was originally proposed by Aristotle as a way to represent universal human knowledge
80 - Predicate Logic as a mathematical theory is too rich to be computable, therefore some subset of it is normally used, such as Horn clauses used in Prolog.
81 - Descriptive Logic is a family of logical systems used to represent and reason about hierarchies of objects distributed knowledge representations such as *semantic web*.
82
83## Expert Systems
84
85One of the early successes of symbolic AI were so-called **expert systems** - computer systems that were designed to act as an expert in some limited problem domain. There were based on a **knowledge base** extracted from one or more human experts, and they contained an **inference engine** that performed some reasoning on top of it.
86
87![Human Architecture](images/arch-human.png) | ![Knowledge-Based System](images/arch-kbs.png)
88---------------------------------------------|------------------------------------------------
89Simplified structure of a human neural system | Architecture of a knowledge-based system
90
91Expert systems are built like the human reasoning system, which contains **short-term memory** and **long-term memory**. Similarly, in knowledge-based systems we distinguish the following components:
92* **Problem memory**: contains the knowledge about the problem being currently solved, i.e. the temperature or blood pressure of a patient, whether he has inflammation or not, etc. This knowledge is also called **static knowledge**, because it contains a snapshot of what we currently know about the problem - the so-called *problem state*.
93* **Knowledge base**: represents long-term knowledge about a problem domain. It is extracted manually from human experts, and does not change from consultation to consultation. Because it allows us to navigate from one problem state to another, it is also called **dynamic knowledge**.
94* **Inference engine**: orchestrates the whole process of searching in the problem state space, asking questions of the user when necessary. It is also responsible for finding the right rules to be applied to each state.
95
96As an example, let's consider the following expert system of determining an animal based on its physical characteristics:
97
98![AND-OR Tree](images/AND-OR-Tree.png)
99
100> Image by [Dmitry Soshnikov](http://soshnikov.com)
101
102This diagram is called an **AND-OR tree**, and it is a graphical representation of a set of production rules. Drawing a tree is useful at the beginning of extracting knowledge from the expert. To represent the knowledge inside the computer it is more convenient to use rules:
103```
104IF the animal eats meat
105OR (animal has sharp teeth
106 AND animal has claws
107 AND animal has forward-looking eyes
108)
109THEN the animal is a carnivore
110```
111You can notice that each condition on the left-hand-side of the rule and the action are essentially object-attribute-value (OAV) triplets. **Working memory** contains the set of OAV triplets that correspond to the problem currently being solved. A **rules engine** looks for rules for which a condition is satisfied and applies them, adding another triplet to the working memory.
112
113> ✅ Write your own AND-OR tree on a topic you like!
114
115### Forward vs. Backward Inference
116
117The process described above is called **forward inference**. It starts with some initial data about the problem available in the working memory, and then executes the following reasoning loop:
118
1191. If the target attribute is present in the working memory - stop and give the result
1202. Look for all the rules whose condition is currently satisfied - obtain **conflict set** of rules.
1213. Perform **conflict resolution** - select one rule that will be executed on this step. There could be different conflict resolution strategies:
122 - Select the first applicable rule in the knowledge base
123 - Select a random rule
124 - Select a *more specific* rule, i.e. the one meeting the most conditions in the "left-hand-side" (LHS)
1254. Apply selected rule and insert new piece of knowledge into the problem state
1265. Repeat from step 1.
127
128However, in some cases we might want to start with an empty knowledge about the problem, and ask questions that will help us arrive to the conclusion. For example, when doing medical diagnosis, we usually do not perform all medical analyses in advance before starting diagnosing the patient. We rather want to perform analyses when a decision needs to be made.
129
130This process can be modeled using **backward inference**. It is driven by the **goal** - the attribute value that we are looking to find:
1311. Select all rules that can give us the value of a goal (i.e. with the goal on the RHS ("right-hand-side")) - a conflict set
1321. If there are no rules for this attribute, or there is a rule saying that we should ask the value from the user - ask for it, otherwise:
1331. Use conflict resolution strategy to select one rule that we will use as *hypothesis* - we will try to prove it
1341. Recurrently repeat the process for all attributes in the LHS of the rule, trying to prove them as goals
1351. If at any point the process fails - use another rule at step 3.
136
137> ✅ In which situations is forward inference more appropriate? How about backward inference?
138
139### Implementing Expert Systems
140
141Expert systems can be implemented using different tools:
142* Programming them directly in some high level programming language. This is not the best idea, because the main advantage of a knowledge-based system is that knowledge is separated from inference, and potentially a problem domain expert should be able to write rules without understanding the details of the inference process
143* Using **expert systems shell**, i.e. a system specifically designed to be populated by knowledge using some knowledge representation language.
144
145## ✍️ Exercise: Animal Inference
146
147See [Animals.ipynb](Animals.ipynb) for an example of implementing forward and backward inference expert system.
148
149> **Note**: This example is rather simple, and only gives the idea of how an expert system looks like. Once you start creating such a system, you will only notice some *intelligent* behaviour from it once you reach certain number of rules, around 200+. At some point, rules become too complex to keep all of them in mind, and at this point you may start wondering why a system makes certain decisions. However, the important characteristics of knowledge-based systems is that you can always *explain* exactly any of the decisions were made.
150
151## Ontologies and the Semantic Web
152
153At the end of 20th century there was an initiative to use knowledge representation to annotate Internet resources, so that it would be possible to find resources that correspond to very specific queries. This motion was called **Semantic Web**, and it relied on several concepts:
154- A special knowledge representation based on **[description logics](https://en.wikipedia.org/wiki/Description_logic)** (DL). It is similar to frame knowledge representation, because it builds a hierarchy of objects with properties, but it has formal logical semantics and inference. There is a whole family of DLs which balance between expressiveness and algorithmic complexity of inference.
155- Distributed knowledge representation, where all concepts are represented by a global URI identifier, making it possible to create knowledge hierarchies that span the internet.
156- A family of XML-based languages for knowledge description: RDF (Resource Description Framework), RDFS (RDF Schema), OWL (Ontology Web Language).
157
158A core concept in the Semantic Web is a concept of **Ontology**. It refers to a explicit specification of a problem domain using some formal knowledge representation. The simplest ontology can be just a hierarchy of objects in a problem domain, but more complex ontologies will include rules that can be used for inference.
159
160In the semantic web, all representations are based on triplets. Each object and each relation are uniquely identified by the URI. For example, if we want to state the fact that this AI Curriculum has been developed by Dmitry Soshnikov on Jan 1st, 2022 - here are the triplets we can use:
161
162<img src="images/triplet.png" width="30%"/>
163
164```
165http://github.com/microsoft/ai-for-beginners http://www.example.com/terms/creation-date “Jan 13, 2007”
166http://github.com/microsoft/ai-for-beginners http://purl.org/dc/elements/1.1/creator http://soshnikov.com
167```
168> > ✅ Here `http://www.example.com/terms/creation-date` and `http://purl.org/dc/elements/1.1/creator` are some well-known and universally accepted URIs to express the concepts of *creator* and *creation date*.
169
170In a more complex case, if we want to define a list of creators, we can use some data structures defined in RDF.
171
172<img src="images/triplet-complex.png" width="40%"/>
173
174> Diagrams above by [Dmitry Soshnikov](http://soshnikov.com)
175
176The progress of building the Semantic Web was somehow slowed down by the success of search engines and natural language processing techniques, which allow extracting structured data from text. However, in some areas there are still significant efforts to maintain ontologies and knowledgebases. A few projects worth noting:
177* [WikiData](https://wikidata.org/) is a collection of machine readable knowledge bases associated with Wikipedia. Most of the data is mined from Wikipedia *InfoBoxes*, pieces of structured content inside Wikipedia pages. You can [query](https://query.wikidata.org/) wikidata in SPARQL, a special query language for Semantic Web. Here is a sample query that displays most popular eye colors among humans:
178```sparql
179#defaultView:BubbleChart
180SELECT ?eyeColorLabel (COUNT(?human) AS ?count)
181WHERE
182{
183 ?human wdt:P31 wd:Q5. # human instance-of homo sapiens
184 ?human wdt:P1340 ?eyeColor. # human eye-color ?eyeColor
185 SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
186}
187GROUP BY ?eyeColorLabel
188```
189* [DBpedia](https://www.dbpedia.org/) is another effort similar to WikiData.
190
191> ✅ If you want to experiment with building your own ontologies, or opening existing ones, there is a great visual ontology editor called [Protégé](https://protege.stanford.edu/). Download it, or use it online.
192
193<img src="images/protege.png" width="70%"/>
194
195*Web Protégé editor open with the Romanov Family ontology. Screenshot by Dmitry Soshnikov*
196
197## ✍️ Exercise: A Family Ontology
198
199See [FamilyOntology.ipynb](FamilyOntology.ipynb) for an example of using Semantic Web techniques to reason about family relationships. We will take a family tree represented in common GEDCOM format and an ontology of family relationships and build a graph of all family relationships for given set of individuals.
200
201## Microsoft Concept Graph
202
203In most of the cases, ontologies are carefully created by hand. However, it is also possible to **mine** ontologies from unstructured data, for example, from natural language texts.
204
205One such attempt was done by Microsoft Research, and resulted in [Microsoft Concept Graph](https://blogs.microsoft.com/ai/microsoft-researchers-release-graph-that-helps-machines-conceptualize/?WT.mc_id=academic-57639-dmitryso).
206
207It is a large collection of entities grouped together using `is-a` inheritance relationship. It allows answering questions like "What is Microsoft?" - the answer being something like "a company with probability 0.87, and a brand with probability 0.75".
208
209The Graph is available either as REST API, or as a large downloadable text file that lists all entity pairs.
210
211## ✍️ Exercise: A Concept Graph
212
213Try the [MSConceptGraph.ipynb](MSConceptGraph.ipynb) notebook to see how we can use Microsoft Concept Graph to group news articles into several categories.
214
215## Conclusion
216
217Nowadays, AI is often considered to be a synonym for *Machine Learning* or *Neural Networks*. However, a human being also exhibits explicit reasoning, which is something currently not being handled by neural networks. In real world projects, explicit reasoning is still used to perform tasks that require explanations, or being able to modify the behavior of the system in a controlled way.
218
219## 🚀 Challenge
220
221In the three notebooks associated to this lesson, there are challenges at the end - pick one, and try to solve it!
222
223## [Post-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/4)
224
225## Review & Self Study
226
227Do some research on the internet to discover areas where humans have tried to quantify and codify knowledge. Take a look at Bloom's Taxonomy, and go back in history to learn how humans tried to make sense of their world. Explore the work of Linnaeus to create a taxonomy of organisms, and observe the way Dmitri Mendeleev created a way for chemical elements to be described and grouped. What other interesting examples can you find?
228
229**Assignment**: [Build an Ontology](assignment.md)
230