microsoft/AI-For-Beginners
Publicmirrored fromhttps://github.com/microsoft/AI-For-BeginnersAvailable
lessons/2-Symbolic/FamilyOntology.ipynb
582lines · modecode
| 1 | { |
| 2 | "cells": [ |
| 3 | { |
| 4 | "cell_type": "markdown", |
| 5 | "metadata": { |
| 6 | "collapsed": true |
| 7 | }, |
| 8 | "source": [ |
| 9 | "# Family Relationships Ontology\n", |
| 10 | "\n", |
| 11 | "This example is a part of [AI for Beginners Curriculum](http://github.com/microsoft/ai-for-beginners), and it has been inspired by [this blog post](https://habr.com/post/270857/).\n", |
| 12 | "\n", |
| 13 | "I always find it difficult to remember different relationships between people in a family. In this example, we will take an ontology that defines family relationships, and the actual genealogical tree, and show how we can then perform automatic inference to find all relatives.\n", |
| 14 | "\n", |
| 15 | "### Getting the Genealogical Tree\n", |
| 16 | "\n", |
| 17 | "As an example, we will take genealogical tree of [Romanov Tsar Family](https://en.wikipedia.org/wiki/House_of_Romanov). The most common format for describing family relationships is [GEDCOM](https://en.wikipedia.org/wiki/GEDCOM). We will take Romanov family tree in GEDCOM format:" |
| 18 | ] |
| 19 | }, |
| 20 | { |
| 21 | "cell_type": "code", |
| 22 | "execution_count": 1, |
| 23 | "metadata": { |
| 24 | "trusted": true |
| 25 | }, |
| 26 | "outputs": [ |
| 27 | { |
| 28 | "name": "stdout", |
| 29 | "output_type": "stream", |
| 30 | "text": [ |
| 31 | "0 HEAD\n", |
| 32 | "1 CHAR UTF8\n", |
| 33 | "1 GEDC\n", |
| 34 | "2 VERS 5.5\n", |
| 35 | "0 @0@ INDI\n", |
| 36 | "1 NAME Mihail Fedorovich /Romanov/\n", |
| 37 | "1 SEX M\n", |
| 38 | "1 BIRT\n", |
| 39 | "2 DATE 1613\n", |
| 40 | "1 DEAT \n", |
| 41 | "2 DATE 1645\n", |
| 42 | "1 FAMS @41@\n", |
| 43 | "0 @1@ INDI\n", |
| 44 | "1 NAME Evdokija Lukjanovna /Streshneva/\n", |
| 45 | "1 SEX F\n" |
| 46 | ] |
| 47 | } |
| 48 | ], |
| 49 | "source": [ |
| 50 | "!head -15 data/tsars.ged" |
| 51 | ] |
| 52 | }, |
| 53 | { |
| 54 | "cell_type": "markdown", |
| 55 | "metadata": {}, |
| 56 | "source": [ |
| 57 | "To use GEDCOM file, we can use `python-gedcom` library:" |
| 58 | ] |
| 59 | }, |
| 60 | { |
| 61 | "cell_type": "code", |
| 62 | "execution_count": 2, |
| 63 | "metadata": { |
| 64 | "trusted": true |
| 65 | }, |
| 66 | "outputs": [ |
| 67 | { |
| 68 | "name": "stdout", |
| 69 | "output_type": "stream", |
| 70 | "text": [ |
| 71 | "Collecting python-gedcom\n", |
| 72 | " Downloading python_gedcom-1.0.0-py2.py3-none-any.whl (35 kB)\n", |
| 73 | "Installing collected packages: python-gedcom\n", |
| 74 | "Successfully installed python-gedcom-1.0.0\n" |
| 75 | ] |
| 76 | } |
| 77 | ], |
| 78 | "source": [ |
| 79 | "import sys\n", |
| 80 | "!{sys.executable} -m pip install python-gedcom" |
| 81 | ] |
| 82 | }, |
| 83 | { |
| 84 | "cell_type": "markdown", |
| 85 | "metadata": {}, |
| 86 | "source": [ |
| 87 | "This library takes away some of the technical problems with file parsing, but it still gives us pretty low-level access to all individuals and families in the tree. Here is how we can parse the file, and show the list of all individuals:" |
| 88 | ] |
| 89 | }, |
| 90 | { |
| 91 | "cell_type": "code", |
| 92 | "execution_count": 3, |
| 93 | "metadata": { |
| 94 | "trusted": true |
| 95 | }, |
| 96 | "outputs": [], |
| 97 | "source": [ |
| 98 | "from gedcom.parser import Parser\n", |
| 99 | "from gedcom.element.individual import IndividualElement\n", |
| 100 | "from gedcom.element.family import FamilyElement\n", |
| 101 | "g = Parser()\n", |
| 102 | "g.parse_file('data/tsars.ged')" |
| 103 | ] |
| 104 | }, |
| 105 | { |
| 106 | "cell_type": "code", |
| 107 | "execution_count": 4, |
| 108 | "metadata": { |
| 109 | "scrolled": true, |
| 110 | "trusted": true |
| 111 | }, |
| 112 | "outputs": [ |
| 113 | { |
| 114 | "data": { |
| 115 | "text/plain": [ |
| 116 | "[('@0@', ('Mihail Fedorovich', 'Romanov')),\n", |
| 117 | " ('@1@', ('Evdokija Lukjanovna', 'Streshneva')),\n", |
| 118 | " ('@2@', ('Aleksej Mihajlovich', 'Romanov')),\n", |
| 119 | " ('@3@', ('Marija Ilinichna', 'Miloslavskaja')),\n", |
| 120 | " ('@4@', ('Natalja Kirillovna', 'Naryshkina')),\n", |
| 121 | " ('@5@', ('Marfa Matveevna', 'Apraksina')),\n", |
| 122 | " ('@6@', ('Fedor Alekseevich', 'Romanov')),\n", |
| 123 | " ('@7@', ('Sofja Aleksevna', 'Romanova')),\n", |
| 124 | " ('@8@', ('Ivan V Alekseevich', 'Romanov')),\n", |
| 125 | " ('@9@', ('Praskovja Fedorovna', 'Saltykova')),\n", |
| 126 | " ('@10@', ('Ekaterina Ivanovna', 'Romanova')),\n", |
| 127 | " ('@11@', ('Anna Ivanovna', 'Romanova')),\n", |
| 128 | " ('@12@', ('Fridrih Vilgelm', 'Kurlandskij')),\n", |
| 129 | " ('@13@', ('Karl Leopold', 'Meklenburg-Shverinskij')),\n", |
| 130 | " ('@14@', ('Anna Leopoldovna', 'Meklenburg-Shverinskaja')),\n", |
| 131 | " ('@15@', ('Anton Ulrih', 'Braunshvejg-Volfenbjuttelskij')),\n", |
| 132 | " ('@16@', ('Ivan VI Antonovich', 'Braunshvejg-Volfenbjuttelskij')),\n", |
| 133 | " ('@17@', ('Petr I Alekseevich', 'Romanov')),\n", |
| 134 | " ('@18@', ('Evdokija Fedorovna', 'Lopuhina')),\n", |
| 135 | " ('@19@', ('Ekaterina I Alekseevna', 'Mihajlova')),\n", |
| 136 | " ('@20@', ('Aleksej Petrovich', 'Romanov')),\n", |
| 137 | " ('@21@', ('Sharlotta Kristina', 'Braunshvejg-Volfenbjuttelskaja')),\n", |
| 138 | " ('@22@', ('Petr II Alekseevich', 'Romanov')),\n", |
| 139 | " ('@23@', ('Anna Petrovna', 'Romanova')),\n", |
| 140 | " ('@24@', ('Elizaveta Petrovna', 'Romanova')),\n", |
| 141 | " ('@25@', ('Karl Fridrih', 'Golshtejn-Gottorpskij')),\n", |
| 142 | " ('@26@', ('Petr III Fedorovich', 'Romanov')),\n", |
| 143 | " ('@27@', ('Ekaterina II', 'Alekseevna')),\n", |
| 144 | " ('@28@', ('Pavel I Petrovich', 'Romanov')),\n", |
| 145 | " ('@29@', ('Natalja Alekseevna', 'Gessen-Darmshtadskaja')),\n", |
| 146 | " ('@30@', ('Marija Fedorovna', 'Vjurtembergskaja')),\n", |
| 147 | " ('@31@', ('Aleksandr I Pavlovich', 'Romanov')),\n", |
| 148 | " ('@32@', ('Elizaveta Alekseevna', 'Baden-Durlahskaja')),\n", |
| 149 | " ('@33@', ('Nikolaj I Pavlovich', 'Romanov')),\n", |
| 150 | " ('@34@', ('Aleksandra Fedorovna', 'Prusskaja')),\n", |
| 151 | " ('@35@', ('Aleksandr II Nikolaevich', 'Romanov')),\n", |
| 152 | " ('@36@', ('Marija Aleksandrovna', 'Gessenskaja')),\n", |
| 153 | " ('@37@', ('Aleksandr III Aleksandrovich', 'Romanov')),\n", |
| 154 | " ('@38@', ('Marija Fedorovna', 'Datskaja')),\n", |
| 155 | " ('@39@', ('Nikolaj II Aleksandrovich', 'Romanov')),\n", |
| 156 | " ('@40@', ('Aleksandra Fedorovna', 'Gessenskaja'))]" |
| 157 | ] |
| 158 | }, |
| 159 | "execution_count": 4, |
| 160 | "metadata": {}, |
| 161 | "output_type": "execute_result" |
| 162 | } |
| 163 | ], |
| 164 | "source": [ |
| 165 | "d = g.get_element_dictionary()\n", |
| 166 | "[ (k,v.get_name()) for k,v in d.items() if isinstance(v,IndividualElement)]" |
| 167 | ] |
| 168 | }, |
| 169 | { |
| 170 | "cell_type": "markdown", |
| 171 | "metadata": {}, |
| 172 | "source": [ |
| 173 | "Here is how we can get information about families. Note that is gives us a list of **identifiers**, and we need to convert them to names if we want more clarity:" |
| 174 | ] |
| 175 | }, |
| 176 | { |
| 177 | "cell_type": "code", |
| 178 | "execution_count": 5, |
| 179 | "metadata": {}, |
| 180 | "outputs": [ |
| 181 | { |
| 182 | "data": { |
| 183 | "text/plain": [ |
| 184 | "[('@41@', ['@0@', '@1@', '@2@']),\n", |
| 185 | " ('@42@', ['@2@', '@3@', '@6@', '@7@', '@8@']),\n", |
| 186 | " ('@43@', ['@8@', '@9@', '@10@', '@11@']),\n", |
| 187 | " ('@44@', ['@13@', '@10@', '@14@']),\n", |
| 188 | " ('@45@', ['@15@', '@14@', '@16@']),\n", |
| 189 | " ('@46@', ['@2@', '@4@', '@17@']),\n", |
| 190 | " ('@47@', ['@17@', '@18@', '@20@']),\n", |
| 191 | " ('@48@', ['@20@', '@21@', '@22@']),\n", |
| 192 | " ('@49@', ['@17@', '@19@', '@23@', '@24@']),\n", |
| 193 | " ('@50@', ['@25@', '@23@', '@26@']),\n", |
| 194 | " ('@51@', ['@26@', '@27@', '@28@']),\n", |
| 195 | " ('@52@', ['@28@', '@30@', '@31@', '@33@']),\n", |
| 196 | " ('@53@', ['@33@', '@34@', '@35@']),\n", |
| 197 | " ('@54@', ['@35@', '@36@', '@37@']),\n", |
| 198 | " ('@55@', ['@37@', '@38@', '@39@'])]" |
| 199 | ] |
| 200 | }, |
| 201 | "execution_count": 5, |
| 202 | "metadata": {}, |
| 203 | "output_type": "execute_result" |
| 204 | } |
| 205 | ], |
| 206 | "source": [ |
| 207 | "d = g.get_element_dictionary()\n", |
| 208 | "[ (k,[x.get_value() for x in v.get_child_elements()]) for k,v in d.items() if isinstance(v,FamilyElement)]" |
| 209 | ] |
| 210 | }, |
| 211 | { |
| 212 | "cell_type": "markdown", |
| 213 | "metadata": {}, |
| 214 | "source": [ |
| 215 | "### Getting Family Ontology\n", |
| 216 | "\n", |
| 217 | "Next, let's have a look at [family ontology](https://raw.githubusercontent.com/blokhin/genealogical-trees/master/data/header.ttl) defined as a set of Semantic Web triplets. This ontology defines such relationships as `isUncleOf`, `isCousinOf`, and many others. All those relationships are defined in terms of basic predicates `isMotherOf`, `isFatherOf`, `isBrotherOf` and `isSisterOf`. We will use automatic reasoning to deduce all other relationships using the ontology.\n", |
| 218 | "\n", |
| 219 | "Here is a sample definition of `isAuntOf` property, which is defined as a composition of `isSisterOf` and `isParentOf` (*Aunt is a sister of one's parent*).\n", |
| 220 | "\n", |
| 221 | "```\n", |
| 222 | "fhkb:isAuntOf a owl:ObjectProperty ;\n", |
| 223 | " rdfs:domain fhkb:Woman ;\n", |
| 224 | " rdfs:range fhkb:Person ;\n", |
| 225 | " owl:propertyChainAxiom ( fhkb:isSisterOf fhkb:isParentOf ) .\n", |
| 226 | "```" |
| 227 | ] |
| 228 | }, |
| 229 | { |
| 230 | "cell_type": "code", |
| 231 | "execution_count": 6, |
| 232 | "metadata": { |
| 233 | "trusted": true |
| 234 | }, |
| 235 | "outputs": [ |
| 236 | { |
| 237 | "name": "stdout", |
| 238 | "output_type": "stream", |
| 239 | "text": [ |
| 240 | "@prefix fhkb: <http://www.example.com/genealogy.owl#> .\n", |
| 241 | "@prefix owl: <http://www.w3.org/2002/07/owl#> .\n", |
| 242 | "@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n", |
| 243 | "@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n", |
| 244 | "@prefix xml: <http://www.w3.org/XML/1998/namespace> .\n", |
| 245 | "@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n", |
| 246 | "\n", |
| 247 | "<http://www.example.com/genealogy.owl#> a owl:Ontology .\n", |
| 248 | "\n", |
| 249 | "fhkb:DomainEntity a owl:Class .\n", |
| 250 | "\n", |
| 251 | "fhkb:Man a owl:Class ;\n", |
| 252 | " owl:equivalentClass [ a owl:Class ;\n", |
| 253 | " owl:intersectionOf ( fhkb:Person [ a owl:Restriction ;\n", |
| 254 | " owl:onProperty fhkb:hasSex ;\n", |
| 255 | " owl:someValuesFrom fhkb:Male ] ) ] .\n", |
| 256 | "\n", |
| 257 | "fhkb:Woman a owl:Class ;\n", |
| 258 | " owl:equivalentClass [ a owl:Class ;\n", |
| 259 | " owl:intersectionOf ( fhkb:Person [ a owl:Restriction ;\n" |
| 260 | ] |
| 261 | } |
| 262 | ], |
| 263 | "source": [ |
| 264 | "!head -20 data/onto.ttl" |
| 265 | ] |
| 266 | }, |
| 267 | { |
| 268 | "cell_type": "markdown", |
| 269 | "metadata": {}, |
| 270 | "source": [ |
| 271 | "### Constructing Ontology for Inference\n", |
| 272 | "\n", |
| 273 | "For simplicity, we will create one ontology file that will include original rules from family ontology, and facts about individuals from our GEDCOM file. We will go through the GEDCOM file and extract information about families and individuals, and convert them to triplets." |
| 274 | ] |
| 275 | }, |
| 276 | { |
| 277 | "cell_type": "code", |
| 278 | "execution_count": 7, |
| 279 | "metadata": { |
| 280 | "trusted": true |
| 281 | }, |
| 282 | "outputs": [], |
| 283 | "source": [ |
| 284 | "!cp data/onto.ttl .\n", |
| 285 | "\n", |
| 286 | "gedcom_dict = g.get_element_dictionary()\n", |
| 287 | "individuals, marriages = {}, {}\n", |
| 288 | "\n", |
| 289 | "def term2id(el):\n", |
| 290 | " return \"i\" + el.get_pointer().replace('@', '').lower()\n", |
| 291 | "\n", |
| 292 | "out = open(\"onto.ttl\",\"a\")\n", |
| 293 | "\n", |
| 294 | "for k, v in gedcom_dict.items():\n", |
| 295 | " if isinstance(v,IndividualElement):\n", |
| 296 | " children, siblings = set(), set()\n", |
| 297 | " idx = term2id(v)\n", |
| 298 | "\n", |
| 299 | " title = v.get_name()[0] + \" \" + v.get_name()[1]\n", |
| 300 | " title = title.replace('\"', '').replace('[', '').replace(']', '').replace('(', '').replace(')', '').strip()\n", |
| 301 | "\n", |
| 302 | " own_families = g.get_families(v, 'FAMS')\n", |
| 303 | " for fam in own_families:\n", |
| 304 | " children |= set(term2id(i) for i in g.get_family_members(fam, \"CHIL\"))\n", |
| 305 | "\n", |
| 306 | " parent_families = g.get_families(v, 'FAMC')\n", |
| 307 | " if len(parent_families):\n", |
| 308 | " for member in g.get_family_members(parent_families[0], \"CHIL\"): # NB adoptive families i.e len(parent_families)>1 are not considered (TODO?)\n", |
| 309 | " if member.get_pointer() == v.get_pointer():\n", |
| 310 | " continue\n", |
| 311 | " siblings.add(term2id(member))\n", |
| 312 | "\n", |
| 313 | " if idx in individuals:\n", |
| 314 | " children |= individuals[idx].get('children', set())\n", |
| 315 | " siblings |= individuals[idx].get('siblings', set())\n", |
| 316 | " individuals[idx] = {'sex': v.get_gender().lower(), 'children': children, 'siblings': siblings, 'title': title}\n", |
| 317 | "\n", |
| 318 | " elif isinstance(v,FamilyElement):\n", |
| 319 | " wife, husb, children = None, None, set()\n", |
| 320 | " children = set(term2id(i) for i in g.get_family_members(v, \"CHIL\"))\n", |
| 321 | "\n", |
| 322 | " try:\n", |
| 323 | " wife = g.get_family_members(v, \"WIFE\")[0]\n", |
| 324 | " wife = term2id(wife)\n", |
| 325 | " if wife in individuals: individuals[wife]['children'] |= children\n", |
| 326 | " else: individuals[wife] = {'children': children}\n", |
| 327 | " except IndexError: pass\n", |
| 328 | " try:\n", |
| 329 | " husb = g.get_family_members(v, \"HUSB\")[0]\n", |
| 330 | " husb = term2id(husb)\n", |
| 331 | " if husb in individuals: individuals[husb]['children'] |= children\n", |
| 332 | " else: individuals[husb] = {'children': children}\n", |
| 333 | " except IndexError: pass\n", |
| 334 | "\n", |
| 335 | " if wife and husb: marriages[wife + husb] = (term2id(v), wife, husb)\n", |
| 336 | "\n", |
| 337 | "for idx, val in individuals.items():\n", |
| 338 | " added_terms = ''\n", |
| 339 | " if val['sex'] == 'f':\n", |
| 340 | " parent_predicate, sibl_predicate = \"isMotherOf\", \"isSisterOf\"\n", |
| 341 | " else:\n", |
| 342 | " parent_predicate, sibl_predicate = \"isFatherOf\", \"isBrotherOf\"\n", |
| 343 | " if len(val['children']):\n", |
| 344 | " added_terms += \" ;\\n fhkb:\" + parent_predicate + \" \" + \", \".join([\"fhkb:\" + i for i in val['children']])\n", |
| 345 | " if len(val['siblings']):\n", |
| 346 | " added_terms += \" ;\\n fhkb:\" + sibl_predicate + \" \" + \", \".join([\"fhkb:\" + i for i in val['siblings']])\n", |
| 347 | " out.write(\"fhkb:%s a owl:NamedIndividual, owl:Thing%s ;\\n rdfs:label \\\"%s\\\" .\\n\" % (idx, added_terms, val['title']))\n", |
| 348 | "\n", |
| 349 | "for k, v in marriages.items():\n", |
| 350 | " out.write(\"fhkb:%s a owl:NamedIndividual, owl:Thing ;\\n fhkb:hasFemalePartner fhkb:%s ;\\n fhkb:hasMalePartner fhkb:%s .\\n\" % v)\n", |
| 351 | "\n", |
| 352 | "out.write(\"[] a owl:AllDifferent ;\\n owl:distinctMembers (\")\n", |
| 353 | "for idx in individuals.keys():\n", |
| 354 | " out.write(\" fhkb:\" + idx)\n", |
| 355 | "for k, v in marriages.items():\n", |
| 356 | " out.write(\" fhkb:\" + v[0])\n", |
| 357 | "out.write(\" ) .\")\n", |
| 358 | "out.close()" |
| 359 | ] |
| 360 | }, |
| 361 | { |
| 362 | "cell_type": "code", |
| 363 | "execution_count": 8, |
| 364 | "metadata": { |
| 365 | "trusted": true |
| 366 | }, |
| 367 | "outputs": [ |
| 368 | { |
| 369 | "name": "stdout", |
| 370 | "output_type": "stream", |
| 371 | "text": [ |
| 372 | " fhkb:hasFemalePartner fhkb:i34 ;\n", |
| 373 | " fhkb:hasMalePartner fhkb:i33 .\n", |
| 374 | "fhkb:i54 a owl:NamedIndividual, owl:Thing ;\n", |
| 375 | " fhkb:hasFemalePartner fhkb:i36 ;\n", |
| 376 | " fhkb:hasMalePartner fhkb:i35 .\n", |
| 377 | "fhkb:i55 a owl:NamedIndividual, owl:Thing ;\n", |
| 378 | " fhkb:hasFemalePartner fhkb:i38 ;\n", |
| 379 | " fhkb:hasMalePartner fhkb:i37 .\n", |
| 380 | "[] a owl:AllDifferent ;\n", |
| 381 | " owl:distinctMembers ( fhkb:i0 fhkb:i1 fhkb:i2 fhkb:i3 fhkb:i4 fhkb:i5 fhkb:i6 fhkb:i7 fhkb:i8 fhkb:i9 fhkb:i10 fhkb:i11 fhkb:i12 fhkb:i13 fhkb:i14 fhkb:i15 fhkb:i16 fhkb:i17 fhkb:i18 fhkb:i19 fhkb:i20 fhkb:i21 fhkb:i22 fhkb:i23 fhkb:i24 fhkb:i25 fhkb:i26 fhkb:i27 fhkb:i28 fhkb:i29 fhkb:i30 fhkb:i31 fhkb:i32 fhkb:i33 fhkb:i34 fhkb:i35 fhkb:i36 fhkb:i37 fhkb:i38 fhkb:i39 fhkb:i40 fhkb:i41 fhkb:i42 fhkb:i43 fhkb:i44 fhkb:i45 fhkb:i46 fhkb:i47 fhkb:i48 fhkb:i49 fhkb:i50 fhkb:i51 fhkb:i52 fhkb:i53 fhkb:i54 fhkb:i55 ) ." |
| 382 | ] |
| 383 | } |
| 384 | ], |
| 385 | "source": [ |
| 386 | "!tail onto.ttl" |
| 387 | ] |
| 388 | }, |
| 389 | { |
| 390 | "cell_type": "markdown", |
| 391 | "metadata": {}, |
| 392 | "source": [ |
| 393 | "### Doing Inference \n", |
| 394 | "\n", |
| 395 | "Now we want to be able to use this ontology for inference and for querying. We will use [RDFLib](https://github.com/RDFLib), library for reading RDF Graph in different formats, querying it, etc. \n", |
| 396 | "\n", |
| 397 | "For logical inference, we will use [OWL-RL](https://github.com/RDFLib/OWL-RL) library, which allows us to build **Closure** of the RDF Graph, i.e. add all possible concepts and relations that can be inferred." |
| 398 | ] |
| 399 | }, |
| 400 | { |
| 401 | "cell_type": "code", |
| 402 | "execution_count": 10, |
| 403 | "metadata": { |
| 404 | "trusted": true |
| 405 | }, |
| 406 | "outputs": [ |
| 407 | { |
| 408 | "name": "stdout", |
| 409 | "output_type": "stream", |
| 410 | "text": [ |
| 411 | "Requirement already satisfied: rdflib in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (6.3.2)\n", |
| 412 | "Requirement already satisfied: isodate<0.7.0,>=0.6.0 in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (from rdflib) (0.6.1)\n", |
| 413 | "Requirement already satisfied: pyparsing<4,>=2.1.0 in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (from rdflib) (3.0.9)\n", |
| 414 | "Requirement already satisfied: six in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (from isodate<0.7.0,>=0.6.0->rdflib) (1.16.0)\n", |
| 415 | "Collecting git+https://github.com/RDFLib/OWL-RL.git\n", |
| 416 | " Cloning https://github.com/RDFLib/OWL-RL.git to /tmp/pip-req-build-lbfzwi3m\n", |
| 417 | " Running command git clone --filter=blob:none --quiet https://github.com/RDFLib/OWL-RL.git /tmp/pip-req-build-lbfzwi3m\n", |
| 418 | " Resolved https://github.com/RDFLib/OWL-RL.git to commit a77e1791b88b54aace609bc6000aac14c7add4ff\n", |
| 419 | " Preparing metadata (setup.py) ... \u001b[?25ldone\n", |
| 420 | "\u001b[?25hRequirement already satisfied: rdflib>=6.0.2 in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (from owlrl==6.0.2) (6.3.2)\n", |
| 421 | "Requirement already satisfied: isodate<0.7.0,>=0.6.0 in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (from rdflib>=6.0.2->owlrl==6.0.2) (0.6.1)\n", |
| 422 | "Requirement already satisfied: pyparsing<4,>=2.1.0 in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (from rdflib>=6.0.2->owlrl==6.0.2) (3.0.9)\n", |
| 423 | "Requirement already satisfied: six in /home/rg/anaconda3/envs/ai4beg/lib/python3.11/site-packages (from isodate<0.7.0,>=0.6.0->rdflib>=6.0.2->owlrl==6.0.2) (1.16.0)\n" |
| 424 | ] |
| 425 | } |
| 426 | ], |
| 427 | "source": [ |
| 428 | "!{sys.executable} -m pip install rdflib\n", |
| 429 | "!{sys.executable} -m pip install git+https://github.com/RDFLib/OWL-RL.git" |
| 430 | ] |
| 431 | }, |
| 432 | { |
| 433 | "cell_type": "markdown", |
| 434 | "metadata": {}, |
| 435 | "source": [ |
| 436 | "Let's open the ontology file and see how many triplets it contains:" |
| 437 | ] |
| 438 | }, |
| 439 | { |
| 440 | "cell_type": "code", |
| 441 | "execution_count": 11, |
| 442 | "metadata": { |
| 443 | "trusted": true |
| 444 | }, |
| 445 | "outputs": [ |
| 446 | { |
| 447 | "name": "stdout", |
| 448 | "output_type": "stream", |
| 449 | "text": [ |
| 450 | "Triplets found:669\n" |
| 451 | ] |
| 452 | } |
| 453 | ], |
| 454 | "source": [ |
| 455 | "import rdflib\n", |
| 456 | "from owlrl import DeductiveClosure, OWLRL_Extension\n", |
| 457 | "\n", |
| 458 | "g = rdflib.Graph()\n", |
| 459 | "g.parse(\"onto.ttl\", format=\"turtle\")\n", |
| 460 | "\n", |
| 461 | "print(\"Triplets found:%d\" % len(g))" |
| 462 | ] |
| 463 | }, |
| 464 | { |
| 465 | "cell_type": "markdown", |
| 466 | "metadata": {}, |
| 467 | "source": [ |
| 468 | "Now let's build the closure, and see how the number of triplets increase:" |
| 469 | ] |
| 470 | }, |
| 471 | { |
| 472 | "cell_type": "code", |
| 473 | "execution_count": 12, |
| 474 | "metadata": { |
| 475 | "trusted": true |
| 476 | }, |
| 477 | "outputs": [ |
| 478 | { |
| 479 | "name": "stdout", |
| 480 | "output_type": "stream", |
| 481 | "text": [ |
| 482 | "Triplets after inference:4246\n" |
| 483 | ] |
| 484 | } |
| 485 | ], |
| 486 | "source": [ |
| 487 | "DeductiveClosure(OWLRL_Extension).expand(g)\n", |
| 488 | "print(\"Triplets after inference:%d\" % len(g))" |
| 489 | ] |
| 490 | }, |
| 491 | { |
| 492 | "cell_type": "markdown", |
| 493 | "metadata": {}, |
| 494 | "source": [ |
| 495 | "### Querying for Relatives \n", |
| 496 | "\n", |
| 497 | "Now we can query the graph to see different relations between people. We can use **SPARQL** language together with `query` method. In our case, let's see all **uncles** in our family tree:" |
| 498 | ] |
| 499 | }, |
| 500 | { |
| 501 | "cell_type": "code", |
| 502 | "execution_count": 13, |
| 503 | "metadata": { |
| 504 | "trusted": true |
| 505 | }, |
| 506 | "outputs": [ |
| 507 | { |
| 508 | "name": "stdout", |
| 509 | "output_type": "stream", |
| 510 | "text": [ |
| 511 | "Fedor Alekseevich Romanov is uncle of Ekaterina Ivanovna Romanova\n", |
| 512 | "Aleksandr I Pavlovich Romanov is uncle of Aleksandr II Nikolaevich Romanov\n", |
| 513 | "Fedor Alekseevich Romanov is uncle of Anna Ivanovna Romanova\n" |
| 514 | ] |
| 515 | } |
| 516 | ], |
| 517 | "source": [ |
| 518 | "qres = g.query(\n", |
| 519 | " \"\"\"SELECT DISTINCT ?aname ?bname\n", |
| 520 | " WHERE {\n", |
| 521 | " ?a fhkb:isUncleOf ?b .\n", |
| 522 | " ?a rdfs:label ?aname .\n", |
| 523 | " ?b rdfs:label ?bname .\n", |
| 524 | " }\"\"\")\n", |
| 525 | "\n", |
| 526 | "for row in qres:\n", |
| 527 | " print(\"%s is uncle of %s\" % row)" |
| 528 | ] |
| 529 | }, |
| 530 | { |
| 531 | "cell_type": "markdown", |
| 532 | "metadata": {}, |
| 533 | "source": [ |
| 534 | "Feel free to experiment with different other family relations. For example, you can have a look at `isAncestorOf` relation, which recurrently defines all ancestors of a given person.\n", |
| 535 | "\n", |
| 536 | "Finally, let's clean up!" |
| 537 | ] |
| 538 | }, |
| 539 | { |
| 540 | "cell_type": "code", |
| 541 | "execution_count": 14, |
| 542 | "metadata": { |
| 543 | "trusted": true |
| 544 | }, |
| 545 | "outputs": [], |
| 546 | "source": [ |
| 547 | "!rm onto.ttl" |
| 548 | ] |
| 549 | }, |
| 550 | { |
| 551 | "cell_type": "code", |
| 552 | "execution_count": null, |
| 553 | "metadata": {}, |
| 554 | "outputs": [], |
| 555 | "source": [] |
| 556 | } |
| 557 | ], |
| 558 | "metadata": { |
| 559 | "interpreter": { |
| 560 | "hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5" |
| 561 | }, |
| 562 | "kernelspec": { |
| 563 | "display_name": "Python 3.6", |
| 564 | "language": "python", |
| 565 | "name": "python3" |
| 566 | }, |
| 567 | "language_info": { |
| 568 | "codemirror_mode": { |
| 569 | "name": "ipython", |
| 570 | "version": 3 |
| 571 | }, |
| 572 | "file_extension": ".py", |
| 573 | "mimetype": "text/x-python", |
| 574 | "name": "python", |
| 575 | "nbconvert_exporter": "python", |
| 576 | "pygments_lexer": "ipython3", |
| 577 | "version": "3.11.2" |
| 578 | } |
| 579 | }, |
| 580 | "nbformat": 4, |
| 581 | "nbformat_minor": 2 |
| 582 | } |
| 583 | |