mockup #7
@ -6,12 +6,16 @@
|
||||
|date | when user want to watch a movie| 2022-04-06
|
||||
|time | play time | 20:30
|
||||
|quantity | number of tickets | 2 OR two
|
||||
|location | location of cinema | Poznań Plaza OR Multikino 51
|
||||
|seats | what seats are reserved | [h1, h2, h3]
|
||||
|reservation_id| reservation number | 32453758
|
||||
|goal | users goal in system | reservation OR cancel
|
||||
|area | preferred place to sit | [front, middle] OR [random, aisle]
|
||||
|tickets_type | tickets types and quantities | [normal, 1] OR [[student, 2], [normal, 1]]
|
||||
|interval | time interval | w tym tygodniu OR w następnym tygodniu
|
||||
|goal | users goal in system | chciałbym zarezerwować(opcjonalne) OR jakie filmy gracie
|
||||
|tickets_type - | tickets types and quantities | [normal, 1] OR [[student, 2], [normal, 1]]
|
||||
|location - | location of cinema | Poznań Plaza OR Multikino 51
|
||||
|reservation_id - | reservation number | 32453758
|
||||
|
||||
|
||||
|
||||
|
||||
# Speech acts systemu
|
||||
|
||||
|
@ -1,228 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"source": [
|
||||
"![Logo 1](https://git.wmi.amu.edu.pl/AITech/Szablon/raw/branch/master/Logotyp_AITech1.jpg)\n",
|
||||
"<div class=\"alert alert-block alert-info\">\n",
|
||||
"<h1> Systemy Dialogowe </h1>\n",
|
||||
"<h2> 7. <i>Parsing semantyczny z wykorzystaniem gramatyk</i> [laboratoria]</h2> \n",
|
||||
"<h3> Marek Kubis (2021)</h3>\n",
|
||||
"</div>\n",
|
||||
"\n",
|
||||
"![Logo 2](https://git.wmi.amu.edu.pl/AITech/Szablon/raw/branch/master/Logotyp_AITech2.jpg)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Parsing semantyczny z wykorzystaniem gramatyk\n",
|
||||
"=============================================\n",
|
||||
"\n",
|
||||
"Warto\u015bci slot\u00f3w mo\u017cemy wydobywa\u0107 z wypowiedzi u\u017cytkownika korzystaj\u0105c z takich technik, jak:\n",
|
||||
"\n",
|
||||
" - wyszukiwanie s\u0142\u00f3w kluczowych w tek\u015bcie,\n",
|
||||
"\n",
|
||||
" - dopasowywanie wzorc\u00f3w zbudowanych przy u\u017cyciu wyra\u017ce\u0144 regularnych,\n",
|
||||
"\n",
|
||||
" - parsery regu\u0142owe (temat dzisiejszych zaj\u0119\u0107),\n",
|
||||
"\n",
|
||||
" - uczenie maszynowe (temat kolejnych zaj\u0119\u0107)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Przyk\u0142ady parser\u00f3w regu\u0142owych\n",
|
||||
"-----------------------------\n",
|
||||
"\n",
|
||||
" - [Phoenix](http://wiki.speech.cs.cmu.edu/olympus/index.php/Phoenix_Server) \u2014 parser gramatyk\n",
|
||||
" bezkontekstowych whodz\u0105cy w sk\u0142ad systemu dialogowego [Olympus](http://wiki.speech.cs.cmu.edu/olympus/index.php/Olympus)\n",
|
||||
"\n",
|
||||
" - Parsery [DCG](https://www.swi-prolog.org/pldoc/man?section=DCG) (Definite Clause Grammars) j\u0119zyka [Prolog](https://www.swi-prolog.org/)\n",
|
||||
"\n",
|
||||
" - [JSpeech Grammar Format](https://www.w3.org/TR/jsgf/) (JSGF)\n",
|
||||
"\n",
|
||||
"Przyk\u0142ad\n",
|
||||
"--------\n",
|
||||
"Zapiszmy w JSGF gramatyk\u0119 semantyczn\u0105 dla aktu dialogowego reprezentuj\u0105cego zamiar rezerwacji\n",
|
||||
"stolika w restauracji."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%%writefile book.jsgf\n",
|
||||
"#JSGF V1.0 UTF-8 pl;\n",
|
||||
"\n",
|
||||
"grammar book;\n",
|
||||
"\n",
|
||||
"public <rezerwuj> = chcia\u0142bym zarezerwowa\u0107 stolik <dzien_rezerwacji> <godzina_rezerwacji> <liczba_osob> ;\n",
|
||||
"\n",
|
||||
"<dzien_rezerwacji> = na <dzien> {day};\n",
|
||||
"\n",
|
||||
"<dzien> = dzisiaj | jutro | poniedzia\u0142ek | wtorek | \u015brod\u0119 | czwartek | pi\u0105tek | sobot\u0119 | niedziel\u0119;\n",
|
||||
"\n",
|
||||
"<godzina_rezerwacji> = na [godzin\u0119] <godzina_z_minutami> {hour};\n",
|
||||
"\n",
|
||||
"<godzina_z_minutami> = <godzina> [<minuty>];\n",
|
||||
"\n",
|
||||
"<godzina> = dziewi\u0105t\u0105 | dziesi\u0105t\u0105 | jedenast\u0105 | dwunast\u0105;\n",
|
||||
"\n",
|
||||
"<minuty> = pietna\u015bcie | trzydzie\u015bci;\n",
|
||||
"\n",
|
||||
"<liczba_osob> = (na | dla) <liczba> {size} os\u00f3b;\n",
|
||||
"\n",
|
||||
"<liczba> = dwie | dw\u00f3ch | trzy | trzech | cztery | czterech | pi\u0119\u0107 | pieciu;\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Parser akceptuj\u0105cy powy\u017csz\u0105 gramatyk\u0119 utworzymy korzystaj\u0105c z biblioteki [pyjsgf](https://github.com/Danesprite/pyjsgf)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import jsgf\n",
|
||||
"\n",
|
||||
"book_grammar = jsgf.parse_grammar_file('book.jsgf')\n",
|
||||
"book_grammar"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Wykorzystajmy gramatyk\u0119 `book.jsgf` do analizy nast\u0119puj\u0105cej wypowiedzi"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"utterance = 'chcia\u0142bym zarezerwowa\u0107 stolik na jutro na godzin\u0119 dwunast\u0105 trzydzie\u015bci na pi\u0119\u0107 os\u00f3b'\n",
|
||||
"matched = book_grammar.find_matching_rules(utterance)\n",
|
||||
"matched"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Reprezentacj\u0119 znaczenia mo\u017cna wydoby\u0107 ze sparsowanej wypowiedzi na wiele sposob\u00f3w. My do\n",
|
||||
"wydobywania slot\u00f3w wykorzystamy mechanizm tag\u00f3w JSGF a za nazw\u0119 aktu dialogowego przyjmiemy nazw\u0119\n",
|
||||
"gramatyki. Wzoruj\u0105c si\u0119 na [DSTC2](https://github.com/matthen/dstc) wynikow\u0105 ram\u0119 zapiszemy korzystaj\u0105c ze s\u0142ownika o polach `act` i `slots`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def get_dialog_act(rule):\n",
|
||||
" slots = []\n",
|
||||
" get_slots(rule.expansion, slots)\n",
|
||||
" return {'act': rule.grammar.name, 'slots': slots}\n",
|
||||
"\n",
|
||||
"def get_slots(expansion, slots):\n",
|
||||
" if expansion.tag != '':\n",
|
||||
" slots.append((expansion.tag, expansion.current_match))\n",
|
||||
" return\n",
|
||||
"\n",
|
||||
" for child in expansion.children:\n",
|
||||
" get_slots(child, slots)\n",
|
||||
"\n",
|
||||
" if not expansion.children and isinstance(expansion, jsgf.NamedRuleRef):\n",
|
||||
" get_slots(expansion.referenced_rule.expansion, slots)\n",
|
||||
"\n",
|
||||
"get_dialog_act(matched[0])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\u0141\u0105cz\u0105c powy\u017csze funkcje mo\u017cemy zbudowa\u0107 prosty modu\u0142 NLU."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def nlu(utterance):\n",
|
||||
" matched = book_grammar.find_matching_rules(utterance)\n",
|
||||
"\n",
|
||||
" if matched:\n",
|
||||
" return get_dialog_act(matched[0])\n",
|
||||
" else:\n",
|
||||
" return {'act': 'null', 'slots': []}\n",
|
||||
"\n",
|
||||
"nlu('chcia\u0142bym zarezerwowa\u0107 stolik na jutro na godzin\u0119 dziesi\u0105t\u0105 dla trzech os\u00f3b')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Problemy\n",
|
||||
"--------\n",
|
||||
"\n",
|
||||
" - Co z normalizacj\u0105 wyra\u017ce\u0144 liczbowych takich, jak godziny, daty czy numery telefon\u00f3w?\n",
|
||||
"\n",
|
||||
" - Co w przypadku gdy wi\u0119cej ni\u017c jedna regu\u0142a zostanie dopasowana?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Zadanie\n",
|
||||
"-------\n",
|
||||
"Zaimplementowa\u0107 analizator j\u0119zyka naturalnego (NLU) na potrzeby realizowanego agenta dialogowego.\n",
|
||||
"\n",
|
||||
"Modu\u0142 powinien by\u0107 zbudowany z wykorzystaniem parsingu regu\u0142owego i/lub technik uczenia maszynowego.\n",
|
||||
"\n",
|
||||
"Przygotowa\u0107 skrypt `evaluate.py` wyznaczaj\u0105cy *dok\u0142adno\u015b\u0107* (ang. accuracy) analizatora wzgl\u0119dem zgromadzonego korpusu eksperymentalnego,\n",
|
||||
"tj. stosunek liczby wypowiedzi u\u017cytkownika, w kt\u00f3rych akty dialogowe zosta\u0142y rozpoznane prawid\u0142owo do liczby wszystkich wypowiedzi u\u017cytkownika w korpusie.\n",
|
||||
"\n",
|
||||
"Analizator j\u0119zyka naturalnego umie\u015bci\u0107 w ga\u0142\u0119zi `master` repozytorium projektowego. Skrypt `evaluate.py` umie\u015bci\u0107 w katalogu g\u0142\u00f3wnym tej ga\u0142\u0119zi.\n",
|
||||
"\n",
|
||||
"Termin: 4.05.2022, godz. 23:59."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"jupytext": {
|
||||
"cell_metadata_filter": "-all",
|
||||
"main_language": "python",
|
||||
"notebook_metadata_filter": "-all"
|
||||
},
|
||||
"author": "Marek Kubis",
|
||||
"email": "mkubis@amu.edu.pl",
|
||||
"lang": "pl",
|
||||
"subtitle": "7.Parsing semantyczny z wykorzystaniem gramatyk[laboratoria]",
|
||||
"title": "Systemy Dialogowe",
|
||||
"year": "2021"
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
Loading…
Reference in New Issue
Block a user