923 lines
42 KiB
Plaintext
923 lines
42 KiB
Plaintext
|
{
|
|||
|
"cells": [
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"![Logo 1](https://git.wmi.amu.edu.pl/AITech/Szablon/raw/branch/master/Logotyp_AITech1.jpg)\n",
|
|||
|
"<div class=\"alert alert-block alert-info\">\n",
|
|||
|
"<h1> Systemy Dialogowe </h1>\n",
|
|||
|
"<h2> 8. <i>Parsing semantyczny z wykorzystaniem technik uczenia maszynowego</i> [laboratoria]</h2> \n",
|
|||
|
"<h3> Marek Kubis (2021)</h3>\n",
|
|||
|
"</div>\n",
|
|||
|
"\n",
|
|||
|
"![Logo 2](https://git.wmi.amu.edu.pl/AITech/Szablon/raw/branch/master/Logotyp_AITech2.jpg)"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Parsing semantyczny z wykorzystaniem technik uczenia maszynowego\n",
|
|||
|
"================================================================\n",
|
|||
|
"\n",
|
|||
|
"Wprowadzenie\n",
|
|||
|
"------------\n",
|
|||
|
"Problem wykrywania slotów i ich wartości w wypowiedziach użytkownika można sformułować jako zadanie\n",
|
|||
|
"polegające na przewidywaniu dla poszczególnych słów etykiet wskazujących na to czy i do jakiego\n",
|
|||
|
"slotu dane słowo należy.\n",
|
|||
|
"\n",
|
|||
|
"> chciałbym zarezerwować stolik na jutro**/day** na godzinę dwunastą**/hour** czterdzieści**/hour** pięć**/hour** na pięć**/size** osób\n",
|
|||
|
"\n",
|
|||
|
"Granice slotów oznacza się korzystając z wybranego schematu etykietowania.\n",
|
|||
|
"\n",
|
|||
|
"### Schemat IOB\n",
|
|||
|
"\n",
|
|||
|
"| Prefix | Znaczenie |\n",
|
|||
|
"|:------:|:---------------------------|\n",
|
|||
|
"| I | wnętrze slotu (inside) |\n",
|
|||
|
"| O | poza slotem (outside) |\n",
|
|||
|
"| B | początek slotu (beginning) |\n",
|
|||
|
"\n",
|
|||
|
"> chciałbym zarezerwować stolik na jutro**/B-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/I-hour** na pięć**/B-size** osób\n",
|
|||
|
"\n",
|
|||
|
"### Schemat IOBES\n",
|
|||
|
"\n",
|
|||
|
"| Prefix | Znaczenie |\n",
|
|||
|
"|:------:|:---------------------------|\n",
|
|||
|
"| I | wnętrze slotu (inside) |\n",
|
|||
|
"| O | poza slotem (outside) |\n",
|
|||
|
"| B | początek slotu (beginning) |\n",
|
|||
|
"| E | koniec slotu (ending) |\n",
|
|||
|
"| S | pojedyncze słowo (single) |\n",
|
|||
|
"\n",
|
|||
|
"> chciałbym zarezerwować stolik na jutro**/S-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/E-hour** na pięć**/S-size** osób\n",
|
|||
|
"\n",
|
|||
|
"Jeżeli dla tak sformułowanego zadania przygotujemy zbiór danych\n",
|
|||
|
"złożony z wypowiedzi użytkownika z oznaczonymi slotami (tzw. *zbiór uczący*),\n",
|
|||
|
"to możemy zastosować techniki (nadzorowanego) uczenia maszynowego w celu zbudowania modelu\n",
|
|||
|
"annotującego wypowiedzi użytkownika etykietami slotów.\n",
|
|||
|
"\n",
|
|||
|
"Do zbudowania takiego modelu można wykorzystać między innymi:\n",
|
|||
|
"\n",
|
|||
|
" 1. warunkowe pola losowe (Lafferty i in.; 2001),\n",
|
|||
|
"\n",
|
|||
|
" 2. rekurencyjne sieci neuronowe, np. sieci LSTM (Hochreiter i Schmidhuber; 1997),\n",
|
|||
|
"\n",
|
|||
|
" 3. transformery (Vaswani i in., 2017).\n",
|
|||
|
"\n",
|
|||
|
"Przykład\n",
|
|||
|
"--------\n",
|
|||
|
"Skorzystamy ze zbioru danych przygotowanego przez Schustera (2019)."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": null,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [],
|
|||
|
"source": [
|
|||
|
"!mkdir -p l07\n",
|
|||
|
"%cd l07\n",
|
|||
|
"!curl -L -C - https://fb.me/multilingual_task_oriented_data -o data.zip\n",
|
|||
|
"!unzip data.zip\n",
|
|||
|
"%cd .."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Zbiór ten gromadzi wypowiedzi w trzech językach opisane slotami dla dwunastu ram należących do trzech dziedzin `Alarm`, `Reminder` oraz `Weather`. Dane wczytamy korzystając z biblioteki [conllu](https://pypi.org/project/conllu/)."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 1,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [],
|
|||
|
"source": [
|
|||
|
"from conllu import parse_incr\n",
|
|||
|
"fields = ['id', 'form', 'frame', 'slot']\n",
|
|||
|
"\n",
|
|||
|
"def nolabel2o(line, i):\n",
|
|||
|
" return 'O' if line[i] == 'NoLabel' else line[i]\n",
|
|||
|
"\n",
|
|||
|
"with open('./train_data//train.conllu') as trainfile:\n",
|
|||
|
" trainset = list(parse_incr(trainfile, fields=fields, field_parsers={'slot': nolabel2o}))\n",
|
|||
|
"with open('./train_data//test.conllu') as testfile:\n",
|
|||
|
" testset = list(parse_incr(testfile, fields=fields, field_parsers={'slot': nolabel2o}))"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Zobaczmy kilka przykładowych wypowiedzi z tego zbioru."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 21,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/html": [
|
|||
|
"<table>\n",
|
|||
|
"<tbody>\n",
|
|||
|
"<tr><td style=\"text-align: right;\">1</td><td>chciałbym</td><td></td><td>O</td></tr>\n",
|
|||
|
"<tr><td style=\"text-align: right;\">2</td><td>kupić </td><td></td><td>O</td></tr>\n",
|
|||
|
"<tr><td style=\"text-align: right;\">3</td><td>popcorn </td><td></td><td>O</td></tr>\n",
|
|||
|
"</tbody>\n",
|
|||
|
"</table>"
|
|||
|
],
|
|||
|
"text/plain": [
|
|||
|
"'<table>\\n<tbody>\\n<tr><td style=\"text-align: right;\">1</td><td>chciałbym</td><td></td><td>O</td></tr>\\n<tr><td style=\"text-align: right;\">2</td><td>kupić </td><td></td><td>O</td></tr>\\n<tr><td style=\"text-align: right;\">3</td><td>popcorn </td><td></td><td>O</td></tr>\\n</tbody>\\n</table>'"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 21,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"from tabulate import tabulate\n",
|
|||
|
"tabulate(trainset[26], tablefmt='html')"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": null,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [],
|
|||
|
"source": [
|
|||
|
"tabulate(trainset[1000], tablefmt='html')"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": null,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [],
|
|||
|
"source": [
|
|||
|
"tabulate(trainset[2000], tablefmt='html')"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"lines_to_next_cell": 0
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"Na potrzeby prezentacji procesu uczenia w jupyterowym notatniku zawęzimy zbiór danych do początkowych przykładów."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": null,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [],
|
|||
|
"source": [
|
|||
|
"trainset = trainset[:100]\n",
|
|||
|
"testset = testset[:100]"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Budując model skorzystamy z architektury opartej o rekurencyjne sieci neuronowe\n",
|
|||
|
"zaimplementowanej w bibliotece [flair](https://github.com/flairNLP/flair) (Akbik i in. 2018)."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 7,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"C:\\Users\\Adrian\\AppData\\Roaming\\Python\\Python37\\site-packages\\tqdm\\auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
|
|||
|
" from .autonotebook import tqdm as notebook_tqdm\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"from flair.data import Corpus, Sentence, Token\n",
|
|||
|
"from flair.datasets import SentenceDataset\n",
|
|||
|
"from flair.embeddings import StackedEmbeddings\n",
|
|||
|
"from flair.embeddings import WordEmbeddings\n",
|
|||
|
"from flair.embeddings import CharacterEmbeddings\n",
|
|||
|
"from flair.embeddings import FlairEmbeddings\n",
|
|||
|
"from flair.models import SequenceTagger\n",
|
|||
|
"from flair.trainers import ModelTrainer\n",
|
|||
|
"\n",
|
|||
|
"# determinizacja obliczeń\n",
|
|||
|
"import random\n",
|
|||
|
"import torch\n",
|
|||
|
"random.seed(42)\n",
|
|||
|
"torch.manual_seed(42)\n",
|
|||
|
"\n",
|
|||
|
"if torch.cuda.is_available():\n",
|
|||
|
" torch.cuda.manual_seed(0)\n",
|
|||
|
" torch.cuda.manual_seed_all(0)\n",
|
|||
|
" torch.backends.cudnn.enabled = False\n",
|
|||
|
" torch.backends.cudnn.benchmark = False\n",
|
|||
|
" torch.backends.cudnn.deterministic = True"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Dane skonwertujemy do formatu wykorzystywanego przez `flair`, korzystając z następującej funkcji."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 8,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"Corpus: 194 train + 22 dev + 33 test sentences\n",
|
|||
|
"Dictionary with 12 tags: <unk>, O, B-time, I-time, B-area, I-area, B-quantity, B-date, I-quantity, I-date, <START>, <STOP>\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"def conllu2flair(sentences, label=None):\n",
|
|||
|
" fsentences = []\n",
|
|||
|
"\n",
|
|||
|
" for sentence in sentences:\n",
|
|||
|
" fsentence = Sentence()\n",
|
|||
|
"\n",
|
|||
|
" for token in sentence:\n",
|
|||
|
" ftoken = Token(token['form'])\n",
|
|||
|
"\n",
|
|||
|
" if label:\n",
|
|||
|
" ftoken.add_tag(label, token[label])\n",
|
|||
|
"\n",
|
|||
|
" fsentence.add_token(ftoken)\n",
|
|||
|
"\n",
|
|||
|
" fsentences.append(fsentence)\n",
|
|||
|
"\n",
|
|||
|
" return SentenceDataset(fsentences)\n",
|
|||
|
"\n",
|
|||
|
"corpus = Corpus(train=conllu2flair(trainset, 'slot'), test=conllu2flair(testset, 'slot'))\n",
|
|||
|
"print(corpus)\n",
|
|||
|
"tag_dictionary = corpus.make_tag_dictionary(tag_type='slot')\n",
|
|||
|
"print(tag_dictionary)"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Nasz model będzie wykorzystywał wektorowe reprezentacje słów (zob. [Word Embeddings](https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_3_WORD_EMBEDDING.md))."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 9,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:51:31,428 https://flair.informatik.hu-berlin.de/resources/embeddings/token/pl-wiki-fasttext-300d-1M.vectors.npy not found in cache, downloading to C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmpdtf6je0q\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"100%|██████████| 1199998928/1199998928 [00:38<00:00, 31059047.54B/s]"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:10,221 copying C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmpdtf6je0q to cache at C:\\Users\\Adrian\\.flair\\embeddings\\pl-wiki-fasttext-300d-1M.vectors.npy\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:11,581 removing temp file C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmpdtf6je0q\n",
|
|||
|
"2022-05-17 23:52:11,834 https://flair.informatik.hu-berlin.de/resources/embeddings/token/pl-wiki-fasttext-300d-1M not found in cache, downloading to C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmpncdt74ud\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"100%|██████████| 40874795/40874795 [00:01<00:00, 25496548.48B/s]"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:13,623 copying C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmpncdt74ud to cache at C:\\Users\\Adrian\\.flair\\embeddings\\pl-wiki-fasttext-300d-1M\n",
|
|||
|
"2022-05-17 23:52:13,678 removing temp file C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmpncdt74ud\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:21,696 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-polish-forward-v0.2.pt not found in cache, downloading to C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmp6okeka8n\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"100%|██████████| 84244196/84244196 [00:02<00:00, 35143826.68B/s]"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:24,338 copying C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmp6okeka8n to cache at C:\\Users\\Adrian\\.flair\\embeddings\\lm-polish-forward-v0.2.pt\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:24,435 removing temp file C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmp6okeka8n\n",
|
|||
|
"2022-05-17 23:52:24,857 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-polish-backward-v0.2.pt not found in cache, downloading to C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmp_6ut1zi9\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"100%|██████████| 84244196/84244196 [00:02<00:00, 35815492.94B/s]"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:27,375 copying C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmp_6ut1zi9 to cache at C:\\Users\\Adrian\\.flair\\embeddings\\lm-polish-backward-v0.2.pt\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stderr",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:27,460 removing temp file C:\\Users\\Adrian\\AppData\\Local\\Temp\\tmp_6ut1zi9\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"embedding_types = [\n",
|
|||
|
" WordEmbeddings('pl'),\n",
|
|||
|
" FlairEmbeddings('polish-forward'),\n",
|
|||
|
" FlairEmbeddings('polish-backward'),\n",
|
|||
|
" CharacterEmbeddings(),\n",
|
|||
|
"]\n",
|
|||
|
"\n",
|
|||
|
"embeddings = StackedEmbeddings(embeddings=embedding_types)\n",
|
|||
|
"tagger = SequenceTagger(hidden_size=256, embeddings=embeddings,\n",
|
|||
|
" tag_dictionary=tag_dictionary,\n",
|
|||
|
" tag_type='slot', use_crf=True)"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Zobaczmy jak wygląda architektura sieci neuronowej, która będzie odpowiedzialna za przewidywanie\n",
|
|||
|
"slotów w wypowiedziach."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 10,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"SequenceTagger(\n",
|
|||
|
" (embeddings): StackedEmbeddings(\n",
|
|||
|
" (list_embedding_0): WordEmbeddings('pl')\n",
|
|||
|
" (list_embedding_1): FlairEmbeddings(\n",
|
|||
|
" (lm): LanguageModel(\n",
|
|||
|
" (drop): Dropout(p=0.25, inplace=False)\n",
|
|||
|
" (encoder): Embedding(1602, 100)\n",
|
|||
|
" (rnn): LSTM(100, 2048)\n",
|
|||
|
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
|
|||
|
" )\n",
|
|||
|
" )\n",
|
|||
|
" (list_embedding_2): FlairEmbeddings(\n",
|
|||
|
" (lm): LanguageModel(\n",
|
|||
|
" (drop): Dropout(p=0.25, inplace=False)\n",
|
|||
|
" (encoder): Embedding(1602, 100)\n",
|
|||
|
" (rnn): LSTM(100, 2048)\n",
|
|||
|
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
|
|||
|
" )\n",
|
|||
|
" )\n",
|
|||
|
" (list_embedding_3): CharacterEmbeddings(\n",
|
|||
|
" (char_embedding): Embedding(275, 25)\n",
|
|||
|
" (char_rnn): LSTM(25, 25, bidirectional=True)\n",
|
|||
|
" )\n",
|
|||
|
" )\n",
|
|||
|
" (word_dropout): WordDropout(p=0.05)\n",
|
|||
|
" (locked_dropout): LockedDropout(p=0.5)\n",
|
|||
|
" (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)\n",
|
|||
|
" (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)\n",
|
|||
|
" (linear): Linear(in_features=512, out_features=12, bias=True)\n",
|
|||
|
" (beta): 1.0\n",
|
|||
|
" (weights): None\n",
|
|||
|
" (weight_tensor) None\n",
|
|||
|
")\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"print(tagger)"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Wykonamy dziesięć iteracji (epok) uczenia a wynikowy model zapiszemy w katalogu `slot-model`."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 11,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:52:57,432 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:52:57,433 Model: \"SequenceTagger(\n",
|
|||
|
" (embeddings): StackedEmbeddings(\n",
|
|||
|
" (list_embedding_0): WordEmbeddings('pl')\n",
|
|||
|
" (list_embedding_1): FlairEmbeddings(\n",
|
|||
|
" (lm): LanguageModel(\n",
|
|||
|
" (drop): Dropout(p=0.25, inplace=False)\n",
|
|||
|
" (encoder): Embedding(1602, 100)\n",
|
|||
|
" (rnn): LSTM(100, 2048)\n",
|
|||
|
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
|
|||
|
" )\n",
|
|||
|
" )\n",
|
|||
|
" (list_embedding_2): FlairEmbeddings(\n",
|
|||
|
" (lm): LanguageModel(\n",
|
|||
|
" (drop): Dropout(p=0.25, inplace=False)\n",
|
|||
|
" (encoder): Embedding(1602, 100)\n",
|
|||
|
" (rnn): LSTM(100, 2048)\n",
|
|||
|
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
|
|||
|
" )\n",
|
|||
|
" )\n",
|
|||
|
" (list_embedding_3): CharacterEmbeddings(\n",
|
|||
|
" (char_embedding): Embedding(275, 25)\n",
|
|||
|
" (char_rnn): LSTM(25, 25, bidirectional=True)\n",
|
|||
|
" )\n",
|
|||
|
" )\n",
|
|||
|
" (word_dropout): WordDropout(p=0.05)\n",
|
|||
|
" (locked_dropout): LockedDropout(p=0.5)\n",
|
|||
|
" (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)\n",
|
|||
|
" (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)\n",
|
|||
|
" (linear): Linear(in_features=512, out_features=12, bias=True)\n",
|
|||
|
" (beta): 1.0\n",
|
|||
|
" (weights): None\n",
|
|||
|
" (weight_tensor) None\n",
|
|||
|
")\"\n",
|
|||
|
"2022-05-17 23:52:57,434 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:52:57,435 Corpus: \"Corpus: 194 train + 22 dev + 33 test sentences\"\n",
|
|||
|
"2022-05-17 23:52:57,435 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:52:57,435 Parameters:\n",
|
|||
|
"2022-05-17 23:52:57,436 - learning_rate: \"0.1\"\n",
|
|||
|
"2022-05-17 23:52:57,437 - mini_batch_size: \"32\"\n",
|
|||
|
"2022-05-17 23:52:57,437 - patience: \"3\"\n",
|
|||
|
"2022-05-17 23:52:57,437 - anneal_factor: \"0.5\"\n",
|
|||
|
"2022-05-17 23:52:57,438 - max_epochs: \"10\"\n",
|
|||
|
"2022-05-17 23:52:57,439 - shuffle: \"True\"\n",
|
|||
|
"2022-05-17 23:52:57,440 - train_with_dev: \"False\"\n",
|
|||
|
"2022-05-17 23:52:57,440 - batch_growth_annealing: \"False\"\n",
|
|||
|
"2022-05-17 23:52:57,441 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:52:57,441 Model training base path: \"slot-model\"\n",
|
|||
|
"2022-05-17 23:52:57,442 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:52:57,443 Device: cpu\n",
|
|||
|
"2022-05-17 23:52:57,443 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:52:57,444 Embeddings storage mode: cpu\n",
|
|||
|
"2022-05-17 23:52:57,446 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:52:59,206 epoch 1 - iter 1/7 - loss 16.77810669 - samples/sec: 18.23 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:01,036 epoch 1 - iter 2/7 - loss 15.17136908 - samples/sec: 17.51 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:02,450 epoch 1 - iter 3/7 - loss 13.45863914 - samples/sec: 22.63 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:04,163 epoch 1 - iter 4/7 - loss 11.81387305 - samples/sec: 18.70 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:06,030 epoch 1 - iter 5/7 - loss 10.41218300 - samples/sec: 17.14 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:07,655 epoch 1 - iter 6/7 - loss 9.20362504 - samples/sec: 19.70 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:07,968 epoch 1 - iter 7/7 - loss 8.10721644 - samples/sec: 102.61 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:07,969 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:07,970 EPOCH 1 done: loss 8.1072 - lr 0.1000000\n",
|
|||
|
"2022-05-17 23:53:09,606 DEV : loss 3.991352081298828 - score 0.2\n",
|
|||
|
"2022-05-17 23:53:09,607 BAD EPOCHS (no improvement): 0\n",
|
|||
|
"saving best model\n",
|
|||
|
"2022-05-17 23:53:14,975 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:15,484 epoch 2 - iter 1/7 - loss 3.58558130 - samples/sec: 63.03 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:15,865 epoch 2 - iter 2/7 - loss 3.12797976 - samples/sec: 84.20 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:16,267 epoch 2 - iter 3/7 - loss 2.60615242 - samples/sec: 79.80 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:16,738 epoch 2 - iter 4/7 - loss 2.71958175 - samples/sec: 68.18 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:17,170 epoch 2 - iter 5/7 - loss 2.70331609 - samples/sec: 74.26 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:17,603 epoch 2 - iter 6/7 - loss 2.51522466 - samples/sec: 74.01 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:17,748 epoch 2 - iter 7/7 - loss 2.19215042 - samples/sec: 221.61 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:17,749 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:17,750 EPOCH 2 done: loss 2.1922 - lr 0.1000000\n",
|
|||
|
"2022-05-17 23:53:17,844 DEV : loss 3.9842920303344727 - score 0.3636\n",
|
|||
|
"2022-05-17 23:53:17,846 BAD EPOCHS (no improvement): 0\n",
|
|||
|
"saving best model\n",
|
|||
|
"2022-05-17 23:53:22,865 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:23,305 epoch 3 - iter 1/7 - loss 2.19582605 - samples/sec: 72.76 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:23,741 epoch 3 - iter 2/7 - loss 1.85529530 - samples/sec: 73.58 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:24,212 epoch 3 - iter 3/7 - loss 1.91948136 - samples/sec: 68.09 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:24,717 epoch 3 - iter 4/7 - loss 2.11527669 - samples/sec: 63.50 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:25,129 epoch 3 - iter 5/7 - loss 2.12587404 - samples/sec: 77.75 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:25,630 epoch 3 - iter 6/7 - loss 2.01592445 - samples/sec: 63.92 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:25,755 epoch 3 - iter 7/7 - loss 1.73551549 - samples/sec: 258.75 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:25,756 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:25,757 EPOCH 3 done: loss 1.7355 - lr 0.1000000\n",
|
|||
|
"2022-05-17 23:53:25,854 DEV : loss 3.3194284439086914 - score 0.3077\n",
|
|||
|
"2022-05-17 23:53:25,855 BAD EPOCHS (no improvement): 1\n",
|
|||
|
"2022-05-17 23:53:25,856 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:26,274 epoch 4 - iter 1/7 - loss 1.46010232 - samples/sec: 76.66 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:26,734 epoch 4 - iter 2/7 - loss 1.18807647 - samples/sec: 69.66 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:27,229 epoch 4 - iter 3/7 - loss 1.33144226 - samples/sec: 64.87 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:27,775 epoch 4 - iter 4/7 - loss 1.64428358 - samples/sec: 58.69 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:28,243 epoch 4 - iter 5/7 - loss 1.62551130 - samples/sec: 68.71 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:28,727 epoch 4 - iter 6/7 - loss 1.74551653 - samples/sec: 66.25 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:28,856 epoch 4 - iter 7/7 - loss 1.53921426 - samples/sec: 248.73 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:28,857 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:28,858 EPOCH 4 done: loss 1.5392 - lr 0.1000000\n",
|
|||
|
"2022-05-17 23:53:28,962 DEV : loss 2.8986825942993164 - score 0.2857\n",
|
|||
|
"2022-05-17 23:53:28,963 BAD EPOCHS (no improvement): 2\n",
|
|||
|
"2022-05-17 23:53:28,965 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:29,417 epoch 5 - iter 1/7 - loss 1.72827125 - samples/sec: 70.90 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:29,902 epoch 5 - iter 2/7 - loss 1.51951337 - samples/sec: 66.07 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:30,355 epoch 5 - iter 3/7 - loss 1.55555471 - samples/sec: 70.83 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:30,840 epoch 5 - iter 4/7 - loss 1.31492138 - samples/sec: 66.16 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:31,257 epoch 5 - iter 5/7 - loss 1.46497860 - samples/sec: 76.92 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:31,768 epoch 5 - iter 6/7 - loss 1.60987592 - samples/sec: 62.75 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:31,929 epoch 5 - iter 7/7 - loss 2.72113044 - samples/sec: 200.53 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:31,930 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:31,931 EPOCH 5 done: loss 2.7211 - lr 0.1000000\n",
|
|||
|
"2022-05-17 23:53:32,024 DEV : loss 2.766446590423584 - score 0.3077\n",
|
|||
|
"2022-05-17 23:53:32,025 BAD EPOCHS (no improvement): 3\n",
|
|||
|
"2022-05-17 23:53:32,026 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:32,475 epoch 6 - iter 1/7 - loss 1.68398678 - samples/sec: 71.62 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:32,971 epoch 6 - iter 2/7 - loss 1.67541099 - samples/sec: 64.62 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:33,400 epoch 6 - iter 3/7 - loss 1.58060956 - samples/sec: 74.78 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:33,878 epoch 6 - iter 4/7 - loss 1.55456299 - samples/sec: 66.92 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:34,278 epoch 6 - iter 5/7 - loss 1.50003145 - samples/sec: 80.28 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:34,813 epoch 6 - iter 6/7 - loss 1.46878848 - samples/sec: 60.04 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:34,951 epoch 6 - iter 7/7 - loss 1.66172016 - samples/sec: 233.22 - lr: 0.100000\n",
|
|||
|
"2022-05-17 23:53:34,952 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:34,952 EPOCH 6 done: loss 1.6617 - lr 0.1000000\n",
|
|||
|
"2022-05-17 23:53:35,040 DEV : loss 2.2595832347869873 - score 0.2857\n",
|
|||
|
"Epoch 6: reducing learning rate of group 0 to 5.0000e-02.\n",
|
|||
|
"2022-05-17 23:53:35,041 BAD EPOCHS (no improvement): 4\n",
|
|||
|
"2022-05-17 23:53:35,043 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:35,461 epoch 7 - iter 1/7 - loss 1.14667833 - samples/sec: 76.93 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:35,976 epoch 7 - iter 2/7 - loss 1.11618459 - samples/sec: 62.22 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:36,416 epoch 7 - iter 3/7 - loss 1.24378494 - samples/sec: 72.88 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:36,880 epoch 7 - iter 4/7 - loss 1.31663331 - samples/sec: 69.14 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:37,298 epoch 7 - iter 5/7 - loss 1.39581544 - samples/sec: 76.75 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:37,714 epoch 7 - iter 6/7 - loss 1.34690581 - samples/sec: 77.09 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:37,860 epoch 7 - iter 7/7 - loss 1.46004195 - samples/sec: 220.36 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:37,861 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:37,861 EPOCH 7 done: loss 1.4600 - lr 0.0500000\n",
|
|||
|
"2022-05-17 23:53:37,954 DEV : loss 2.200728416442871 - score 0.2857\n",
|
|||
|
"2022-05-17 23:53:37,955 BAD EPOCHS (no improvement): 1\n",
|
|||
|
"2022-05-17 23:53:37,956 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:38,423 epoch 8 - iter 1/7 - loss 1.14459288 - samples/sec: 68.83 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:38,805 epoch 8 - iter 2/7 - loss 0.95714736 - samples/sec: 83.88 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:39,302 epoch 8 - iter 3/7 - loss 1.17704646 - samples/sec: 64.42 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:39,781 epoch 8 - iter 4/7 - loss 1.29963121 - samples/sec: 66.92 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:40,256 epoch 8 - iter 5/7 - loss 1.34262223 - samples/sec: 67.59 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:40,704 epoch 8 - iter 6/7 - loss 1.33356750 - samples/sec: 71.53 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:40,846 epoch 8 - iter 7/7 - loss 1.20113390 - samples/sec: 226.59 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:40,847 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:40,848 EPOCH 8 done: loss 1.2011 - lr 0.0500000\n",
|
|||
|
"2022-05-17 23:53:40,941 DEV : loss 2.4227261543273926 - score 0.2857\n",
|
|||
|
"2022-05-17 23:53:40,942 BAD EPOCHS (no improvement): 2\n",
|
|||
|
"2022-05-17 23:53:40,943 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:41,389 epoch 9 - iter 1/7 - loss 1.12297106 - samples/sec: 71.73 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:41,800 epoch 9 - iter 2/7 - loss 0.92356640 - samples/sec: 78.01 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:42,249 epoch 9 - iter 3/7 - loss 1.02407436 - samples/sec: 71.37 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:42,667 epoch 9 - iter 4/7 - loss 1.04805315 - samples/sec: 76.71 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:43,215 epoch 9 - iter 5/7 - loss 1.33371143 - samples/sec: 58.59 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:43,661 epoch 9 - iter 6/7 - loss 1.27829826 - samples/sec: 71.89 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:43,796 epoch 9 - iter 7/7 - loss 1.10260926 - samples/sec: 240.25 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:43,797 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:43,798 EPOCH 9 done: loss 1.1026 - lr 0.0500000\n",
|
|||
|
"2022-05-17 23:53:43,895 DEV : loss 2.1707162857055664 - score 0.3077\n",
|
|||
|
"2022-05-17 23:53:43,896 BAD EPOCHS (no improvement): 3\n",
|
|||
|
"2022-05-17 23:53:43,903 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:44,338 epoch 10 - iter 1/7 - loss 1.34320462 - samples/sec: 73.74 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:44,808 epoch 10 - iter 2/7 - loss 0.96772069 - samples/sec: 68.25 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:45,207 epoch 10 - iter 3/7 - loss 1.06257542 - samples/sec: 80.34 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:45,729 epoch 10 - iter 4/7 - loss 0.92318819 - samples/sec: 61.50 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:46,202 epoch 10 - iter 5/7 - loss 1.08295707 - samples/sec: 67.82 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:46,707 epoch 10 - iter 6/7 - loss 1.18012399 - samples/sec: 63.49 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:46,841 epoch 10 - iter 7/7 - loss 1.01267667 - samples/sec: 239.34 - lr: 0.050000\n",
|
|||
|
"2022-05-17 23:53:46,842 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:46,842 EPOCH 10 done: loss 1.0127 - lr 0.0500000\n",
|
|||
|
"2022-05-17 23:53:46,942 DEV : loss 1.9863343238830566 - score 0.3077\n",
|
|||
|
"Epoch 10: reducing learning rate of group 0 to 2.5000e-02.\n",
|
|||
|
"2022-05-17 23:53:46,943 BAD EPOCHS (no improvement): 4\n",
|
|||
|
"2022-05-17 23:53:51,951 ----------------------------------------------------------------------------------------------------\n",
|
|||
|
"2022-05-17 23:53:51,952 Testing using best model ...\n",
|
|||
|
"2022-05-17 23:53:51,953 loading file slot-model\\best-model.pt\n",
|
|||
|
"2022-05-17 23:53:57,745 0.8000\t0.2667\t0.4000\n",
|
|||
|
"2022-05-17 23:53:57,746 \n",
|
|||
|
"Results:\n",
|
|||
|
"- F1-score (micro) 0.4000\n",
|
|||
|
"- F1-score (macro) 0.2424\n",
|
|||
|
"\n",
|
|||
|
"By class:\n",
|
|||
|
"date tp: 0 - fp: 0 - fn: 4 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
|
|||
|
"quantity tp: 4 - fp: 1 - fn: 2 - precision: 0.8000 - recall: 0.6667 - f1-score: 0.7273\n",
|
|||
|
"time tp: 0 - fp: 0 - fn: 5 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
|
|||
|
"2022-05-17 23:53:57,747 ----------------------------------------------------------------------------------------------------\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/plain": [
|
|||
|
"{'test_score': 0.4,\n",
|
|||
|
" 'dev_score_history': [0.2,\n",
|
|||
|
" 0.36363636363636365,\n",
|
|||
|
" 0.3076923076923077,\n",
|
|||
|
" 0.28571428571428575,\n",
|
|||
|
" 0.3076923076923077,\n",
|
|||
|
" 0.28571428571428575,\n",
|
|||
|
" 0.28571428571428575,\n",
|
|||
|
" 0.28571428571428575,\n",
|
|||
|
" 0.3076923076923077,\n",
|
|||
|
" 0.3076923076923077],\n",
|
|||
|
" 'train_loss_history': [8.107216443334307,\n",
|
|||
|
" 2.19215042250497,\n",
|
|||
|
" 1.735515492303031,\n",
|
|||
|
" 1.5392142619405473,\n",
|
|||
|
" 2.721130439213344,\n",
|
|||
|
" 1.6617201566696167,\n",
|
|||
|
" 1.460041948727199,\n",
|
|||
|
" 1.2011338983263289,\n",
|
|||
|
" 1.1026092597416468,\n",
|
|||
|
" 1.012676673276084],\n",
|
|||
|
" 'dev_loss_history': [3.991352081298828,\n",
|
|||
|
" 3.9842920303344727,\n",
|
|||
|
" 3.3194284439086914,\n",
|
|||
|
" 2.8986825942993164,\n",
|
|||
|
" 2.766446590423584,\n",
|
|||
|
" 2.2595832347869873,\n",
|
|||
|
" 2.200728416442871,\n",
|
|||
|
" 2.4227261543273926,\n",
|
|||
|
" 2.1707162857055664,\n",
|
|||
|
" 1.9863343238830566]}"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 11,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"trainer = ModelTrainer(tagger, corpus)\n",
|
|||
|
"trainer.train('slot-model',\n",
|
|||
|
" learning_rate=0.1,\n",
|
|||
|
" mini_batch_size=32,\n",
|
|||
|
" max_epochs=10,\n",
|
|||
|
" train_with_dev=False)"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Jakość wyuczonego modelu możemy ocenić, korzystając z zaraportowanych powyżej metryk, tj.:\n",
|
|||
|
"\n",
|
|||
|
" - *tp (true positives)*\n",
|
|||
|
"\n",
|
|||
|
" > liczba słów oznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą\n",
|
|||
|
"\n",
|
|||
|
" - *fp (false positives)*\n",
|
|||
|
"\n",
|
|||
|
" > liczba słów nieoznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą\n",
|
|||
|
"\n",
|
|||
|
" - *fn (false negatives)*\n",
|
|||
|
"\n",
|
|||
|
" > liczba słów oznaczonych w zbiorze testowym etykietą $e$, którym model nie nadał etykiety $e$\n",
|
|||
|
"\n",
|
|||
|
" - *precision*\n",
|
|||
|
"\n",
|
|||
|
" > $$\\frac{tp}{tp + fp}$$\n",
|
|||
|
"\n",
|
|||
|
" - *recall*\n",
|
|||
|
"\n",
|
|||
|
" > $$\\frac{tp}{tp + fn}$$\n",
|
|||
|
"\n",
|
|||
|
" - $F_1$\n",
|
|||
|
"\n",
|
|||
|
" > $$\\frac{2 \\cdot precision \\cdot recall}{precision + recall}$$\n",
|
|||
|
"\n",
|
|||
|
" - *micro* $F_1$\n",
|
|||
|
"\n",
|
|||
|
" > $F_1$ w którym $tp$, $fp$ i $fn$ są liczone łącznie dla wszystkich etykiet, tj. $tp = \\sum_{e}{{tp}_e}$, $fn = \\sum_{e}{{fn}_e}$, $fp = \\sum_{e}{{fp}_e}$\n",
|
|||
|
"\n",
|
|||
|
" - *macro* $F_1$\n",
|
|||
|
"\n",
|
|||
|
" > średnia arytmetyczna z $F_1$ obliczonych dla poszczególnych etykiet z osobna.\n",
|
|||
|
"\n",
|
|||
|
"Wyuczony model możemy wczytać z pliku korzystając z metody `load`."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 22,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"2022-05-17 23:57:03,014 loading file slot-model/final-model.pt\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"model = SequenceTagger.load('slot-model/final-model.pt')"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Wczytany model możemy wykorzystać do przewidywania slotów w wypowiedziach użytkownika, korzystając\n",
|
|||
|
"z przedstawionej poniżej funkcji `predict`."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 23,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [],
|
|||
|
"source": [
|
|||
|
"def predict(model, sentence):\n",
|
|||
|
" csentence = [{'form': word} for word in sentence]\n",
|
|||
|
" fsentence = conllu2flair([csentence])[0]\n",
|
|||
|
" model.predict(fsentence)\n",
|
|||
|
" return [(token, ftoken.get_tag('slot').value) for token, ftoken in zip(sentence, fsentence)]\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Jak pokazuje przykład poniżej model wyuczony tylko na 100 przykładach popełnia w dosyć prostej\n",
|
|||
|
"wypowiedzi błąd etykietując słowo `alarm` tagiem `B-weather/noun`."
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 28,
|
|||
|
"metadata": {},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/html": [
|
|||
|
"<table>\n",
|
|||
|
"<tbody>\n",
|
|||
|
"<tr><td>chciałbym </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>zarezerwować</td><td>O </td></tr>\n",
|
|||
|
"<tr><td>2 </td><td>B-quantity</td></tr>\n",
|
|||
|
"<tr><td>bilety </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>na </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>batman </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>na </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>19:30 </td><td>B-quantity</td></tr>\n",
|
|||
|
"<tr><td>na </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>środku </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>z </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>tyłun </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>po </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>prawej </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>i </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>po </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>lewej </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>nie </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>chce </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>z </td><td>O </td></tr>\n",
|
|||
|
"<tr><td>przodu </td><td>O </td></tr>\n",
|
|||
|
"</tbody>\n",
|
|||
|
"</table>"
|
|||
|
],
|
|||
|
"text/plain": [
|
|||
|
"'<table>\\n<tbody>\\n<tr><td>chciałbym </td><td>O </td></tr>\\n<tr><td>zarezerwować</td><td>O </td></tr>\\n<tr><td>2 </td><td>B-quantity</td></tr>\\n<tr><td>bilety </td><td>O </td></tr>\\n<tr><td>na </td><td>O </td></tr>\\n<tr><td>batman </td><td>O </td></tr>\\n<tr><td>na </td><td>O </td></tr>\\n<tr><td>19:30 </td><td>B-quantity</td></tr>\\n<tr><td>na </td><td>O </td></tr>\\n<tr><td>środku </td><td>O </td></tr>\\n<tr><td>z </td><td>O </td></tr>\\n<tr><td>tyłun </td><td>O </td></tr>\\n<tr><td>po </td><td>O </td></tr>\\n<tr><td>prawej </td><td>O </td></tr>\\n<tr><td>i </td><td>O </td></tr>\\n<tr><td>po </td><td>O </td></tr>\\n<tr><td>lewej </td><td>O </td></tr>\\n<tr><td>nie </td><td>O </td></tr>\\n<tr><td>chce </td><td>O </td></tr>\\n<tr><td>z </td><td>O </td></tr>\\n<tr><td>przodu </td><td>O </td></tr>\\n</tbody>\\n</table>'"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 28,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"tabulate(predict(model, 'chciałbym zarezerwować 2 bilety na batman na 19:30 na środku z tyłun po prawej i po lewej nie chce z przodu'.split()), tablefmt='html')"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"Literatura\n",
|
|||
|
"----------\n",
|
|||
|
" 1. Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis, Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog. NAACL-HLT (1) 2019, pp. 3795-3805\n",
|
|||
|
" 2. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282–289, https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers\n",
|
|||
|
" 3. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (November 15, 1997), 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735\n",
|
|||
|
" 4. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, Attention is All you Need, NIPS 2017, pp. 5998-6008, https://arxiv.org/abs/1706.03762\n",
|
|||
|
" 5. Alan Akbik, Duncan Blythe, Roland Vollgraf, Contextual String Embeddings for Sequence Labeling, Proceedings of the 27th International Conference on Computational Linguistics, pp. 1638–1649, https://www.aclweb.org/anthology/C18-1139.pdf\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"metadata": {
|
|||
|
"author": "Marek Kubis",
|
|||
|
"email": "mkubis@amu.edu.pl",
|
|||
|
"jupytext": {
|
|||
|
"cell_metadata_filter": "-all",
|
|||
|
"main_language": "python",
|
|||
|
"notebook_metadata_filter": "-all"
|
|||
|
},
|
|||
|
"kernelspec": {
|
|||
|
"display_name": "Python 3 (ipykernel)",
|
|||
|
"language": "python",
|
|||
|
"name": "python3"
|
|||
|
},
|
|||
|
"lang": "pl",
|
|||
|
"language_info": {
|
|||
|
"codemirror_mode": {
|
|||
|
"name": "ipython",
|
|||
|
"version": 3
|
|||
|
},
|
|||
|
"file_extension": ".py",
|
|||
|
"mimetype": "text/x-python",
|
|||
|
"name": "python",
|
|||
|
"nbconvert_exporter": "python",
|
|||
|
"pygments_lexer": "ipython3",
|
|||
|
"version": "3.7.13"
|
|||
|
},
|
|||
|
"subtitle": "8.Parsing semantyczny z wykorzystaniem technik uczenia maszynowego[laboratoria]",
|
|||
|
"title": "Systemy Dialogowe",
|
|||
|
"year": "2021"
|
|||
|
},
|
|||
|
"nbformat": 4,
|
|||
|
"nbformat_minor": 4
|
|||
|
}
|