{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Parsing semantyczny z wykorzystaniem technik uczenia maszynowego\n",
"================================================================\n",
"\n",
"Wprowadzenie\n",
"------------\n",
"Problem wykrywania slotów i ich wartości w wypowiedziach użytkownika można sformułować jako zadanie\n",
"polegające na przewidywaniu dla poszczególnych słów etykiet wskazujących na to czy i do jakiego\n",
"slotu dane słowo należy.\n",
"\n",
"> chciałbym zarezerwować stolik na jutro**/day** na godzinę dwunastą**/hour** czterdzieści**/hour** pięć**/hour** na pięć**/size** osób\n",
"\n",
"Granice slotów oznacza się korzystając z wybranego schematu etykietowania.\n",
"\n",
"### Schemat IOB\n",
"\n",
"| Prefix | Znaczenie |\n",
"|:------:|:---------------------------|\n",
"| I | wnętrze slotu (inside) |\n",
"| O | poza slotem (outside) |\n",
"| B | początek slotu (beginning) |\n",
"\n",
"> chciałbym zarezerwować stolik na jutro**/B-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/I-hour** na pięć**/B-size** osób\n",
"\n",
"### Schemat IOBES\n",
"\n",
"| Prefix | Znaczenie |\n",
"|:------:|:---------------------------|\n",
"| I | wnętrze slotu (inside) |\n",
"| O | poza slotem (outside) |\n",
"| B | początek slotu (beginning) |\n",
"| E | koniec slotu (ending) |\n",
"| S | pojedyncze słowo (single) |\n",
"\n",
"> chciałbym zarezerwować stolik na jutro**/S-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/E-hour** na pięć**/S-size** osób\n",
"\n",
"Jeżeli dla tak sformułowanego zadania przygotujemy zbiór danych\n",
"złożony z wypowiedzi użytkownika z oznaczonymi slotami (tzw. *zbiór uczący*),\n",
"to możemy zastosować techniki (nadzorowanego) uczenia maszynowego w celu zbudowania modelu\n",
"annotującego wypowiedzi użytkownika etykietami slotów.\n",
"\n",
"Do zbudowania takiego modelu można wykorzystać między innymi:\n",
"\n",
" 1. warunkowe pola losowe (Lafferty i in.; 2001),\n",
"\n",
" 2. rekurencyjne sieci neuronowe, np. sieci LSTM (Hochreiter i Schmidhuber; 1997),\n",
"\n",
" 3. transformery (Vaswani i in., 2017).\n",
"\n",
"Przykład\n",
"--------\n",
"Skorzystamy ze zbioru danych przygotowanego przez Schustera (2019)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C:\\Users\\Ania\\Desktop\\System_Dialogowy_Janet\\l07\n",
"C:\\Users\\Ania\\Desktop\\System_Dialogowy_Janet\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
" % Total % Received % Xferd Average Speed Time Time Time Current\n",
" Dload Upload Total Spent Left Speed\n",
"\n",
" 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\n",
" 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\n",
" 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\n",
"\n",
" 0 8714k 0 1656 0 0 886 0 2:47:51 0:00:01 2:47:50 886\n",
" 4 8714k 4 406k 0 0 167k 0 0:00:52 0:00:02 0:00:50 721k\n",
" 33 8714k 33 2957k 0 0 863k 0 0:00:10 0:00:03 0:00:07 1898k\n",
" 69 8714k 69 6035k 0 0 1387k 0 0:00:06 0:00:04 0:00:02 2429k\n",
"100 8714k 100 8714k 0 0 1703k 0 0:00:05 0:00:05 --:--:-- 2683k\n"
]
}
],
"source": [
"!mkdir -p l07\n",
"%cd l07\n",
"!curl -L -C - https://fb.me/multilingual_task_oriented_data -o data.zip\n",
"%cd .."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Zbiór ten gromadzi wypowiedzi w trzech językach opisane slotami dla dwunastu ram należących do trzech dziedzin `Alarm`, `Reminder` oraz `Weather`. Dane wczytamy korzystając z biblioteki [conllu](https://pypi.org/project/conllu/)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: conllu in c:\\programdata\\anaconda3\\lib\\site-packages (4.4)\n"
]
}
],
"source": [
"!pip3 install conllu\n",
"import codecs\n",
"from conllu import parse_incr\n",
"fields = ['id', 'form', 'frame', 'slot']\n",
"\n",
"def nolabel2o(line, i):\n",
" return 'O' if line[i] == 'NoLabel' else line[i]\n",
"\n",
"with open('Janet.conllu', encoding='utf-8') as trainfile:\n",
" trainset = list(parse_incr(trainfile, fields=fields, field_parsers={'slot': nolabel2o}))\n",
"with open('Janet.conllu', encoding='utf-8') as testfile:\n",
" testset = list(parse_incr(testfile, fields=fields, field_parsers={'slot': nolabel2o}))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Zobaczmy kilka przykładowych wypowiedzi z tego zbioru."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: tabulate in c:\\programdata\\anaconda3\\lib\\site-packages (0.8.9)\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"1 | chciałem | appointment/request_prescription | O |
\n",
"2 | prosić | appointment/request_prescription | O |
\n",
"3 | o | appointment/request_prescription | O |
\n",
"4 | wypisanie | appointment/request_prescription | O |
\n",
"5 | kolejnej | appointment/request_prescription | O |
\n",
"6 | recepty | appointment/request_prescription | B-prescription |
\n",
"7 | na | appointment/request_prescription | O |
\n",
"8 | lek | appointment/request_prescription | B-prescription/type |
\n",
"9 | x | appointment/request_prescription | I-prescription/type |
\n",
"\n",
"
"
],
"text/plain": [
"'\\n\\n1 | chciałem | appointment/request_prescription | O |
\\n2 | prosić | appointment/request_prescription | O |
\\n3 | o | appointment/request_prescription | O |
\\n4 | wypisanie | appointment/request_prescription | O |
\\n5 | kolejnej | appointment/request_prescription | O |
\\n6 | recepty | appointment/request_prescription | B-prescription |
\\n7 | na | appointment/request_prescription | O |
\\n8 | lek | appointment/request_prescription | B-prescription/type |
\\n9 | x | appointment/request_prescription | I-prescription/type |
\\n\\n
'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"!pip3 install tabulate\n",
"from tabulate import tabulate\n",
"tabulate(trainset[0], tablefmt='html')"
]
},
{
"cell_type": "markdown",
"metadata": {
"lines_to_next_cell": 0
},
"source": [
"Na potrzeby prezentacji procesu uczenia w jupyterowym notatniku zawęzimy zbiór danych do początkowych przykładów."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Budując model skorzystamy z architektury opartej o rekurencyjne sieci neuronowe\n",
"zaimplementowanej w bibliotece [flair](https://github.com/flairNLP/flair) (Akbik i in. 2018)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: flair in c:\\programdata\\anaconda3\\lib\\site-packages (0.8.0.post1)\n",
"Requirement already satisfied: deprecated>=1.2.4 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (1.2.12)\n",
"Requirement already satisfied: janome in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.4.1)\n",
"Requirement already satisfied: langdetect in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (1.0.9)\n",
"Requirement already satisfied: hyperopt>=0.1.1 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.2.5)\n",
"Requirement already satisfied: sentencepiece==0.1.95 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.1.95)\n",
"Requirement already satisfied: python-dateutil>=2.6.1 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (2.8.1)\n",
"Requirement already satisfied: regex in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (2020.10.15)\n",
"Requirement already satisfied: segtok>=1.5.7 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (1.5.10)\n",
"Requirement already satisfied: numpy<1.20.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (1.19.2)\n",
"Requirement already satisfied: mpld3==0.3 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.3)\n",
"Requirement already satisfied: bpemb>=0.3.2 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.3.3)\n",
"Requirement already satisfied: scikit-learn>=0.21.3 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.23.2)\n",
"Requirement already satisfied: matplotlib>=2.2.3 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (3.3.2)\n",
"Requirement already satisfied: sqlitedict>=1.6.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (1.7.0)\n",
"Requirement already satisfied: tabulate in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.8.9)\n",
"Requirement already satisfied: ftfy in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (6.0.1)\n",
"Requirement already satisfied: konoha<5.0.0,>=4.0.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (4.6.4)\n",
"Requirement already satisfied: torch<=1.7.1,>=1.5.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (1.7.1)\n",
"Requirement already satisfied: gensim<=3.8.3,>=3.4.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (3.8.3)\n",
"Requirement already satisfied: transformers>=4.0.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (4.6.0)\n",
"Requirement already satisfied: gdown==3.12.2 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (3.12.2)\n",
"Requirement already satisfied: tqdm>=4.26.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (4.50.2)\n",
"Requirement already satisfied: lxml in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (4.6.1)\n",
"Requirement already satisfied: huggingface-hub in c:\\programdata\\anaconda3\\lib\\site-packages (from flair) (0.0.8)\n",
"Requirement already satisfied: wrapt<2,>=1.10 in c:\\users\\ania\\appdata\\roaming\\python\\python38\\site-packages (from deprecated>=1.2.4->flair) (1.12.1)\n",
"Requirement already satisfied: six in c:\\programdata\\anaconda3\\lib\\site-packages (from langdetect->flair) (1.15.0)\n",
"Requirement already satisfied: cloudpickle in c:\\programdata\\anaconda3\\lib\\site-packages (from hyperopt>=0.1.1->flair) (1.6.0)\n",
"Requirement already satisfied: scipy in c:\\programdata\\anaconda3\\lib\\site-packages (from hyperopt>=0.1.1->flair) (1.5.2)\n",
"Requirement already satisfied: networkx>=2.2 in c:\\programdata\\anaconda3\\lib\\site-packages (from hyperopt>=0.1.1->flair) (2.5)\n",
"Requirement already satisfied: future in c:\\programdata\\anaconda3\\lib\\site-packages (from hyperopt>=0.1.1->flair) (0.18.2)\n",
"Requirement already satisfied: requests in c:\\programdata\\anaconda3\\lib\\site-packages (from bpemb>=0.3.2->flair) (2.24.0)\n",
"Requirement already satisfied: joblib>=0.11 in c:\\programdata\\anaconda3\\lib\\site-packages (from scikit-learn>=0.21.3->flair) (0.17.0)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from scikit-learn>=0.21.3->flair) (2.1.0)\n",
"Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\\programdata\\anaconda3\\lib\\site-packages (from matplotlib>=2.2.3->flair) (2.4.7)\n",
"Requirement already satisfied: pillow>=6.2.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from matplotlib>=2.2.3->flair) (8.0.1)\n",
"Requirement already satisfied: kiwisolver>=1.0.1 in c:\\programdata\\anaconda3\\lib\\site-packages (from matplotlib>=2.2.3->flair) (1.3.0)\n",
"Requirement already satisfied: certifi>=2020.06.20 in c:\\programdata\\anaconda3\\lib\\site-packages (from matplotlib>=2.2.3->flair) (2020.6.20)\n",
"Requirement already satisfied: cycler>=0.10 in c:\\programdata\\anaconda3\\lib\\site-packages (from matplotlib>=2.2.3->flair) (0.10.0)\n",
"Requirement already satisfied: wcwidth in c:\\programdata\\anaconda3\\lib\\site-packages (from ftfy->flair) (0.2.5)\n",
"Requirement already satisfied: importlib-metadata<4.0.0,>=3.7.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from konoha<5.0.0,>=4.0.0->flair) (3.10.1)\n",
"Requirement already satisfied: overrides<4.0.0,>=3.0.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from konoha<5.0.0,>=4.0.0->flair) (3.1.0)\n",
"Requirement already satisfied: typing-extensions in c:\\programdata\\anaconda3\\lib\\site-packages (from torch<=1.7.1,>=1.5.0->flair) (3.7.4.3)\n",
"Requirement already satisfied: smart-open>=1.8.1 in c:\\programdata\\anaconda3\\lib\\site-packages (from gensim<=3.8.3,>=3.4.0->flair) (5.0.0)\n",
"Requirement already satisfied: Cython==0.29.14 in c:\\programdata\\anaconda3\\lib\\site-packages (from gensim<=3.8.3,>=3.4.0->flair) (0.29.14)\n",
"Requirement already satisfied: packaging in c:\\programdata\\anaconda3\\lib\\site-packages (from transformers>=4.0.0->flair) (20.4)\n",
"Requirement already satisfied: filelock in c:\\programdata\\anaconda3\\lib\\site-packages (from transformers>=4.0.0->flair) (3.0.12)\n",
"Requirement already satisfied: sacremoses in c:\\programdata\\anaconda3\\lib\\site-packages (from transformers>=4.0.0->flair) (0.0.45)\n",
"Requirement already satisfied: tokenizers<0.11,>=0.10.1 in c:\\programdata\\anaconda3\\lib\\site-packages (from transformers>=4.0.0->flair) (0.10.2)\n",
"Requirement already satisfied: decorator>=4.3.0 in c:\\programdata\\anaconda3\\lib\\site-packages (from networkx>=2.2->hyperopt>=0.1.1->flair) (4.4.2)\n",
"Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\\programdata\\anaconda3\\lib\\site-packages (from requests->bpemb>=0.3.2->flair) (1.25.11)\n",
"Requirement already satisfied: idna<3,>=2.5 in c:\\programdata\\anaconda3\\lib\\site-packages (from requests->bpemb>=0.3.2->flair) (2.10)\n",
"Requirement already satisfied: chardet<4,>=3.0.2 in c:\\programdata\\anaconda3\\lib\\site-packages (from requests->bpemb>=0.3.2->flair) (3.0.4)\n",
"Requirement already satisfied: zipp>=0.5 in c:\\programdata\\anaconda3\\lib\\site-packages (from importlib-metadata<4.0.0,>=3.7.0->konoha<5.0.0,>=4.0.0->flair) (3.4.0)\n",
"Requirement already satisfied: click in c:\\programdata\\anaconda3\\lib\\site-packages (from sacremoses->transformers>=4.0.0->flair) (7.1.2)\n"
]
}
],
"source": [
"!pip3 install flair"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: torch in c:\\programdata\\anaconda3\\lib\\site-packages (1.7.1)\n",
"Requirement already satisfied: numpy in c:\\programdata\\anaconda3\\lib\\site-packages (from torch) (1.19.2)\n",
"Requirement already satisfied: typing-extensions in c:\\programdata\\anaconda3\\lib\\site-packages (from torch) (3.7.4.3)\n"
]
}
],
"source": [
"from flair.data import Corpus, Sentence, Token\n",
"from flair.datasets import SentenceDataset\n",
"from flair.embeddings import StackedEmbeddings\n",
"from flair.embeddings import WordEmbeddings\n",
"from flair.embeddings import CharacterEmbeddings\n",
"from flair.embeddings import FlairEmbeddings\n",
"from flair.models import SequenceTagger\n",
"from flair.trainers import ModelTrainer\n",
"\n",
"!pip3 install torch\n",
"# determinizacja obliczeń\n",
"import random\n",
"import torch\n",
"random.seed(42)\n",
"torch.manual_seed(42)\n",
"\n",
"if torch.cuda.is_available():\n",
" torch.cuda.manual_seed(0)\n",
" torch.cuda.manual_seed_all(0)\n",
" torch.backends.cudnn.enabled = False\n",
" torch.backends.cudnn.benchmark = False\n",
" torch.backends.cudnn.deterministic = True"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dane skonwertujemy do formatu wykorzystywanego przez `flair`, korzystając z następującej funkcji."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Corpus: 99 train + 11 dev + 110 test sentences\n",
"Dictionary with 31 tags: , O, B-prescription, B-prescription/type, I-prescription/type, B-end_conversation, B-deny, I-end_conversation, B-greeting, I-greeting, B-appointment, B-appointment/doctor, I-appointment/doctor, B-datetime, NoLabel I-end_conversation, I-datetime, B-affirm, B-appointment/office, I-B-datetime, B-results, B-appointment/type, I-appointment/type, B-register/email, B-doctor, I-affirm, B-appoinment/doctor, B-appoinment, B-register/name, I-register/name, \n"
]
}
],
"source": [
"def conllu2flair(sentences, label=None):\n",
" fsentences = []\n",
"\n",
" for sentence in sentences:\n",
" fsentence = Sentence()\n",
"\n",
" for token in sentence:\n",
" ftoken = Token(token['form'])\n",
"\n",
" if label:\n",
" ftoken.add_tag(label, token[label])\n",
"\n",
" fsentence.add_token(ftoken)\n",
"\n",
" fsentences.append(fsentence)\n",
"\n",
" return SentenceDataset(fsentences)\n",
"\n",
"corpus = Corpus(train=conllu2flair(trainset, 'slot'), test=conllu2flair(testset, 'slot'))\n",
"print(corpus)\n",
"tag_dictionary = corpus.make_tag_dictionary(tag_type='slot')\n",
"print(tag_dictionary)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nasz model będzie wykorzystywał wektorowe reprezentacje słów (zob. [Word Embeddings](https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_3_WORD_EMBEDDING.md))."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"embedding_types = [\n",
" WordEmbeddings('pl'),\n",
" FlairEmbeddings('pl-forward'),\n",
" FlairEmbeddings('pl-backward'),\n",
" CharacterEmbeddings(),\n",
"]\n",
"\n",
"embeddings = StackedEmbeddings(embeddings=embedding_types)\n",
"tagger = SequenceTagger(hidden_size=256, embeddings=embeddings,\n",
" tag_dictionary=tag_dictionary,\n",
" tag_type='slot', use_crf=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Zobaczmy jak wygląda architektura sieci neuronowej, która będzie odpowiedzialna za przewidywanie\n",
"slotów w wypowiedziach."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SequenceTagger(\n",
" (embeddings): StackedEmbeddings(\n",
" (list_embedding_0): WordEmbeddings('pl')\n",
" (list_embedding_1): FlairEmbeddings(\n",
" (lm): LanguageModel(\n",
" (drop): Dropout(p=0.25, inplace=False)\n",
" (encoder): Embedding(1602, 100)\n",
" (rnn): LSTM(100, 2048)\n",
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
" )\n",
" )\n",
" (list_embedding_2): FlairEmbeddings(\n",
" (lm): LanguageModel(\n",
" (drop): Dropout(p=0.25, inplace=False)\n",
" (encoder): Embedding(1602, 100)\n",
" (rnn): LSTM(100, 2048)\n",
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
" )\n",
" )\n",
" (list_embedding_3): CharacterEmbeddings(\n",
" (char_embedding): Embedding(275, 25)\n",
" (char_rnn): LSTM(25, 25, bidirectional=True)\n",
" )\n",
" )\n",
" (word_dropout): WordDropout(p=0.05)\n",
" (locked_dropout): LockedDropout(p=0.5)\n",
" (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)\n",
" (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)\n",
" (linear): Linear(in_features=512, out_features=31, bias=True)\n",
" (beta): 1.0\n",
" (weights): None\n",
" (weight_tensor) None\n",
")\n"
]
}
],
"source": [
"print(tagger)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wykonamy dziesięć iteracji (epok) uczenia a wynikowy model zapiszemy w katalogu `slot-model`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:11:09,838 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:09,846 Model: \"SequenceTagger(\n",
" (embeddings): StackedEmbeddings(\n",
" (list_embedding_0): WordEmbeddings('pl')\n",
" (list_embedding_1): FlairEmbeddings(\n",
" (lm): LanguageModel(\n",
" (drop): Dropout(p=0.25, inplace=False)\n",
" (encoder): Embedding(1602, 100)\n",
" (rnn): LSTM(100, 2048)\n",
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
" )\n",
" )\n",
" (list_embedding_2): FlairEmbeddings(\n",
" (lm): LanguageModel(\n",
" (drop): Dropout(p=0.25, inplace=False)\n",
" (encoder): Embedding(1602, 100)\n",
" (rnn): LSTM(100, 2048)\n",
" (decoder): Linear(in_features=2048, out_features=1602, bias=True)\n",
" )\n",
" )\n",
" (list_embedding_3): CharacterEmbeddings(\n",
" (char_embedding): Embedding(275, 25)\n",
" (char_rnn): LSTM(25, 25, bidirectional=True)\n",
" )\n",
" )\n",
" (word_dropout): WordDropout(p=0.05)\n",
" (locked_dropout): LockedDropout(p=0.5)\n",
" (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)\n",
" (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)\n",
" (linear): Linear(in_features=512, out_features=31, bias=True)\n",
" (beta): 1.0\n",
" (weights): None\n",
" (weight_tensor) None\n",
")\"\n",
"2021-05-16 19:11:09,846 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:09,846 Corpus: \"Corpus: 99 train + 11 dev + 110 test sentences\"\n",
"2021-05-16 19:11:09,846 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:09,846 Parameters:\n",
"2021-05-16 19:11:09,854 - learning_rate: \"0.1\"\n",
"2021-05-16 19:11:09,854 - mini_batch_size: \"32\"\n",
"2021-05-16 19:11:09,854 - patience: \"3\"\n",
"2021-05-16 19:11:09,854 - anneal_factor: \"0.5\"\n",
"2021-05-16 19:11:09,854 - max_epochs: \"100\"\n",
"2021-05-16 19:11:09,854 - shuffle: \"True\"\n",
"2021-05-16 19:11:09,854 - train_with_dev: \"False\"\n",
"2021-05-16 19:11:09,854 - batch_growth_annealing: \"False\"\n",
"2021-05-16 19:11:09,862 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:09,862 Model training base path: \"slot-model\"\n",
"2021-05-16 19:11:09,862 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:09,862 Device: cpu\n",
"2021-05-16 19:11:09,862 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:09,862 Embeddings storage mode: cpu\n",
"2021-05-16 19:11:09,870 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:12,779 epoch 1 - iter 1/4 - loss 23.51556206 - samples/sec: 11.00 - lr: 0.100000\n",
"2021-05-16 19:11:16,270 epoch 1 - iter 2/4 - loss 19.95522118 - samples/sec: 9.17 - lr: 0.100000\n",
"2021-05-16 19:11:19,989 epoch 1 - iter 3/4 - loss 18.64025307 - samples/sec: 8.64 - lr: 0.100000\n",
"2021-05-16 19:11:20,665 epoch 1 - iter 4/4 - loss 16.56225991 - samples/sec: 47.34 - lr: 0.100000\n",
"2021-05-16 19:11:20,665 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:20,665 EPOCH 1 done: loss 16.5623 - lr 0.1000000\n",
"2021-05-16 19:11:23,175 DEV : loss 12.217952728271484 - score 0.0\n",
"2021-05-16 19:11:23,175 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:11:31,472 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:32,200 epoch 2 - iter 1/4 - loss 13.48146439 - samples/sec: 44.15 - lr: 0.100000\n",
"2021-05-16 19:11:32,902 epoch 2 - iter 2/4 - loss 13.13387251 - samples/sec: 45.60 - lr: 0.100000\n",
"2021-05-16 19:11:33,485 epoch 2 - iter 3/4 - loss 12.05493037 - samples/sec: 54.92 - lr: 0.100000\n",
"2021-05-16 19:11:33,672 epoch 2 - iter 4/4 - loss 10.83767450 - samples/sec: 170.46 - lr: 0.100000\n",
"2021-05-16 19:11:33,672 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:33,672 EPOCH 2 done: loss 10.8377 - lr 0.1000000\n",
"2021-05-16 19:11:33,768 DEV : loss 8.176359176635742 - score 0.0\n",
"2021-05-16 19:11:33,771 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:11:42,363 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:43,054 epoch 3 - iter 1/4 - loss 9.78410912 - samples/sec: 46.31 - lr: 0.100000\n",
"2021-05-16 19:11:43,672 epoch 3 - iter 2/4 - loss 9.88690376 - samples/sec: 51.75 - lr: 0.100000\n",
"2021-05-16 19:11:44,405 epoch 3 - iter 3/4 - loss 9.67457644 - samples/sec: 43.69 - lr: 0.100000\n",
"2021-05-16 19:11:44,589 epoch 3 - iter 4/4 - loss 8.94925010 - samples/sec: 173.35 - lr: 0.100000\n",
"2021-05-16 19:11:44,589 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:44,589 EPOCH 3 done: loss 8.9493 - lr 0.1000000\n",
"2021-05-16 19:11:44,693 DEV : loss 7.451809883117676 - score 0.0\n",
"2021-05-16 19:11:44,693 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:11:53,845 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:54,437 epoch 4 - iter 1/4 - loss 8.59626198 - samples/sec: 55.55 - lr: 0.100000\n",
"2021-05-16 19:11:55,150 epoch 4 - iter 2/4 - loss 8.40540457 - samples/sec: 44.85 - lr: 0.100000\n",
"2021-05-16 19:11:55,995 epoch 4 - iter 3/4 - loss 8.39408366 - samples/sec: 37.88 - lr: 0.100000\n",
"2021-05-16 19:11:56,222 epoch 4 - iter 4/4 - loss 7.31822419 - samples/sec: 141.22 - lr: 0.100000\n",
"2021-05-16 19:11:56,222 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:56,222 EPOCH 4 done: loss 7.3182 - lr 0.1000000\n",
"2021-05-16 19:11:56,309 DEV : loss 7.464598178863525 - score 0.0\n",
"2021-05-16 19:11:56,309 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:11:56,309 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:57,036 epoch 5 - iter 1/4 - loss 7.71572590 - samples/sec: 44.96 - lr: 0.100000\n",
"2021-05-16 19:11:57,744 epoch 5 - iter 2/4 - loss 8.43728781 - samples/sec: 45.20 - lr: 0.100000\n",
"2021-05-16 19:11:58,488 epoch 5 - iter 3/4 - loss 7.66639407 - samples/sec: 43.01 - lr: 0.100000\n",
"2021-05-16 19:11:58,705 epoch 5 - iter 4/4 - loss 8.57210910 - samples/sec: 147.23 - lr: 0.100000\n",
"2021-05-16 19:11:58,705 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:11:58,705 EPOCH 5 done: loss 8.5721 - lr 0.1000000\n",
"2021-05-16 19:11:58,801 DEV : loss 7.330676555633545 - score 0.0645\n",
"2021-05-16 19:11:58,809 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:12:09,132 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:10,066 epoch 6 - iter 1/4 - loss 6.82695341 - samples/sec: 34.26 - lr: 0.100000\n",
"2021-05-16 19:12:10,923 epoch 6 - iter 2/4 - loss 6.71814942 - samples/sec: 37.31 - lr: 0.100000\n",
"2021-05-16 19:12:11,835 epoch 6 - iter 3/4 - loss 7.02111626 - samples/sec: 35.09 - lr: 0.100000\n",
"2021-05-16 19:12:12,029 epoch 6 - iter 4/4 - loss 8.55612421 - samples/sec: 165.49 - lr: 0.100000\n",
"2021-05-16 19:12:12,029 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:12,029 EPOCH 6 done: loss 8.5561 - lr 0.1000000\n",
"2021-05-16 19:12:12,117 DEV : loss 5.898077011108398 - score 0.0\n",
"2021-05-16 19:12:12,117 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:12:12,117 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:12,829 epoch 7 - iter 1/4 - loss 3.95063305 - samples/sec: 45.47 - lr: 0.100000\n",
"2021-05-16 19:12:13,605 epoch 7 - iter 2/4 - loss 4.73969674 - samples/sec: 41.22 - lr: 0.100000\n",
"2021-05-16 19:12:14,424 epoch 7 - iter 3/4 - loss 6.22298797 - samples/sec: 39.08 - lr: 0.100000\n",
"2021-05-16 19:12:14,648 epoch 7 - iter 4/4 - loss 7.01634419 - samples/sec: 142.74 - lr: 0.100000\n",
"2021-05-16 19:12:14,648 ----------------------------------------------------------------------------------------------------\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:12:14,648 EPOCH 7 done: loss 7.0163 - lr 0.1000000\n",
"2021-05-16 19:12:14,745 DEV : loss 5.496520519256592 - score 0.1538\n",
"2021-05-16 19:12:14,745 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:12:24,553 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:25,305 epoch 8 - iter 1/4 - loss 5.84166050 - samples/sec: 43.01 - lr: 0.100000\n",
"2021-05-16 19:12:26,009 epoch 8 - iter 2/4 - loss 5.58190751 - samples/sec: 45.43 - lr: 0.100000\n",
"2021-05-16 19:12:26,803 epoch 8 - iter 3/4 - loss 6.09121291 - samples/sec: 40.28 - lr: 0.100000\n",
"2021-05-16 19:12:27,011 epoch 8 - iter 4/4 - loss 5.20219183 - samples/sec: 153.85 - lr: 0.100000\n",
"2021-05-16 19:12:27,011 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:27,011 EPOCH 8 done: loss 5.2022 - lr 0.1000000\n",
"2021-05-16 19:12:27,099 DEV : loss 5.2129292488098145 - score 0.3478\n",
"2021-05-16 19:12:27,099 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:12:37,200 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:37,968 epoch 9 - iter 1/4 - loss 6.38291883 - samples/sec: 41.64 - lr: 0.100000\n",
"2021-05-16 19:12:38,703 epoch 9 - iter 2/4 - loss 6.26358747 - samples/sec: 43.56 - lr: 0.100000\n",
"2021-05-16 19:12:39,284 epoch 9 - iter 3/4 - loss 5.50593615 - samples/sec: 55.03 - lr: 0.100000\n",
"2021-05-16 19:12:39,476 epoch 9 - iter 4/4 - loss 4.59320381 - samples/sec: 166.66 - lr: 0.100000\n",
"2021-05-16 19:12:39,476 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:39,476 EPOCH 9 done: loss 4.5932 - lr 0.1000000\n",
"2021-05-16 19:12:39,580 DEV : loss 4.9869303703308105 - score 0.2609\n",
"2021-05-16 19:12:39,580 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:12:39,590 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:40,387 epoch 10 - iter 1/4 - loss 4.83267832 - samples/sec: 40.15 - lr: 0.100000\n",
"2021-05-16 19:12:41,158 epoch 10 - iter 2/4 - loss 4.78956985 - samples/sec: 41.52 - lr: 0.100000\n",
"2021-05-16 19:12:41,792 epoch 10 - iter 3/4 - loss 4.80196079 - samples/sec: 50.47 - lr: 0.100000\n",
"2021-05-16 19:12:41,993 epoch 10 - iter 4/4 - loss 4.40808117 - samples/sec: 158.79 - lr: 0.100000\n",
"2021-05-16 19:12:42,001 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:42,001 EPOCH 10 done: loss 4.4081 - lr 0.1000000\n",
"2021-05-16 19:12:42,089 DEV : loss 4.855195045471191 - score 0.3077\n",
"2021-05-16 19:12:42,089 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:12:42,097 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:42,780 epoch 11 - iter 1/4 - loss 3.66451931 - samples/sec: 46.87 - lr: 0.100000\n",
"2021-05-16 19:12:43,666 epoch 11 - iter 2/4 - loss 4.65244174 - samples/sec: 36.16 - lr: 0.100000\n",
"2021-05-16 19:12:44,456 epoch 11 - iter 3/4 - loss 4.58611314 - samples/sec: 40.51 - lr: 0.100000\n",
"2021-05-16 19:12:44,648 epoch 11 - iter 4/4 - loss 4.86016536 - samples/sec: 166.85 - lr: 0.100000\n",
"2021-05-16 19:12:44,656 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:44,656 EPOCH 11 done: loss 4.8602 - lr 0.1000000\n",
"2021-05-16 19:12:44,737 DEV : loss 4.352779865264893 - score 0.3478\n",
"2021-05-16 19:12:44,745 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:12:53,094 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:53,668 epoch 12 - iter 1/4 - loss 3.02415586 - samples/sec: 55.76 - lr: 0.100000\n",
"2021-05-16 19:12:54,381 epoch 12 - iter 2/4 - loss 3.78920162 - samples/sec: 44.90 - lr: 0.100000\n",
"2021-05-16 19:12:55,097 epoch 12 - iter 3/4 - loss 4.02983785 - samples/sec: 44.67 - lr: 0.100000\n",
"2021-05-16 19:12:55,304 epoch 12 - iter 4/4 - loss 3.44744644 - samples/sec: 154.91 - lr: 0.100000\n",
"2021-05-16 19:12:55,304 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:55,304 EPOCH 12 done: loss 3.4474 - lr 0.1000000\n",
"2021-05-16 19:12:55,402 DEV : loss 4.364665508270264 - score 0.3333\n",
"2021-05-16 19:12:55,402 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:12:55,414 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:56,003 epoch 13 - iter 1/4 - loss 4.21208715 - samples/sec: 54.32 - lr: 0.100000\n",
"2021-05-16 19:12:56,765 epoch 13 - iter 2/4 - loss 4.02075458 - samples/sec: 42.01 - lr: 0.100000\n",
"2021-05-16 19:12:57,528 epoch 13 - iter 3/4 - loss 3.93069355 - samples/sec: 41.92 - lr: 0.100000\n",
"2021-05-16 19:12:57,757 epoch 13 - iter 4/4 - loss 4.47141653 - samples/sec: 139.66 - lr: 0.100000\n",
"2021-05-16 19:12:57,757 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:12:57,757 EPOCH 13 done: loss 4.4714 - lr 0.1000000\n",
"2021-05-16 19:12:57,856 DEV : loss 4.251131057739258 - score 0.4615\n",
"2021-05-16 19:12:57,856 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:13:07,766 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:08,603 epoch 14 - iter 1/4 - loss 4.07004356 - samples/sec: 38.23 - lr: 0.100000\n",
"2021-05-16 19:13:09,137 epoch 14 - iter 2/4 - loss 3.58775365 - samples/sec: 60.00 - lr: 0.100000\n",
"2021-05-16 19:13:09,805 epoch 14 - iter 3/4 - loss 3.37540340 - samples/sec: 49.04 - lr: 0.100000\n",
"2021-05-16 19:13:10,017 epoch 14 - iter 4/4 - loss 3.30140239 - samples/sec: 150.99 - lr: 0.100000\n",
"2021-05-16 19:13:10,017 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:10,017 EPOCH 14 done: loss 3.3014 - lr 0.1000000\n",
"2021-05-16 19:13:10,108 DEV : loss 3.9291062355041504 - score 0.4348\n",
"2021-05-16 19:13:10,108 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:13:10,126 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:10,799 epoch 15 - iter 1/4 - loss 4.12087154 - samples/sec: 47.53 - lr: 0.100000\n",
"2021-05-16 19:13:11,479 epoch 15 - iter 2/4 - loss 3.45777619 - samples/sec: 47.09 - lr: 0.100000\n",
"2021-05-16 19:13:12,230 epoch 15 - iter 3/4 - loss 3.44035808 - samples/sec: 42.59 - lr: 0.100000\n",
"2021-05-16 19:13:12,392 epoch 15 - iter 4/4 - loss 2.90269253 - samples/sec: 197.83 - lr: 0.100000\n",
"2021-05-16 19:13:12,408 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:12,408 EPOCH 15 done: loss 2.9027 - lr 0.1000000\n",
"2021-05-16 19:13:12,498 DEV : loss 4.368889808654785 - score 0.6923\n",
"2021-05-16 19:13:12,498 BAD EPOCHS (no improvement): 0\n",
"saving best model\n",
"2021-05-16 19:13:22,020 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:22,716 epoch 16 - iter 1/4 - loss 2.49819446 - samples/sec: 45.95 - lr: 0.100000\n",
"2021-05-16 19:13:23,466 epoch 16 - iter 2/4 - loss 3.36824119 - samples/sec: 43.59 - lr: 0.100000\n",
"2021-05-16 19:13:24,067 epoch 16 - iter 3/4 - loss 3.36522110 - samples/sec: 53.20 - lr: 0.100000\n",
"2021-05-16 19:13:24,253 epoch 16 - iter 4/4 - loss 3.36765742 - samples/sec: 188.42 - lr: 0.100000\n",
"2021-05-16 19:13:24,253 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:24,253 EPOCH 16 done: loss 3.3677 - lr 0.1000000\n",
"2021-05-16 19:13:24,348 DEV : loss 3.6790337562561035 - score 0.5833\n",
"2021-05-16 19:13:24,348 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:13:24,356 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:24,905 epoch 17 - iter 1/4 - loss 3.17663288 - samples/sec: 58.35 - lr: 0.100000\n",
"2021-05-16 19:13:25,620 epoch 17 - iter 2/4 - loss 3.24819005 - samples/sec: 44.73 - lr: 0.100000\n",
"2021-05-16 19:13:26,267 epoch 17 - iter 3/4 - loss 2.86507106 - samples/sec: 49.44 - lr: 0.100000\n",
"2021-05-16 19:13:26,483 epoch 17 - iter 4/4 - loss 4.03450483 - samples/sec: 160.21 - lr: 0.100000\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:13:26,483 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:26,483 EPOCH 17 done: loss 4.0345 - lr 0.1000000\n",
"2021-05-16 19:13:26,579 DEV : loss 3.864961862564087 - score 0.6154\n",
"2021-05-16 19:13:26,580 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:13:26,583 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:27,322 epoch 18 - iter 1/4 - loss 3.06332946 - samples/sec: 43.30 - lr: 0.100000\n",
"2021-05-16 19:13:27,901 epoch 18 - iter 2/4 - loss 3.11640310 - samples/sec: 55.27 - lr: 0.100000\n",
"2021-05-16 19:13:28,698 epoch 18 - iter 3/4 - loss 2.99107130 - samples/sec: 40.18 - lr: 0.100000\n",
"2021-05-16 19:13:28,898 epoch 18 - iter 4/4 - loss 2.94846284 - samples/sec: 160.00 - lr: 0.100000\n",
"2021-05-16 19:13:28,898 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:28,898 EPOCH 18 done: loss 2.9485 - lr 0.1000000\n",
"2021-05-16 19:13:28,986 DEV : loss 3.8492608070373535 - score 0.48\n",
"2021-05-16 19:13:28,994 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:13:28,994 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:29,622 epoch 19 - iter 1/4 - loss 2.81688428 - samples/sec: 50.89 - lr: 0.100000\n",
"2021-05-16 19:13:30,354 epoch 19 - iter 2/4 - loss 2.99261010 - samples/sec: 44.72 - lr: 0.100000\n",
"2021-05-16 19:13:30,979 epoch 19 - iter 3/4 - loss 2.85697055 - samples/sec: 51.15 - lr: 0.100000\n",
"2021-05-16 19:13:31,139 epoch 19 - iter 4/4 - loss 2.25571273 - samples/sec: 200.02 - lr: 0.100000\n",
"2021-05-16 19:13:31,139 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:31,139 EPOCH 19 done: loss 2.2557 - lr 0.1000000\n",
"2021-05-16 19:13:31,235 DEV : loss 3.9649171829223633 - score 0.5185\n",
"Epoch 19: reducing learning rate of group 0 to 5.0000e-02.\n",
"2021-05-16 19:13:31,235 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:13:31,242 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:31,906 epoch 20 - iter 1/4 - loss 3.35270214 - samples/sec: 48.22 - lr: 0.050000\n",
"2021-05-16 19:13:32,555 epoch 20 - iter 2/4 - loss 2.56608105 - samples/sec: 49.28 - lr: 0.050000\n",
"2021-05-16 19:13:33,131 epoch 20 - iter 3/4 - loss 2.33327313 - samples/sec: 56.37 - lr: 0.050000\n",
"2021-05-16 19:13:33,332 epoch 20 - iter 4/4 - loss 2.89689222 - samples/sec: 165.52 - lr: 0.050000\n",
"2021-05-16 19:13:33,340 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:33,340 EPOCH 20 done: loss 2.8969 - lr 0.0500000\n",
"2021-05-16 19:13:33,421 DEV : loss 3.6375184059143066 - score 0.56\n",
"2021-05-16 19:13:33,421 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:13:33,421 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:34,102 epoch 21 - iter 1/4 - loss 2.03401089 - samples/sec: 47.65 - lr: 0.050000\n",
"2021-05-16 19:13:34,750 epoch 21 - iter 2/4 - loss 2.45254445 - samples/sec: 49.40 - lr: 0.050000\n",
"2021-05-16 19:13:35,405 epoch 21 - iter 3/4 - loss 2.02827569 - samples/sec: 48.84 - lr: 0.050000\n",
"2021-05-16 19:13:35,652 epoch 21 - iter 4/4 - loss 2.53652957 - samples/sec: 129.49 - lr: 0.050000\n",
"2021-05-16 19:13:35,652 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:35,652 EPOCH 21 done: loss 2.5365 - lr 0.0500000\n",
"2021-05-16 19:13:35,756 DEV : loss 3.636472463607788 - score 0.56\n",
"2021-05-16 19:13:35,756 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:13:35,763 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:36,461 epoch 22 - iter 1/4 - loss 2.35593867 - samples/sec: 45.85 - lr: 0.050000\n",
"2021-05-16 19:13:37,157 epoch 22 - iter 2/4 - loss 1.78290999 - samples/sec: 45.97 - lr: 0.050000\n",
"2021-05-16 19:13:37,821 epoch 22 - iter 3/4 - loss 2.12207437 - samples/sec: 48.21 - lr: 0.050000\n",
"2021-05-16 19:13:38,014 epoch 22 - iter 4/4 - loss 2.15731788 - samples/sec: 165.55 - lr: 0.050000\n",
"2021-05-16 19:13:38,014 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:38,021 EPOCH 22 done: loss 2.1573 - lr 0.0500000\n",
"2021-05-16 19:13:38,108 DEV : loss 3.7137885093688965 - score 0.6667\n",
"2021-05-16 19:13:38,116 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:13:38,116 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:38,822 epoch 23 - iter 1/4 - loss 1.83278751 - samples/sec: 45.53 - lr: 0.050000\n",
"2021-05-16 19:13:39,736 epoch 23 - iter 2/4 - loss 2.04161525 - samples/sec: 35.03 - lr: 0.050000\n",
"2021-05-16 19:13:40,684 epoch 23 - iter 3/4 - loss 2.19689337 - samples/sec: 33.76 - lr: 0.050000\n",
"2021-05-16 19:13:40,933 epoch 23 - iter 4/4 - loss 1.73538903 - samples/sec: 128.34 - lr: 0.050000\n",
"2021-05-16 19:13:40,934 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:40,934 EPOCH 23 done: loss 1.7354 - lr 0.0500000\n",
"2021-05-16 19:13:41,043 DEV : loss 3.495877265930176 - score 0.5833\n",
"Epoch 23: reducing learning rate of group 0 to 2.5000e-02.\n",
"2021-05-16 19:13:41,043 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:13:41,051 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:41,949 epoch 24 - iter 1/4 - loss 2.58249235 - samples/sec: 35.62 - lr: 0.025000\n",
"2021-05-16 19:13:42,545 epoch 24 - iter 2/4 - loss 2.33847690 - samples/sec: 53.73 - lr: 0.025000\n",
"2021-05-16 19:13:43,209 epoch 24 - iter 3/4 - loss 2.05386758 - samples/sec: 48.20 - lr: 0.025000\n",
"2021-05-16 19:13:43,426 epoch 24 - iter 4/4 - loss 1.69814771 - samples/sec: 147.27 - lr: 0.025000\n",
"2021-05-16 19:13:43,426 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:43,426 EPOCH 24 done: loss 1.6981 - lr 0.0250000\n",
"2021-05-16 19:13:43,514 DEV : loss 3.547339677810669 - score 0.5833\n",
"2021-05-16 19:13:43,514 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:13:43,514 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:44,502 epoch 25 - iter 1/4 - loss 2.63612175 - samples/sec: 32.67 - lr: 0.025000\n",
"2021-05-16 19:13:45,551 epoch 25 - iter 2/4 - loss 2.28528547 - samples/sec: 30.49 - lr: 0.025000\n",
"2021-05-16 19:13:46,368 epoch 25 - iter 3/4 - loss 2.18019919 - samples/sec: 39.20 - lr: 0.025000\n",
"2021-05-16 19:13:46,585 epoch 25 - iter 4/4 - loss 1.82882562 - samples/sec: 147.22 - lr: 0.025000\n",
"2021-05-16 19:13:46,585 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:46,585 EPOCH 25 done: loss 1.8288 - lr 0.0250000\n",
"2021-05-16 19:13:46,681 DEV : loss 3.695451259613037 - score 0.6667\n",
"2021-05-16 19:13:46,681 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:13:46,681 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:47,435 epoch 26 - iter 1/4 - loss 2.46649575 - samples/sec: 42.90 - lr: 0.025000\n",
"2021-05-16 19:13:48,195 epoch 26 - iter 2/4 - loss 1.86319947 - samples/sec: 42.09 - lr: 0.025000\n",
"2021-05-16 19:13:49,101 epoch 26 - iter 3/4 - loss 1.99375129 - samples/sec: 35.34 - lr: 0.025000\n",
"2021-05-16 19:13:49,350 epoch 26 - iter 4/4 - loss 2.51209539 - samples/sec: 132.64 - lr: 0.025000\n",
"2021-05-16 19:13:49,350 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:49,350 EPOCH 26 done: loss 2.5121 - lr 0.0250000\n",
"2021-05-16 19:13:49,454 DEV : loss 3.5949974060058594 - score 0.6667\n",
"2021-05-16 19:13:49,457 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:13:49,457 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:50,194 epoch 27 - iter 1/4 - loss 1.67152703 - samples/sec: 43.40 - lr: 0.025000\n",
"2021-05-16 19:13:50,906 epoch 27 - iter 2/4 - loss 1.81827271 - samples/sec: 44.95 - lr: 0.025000\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:13:51,642 epoch 27 - iter 3/4 - loss 1.91284267 - samples/sec: 43.46 - lr: 0.025000\n",
"2021-05-16 19:13:51,834 epoch 27 - iter 4/4 - loss 2.51718122 - samples/sec: 166.65 - lr: 0.025000\n",
"2021-05-16 19:13:51,834 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:51,834 EPOCH 27 done: loss 2.5172 - lr 0.0250000\n",
"2021-05-16 19:13:51,930 DEV : loss 3.624786376953125 - score 0.6667\n",
"Epoch 27: reducing learning rate of group 0 to 1.2500e-02.\n",
"2021-05-16 19:13:51,930 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:13:51,930 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:52,650 epoch 28 - iter 1/4 - loss 2.06657982 - samples/sec: 44.45 - lr: 0.012500\n",
"2021-05-16 19:13:53,405 epoch 28 - iter 2/4 - loss 2.16739893 - samples/sec: 42.42 - lr: 0.012500\n",
"2021-05-16 19:13:54,234 epoch 28 - iter 3/4 - loss 1.87206562 - samples/sec: 38.60 - lr: 0.012500\n",
"2021-05-16 19:13:54,402 epoch 28 - iter 4/4 - loss 1.53354126 - samples/sec: 190.48 - lr: 0.012500\n",
"2021-05-16 19:13:54,410 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:54,410 EPOCH 28 done: loss 1.5335 - lr 0.0125000\n",
"2021-05-16 19:13:54,498 DEV : loss 3.486685276031494 - score 0.6667\n",
"2021-05-16 19:13:54,498 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:13:54,498 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:55,514 epoch 29 - iter 1/4 - loss 1.94683826 - samples/sec: 31.74 - lr: 0.012500\n",
"2021-05-16 19:13:56,355 epoch 29 - iter 2/4 - loss 1.87296987 - samples/sec: 38.03 - lr: 0.012500\n",
"2021-05-16 19:13:57,018 epoch 29 - iter 3/4 - loss 1.93602276 - samples/sec: 48.88 - lr: 0.012500\n",
"2021-05-16 19:13:57,202 epoch 29 - iter 4/4 - loss 1.87588742 - samples/sec: 173.70 - lr: 0.012500\n",
"2021-05-16 19:13:57,202 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:57,202 EPOCH 29 done: loss 1.8759 - lr 0.0125000\n",
"2021-05-16 19:13:57,298 DEV : loss 3.5309135913848877 - score 0.6667\n",
"2021-05-16 19:13:57,298 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:13:57,298 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:58,250 epoch 30 - iter 1/4 - loss 2.16732407 - samples/sec: 33.90 - lr: 0.012500\n",
"2021-05-16 19:13:58,931 epoch 30 - iter 2/4 - loss 1.72622716 - samples/sec: 46.96 - lr: 0.012500\n",
"2021-05-16 19:13:59,781 epoch 30 - iter 3/4 - loss 1.93175316 - samples/sec: 37.65 - lr: 0.012500\n",
"2021-05-16 19:13:59,982 epoch 30 - iter 4/4 - loss 1.60670690 - samples/sec: 159.08 - lr: 0.012500\n",
"2021-05-16 19:13:59,990 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:13:59,990 EPOCH 30 done: loss 1.6067 - lr 0.0125000\n",
"2021-05-16 19:14:00,088 DEV : loss 3.4875831604003906 - score 0.6667\n",
"2021-05-16 19:14:00,096 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:14:00,096 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:01,011 epoch 31 - iter 1/4 - loss 2.39419317 - samples/sec: 34.99 - lr: 0.012500\n",
"2021-05-16 19:14:01,826 epoch 31 - iter 2/4 - loss 1.94124657 - samples/sec: 39.64 - lr: 0.012500\n",
"2021-05-16 19:14:02,676 epoch 31 - iter 3/4 - loss 1.81396655 - samples/sec: 37.62 - lr: 0.012500\n",
"2021-05-16 19:14:02,876 epoch 31 - iter 4/4 - loss 1.78971809 - samples/sec: 166.69 - lr: 0.012500\n",
"2021-05-16 19:14:02,884 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:02,884 EPOCH 31 done: loss 1.7897 - lr 0.0125000\n",
"2021-05-16 19:14:02,961 DEV : loss 3.4355287551879883 - score 0.5833\n",
"Epoch 31: reducing learning rate of group 0 to 6.2500e-03.\n",
"2021-05-16 19:14:02,961 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:14:02,976 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:03,838 epoch 32 - iter 1/4 - loss 1.18405724 - samples/sec: 37.13 - lr: 0.006250\n",
"2021-05-16 19:14:04,727 epoch 32 - iter 2/4 - loss 1.78029823 - samples/sec: 35.98 - lr: 0.006250\n",
"2021-05-16 19:14:05,416 epoch 32 - iter 3/4 - loss 1.71468850 - samples/sec: 46.96 - lr: 0.006250\n",
"2021-05-16 19:14:05,673 epoch 32 - iter 4/4 - loss 1.98795196 - samples/sec: 124.99 - lr: 0.006250\n",
"2021-05-16 19:14:05,673 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:05,673 EPOCH 32 done: loss 1.9880 - lr 0.0062500\n",
"2021-05-16 19:14:05,768 DEV : loss 3.4302756786346436 - score 0.5833\n",
"2021-05-16 19:14:05,776 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:14:05,776 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:06,493 epoch 33 - iter 1/4 - loss 1.43548059 - samples/sec: 44.69 - lr: 0.006250\n",
"2021-05-16 19:14:07,307 epoch 33 - iter 2/4 - loss 1.70211828 - samples/sec: 39.28 - lr: 0.006250\n",
"2021-05-16 19:14:08,082 epoch 33 - iter 3/4 - loss 1.72906860 - samples/sec: 41.30 - lr: 0.006250\n",
"2021-05-16 19:14:08,343 epoch 33 - iter 4/4 - loss 2.12577587 - samples/sec: 122.39 - lr: 0.006250\n",
"2021-05-16 19:14:08,343 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:08,343 EPOCH 33 done: loss 2.1258 - lr 0.0062500\n",
"2021-05-16 19:14:08,431 DEV : loss 3.4519147872924805 - score 0.6667\n",
"2021-05-16 19:14:08,439 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:14:08,439 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:09,154 epoch 34 - iter 1/4 - loss 1.07441115 - samples/sec: 44.79 - lr: 0.006250\n",
"2021-05-16 19:14:09,975 epoch 34 - iter 2/4 - loss 1.89638603 - samples/sec: 38.96 - lr: 0.006250\n",
"2021-05-16 19:14:10,993 epoch 34 - iter 3/4 - loss 1.81038960 - samples/sec: 31.45 - lr: 0.006250\n",
"2021-05-16 19:14:11,289 epoch 34 - iter 4/4 - loss 1.82815674 - samples/sec: 108.11 - lr: 0.006250\n",
"2021-05-16 19:14:11,289 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:11,289 EPOCH 34 done: loss 1.8282 - lr 0.0062500\n",
"2021-05-16 19:14:11,393 DEV : loss 3.4468681812286377 - score 0.6667\n",
"2021-05-16 19:14:11,393 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:14:11,393 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:12,314 epoch 35 - iter 1/4 - loss 1.71202326 - samples/sec: 34.74 - lr: 0.006250\n",
"2021-05-16 19:14:13,347 epoch 35 - iter 2/4 - loss 2.02234995 - samples/sec: 30.99 - lr: 0.006250\n",
"2021-05-16 19:14:13,977 epoch 35 - iter 3/4 - loss 1.83293974 - samples/sec: 51.40 - lr: 0.006250\n",
"2021-05-16 19:14:14,155 epoch 35 - iter 4/4 - loss 1.40346918 - samples/sec: 188.15 - lr: 0.006250\n",
"2021-05-16 19:14:14,155 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:14,155 EPOCH 35 done: loss 1.4035 - lr 0.0062500\n",
"2021-05-16 19:14:14,251 DEV : loss 3.4555253982543945 - score 0.6667\n",
"Epoch 35: reducing learning rate of group 0 to 3.1250e-03.\n",
"2021-05-16 19:14:14,251 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:14:14,251 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:15,020 epoch 36 - iter 1/4 - loss 1.60199451 - samples/sec: 41.61 - lr: 0.003125\n",
"2021-05-16 19:14:15,758 epoch 36 - iter 2/4 - loss 1.76909965 - samples/sec: 43.41 - lr: 0.003125\n",
"2021-05-16 19:14:16,694 epoch 36 - iter 3/4 - loss 1.96563844 - samples/sec: 34.46 - lr: 0.003125\n",
"2021-05-16 19:14:16,926 epoch 36 - iter 4/4 - loss 2.04810312 - samples/sec: 137.94 - lr: 0.003125\n",
"2021-05-16 19:14:16,926 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:16,926 EPOCH 36 done: loss 2.0481 - lr 0.0031250\n",
"2021-05-16 19:14:17,022 DEV : loss 3.467947483062744 - score 0.6667\n",
"2021-05-16 19:14:17,022 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:14:17,022 ----------------------------------------------------------------------------------------------------\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:14:17,771 epoch 37 - iter 1/4 - loss 1.59361398 - samples/sec: 42.71 - lr: 0.003125\n",
"2021-05-16 19:14:18,573 epoch 37 - iter 2/4 - loss 1.86242718 - samples/sec: 39.93 - lr: 0.003125\n",
"2021-05-16 19:14:19,367 epoch 37 - iter 3/4 - loss 1.84938045 - samples/sec: 40.27 - lr: 0.003125\n",
"2021-05-16 19:14:19,575 epoch 37 - iter 4/4 - loss 1.94639012 - samples/sec: 159.98 - lr: 0.003125\n",
"2021-05-16 19:14:19,575 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:19,575 EPOCH 37 done: loss 1.9464 - lr 0.0031250\n",
"2021-05-16 19:14:19,663 DEV : loss 3.4721953868865967 - score 0.6667\n",
"2021-05-16 19:14:19,663 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:14:19,663 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:20,420 epoch 38 - iter 1/4 - loss 1.87127459 - samples/sec: 42.26 - lr: 0.003125\n",
"2021-05-16 19:14:21,214 epoch 38 - iter 2/4 - loss 1.65014571 - samples/sec: 40.34 - lr: 0.003125\n",
"2021-05-16 19:14:22,201 epoch 38 - iter 3/4 - loss 1.78922117 - samples/sec: 32.41 - lr: 0.003125\n",
"2021-05-16 19:14:22,409 epoch 38 - iter 4/4 - loss 1.57039295 - samples/sec: 153.84 - lr: 0.003125\n",
"2021-05-16 19:14:22,417 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:22,417 EPOCH 38 done: loss 1.5704 - lr 0.0031250\n",
"2021-05-16 19:14:22,522 DEV : loss 3.4747495651245117 - score 0.6667\n",
"2021-05-16 19:14:22,522 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:14:22,522 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:23,532 epoch 39 - iter 1/4 - loss 1.71339095 - samples/sec: 31.94 - lr: 0.003125\n",
"2021-05-16 19:14:24,351 epoch 39 - iter 2/4 - loss 1.87997061 - samples/sec: 39.07 - lr: 0.003125\n",
"2021-05-16 19:14:25,353 epoch 39 - iter 3/4 - loss 1.93014069 - samples/sec: 31.93 - lr: 0.003125\n",
"2021-05-16 19:14:25,553 epoch 39 - iter 4/4 - loss 1.66254094 - samples/sec: 166.68 - lr: 0.003125\n",
"2021-05-16 19:14:25,561 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:25,561 EPOCH 39 done: loss 1.6625 - lr 0.0031250\n",
"2021-05-16 19:14:25,650 DEV : loss 3.4640121459960938 - score 0.6667\n",
"Epoch 39: reducing learning rate of group 0 to 1.5625e-03.\n",
"2021-05-16 19:14:25,650 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:14:25,650 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:26,482 epoch 40 - iter 1/4 - loss 1.51390183 - samples/sec: 38.46 - lr: 0.001563\n",
"2021-05-16 19:14:27,268 epoch 40 - iter 2/4 - loss 1.62989253 - samples/sec: 40.73 - lr: 0.001563\n",
"2021-05-16 19:14:28,116 epoch 40 - iter 3/4 - loss 1.59191600 - samples/sec: 37.73 - lr: 0.001563\n",
"2021-05-16 19:14:28,389 epoch 40 - iter 4/4 - loss 1.58031228 - samples/sec: 116.91 - lr: 0.001563\n",
"2021-05-16 19:14:28,389 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:28,389 EPOCH 40 done: loss 1.5803 - lr 0.0015625\n",
"2021-05-16 19:14:28,493 DEV : loss 3.464979648590088 - score 0.6667\n",
"2021-05-16 19:14:28,493 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:14:28,493 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:29,395 epoch 41 - iter 1/4 - loss 2.09950924 - samples/sec: 35.51 - lr: 0.001563\n",
"2021-05-16 19:14:30,198 epoch 41 - iter 2/4 - loss 2.02299452 - samples/sec: 39.85 - lr: 0.001563\n",
"2021-05-16 19:14:30,959 epoch 41 - iter 3/4 - loss 1.83912905 - samples/sec: 42.02 - lr: 0.001563\n",
"2021-05-16 19:14:31,168 epoch 41 - iter 4/4 - loss 2.28552222 - samples/sec: 152.95 - lr: 0.001563\n",
"2021-05-16 19:14:31,176 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:31,176 EPOCH 41 done: loss 2.2855 - lr 0.0015625\n",
"2021-05-16 19:14:31,256 DEV : loss 3.46785044670105 - score 0.6667\n",
"2021-05-16 19:14:31,256 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:14:31,264 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:31,960 epoch 42 - iter 1/4 - loss 2.07870221 - samples/sec: 45.98 - lr: 0.001563\n",
"2021-05-16 19:14:32,809 epoch 42 - iter 2/4 - loss 1.80660170 - samples/sec: 38.05 - lr: 0.001563\n",
"2021-05-16 19:14:33,486 epoch 42 - iter 3/4 - loss 1.86924104 - samples/sec: 47.31 - lr: 0.001563\n",
"2021-05-16 19:14:33,738 epoch 42 - iter 4/4 - loss 2.06889942 - samples/sec: 126.97 - lr: 0.001563\n",
"2021-05-16 19:14:33,738 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:33,738 EPOCH 42 done: loss 2.0689 - lr 0.0015625\n",
"2021-05-16 19:14:33,827 DEV : loss 3.464182138442993 - score 0.6667\n",
"2021-05-16 19:14:33,835 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:14:33,835 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:34,689 epoch 43 - iter 1/4 - loss 2.16509676 - samples/sec: 37.68 - lr: 0.001563\n",
"2021-05-16 19:14:35,420 epoch 43 - iter 2/4 - loss 1.79616153 - samples/sec: 44.27 - lr: 0.001563\n",
"2021-05-16 19:14:36,298 epoch 43 - iter 3/4 - loss 1.79792849 - samples/sec: 36.44 - lr: 0.001563\n",
"2021-05-16 19:14:36,517 epoch 43 - iter 4/4 - loss 1.78867936 - samples/sec: 146.19 - lr: 0.001563\n",
"2021-05-16 19:14:36,517 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:36,517 EPOCH 43 done: loss 1.7887 - lr 0.0015625\n",
"2021-05-16 19:14:36,589 DEV : loss 3.464967966079712 - score 0.6667\n",
"Epoch 43: reducing learning rate of group 0 to 7.8125e-04.\n",
"2021-05-16 19:14:36,589 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:14:36,603 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:37,308 epoch 44 - iter 1/4 - loss 1.60833621 - samples/sec: 45.36 - lr: 0.000781\n",
"2021-05-16 19:14:38,140 epoch 44 - iter 2/4 - loss 1.45758373 - samples/sec: 38.48 - lr: 0.000781\n",
"2021-05-16 19:14:38,983 epoch 44 - iter 3/4 - loss 1.52034609 - samples/sec: 37.96 - lr: 0.000781\n",
"2021-05-16 19:14:39,226 epoch 44 - iter 4/4 - loss 2.32687372 - samples/sec: 131.50 - lr: 0.000781\n",
"2021-05-16 19:14:39,235 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:39,236 EPOCH 44 done: loss 2.3269 - lr 0.0007813\n",
"2021-05-16 19:14:39,343 DEV : loss 3.467527151107788 - score 0.6667\n",
"2021-05-16 19:14:39,343 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:14:39,343 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:40,254 epoch 45 - iter 1/4 - loss 2.09789848 - samples/sec: 35.42 - lr: 0.000781\n",
"2021-05-16 19:14:41,142 epoch 45 - iter 2/4 - loss 1.90345168 - samples/sec: 36.05 - lr: 0.000781\n",
"2021-05-16 19:14:41,828 epoch 45 - iter 3/4 - loss 1.76009802 - samples/sec: 46.62 - lr: 0.000781\n",
"2021-05-16 19:14:42,079 epoch 45 - iter 4/4 - loss 1.94041607 - samples/sec: 127.70 - lr: 0.000781\n",
"2021-05-16 19:14:42,079 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:42,079 EPOCH 45 done: loss 1.9404 - lr 0.0007813\n",
"2021-05-16 19:14:42,174 DEV : loss 3.4680516719818115 - score 0.6667\n",
"2021-05-16 19:14:42,174 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:14:42,174 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:42,949 epoch 46 - iter 1/4 - loss 2.13200164 - samples/sec: 41.69 - lr: 0.000781\n",
"2021-05-16 19:14:43,628 epoch 46 - iter 2/4 - loss 1.92884541 - samples/sec: 47.13 - lr: 0.000781\n",
"2021-05-16 19:14:44,188 epoch 46 - iter 3/4 - loss 1.86859485 - samples/sec: 57.14 - lr: 0.000781\n",
"2021-05-16 19:14:44,420 epoch 46 - iter 4/4 - loss 2.23936662 - samples/sec: 137.82 - lr: 0.000781\n",
"2021-05-16 19:14:44,420 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:44,420 EPOCH 46 done: loss 2.2394 - lr 0.0007813\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:14:44,516 DEV : loss 3.467272996902466 - score 0.6667\n",
"2021-05-16 19:14:44,516 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:14:44,516 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:45,083 epoch 47 - iter 1/4 - loss 1.17524457 - samples/sec: 57.22 - lr: 0.000781\n",
"2021-05-16 19:14:45,804 epoch 47 - iter 2/4 - loss 1.69363821 - samples/sec: 44.40 - lr: 0.000781\n",
"2021-05-16 19:14:46,515 epoch 47 - iter 3/4 - loss 1.80291025 - samples/sec: 45.00 - lr: 0.000781\n",
"2021-05-16 19:14:46,744 epoch 47 - iter 4/4 - loss 1.68751404 - samples/sec: 139.56 - lr: 0.000781\n",
"2021-05-16 19:14:46,744 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:46,744 EPOCH 47 done: loss 1.6875 - lr 0.0007813\n",
"2021-05-16 19:14:46,841 DEV : loss 3.4656827449798584 - score 0.6667\n",
"Epoch 47: reducing learning rate of group 0 to 3.9063e-04.\n",
"2021-05-16 19:14:46,841 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:14:46,845 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:47,512 epoch 48 - iter 1/4 - loss 1.40106690 - samples/sec: 47.97 - lr: 0.000391\n",
"2021-05-16 19:14:48,126 epoch 48 - iter 2/4 - loss 1.41452271 - samples/sec: 52.10 - lr: 0.000391\n",
"2021-05-16 19:14:48,882 epoch 48 - iter 3/4 - loss 1.74593834 - samples/sec: 42.34 - lr: 0.000391\n",
"2021-05-16 19:14:49,064 epoch 48 - iter 4/4 - loss 1.58755332 - samples/sec: 176.07 - lr: 0.000391\n",
"2021-05-16 19:14:49,064 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:49,064 EPOCH 48 done: loss 1.5876 - lr 0.0003906\n",
"2021-05-16 19:14:49,149 DEV : loss 3.467986822128296 - score 0.6667\n",
"2021-05-16 19:14:49,149 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:14:49,149 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:49,930 epoch 49 - iter 1/4 - loss 1.38971734 - samples/sec: 40.97 - lr: 0.000391\n",
"2021-05-16 19:14:50,510 epoch 49 - iter 2/4 - loss 1.67799520 - samples/sec: 55.24 - lr: 0.000391\n",
"2021-05-16 19:14:51,137 epoch 49 - iter 3/4 - loss 1.69751259 - samples/sec: 51.05 - lr: 0.000391\n",
"2021-05-16 19:14:51,356 epoch 49 - iter 4/4 - loss 1.83348897 - samples/sec: 145.87 - lr: 0.000391\n",
"2021-05-16 19:14:51,356 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:51,356 EPOCH 49 done: loss 1.8335 - lr 0.0003906\n",
"2021-05-16 19:14:51,446 DEV : loss 3.4678850173950195 - score 0.6667\n",
"2021-05-16 19:14:51,446 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:14:51,462 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:52,179 epoch 50 - iter 1/4 - loss 1.13970292 - samples/sec: 44.71 - lr: 0.000391\n",
"2021-05-16 19:14:52,916 epoch 50 - iter 2/4 - loss 1.94286901 - samples/sec: 43.40 - lr: 0.000391\n",
"2021-05-16 19:14:53,640 epoch 50 - iter 3/4 - loss 1.91910776 - samples/sec: 44.19 - lr: 0.000391\n",
"2021-05-16 19:14:53,807 epoch 50 - iter 4/4 - loss 1.56437027 - samples/sec: 191.98 - lr: 0.000391\n",
"2021-05-16 19:14:53,807 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:53,807 EPOCH 50 done: loss 1.5644 - lr 0.0003906\n",
"2021-05-16 19:14:53,886 DEV : loss 3.4673101902008057 - score 0.6667\n",
"2021-05-16 19:14:53,886 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:14:53,898 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:54,525 epoch 51 - iter 1/4 - loss 1.64230800 - samples/sec: 50.99 - lr: 0.000391\n",
"2021-05-16 19:14:55,323 epoch 51 - iter 2/4 - loss 1.66435432 - samples/sec: 40.11 - lr: 0.000391\n",
"2021-05-16 19:14:56,158 epoch 51 - iter 3/4 - loss 1.76997383 - samples/sec: 38.33 - lr: 0.000391\n",
"2021-05-16 19:14:56,348 epoch 51 - iter 4/4 - loss 1.45529963 - samples/sec: 168.77 - lr: 0.000391\n",
"2021-05-16 19:14:56,348 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:56,348 EPOCH 51 done: loss 1.4553 - lr 0.0003906\n",
"2021-05-16 19:14:56,451 DEV : loss 3.46675705909729 - score 0.6667\n",
"Epoch 51: reducing learning rate of group 0 to 1.9531e-04.\n",
"2021-05-16 19:14:56,451 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:14:56,451 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:57,134 epoch 52 - iter 1/4 - loss 1.39893460 - samples/sec: 47.38 - lr: 0.000195\n",
"2021-05-16 19:14:57,904 epoch 52 - iter 2/4 - loss 1.95114291 - samples/sec: 41.57 - lr: 0.000195\n",
"2021-05-16 19:14:58,589 epoch 52 - iter 3/4 - loss 1.87273510 - samples/sec: 46.70 - lr: 0.000195\n",
"2021-05-16 19:14:58,814 epoch 52 - iter 4/4 - loss 1.66518828 - samples/sec: 142.21 - lr: 0.000195\n",
"2021-05-16 19:14:58,814 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:58,814 EPOCH 52 done: loss 1.6652 - lr 0.0001953\n",
"2021-05-16 19:14:58,898 DEV : loss 3.4661099910736084 - score 0.6667\n",
"2021-05-16 19:14:58,898 BAD EPOCHS (no improvement): 1\n",
"2021-05-16 19:14:58,898 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:14:59,621 epoch 53 - iter 1/4 - loss 1.52661002 - samples/sec: 44.90 - lr: 0.000195\n",
"2021-05-16 19:15:00,323 epoch 53 - iter 2/4 - loss 1.72744888 - samples/sec: 45.60 - lr: 0.000195\n",
"2021-05-16 19:15:01,033 epoch 53 - iter 3/4 - loss 1.67759216 - samples/sec: 45.09 - lr: 0.000195\n",
"2021-05-16 19:15:01,186 epoch 53 - iter 4/4 - loss 1.46851297 - samples/sec: 208.70 - lr: 0.000195\n",
"2021-05-16 19:15:01,186 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:01,186 EPOCH 53 done: loss 1.4685 - lr 0.0001953\n",
"2021-05-16 19:15:01,282 DEV : loss 3.466641426086426 - score 0.6667\n",
"2021-05-16 19:15:01,282 BAD EPOCHS (no improvement): 2\n",
"2021-05-16 19:15:01,282 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:01,903 epoch 54 - iter 1/4 - loss 1.67276871 - samples/sec: 51.56 - lr: 0.000195\n",
"2021-05-16 19:15:02,720 epoch 54 - iter 2/4 - loss 1.84151357 - samples/sec: 39.15 - lr: 0.000195\n",
"2021-05-16 19:15:03,497 epoch 54 - iter 3/4 - loss 1.79460196 - samples/sec: 41.16 - lr: 0.000195\n",
"2021-05-16 19:15:03,697 epoch 54 - iter 4/4 - loss 1.73617950 - samples/sec: 160.20 - lr: 0.000195\n",
"2021-05-16 19:15:03,697 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:03,697 EPOCH 54 done: loss 1.7362 - lr 0.0001953\n",
"2021-05-16 19:15:03,791 DEV : loss 3.4663610458374023 - score 0.6667\n",
"2021-05-16 19:15:03,807 BAD EPOCHS (no improvement): 3\n",
"2021-05-16 19:15:03,809 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:04,563 epoch 55 - iter 1/4 - loss 2.19241428 - samples/sec: 42.46 - lr: 0.000195\n",
"2021-05-16 19:15:05,206 epoch 55 - iter 2/4 - loss 1.68816346 - samples/sec: 49.73 - lr: 0.000195\n",
"2021-05-16 19:15:05,899 epoch 55 - iter 3/4 - loss 1.67743218 - samples/sec: 46.20 - lr: 0.000195\n",
"2021-05-16 19:15:06,147 epoch 55 - iter 4/4 - loss 1.62165421 - samples/sec: 129.04 - lr: 0.000195\n",
"2021-05-16 19:15:06,147 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:06,147 EPOCH 55 done: loss 1.6217 - lr 0.0001953\n",
"2021-05-16 19:15:06,243 DEV : loss 3.4659790992736816 - score 0.6667\n",
"Epoch 55: reducing learning rate of group 0 to 9.7656e-05.\n",
"2021-05-16 19:15:06,243 BAD EPOCHS (no improvement): 4\n",
"2021-05-16 19:15:06,243 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:06,243 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:06,243 learning rate too small - quitting training!\n",
"2021-05-16 19:15:06,243 ----------------------------------------------------------------------------------------------------\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:15:14,421 ----------------------------------------------------------------------------------------------------\n",
"2021-05-16 19:15:14,421 Testing using best model ...\n",
"2021-05-16 19:15:14,426 loading file slot-model\\best-model.pt\n",
"2021-05-16 19:15:34,103 0.6759\t0.6901\t0.6829\n",
"2021-05-16 19:15:34,103 \n",
"Results:\n",
"- F1-score (micro) 0.6829\n",
"- F1-score (macro) 0.3185\n",
"\n",
"By class:\n",
"NoLabel I-end_conversation tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"affirm tp: 11 - fp: 6 - fn: 2 - precision: 0.6471 - recall: 0.8462 - f1-score: 0.7333\n",
"appoinment tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"appoinment/doctor tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"appointment tp: 19 - fp: 4 - fn: 1 - precision: 0.8261 - recall: 0.9500 - f1-score: 0.8837\n",
"appointment/doctor tp: 16 - fp: 13 - fn: 5 - precision: 0.5517 - recall: 0.7619 - f1-score: 0.6400\n",
"appointment/office tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"appointment/type tp: 4 - fp: 2 - fn: 2 - precision: 0.6667 - recall: 0.6667 - f1-score: 0.6667\n",
"datetime tp: 12 - fp: 7 - fn: 6 - precision: 0.6316 - recall: 0.6667 - f1-score: 0.6486\n",
"deny tp: 0 - fp: 0 - fn: 4 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"doctor tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"end_conversation tp: 14 - fp: 8 - fn: 6 - precision: 0.6364 - recall: 0.7000 - f1-score: 0.6667\n",
"greeting tp: 18 - fp: 3 - fn: 2 - precision: 0.8571 - recall: 0.9000 - f1-score: 0.8780\n",
"prescription tp: 4 - fp: 3 - fn: 2 - precision: 0.5714 - recall: 0.6667 - f1-score: 0.6154\n",
"prescription/type tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"register/email tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"register/name tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"results tp: 0 - fp: 1 - fn: 3 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000\n",
"2021-05-16 19:15:34,103 ----------------------------------------------------------------------------------------------------\n"
]
},
{
"data": {
"text/plain": [
"{'test_score': 0.6829268292682927,\n",
" 'dev_score_history': [0.0,\n",
" 0.0,\n",
" 0.0,\n",
" 0.0,\n",
" 0.06451612903225808,\n",
" 0.0,\n",
" 0.15384615384615383,\n",
" 0.34782608695652173,\n",
" 0.2608695652173913,\n",
" 0.30769230769230765,\n",
" 0.34782608695652173,\n",
" 0.3333333333333333,\n",
" 0.4615384615384615,\n",
" 0.43478260869565216,\n",
" 0.6923076923076924,\n",
" 0.5833333333333334,\n",
" 0.6153846153846153,\n",
" 0.48000000000000004,\n",
" 0.5185185185185186,\n",
" 0.5599999999999999,\n",
" 0.5599999999999999,\n",
" 0.6666666666666666,\n",
" 0.5833333333333334,\n",
" 0.5833333333333334,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.5833333333333334,\n",
" 0.5833333333333334,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666,\n",
" 0.6666666666666666],\n",
" 'train_loss_history': [16.562259912490845,\n",
" 10.837674498558044,\n",
" 8.949250102043152,\n",
" 7.318224191665649,\n",
" 8.57210910320282,\n",
" 8.556124210357666,\n",
" 7.01634418964386,\n",
" 5.2021918296813965,\n",
" 4.593203812837601,\n",
" 4.4080811738967896,\n",
" 4.860165357589722,\n",
" 3.447446435689926,\n",
" 4.471416532993317,\n",
" 3.3014023900032043,\n",
" 2.902692526578903,\n",
" 3.367657423019409,\n",
" 4.03450483083725,\n",
" 2.9484628438949585,\n",
" 2.2557127252221107,\n",
" 2.8968922197818756,\n",
" 2.5365295708179474,\n",
" 2.157317876815796,\n",
" 1.735389031469822,\n",
" 1.698147714138031,\n",
" 1.8288256227970123,\n",
" 2.5120953917503357,\n",
" 2.5171812176704407,\n",
" 1.5335412621498108,\n",
" 1.8758874237537384,\n",
" 1.606706902384758,\n",
" 1.7897180914878845,\n",
" 1.9879519641399384,\n",
" 2.1257758736610413,\n",
" 1.828156739473343,\n",
" 1.4034691751003265,\n",
" 2.0481031239032745,\n",
" 1.9463901221752167,\n",
" 1.5703929513692856,\n",
" 1.6625409424304962,\n",
" 1.5803122818470001,\n",
" 2.285522222518921,\n",
" 2.0688994228839874,\n",
" 1.7886793613433838,\n",
" 2.3268737196922302,\n",
" 1.9404160678386688,\n",
" 2.2393666207790375,\n",
" 1.6875140368938446,\n",
" 1.587553322315216,\n",
" 1.8334889709949493,\n",
" 1.5643702745437622,\n",
" 1.4552996307611465,\n",
" 1.6651882827281952,\n",
" 1.4685129672288895,\n",
" 1.7361795008182526,\n",
" 1.621654212474823],\n",
" 'dev_loss_history': [12.217952728271484,\n",
" 8.176359176635742,\n",
" 7.451809883117676,\n",
" 7.464598178863525,\n",
" 7.330676555633545,\n",
" 5.898077011108398,\n",
" 5.496520519256592,\n",
" 5.2129292488098145,\n",
" 4.9869303703308105,\n",
" 4.855195045471191,\n",
" 4.352779865264893,\n",
" 4.364665508270264,\n",
" 4.251131057739258,\n",
" 3.9291062355041504,\n",
" 4.368889808654785,\n",
" 3.6790337562561035,\n",
" 3.864961862564087,\n",
" 3.8492608070373535,\n",
" 3.9649171829223633,\n",
" 3.6375184059143066,\n",
" 3.636472463607788,\n",
" 3.7137885093688965,\n",
" 3.495877265930176,\n",
" 3.547339677810669,\n",
" 3.695451259613037,\n",
" 3.5949974060058594,\n",
" 3.624786376953125,\n",
" 3.486685276031494,\n",
" 3.5309135913848877,\n",
" 3.4875831604003906,\n",
" 3.4355287551879883,\n",
" 3.4302756786346436,\n",
" 3.4519147872924805,\n",
" 3.4468681812286377,\n",
" 3.4555253982543945,\n",
" 3.467947483062744,\n",
" 3.4721953868865967,\n",
" 3.4747495651245117,\n",
" 3.4640121459960938,\n",
" 3.464979648590088,\n",
" 3.46785044670105,\n",
" 3.464182138442993,\n",
" 3.464967966079712,\n",
" 3.467527151107788,\n",
" 3.4680516719818115,\n",
" 3.467272996902466,\n",
" 3.4656827449798584,\n",
" 3.467986822128296,\n",
" 3.4678850173950195,\n",
" 3.4673101902008057,\n",
" 3.46675705909729,\n",
" 3.4661099910736084,\n",
" 3.466641426086426,\n",
" 3.4663610458374023,\n",
" 3.4659790992736816]}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"trainer = ModelTrainer(tagger, corpus)\n",
"trainer.train('slot-model',\n",
" learning_rate=0.1,\n",
" mini_batch_size=32,\n",
" max_epochs=100,\n",
" train_with_dev=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Jakość wyuczonego modelu możemy ocenić, korzystając z zaraportowanych powyżej metryk, tj.:\n",
"\n",
" - *tp (true positives)*\n",
"\n",
" > liczba słów oznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą\n",
"\n",
" - *fp (false positives)*\n",
"\n",
" > liczba słów nieoznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą\n",
"\n",
" - *fn (false negatives)*\n",
"\n",
" > liczba słów oznaczonych w zbiorze testowym etykietą $e$, którym model nie nadał etykiety $e$\n",
"\n",
" - *precision*\n",
"\n",
" > $$\\frac{tp}{tp + fp}$$\n",
"\n",
" - *recall*\n",
"\n",
" > $$\\frac{tp}{tp + fn}$$\n",
"\n",
" - $F_1$\n",
"\n",
" > $$\\frac{2 \\cdot precision \\cdot recall}{precision + recall}$$\n",
"\n",
" - *micro* $F_1$\n",
"\n",
" > $F_1$ w którym $tp$, $fp$ i $fn$ są liczone łącznie dla wszystkich etykiet, tj. $tp = \\sum_{e}{{tp}_e}$, $fn = \\sum_{e}{{fn}_e}$, $fp = \\sum_{e}{{fp}_e}$\n",
"\n",
" - *macro* $F_1$\n",
"\n",
" > średnia arytmetyczna z $F_1$ obliczonych dla poszczególnych etykiet z osobna.\n",
"\n",
"Wyuczony model możemy wczytać z pliku korzystając z metody `load`."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021-05-16 19:15:34,133 loading file slot-model/final-model.pt\n"
]
}
],
"source": [
"model = SequenceTagger.load('slot-model/final-model.pt')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wczytany model możemy wykorzystać do przewidywania slotów w wypowiedziach użytkownika, korzystając\n",
"z przedstawionej poniżej funkcji `predict`."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"def predict(model, sentence):\n",
" csentence = [{'form': word} for word in sentence]\n",
" fsentence = conllu2flair([csentence])[0]\n",
" model.predict(fsentence)\n",
" return [(token, ftoken.get_tag('slot').value) for token, ftoken in zip(sentence, fsentence)]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Jak pokazuje przykład poniżej model wyuczony tylko na 100 przykładach popełnia w dosyć prostej\n",
"wypowiedzi błąd etykietując słowo `alarm` tagiem `B-weather/noun`."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"dzien | B-greeting |
\n",
"dobry | I-greeting |
\n",
"poprosze | O |
\n",
"wizytę | B-appointment |
\n",
"do | O |
\n",
"doktor | B-appointment/doctor |
\n",
"lekarza | B-appointment/doctor |
\n",
"rodzinnego | I-appointment/doctor |
\n",
"najlepiej | O |
\n",
"dzisiaj | O |
\n",
"w | O |
\n",
"godzinach | I-datetime |
\n",
"popołudniowych | I-datetime |
\n",
"dziś | B-datetime |
\n",
"albo | O |
\n",
"jutro | I-datetime |
\n",
"internisty | I-appointment/doctor |
\n",
"\n",
"
"
],
"text/plain": [
"'\\n\\ndzien | B-greeting |
\\ndobry | I-greeting |
\\npoprosze | O |
\\nwizytę | B-appointment |
\\ndo | O |
\\ndoktor | B-appointment/doctor |
\\nlekarza | B-appointment/doctor |
\\nrodzinnego | I-appointment/doctor |
\\nnajlepiej | O |
\\ndzisiaj | O |
\\nw | O |
\\ngodzinach | I-datetime |
\\npopołudniowych | I-datetime |
\\ndziś | B-datetime |
\\nalbo | O |
\\njutro | I-datetime |
\\ninternisty | I-appointment/doctor |
\\n\\n
'"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tabulate(predict(model, ' dzien dobry poprosze wizytę do doktor lekarza rodzinnego najlepiej dzisiaj w godzinach popołudniowych dziś albo jutro internisty'.split()), tablefmt='html')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Literatura\n",
"----------\n",
" 1. Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis, Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog. NAACL-HLT (1) 2019, pp. 3795-3805\n",
" 2. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282–289, https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers\n",
" 3. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (November 15, 1997), 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735\n",
" 4. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, Attention is All you Need, NIPS 2017, pp. 5998-6008, https://arxiv.org/abs/1706.03762\n",
" 5. Alan Akbik, Duncan Blythe, Roland Vollgraf, Contextual String Embeddings for Sequence Labeling, Proceedings of the 27th International Conference on Computational Linguistics, pp. 1638–1649, https://www.aclweb.org/anthology/C18-1139.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all",
"main_language": "python",
"notebook_metadata_filter": "-all"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}