94 KiB
Parsing semantyczny z wykorzystaniem technik uczenia maszynowego
Wprowadzenie
Problem wykrywania slotów i ich wartości w wypowiedziach użytkownika można sformułować jako zadanie polegające na przewidywaniu dla poszczególnych słów etykiet wskazujących na to czy i do jakiego slotu dane słowo należy.
chciałbym zarezerwować stolik na jutro**/day** na godzinę dwunastą**/hour** czterdzieści**/hour** pięć**/hour** na pięć**/size** osób
Granice slotów oznacza się korzystając z wybranego schematu etykietowania.
Schemat IOB
Prefix | Znaczenie |
---|---|
I | wnętrze slotu (inside) |
O | poza slotem (outside) |
B | początek slotu (beginning) |
chciałbym zarezerwować stolik na jutro**/B-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/I-hour** na pięć**/B-size** osób
Schemat IOBES
Prefix | Znaczenie |
---|---|
I | wnętrze slotu (inside) |
O | poza slotem (outside) |
B | początek slotu (beginning) |
E | koniec slotu (ending) |
S | pojedyncze słowo (single) |
chciałbym zarezerwować stolik na jutro**/S-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/E-hour** na pięć**/S-size** osób
Jeżeli dla tak sformułowanego zadania przygotujemy zbiór danych złożony z wypowiedzi użytkownika z oznaczonymi slotami (tzw. _zbiór uczący), to możemy zastosować techniki (nadzorowanego) uczenia maszynowego w celu zbudowania modelu annotującego wypowiedzi użytkownika etykietami slotów.
Do zbudowania takiego modelu można wykorzystać między innymi:
warunkowe pola losowe (Lafferty i in.; 2001),
rekurencyjne sieci neuronowe, np. sieci LSTM (Hochreiter i Schmidhuber; 1997),
transformery (Vaswani i in., 2017).
Przykład
Skorzystamy ze zbioru danych przygotowanego przez Schustera (2019).
!mkdir -p l07
%cd l07
!curl -L -C - https://fb.me/multilingual_task_oriented_data -o data.zip
%cd ..
C:\Users\Ania\Desktop\System_Dialogowy_Janet\l07 C:\Users\Ania\Desktop\System_Dialogowy_Janet
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 8714k 0 1656 0 0 886 0 2:47:51 0:00:01 2:47:50 886 4 8714k 4 406k 0 0 167k 0 0:00:52 0:00:02 0:00:50 721k 33 8714k 33 2957k 0 0 863k 0 0:00:10 0:00:03 0:00:07 1898k 69 8714k 69 6035k 0 0 1387k 0 0:00:06 0:00:04 0:00:02 2429k 100 8714k 100 8714k 0 0 1703k 0 0:00:05 0:00:05 --:--:-- 2683k
Zbiór ten gromadzi wypowiedzi w trzech językach opisane slotami dla dwunastu ram należących do trzech dziedzin Alarm
, Reminder
oraz Weather
. Dane wczytamy korzystając z biblioteki conllu.
!pip3 install conllu
import codecs
from conllu import parse_incr
fields = ['id', 'form', 'frame', 'slot']
def nolabel2o(line, i):
return 'O' if line[i] == 'NoLabel' else line[i]
with open('Janet.conllu', encoding='utf-8') as trainfile:
trainset = list(parse_incr(trainfile, fields=fields, field_parsers={'slot': nolabel2o}))
with open('Janet.conllu', encoding='utf-8') as testfile:
testset = list(parse_incr(testfile, fields=fields, field_parsers={'slot': nolabel2o}))
Requirement already satisfied: conllu in c:\programdata\anaconda3\lib\site-packages (4.4)
Zobaczmy kilka przykładowych wypowiedzi z tego zbioru.
!pip3 install tabulate
from tabulate import tabulate
tabulate(trainset[0], tablefmt='html')
Requirement already satisfied: tabulate in c:\programdata\anaconda3\lib\site-packages (0.8.9)
1 | chciałem | appointment/request_prescription | O |
2 | prosić | appointment/request_prescription | O |
3 | o | appointment/request_prescription | O |
4 | wypisanie | appointment/request_prescription | O |
5 | kolejnej | appointment/request_prescription | O |
6 | recepty | appointment/request_prescription | B-prescription |
7 | na | appointment/request_prescription | O |
8 | lek | appointment/request_prescription | B-prescription/type |
9 | x | appointment/request_prescription | I-prescription/type |
Na potrzeby prezentacji procesu uczenia w jupyterowym notatniku zawęzimy zbiór danych do początkowych przykładów.
Budując model skorzystamy z architektury opartej o rekurencyjne sieci neuronowe zaimplementowanej w bibliotece flair (Akbik i in. 2018).
!pip3 install flair
Requirement already satisfied: flair in c:\programdata\anaconda3\lib\site-packages (0.8.0.post1) Requirement already satisfied: deprecated>=1.2.4 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.2.12) Requirement already satisfied: janome in c:\programdata\anaconda3\lib\site-packages (from flair) (0.4.1) Requirement already satisfied: langdetect in c:\programdata\anaconda3\lib\site-packages (from flair) (1.0.9) Requirement already satisfied: hyperopt>=0.1.1 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.2.5) Requirement already satisfied: sentencepiece==0.1.95 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.1.95) Requirement already satisfied: python-dateutil>=2.6.1 in c:\programdata\anaconda3\lib\site-packages (from flair) (2.8.1) Requirement already satisfied: regex in c:\programdata\anaconda3\lib\site-packages (from flair) (2020.10.15) Requirement already satisfied: segtok>=1.5.7 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.5.10) Requirement already satisfied: numpy<1.20.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.19.2) Requirement already satisfied: mpld3==0.3 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.3) Requirement already satisfied: bpemb>=0.3.2 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.3.3) Requirement already satisfied: scikit-learn>=0.21.3 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.23.2) Requirement already satisfied: matplotlib>=2.2.3 in c:\programdata\anaconda3\lib\site-packages (from flair) (3.3.2) Requirement already satisfied: sqlitedict>=1.6.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.7.0) Requirement already satisfied: tabulate in c:\programdata\anaconda3\lib\site-packages (from flair) (0.8.9) Requirement already satisfied: ftfy in c:\programdata\anaconda3\lib\site-packages (from flair) (6.0.1) Requirement already satisfied: konoha<5.0.0,>=4.0.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (4.6.4) Requirement already satisfied: torch<=1.7.1,>=1.5.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.7.1) Requirement already satisfied: gensim<=3.8.3,>=3.4.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (3.8.3) Requirement already satisfied: transformers>=4.0.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (4.6.0) Requirement already satisfied: gdown==3.12.2 in c:\programdata\anaconda3\lib\site-packages (from flair) (3.12.2) Requirement already satisfied: tqdm>=4.26.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (4.50.2) Requirement already satisfied: lxml in c:\programdata\anaconda3\lib\site-packages (from flair) (4.6.1) Requirement already satisfied: huggingface-hub in c:\programdata\anaconda3\lib\site-packages (from flair) (0.0.8) Requirement already satisfied: wrapt<2,>=1.10 in c:\users\ania\appdata\roaming\python\python38\site-packages (from deprecated>=1.2.4->flair) (1.12.1) Requirement already satisfied: six in c:\programdata\anaconda3\lib\site-packages (from langdetect->flair) (1.15.0) Requirement already satisfied: cloudpickle in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (1.6.0) Requirement already satisfied: scipy in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (1.5.2) Requirement already satisfied: networkx>=2.2 in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (2.5) Requirement already satisfied: future in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (0.18.2) Requirement already satisfied: requests in c:\programdata\anaconda3\lib\site-packages (from bpemb>=0.3.2->flair) (2.24.0) Requirement already satisfied: joblib>=0.11 in c:\programdata\anaconda3\lib\site-packages (from scikit-learn>=0.21.3->flair) (0.17.0) Requirement already satisfied: threadpoolctl>=2.0.0 in c:\programdata\anaconda3\lib\site-packages (from scikit-learn>=0.21.3->flair) (2.1.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (2.4.7) Requirement already satisfied: pillow>=6.2.0 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (8.0.1) Requirement already satisfied: kiwisolver>=1.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (1.3.0) Requirement already satisfied: certifi>=2020.06.20 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (2020.6.20) Requirement already satisfied: cycler>=0.10 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (0.10.0) Requirement already satisfied: wcwidth in c:\programdata\anaconda3\lib\site-packages (from ftfy->flair) (0.2.5) Requirement already satisfied: importlib-metadata<4.0.0,>=3.7.0 in c:\programdata\anaconda3\lib\site-packages (from konoha<5.0.0,>=4.0.0->flair) (3.10.1) Requirement already satisfied: overrides<4.0.0,>=3.0.0 in c:\programdata\anaconda3\lib\site-packages (from konoha<5.0.0,>=4.0.0->flair) (3.1.0) Requirement already satisfied: typing-extensions in c:\programdata\anaconda3\lib\site-packages (from torch<=1.7.1,>=1.5.0->flair) (3.7.4.3) Requirement already satisfied: smart-open>=1.8.1 in c:\programdata\anaconda3\lib\site-packages (from gensim<=3.8.3,>=3.4.0->flair) (5.0.0) Requirement already satisfied: Cython==0.29.14 in c:\programdata\anaconda3\lib\site-packages (from gensim<=3.8.3,>=3.4.0->flair) (0.29.14) Requirement already satisfied: packaging in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (20.4) Requirement already satisfied: filelock in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (3.0.12) Requirement already satisfied: sacremoses in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (0.0.45) Requirement already satisfied: tokenizers<0.11,>=0.10.1 in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (0.10.2) Requirement already satisfied: decorator>=4.3.0 in c:\programdata\anaconda3\lib\site-packages (from networkx>=2.2->hyperopt>=0.1.1->flair) (4.4.2) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\programdata\anaconda3\lib\site-packages (from requests->bpemb>=0.3.2->flair) (1.25.11) Requirement already satisfied: idna<3,>=2.5 in c:\programdata\anaconda3\lib\site-packages (from requests->bpemb>=0.3.2->flair) (2.10) Requirement already satisfied: chardet<4,>=3.0.2 in c:\programdata\anaconda3\lib\site-packages (from requests->bpemb>=0.3.2->flair) (3.0.4) Requirement already satisfied: zipp>=0.5 in c:\programdata\anaconda3\lib\site-packages (from importlib-metadata<4.0.0,>=3.7.0->konoha<5.0.0,>=4.0.0->flair) (3.4.0) Requirement already satisfied: click in c:\programdata\anaconda3\lib\site-packages (from sacremoses->transformers>=4.0.0->flair) (7.1.2)
from flair.data import Corpus, Sentence, Token
from flair.datasets import SentenceDataset
from flair.embeddings import StackedEmbeddings
from flair.embeddings import WordEmbeddings
from flair.embeddings import CharacterEmbeddings
from flair.embeddings import FlairEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer
!pip3 install torch
# determinizacja obliczeń
import random
import torch
random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
torch.cuda.manual_seed(0)
torch.cuda.manual_seed_all(0)
torch.backends.cudnn.enabled = False
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
Requirement already satisfied: torch in c:\programdata\anaconda3\lib\site-packages (1.7.1) Requirement already satisfied: numpy in c:\programdata\anaconda3\lib\site-packages (from torch) (1.19.2) Requirement already satisfied: typing-extensions in c:\programdata\anaconda3\lib\site-packages (from torch) (3.7.4.3)
Dane skonwertujemy do formatu wykorzystywanego przez flair
, korzystając z następującej funkcji.
def conllu2flair(sentences, label=None):
fsentences = []
for sentence in sentences:
fsentence = Sentence()
for token in sentence:
ftoken = Token(token['form'])
if label:
ftoken.add_tag(label, token[label])
fsentence.add_token(ftoken)
fsentences.append(fsentence)
return SentenceDataset(fsentences)
corpus = Corpus(train=conllu2flair(trainset, 'slot'), test=conllu2flair(testset, 'slot'))
print(corpus)
tag_dictionary = corpus.make_tag_dictionary(tag_type='slot')
print(tag_dictionary)
Corpus: 99 train + 11 dev + 110 test sentences Dictionary with 31 tags: <unk>, O, B-prescription, B-prescription/type, I-prescription/type, B-end_conversation, B-deny, I-end_conversation, B-greeting, I-greeting, B-appointment, B-appointment/doctor, I-appointment/doctor, B-datetime, NoLabel I-end_conversation, I-datetime, B-affirm, B-appointment/office, I-B-datetime, B-results, B-appointment/type, I-appointment/type, B-register/email, B-doctor, I-affirm, B-appoinment/doctor, B-appoinment, B-register/name, I-register/name, <START>
Nasz model będzie wykorzystywał wektorowe reprezentacje słów (zob. Word Embeddings).
embedding_types = [
WordEmbeddings('pl'),
FlairEmbeddings('pl-forward'),
FlairEmbeddings('pl-backward'),
CharacterEmbeddings(),
]
embeddings = StackedEmbeddings(embeddings=embedding_types)
tagger = SequenceTagger(hidden_size=256, embeddings=embeddings,
tag_dictionary=tag_dictionary,
tag_type='slot', use_crf=True)
Zobaczmy jak wygląda architektura sieci neuronowej, która będzie odpowiedzialna za przewidywanie slotów w wypowiedziach.
print(tagger)
SequenceTagger( (embeddings): StackedEmbeddings( (list_embedding_0): WordEmbeddings('pl') (list_embedding_1): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.25, inplace=False) (encoder): Embedding(1602, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=1602, bias=True) ) ) (list_embedding_2): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.25, inplace=False) (encoder): Embedding(1602, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=1602, bias=True) ) ) (list_embedding_3): CharacterEmbeddings( (char_embedding): Embedding(275, 25) (char_rnn): LSTM(25, 25, bidirectional=True) ) ) (word_dropout): WordDropout(p=0.05) (locked_dropout): LockedDropout(p=0.5) (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True) (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True) (linear): Linear(in_features=512, out_features=31, bias=True) (beta): 1.0 (weights): None (weight_tensor) None )
Wykonamy dziesięć iteracji (epok) uczenia a wynikowy model zapiszemy w katalogu slot-model
.
trainer = ModelTrainer(tagger, corpus)
trainer.train('slot-model',
learning_rate=0.1,
mini_batch_size=32,
max_epochs=100,
train_with_dev=False)
2021-05-16 19:11:09,838 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:09,846 Model: "SequenceTagger( (embeddings): StackedEmbeddings( (list_embedding_0): WordEmbeddings('pl') (list_embedding_1): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.25, inplace=False) (encoder): Embedding(1602, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=1602, bias=True) ) ) (list_embedding_2): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.25, inplace=False) (encoder): Embedding(1602, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=1602, bias=True) ) ) (list_embedding_3): CharacterEmbeddings( (char_embedding): Embedding(275, 25) (char_rnn): LSTM(25, 25, bidirectional=True) ) ) (word_dropout): WordDropout(p=0.05) (locked_dropout): LockedDropout(p=0.5) (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True) (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True) (linear): Linear(in_features=512, out_features=31, bias=True) (beta): 1.0 (weights): None (weight_tensor) None )" 2021-05-16 19:11:09,846 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:09,846 Corpus: "Corpus: 99 train + 11 dev + 110 test sentences" 2021-05-16 19:11:09,846 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:09,846 Parameters: 2021-05-16 19:11:09,854 - learning_rate: "0.1" 2021-05-16 19:11:09,854 - mini_batch_size: "32" 2021-05-16 19:11:09,854 - patience: "3" 2021-05-16 19:11:09,854 - anneal_factor: "0.5" 2021-05-16 19:11:09,854 - max_epochs: "100" 2021-05-16 19:11:09,854 - shuffle: "True" 2021-05-16 19:11:09,854 - train_with_dev: "False" 2021-05-16 19:11:09,854 - batch_growth_annealing: "False" 2021-05-16 19:11:09,862 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:09,862 Model training base path: "slot-model" 2021-05-16 19:11:09,862 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:09,862 Device: cpu 2021-05-16 19:11:09,862 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:09,862 Embeddings storage mode: cpu 2021-05-16 19:11:09,870 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:12,779 epoch 1 - iter 1/4 - loss 23.51556206 - samples/sec: 11.00 - lr: 0.100000 2021-05-16 19:11:16,270 epoch 1 - iter 2/4 - loss 19.95522118 - samples/sec: 9.17 - lr: 0.100000 2021-05-16 19:11:19,989 epoch 1 - iter 3/4 - loss 18.64025307 - samples/sec: 8.64 - lr: 0.100000 2021-05-16 19:11:20,665 epoch 1 - iter 4/4 - loss 16.56225991 - samples/sec: 47.34 - lr: 0.100000 2021-05-16 19:11:20,665 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:20,665 EPOCH 1 done: loss 16.5623 - lr 0.1000000 2021-05-16 19:11:23,175 DEV : loss 12.217952728271484 - score 0.0 2021-05-16 19:11:23,175 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:11:31,472 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:32,200 epoch 2 - iter 1/4 - loss 13.48146439 - samples/sec: 44.15 - lr: 0.100000 2021-05-16 19:11:32,902 epoch 2 - iter 2/4 - loss 13.13387251 - samples/sec: 45.60 - lr: 0.100000 2021-05-16 19:11:33,485 epoch 2 - iter 3/4 - loss 12.05493037 - samples/sec: 54.92 - lr: 0.100000 2021-05-16 19:11:33,672 epoch 2 - iter 4/4 - loss 10.83767450 - samples/sec: 170.46 - lr: 0.100000 2021-05-16 19:11:33,672 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:33,672 EPOCH 2 done: loss 10.8377 - lr 0.1000000 2021-05-16 19:11:33,768 DEV : loss 8.176359176635742 - score 0.0 2021-05-16 19:11:33,771 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:11:42,363 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:43,054 epoch 3 - iter 1/4 - loss 9.78410912 - samples/sec: 46.31 - lr: 0.100000 2021-05-16 19:11:43,672 epoch 3 - iter 2/4 - loss 9.88690376 - samples/sec: 51.75 - lr: 0.100000 2021-05-16 19:11:44,405 epoch 3 - iter 3/4 - loss 9.67457644 - samples/sec: 43.69 - lr: 0.100000 2021-05-16 19:11:44,589 epoch 3 - iter 4/4 - loss 8.94925010 - samples/sec: 173.35 - lr: 0.100000 2021-05-16 19:11:44,589 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:44,589 EPOCH 3 done: loss 8.9493 - lr 0.1000000 2021-05-16 19:11:44,693 DEV : loss 7.451809883117676 - score 0.0 2021-05-16 19:11:44,693 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:11:53,845 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:54,437 epoch 4 - iter 1/4 - loss 8.59626198 - samples/sec: 55.55 - lr: 0.100000 2021-05-16 19:11:55,150 epoch 4 - iter 2/4 - loss 8.40540457 - samples/sec: 44.85 - lr: 0.100000 2021-05-16 19:11:55,995 epoch 4 - iter 3/4 - loss 8.39408366 - samples/sec: 37.88 - lr: 0.100000 2021-05-16 19:11:56,222 epoch 4 - iter 4/4 - loss 7.31822419 - samples/sec: 141.22 - lr: 0.100000 2021-05-16 19:11:56,222 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:56,222 EPOCH 4 done: loss 7.3182 - lr 0.1000000 2021-05-16 19:11:56,309 DEV : loss 7.464598178863525 - score 0.0 2021-05-16 19:11:56,309 BAD EPOCHS (no improvement): 1 2021-05-16 19:11:56,309 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:57,036 epoch 5 - iter 1/4 - loss 7.71572590 - samples/sec: 44.96 - lr: 0.100000 2021-05-16 19:11:57,744 epoch 5 - iter 2/4 - loss 8.43728781 - samples/sec: 45.20 - lr: 0.100000 2021-05-16 19:11:58,488 epoch 5 - iter 3/4 - loss 7.66639407 - samples/sec: 43.01 - lr: 0.100000 2021-05-16 19:11:58,705 epoch 5 - iter 4/4 - loss 8.57210910 - samples/sec: 147.23 - lr: 0.100000 2021-05-16 19:11:58,705 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:11:58,705 EPOCH 5 done: loss 8.5721 - lr 0.1000000 2021-05-16 19:11:58,801 DEV : loss 7.330676555633545 - score 0.0645 2021-05-16 19:11:58,809 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:12:09,132 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:10,066 epoch 6 - iter 1/4 - loss 6.82695341 - samples/sec: 34.26 - lr: 0.100000 2021-05-16 19:12:10,923 epoch 6 - iter 2/4 - loss 6.71814942 - samples/sec: 37.31 - lr: 0.100000 2021-05-16 19:12:11,835 epoch 6 - iter 3/4 - loss 7.02111626 - samples/sec: 35.09 - lr: 0.100000 2021-05-16 19:12:12,029 epoch 6 - iter 4/4 - loss 8.55612421 - samples/sec: 165.49 - lr: 0.100000 2021-05-16 19:12:12,029 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:12,029 EPOCH 6 done: loss 8.5561 - lr 0.1000000 2021-05-16 19:12:12,117 DEV : loss 5.898077011108398 - score 0.0 2021-05-16 19:12:12,117 BAD EPOCHS (no improvement): 1 2021-05-16 19:12:12,117 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:12,829 epoch 7 - iter 1/4 - loss 3.95063305 - samples/sec: 45.47 - lr: 0.100000 2021-05-16 19:12:13,605 epoch 7 - iter 2/4 - loss 4.73969674 - samples/sec: 41.22 - lr: 0.100000 2021-05-16 19:12:14,424 epoch 7 - iter 3/4 - loss 6.22298797 - samples/sec: 39.08 - lr: 0.100000 2021-05-16 19:12:14,648 epoch 7 - iter 4/4 - loss 7.01634419 - samples/sec: 142.74 - lr: 0.100000 2021-05-16 19:12:14,648 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:14,648 EPOCH 7 done: loss 7.0163 - lr 0.1000000 2021-05-16 19:12:14,745 DEV : loss 5.496520519256592 - score 0.1538 2021-05-16 19:12:14,745 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:12:24,553 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:25,305 epoch 8 - iter 1/4 - loss 5.84166050 - samples/sec: 43.01 - lr: 0.100000 2021-05-16 19:12:26,009 epoch 8 - iter 2/4 - loss 5.58190751 - samples/sec: 45.43 - lr: 0.100000 2021-05-16 19:12:26,803 epoch 8 - iter 3/4 - loss 6.09121291 - samples/sec: 40.28 - lr: 0.100000 2021-05-16 19:12:27,011 epoch 8 - iter 4/4 - loss 5.20219183 - samples/sec: 153.85 - lr: 0.100000 2021-05-16 19:12:27,011 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:27,011 EPOCH 8 done: loss 5.2022 - lr 0.1000000 2021-05-16 19:12:27,099 DEV : loss 5.2129292488098145 - score 0.3478 2021-05-16 19:12:27,099 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:12:37,200 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:37,968 epoch 9 - iter 1/4 - loss 6.38291883 - samples/sec: 41.64 - lr: 0.100000 2021-05-16 19:12:38,703 epoch 9 - iter 2/4 - loss 6.26358747 - samples/sec: 43.56 - lr: 0.100000 2021-05-16 19:12:39,284 epoch 9 - iter 3/4 - loss 5.50593615 - samples/sec: 55.03 - lr: 0.100000 2021-05-16 19:12:39,476 epoch 9 - iter 4/4 - loss 4.59320381 - samples/sec: 166.66 - lr: 0.100000 2021-05-16 19:12:39,476 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:39,476 EPOCH 9 done: loss 4.5932 - lr 0.1000000 2021-05-16 19:12:39,580 DEV : loss 4.9869303703308105 - score 0.2609 2021-05-16 19:12:39,580 BAD EPOCHS (no improvement): 1 2021-05-16 19:12:39,590 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:40,387 epoch 10 - iter 1/4 - loss 4.83267832 - samples/sec: 40.15 - lr: 0.100000 2021-05-16 19:12:41,158 epoch 10 - iter 2/4 - loss 4.78956985 - samples/sec: 41.52 - lr: 0.100000 2021-05-16 19:12:41,792 epoch 10 - iter 3/4 - loss 4.80196079 - samples/sec: 50.47 - lr: 0.100000 2021-05-16 19:12:41,993 epoch 10 - iter 4/4 - loss 4.40808117 - samples/sec: 158.79 - lr: 0.100000 2021-05-16 19:12:42,001 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:42,001 EPOCH 10 done: loss 4.4081 - lr 0.1000000 2021-05-16 19:12:42,089 DEV : loss 4.855195045471191 - score 0.3077 2021-05-16 19:12:42,089 BAD EPOCHS (no improvement): 2 2021-05-16 19:12:42,097 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:42,780 epoch 11 - iter 1/4 - loss 3.66451931 - samples/sec: 46.87 - lr: 0.100000 2021-05-16 19:12:43,666 epoch 11 - iter 2/4 - loss 4.65244174 - samples/sec: 36.16 - lr: 0.100000 2021-05-16 19:12:44,456 epoch 11 - iter 3/4 - loss 4.58611314 - samples/sec: 40.51 - lr: 0.100000 2021-05-16 19:12:44,648 epoch 11 - iter 4/4 - loss 4.86016536 - samples/sec: 166.85 - lr: 0.100000 2021-05-16 19:12:44,656 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:44,656 EPOCH 11 done: loss 4.8602 - lr 0.1000000 2021-05-16 19:12:44,737 DEV : loss 4.352779865264893 - score 0.3478 2021-05-16 19:12:44,745 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:12:53,094 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:53,668 epoch 12 - iter 1/4 - loss 3.02415586 - samples/sec: 55.76 - lr: 0.100000 2021-05-16 19:12:54,381 epoch 12 - iter 2/4 - loss 3.78920162 - samples/sec: 44.90 - lr: 0.100000 2021-05-16 19:12:55,097 epoch 12 - iter 3/4 - loss 4.02983785 - samples/sec: 44.67 - lr: 0.100000 2021-05-16 19:12:55,304 epoch 12 - iter 4/4 - loss 3.44744644 - samples/sec: 154.91 - lr: 0.100000 2021-05-16 19:12:55,304 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:55,304 EPOCH 12 done: loss 3.4474 - lr 0.1000000 2021-05-16 19:12:55,402 DEV : loss 4.364665508270264 - score 0.3333 2021-05-16 19:12:55,402 BAD EPOCHS (no improvement): 1 2021-05-16 19:12:55,414 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:56,003 epoch 13 - iter 1/4 - loss 4.21208715 - samples/sec: 54.32 - lr: 0.100000 2021-05-16 19:12:56,765 epoch 13 - iter 2/4 - loss 4.02075458 - samples/sec: 42.01 - lr: 0.100000 2021-05-16 19:12:57,528 epoch 13 - iter 3/4 - loss 3.93069355 - samples/sec: 41.92 - lr: 0.100000 2021-05-16 19:12:57,757 epoch 13 - iter 4/4 - loss 4.47141653 - samples/sec: 139.66 - lr: 0.100000 2021-05-16 19:12:57,757 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:12:57,757 EPOCH 13 done: loss 4.4714 - lr 0.1000000 2021-05-16 19:12:57,856 DEV : loss 4.251131057739258 - score 0.4615 2021-05-16 19:12:57,856 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:13:07,766 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:08,603 epoch 14 - iter 1/4 - loss 4.07004356 - samples/sec: 38.23 - lr: 0.100000 2021-05-16 19:13:09,137 epoch 14 - iter 2/4 - loss 3.58775365 - samples/sec: 60.00 - lr: 0.100000 2021-05-16 19:13:09,805 epoch 14 - iter 3/4 - loss 3.37540340 - samples/sec: 49.04 - lr: 0.100000 2021-05-16 19:13:10,017 epoch 14 - iter 4/4 - loss 3.30140239 - samples/sec: 150.99 - lr: 0.100000 2021-05-16 19:13:10,017 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:10,017 EPOCH 14 done: loss 3.3014 - lr 0.1000000 2021-05-16 19:13:10,108 DEV : loss 3.9291062355041504 - score 0.4348 2021-05-16 19:13:10,108 BAD EPOCHS (no improvement): 1 2021-05-16 19:13:10,126 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:10,799 epoch 15 - iter 1/4 - loss 4.12087154 - samples/sec: 47.53 - lr: 0.100000 2021-05-16 19:13:11,479 epoch 15 - iter 2/4 - loss 3.45777619 - samples/sec: 47.09 - lr: 0.100000 2021-05-16 19:13:12,230 epoch 15 - iter 3/4 - loss 3.44035808 - samples/sec: 42.59 - lr: 0.100000 2021-05-16 19:13:12,392 epoch 15 - iter 4/4 - loss 2.90269253 - samples/sec: 197.83 - lr: 0.100000 2021-05-16 19:13:12,408 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:12,408 EPOCH 15 done: loss 2.9027 - lr 0.1000000 2021-05-16 19:13:12,498 DEV : loss 4.368889808654785 - score 0.6923 2021-05-16 19:13:12,498 BAD EPOCHS (no improvement): 0 saving best model 2021-05-16 19:13:22,020 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:22,716 epoch 16 - iter 1/4 - loss 2.49819446 - samples/sec: 45.95 - lr: 0.100000 2021-05-16 19:13:23,466 epoch 16 - iter 2/4 - loss 3.36824119 - samples/sec: 43.59 - lr: 0.100000 2021-05-16 19:13:24,067 epoch 16 - iter 3/4 - loss 3.36522110 - samples/sec: 53.20 - lr: 0.100000 2021-05-16 19:13:24,253 epoch 16 - iter 4/4 - loss 3.36765742 - samples/sec: 188.42 - lr: 0.100000 2021-05-16 19:13:24,253 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:24,253 EPOCH 16 done: loss 3.3677 - lr 0.1000000 2021-05-16 19:13:24,348 DEV : loss 3.6790337562561035 - score 0.5833 2021-05-16 19:13:24,348 BAD EPOCHS (no improvement): 1 2021-05-16 19:13:24,356 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:24,905 epoch 17 - iter 1/4 - loss 3.17663288 - samples/sec: 58.35 - lr: 0.100000 2021-05-16 19:13:25,620 epoch 17 - iter 2/4 - loss 3.24819005 - samples/sec: 44.73 - lr: 0.100000 2021-05-16 19:13:26,267 epoch 17 - iter 3/4 - loss 2.86507106 - samples/sec: 49.44 - lr: 0.100000 2021-05-16 19:13:26,483 epoch 17 - iter 4/4 - loss 4.03450483 - samples/sec: 160.21 - lr: 0.100000 2021-05-16 19:13:26,483 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:26,483 EPOCH 17 done: loss 4.0345 - lr 0.1000000 2021-05-16 19:13:26,579 DEV : loss 3.864961862564087 - score 0.6154 2021-05-16 19:13:26,580 BAD EPOCHS (no improvement): 2 2021-05-16 19:13:26,583 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:27,322 epoch 18 - iter 1/4 - loss 3.06332946 - samples/sec: 43.30 - lr: 0.100000 2021-05-16 19:13:27,901 epoch 18 - iter 2/4 - loss 3.11640310 - samples/sec: 55.27 - lr: 0.100000 2021-05-16 19:13:28,698 epoch 18 - iter 3/4 - loss 2.99107130 - samples/sec: 40.18 - lr: 0.100000 2021-05-16 19:13:28,898 epoch 18 - iter 4/4 - loss 2.94846284 - samples/sec: 160.00 - lr: 0.100000 2021-05-16 19:13:28,898 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:28,898 EPOCH 18 done: loss 2.9485 - lr 0.1000000 2021-05-16 19:13:28,986 DEV : loss 3.8492608070373535 - score 0.48 2021-05-16 19:13:28,994 BAD EPOCHS (no improvement): 3 2021-05-16 19:13:28,994 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:29,622 epoch 19 - iter 1/4 - loss 2.81688428 - samples/sec: 50.89 - lr: 0.100000 2021-05-16 19:13:30,354 epoch 19 - iter 2/4 - loss 2.99261010 - samples/sec: 44.72 - lr: 0.100000 2021-05-16 19:13:30,979 epoch 19 - iter 3/4 - loss 2.85697055 - samples/sec: 51.15 - lr: 0.100000 2021-05-16 19:13:31,139 epoch 19 - iter 4/4 - loss 2.25571273 - samples/sec: 200.02 - lr: 0.100000 2021-05-16 19:13:31,139 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:31,139 EPOCH 19 done: loss 2.2557 - lr 0.1000000 2021-05-16 19:13:31,235 DEV : loss 3.9649171829223633 - score 0.5185 Epoch 19: reducing learning rate of group 0 to 5.0000e-02. 2021-05-16 19:13:31,235 BAD EPOCHS (no improvement): 4 2021-05-16 19:13:31,242 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:31,906 epoch 20 - iter 1/4 - loss 3.35270214 - samples/sec: 48.22 - lr: 0.050000 2021-05-16 19:13:32,555 epoch 20 - iter 2/4 - loss 2.56608105 - samples/sec: 49.28 - lr: 0.050000 2021-05-16 19:13:33,131 epoch 20 - iter 3/4 - loss 2.33327313 - samples/sec: 56.37 - lr: 0.050000 2021-05-16 19:13:33,332 epoch 20 - iter 4/4 - loss 2.89689222 - samples/sec: 165.52 - lr: 0.050000 2021-05-16 19:13:33,340 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:33,340 EPOCH 20 done: loss 2.8969 - lr 0.0500000 2021-05-16 19:13:33,421 DEV : loss 3.6375184059143066 - score 0.56 2021-05-16 19:13:33,421 BAD EPOCHS (no improvement): 1 2021-05-16 19:13:33,421 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:34,102 epoch 21 - iter 1/4 - loss 2.03401089 - samples/sec: 47.65 - lr: 0.050000 2021-05-16 19:13:34,750 epoch 21 - iter 2/4 - loss 2.45254445 - samples/sec: 49.40 - lr: 0.050000 2021-05-16 19:13:35,405 epoch 21 - iter 3/4 - loss 2.02827569 - samples/sec: 48.84 - lr: 0.050000 2021-05-16 19:13:35,652 epoch 21 - iter 4/4 - loss 2.53652957 - samples/sec: 129.49 - lr: 0.050000 2021-05-16 19:13:35,652 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:35,652 EPOCH 21 done: loss 2.5365 - lr 0.0500000 2021-05-16 19:13:35,756 DEV : loss 3.636472463607788 - score 0.56 2021-05-16 19:13:35,756 BAD EPOCHS (no improvement): 2 2021-05-16 19:13:35,763 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:36,461 epoch 22 - iter 1/4 - loss 2.35593867 - samples/sec: 45.85 - lr: 0.050000 2021-05-16 19:13:37,157 epoch 22 - iter 2/4 - loss 1.78290999 - samples/sec: 45.97 - lr: 0.050000 2021-05-16 19:13:37,821 epoch 22 - iter 3/4 - loss 2.12207437 - samples/sec: 48.21 - lr: 0.050000 2021-05-16 19:13:38,014 epoch 22 - iter 4/4 - loss 2.15731788 - samples/sec: 165.55 - lr: 0.050000 2021-05-16 19:13:38,014 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:38,021 EPOCH 22 done: loss 2.1573 - lr 0.0500000 2021-05-16 19:13:38,108 DEV : loss 3.7137885093688965 - score 0.6667 2021-05-16 19:13:38,116 BAD EPOCHS (no improvement): 3 2021-05-16 19:13:38,116 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:38,822 epoch 23 - iter 1/4 - loss 1.83278751 - samples/sec: 45.53 - lr: 0.050000 2021-05-16 19:13:39,736 epoch 23 - iter 2/4 - loss 2.04161525 - samples/sec: 35.03 - lr: 0.050000 2021-05-16 19:13:40,684 epoch 23 - iter 3/4 - loss 2.19689337 - samples/sec: 33.76 - lr: 0.050000 2021-05-16 19:13:40,933 epoch 23 - iter 4/4 - loss 1.73538903 - samples/sec: 128.34 - lr: 0.050000 2021-05-16 19:13:40,934 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:40,934 EPOCH 23 done: loss 1.7354 - lr 0.0500000 2021-05-16 19:13:41,043 DEV : loss 3.495877265930176 - score 0.5833 Epoch 23: reducing learning rate of group 0 to 2.5000e-02. 2021-05-16 19:13:41,043 BAD EPOCHS (no improvement): 4 2021-05-16 19:13:41,051 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:41,949 epoch 24 - iter 1/4 - loss 2.58249235 - samples/sec: 35.62 - lr: 0.025000 2021-05-16 19:13:42,545 epoch 24 - iter 2/4 - loss 2.33847690 - samples/sec: 53.73 - lr: 0.025000 2021-05-16 19:13:43,209 epoch 24 - iter 3/4 - loss 2.05386758 - samples/sec: 48.20 - lr: 0.025000 2021-05-16 19:13:43,426 epoch 24 - iter 4/4 - loss 1.69814771 - samples/sec: 147.27 - lr: 0.025000 2021-05-16 19:13:43,426 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:43,426 EPOCH 24 done: loss 1.6981 - lr 0.0250000 2021-05-16 19:13:43,514 DEV : loss 3.547339677810669 - score 0.5833 2021-05-16 19:13:43,514 BAD EPOCHS (no improvement): 1 2021-05-16 19:13:43,514 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:44,502 epoch 25 - iter 1/4 - loss 2.63612175 - samples/sec: 32.67 - lr: 0.025000 2021-05-16 19:13:45,551 epoch 25 - iter 2/4 - loss 2.28528547 - samples/sec: 30.49 - lr: 0.025000 2021-05-16 19:13:46,368 epoch 25 - iter 3/4 - loss 2.18019919 - samples/sec: 39.20 - lr: 0.025000 2021-05-16 19:13:46,585 epoch 25 - iter 4/4 - loss 1.82882562 - samples/sec: 147.22 - lr: 0.025000 2021-05-16 19:13:46,585 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:46,585 EPOCH 25 done: loss 1.8288 - lr 0.0250000 2021-05-16 19:13:46,681 DEV : loss 3.695451259613037 - score 0.6667 2021-05-16 19:13:46,681 BAD EPOCHS (no improvement): 2 2021-05-16 19:13:46,681 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:47,435 epoch 26 - iter 1/4 - loss 2.46649575 - samples/sec: 42.90 - lr: 0.025000 2021-05-16 19:13:48,195 epoch 26 - iter 2/4 - loss 1.86319947 - samples/sec: 42.09 - lr: 0.025000 2021-05-16 19:13:49,101 epoch 26 - iter 3/4 - loss 1.99375129 - samples/sec: 35.34 - lr: 0.025000 2021-05-16 19:13:49,350 epoch 26 - iter 4/4 - loss 2.51209539 - samples/sec: 132.64 - lr: 0.025000 2021-05-16 19:13:49,350 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:49,350 EPOCH 26 done: loss 2.5121 - lr 0.0250000 2021-05-16 19:13:49,454 DEV : loss 3.5949974060058594 - score 0.6667 2021-05-16 19:13:49,457 BAD EPOCHS (no improvement): 3 2021-05-16 19:13:49,457 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:50,194 epoch 27 - iter 1/4 - loss 1.67152703 - samples/sec: 43.40 - lr: 0.025000 2021-05-16 19:13:50,906 epoch 27 - iter 2/4 - loss 1.81827271 - samples/sec: 44.95 - lr: 0.025000 2021-05-16 19:13:51,642 epoch 27 - iter 3/4 - loss 1.91284267 - samples/sec: 43.46 - lr: 0.025000 2021-05-16 19:13:51,834 epoch 27 - iter 4/4 - loss 2.51718122 - samples/sec: 166.65 - lr: 0.025000 2021-05-16 19:13:51,834 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:51,834 EPOCH 27 done: loss 2.5172 - lr 0.0250000 2021-05-16 19:13:51,930 DEV : loss 3.624786376953125 - score 0.6667 Epoch 27: reducing learning rate of group 0 to 1.2500e-02. 2021-05-16 19:13:51,930 BAD EPOCHS (no improvement): 4 2021-05-16 19:13:51,930 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:52,650 epoch 28 - iter 1/4 - loss 2.06657982 - samples/sec: 44.45 - lr: 0.012500 2021-05-16 19:13:53,405 epoch 28 - iter 2/4 - loss 2.16739893 - samples/sec: 42.42 - lr: 0.012500 2021-05-16 19:13:54,234 epoch 28 - iter 3/4 - loss 1.87206562 - samples/sec: 38.60 - lr: 0.012500 2021-05-16 19:13:54,402 epoch 28 - iter 4/4 - loss 1.53354126 - samples/sec: 190.48 - lr: 0.012500 2021-05-16 19:13:54,410 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:54,410 EPOCH 28 done: loss 1.5335 - lr 0.0125000 2021-05-16 19:13:54,498 DEV : loss 3.486685276031494 - score 0.6667 2021-05-16 19:13:54,498 BAD EPOCHS (no improvement): 1 2021-05-16 19:13:54,498 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:55,514 epoch 29 - iter 1/4 - loss 1.94683826 - samples/sec: 31.74 - lr: 0.012500 2021-05-16 19:13:56,355 epoch 29 - iter 2/4 - loss 1.87296987 - samples/sec: 38.03 - lr: 0.012500 2021-05-16 19:13:57,018 epoch 29 - iter 3/4 - loss 1.93602276 - samples/sec: 48.88 - lr: 0.012500 2021-05-16 19:13:57,202 epoch 29 - iter 4/4 - loss 1.87588742 - samples/sec: 173.70 - lr: 0.012500 2021-05-16 19:13:57,202 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:57,202 EPOCH 29 done: loss 1.8759 - lr 0.0125000 2021-05-16 19:13:57,298 DEV : loss 3.5309135913848877 - score 0.6667 2021-05-16 19:13:57,298 BAD EPOCHS (no improvement): 2 2021-05-16 19:13:57,298 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:58,250 epoch 30 - iter 1/4 - loss 2.16732407 - samples/sec: 33.90 - lr: 0.012500 2021-05-16 19:13:58,931 epoch 30 - iter 2/4 - loss 1.72622716 - samples/sec: 46.96 - lr: 0.012500 2021-05-16 19:13:59,781 epoch 30 - iter 3/4 - loss 1.93175316 - samples/sec: 37.65 - lr: 0.012500 2021-05-16 19:13:59,982 epoch 30 - iter 4/4 - loss 1.60670690 - samples/sec: 159.08 - lr: 0.012500 2021-05-16 19:13:59,990 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:13:59,990 EPOCH 30 done: loss 1.6067 - lr 0.0125000 2021-05-16 19:14:00,088 DEV : loss 3.4875831604003906 - score 0.6667 2021-05-16 19:14:00,096 BAD EPOCHS (no improvement): 3 2021-05-16 19:14:00,096 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:01,011 epoch 31 - iter 1/4 - loss 2.39419317 - samples/sec: 34.99 - lr: 0.012500 2021-05-16 19:14:01,826 epoch 31 - iter 2/4 - loss 1.94124657 - samples/sec: 39.64 - lr: 0.012500 2021-05-16 19:14:02,676 epoch 31 - iter 3/4 - loss 1.81396655 - samples/sec: 37.62 - lr: 0.012500 2021-05-16 19:14:02,876 epoch 31 - iter 4/4 - loss 1.78971809 - samples/sec: 166.69 - lr: 0.012500 2021-05-16 19:14:02,884 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:02,884 EPOCH 31 done: loss 1.7897 - lr 0.0125000 2021-05-16 19:14:02,961 DEV : loss 3.4355287551879883 - score 0.5833 Epoch 31: reducing learning rate of group 0 to 6.2500e-03. 2021-05-16 19:14:02,961 BAD EPOCHS (no improvement): 4 2021-05-16 19:14:02,976 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:03,838 epoch 32 - iter 1/4 - loss 1.18405724 - samples/sec: 37.13 - lr: 0.006250 2021-05-16 19:14:04,727 epoch 32 - iter 2/4 - loss 1.78029823 - samples/sec: 35.98 - lr: 0.006250 2021-05-16 19:14:05,416 epoch 32 - iter 3/4 - loss 1.71468850 - samples/sec: 46.96 - lr: 0.006250 2021-05-16 19:14:05,673 epoch 32 - iter 4/4 - loss 1.98795196 - samples/sec: 124.99 - lr: 0.006250 2021-05-16 19:14:05,673 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:05,673 EPOCH 32 done: loss 1.9880 - lr 0.0062500 2021-05-16 19:14:05,768 DEV : loss 3.4302756786346436 - score 0.5833 2021-05-16 19:14:05,776 BAD EPOCHS (no improvement): 1 2021-05-16 19:14:05,776 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:06,493 epoch 33 - iter 1/4 - loss 1.43548059 - samples/sec: 44.69 - lr: 0.006250 2021-05-16 19:14:07,307 epoch 33 - iter 2/4 - loss 1.70211828 - samples/sec: 39.28 - lr: 0.006250 2021-05-16 19:14:08,082 epoch 33 - iter 3/4 - loss 1.72906860 - samples/sec: 41.30 - lr: 0.006250 2021-05-16 19:14:08,343 epoch 33 - iter 4/4 - loss 2.12577587 - samples/sec: 122.39 - lr: 0.006250 2021-05-16 19:14:08,343 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:08,343 EPOCH 33 done: loss 2.1258 - lr 0.0062500 2021-05-16 19:14:08,431 DEV : loss 3.4519147872924805 - score 0.6667 2021-05-16 19:14:08,439 BAD EPOCHS (no improvement): 2 2021-05-16 19:14:08,439 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:09,154 epoch 34 - iter 1/4 - loss 1.07441115 - samples/sec: 44.79 - lr: 0.006250 2021-05-16 19:14:09,975 epoch 34 - iter 2/4 - loss 1.89638603 - samples/sec: 38.96 - lr: 0.006250 2021-05-16 19:14:10,993 epoch 34 - iter 3/4 - loss 1.81038960 - samples/sec: 31.45 - lr: 0.006250 2021-05-16 19:14:11,289 epoch 34 - iter 4/4 - loss 1.82815674 - samples/sec: 108.11 - lr: 0.006250 2021-05-16 19:14:11,289 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:11,289 EPOCH 34 done: loss 1.8282 - lr 0.0062500 2021-05-16 19:14:11,393 DEV : loss 3.4468681812286377 - score 0.6667 2021-05-16 19:14:11,393 BAD EPOCHS (no improvement): 3 2021-05-16 19:14:11,393 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:12,314 epoch 35 - iter 1/4 - loss 1.71202326 - samples/sec: 34.74 - lr: 0.006250 2021-05-16 19:14:13,347 epoch 35 - iter 2/4 - loss 2.02234995 - samples/sec: 30.99 - lr: 0.006250 2021-05-16 19:14:13,977 epoch 35 - iter 3/4 - loss 1.83293974 - samples/sec: 51.40 - lr: 0.006250 2021-05-16 19:14:14,155 epoch 35 - iter 4/4 - loss 1.40346918 - samples/sec: 188.15 - lr: 0.006250 2021-05-16 19:14:14,155 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:14,155 EPOCH 35 done: loss 1.4035 - lr 0.0062500 2021-05-16 19:14:14,251 DEV : loss 3.4555253982543945 - score 0.6667 Epoch 35: reducing learning rate of group 0 to 3.1250e-03. 2021-05-16 19:14:14,251 BAD EPOCHS (no improvement): 4 2021-05-16 19:14:14,251 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:15,020 epoch 36 - iter 1/4 - loss 1.60199451 - samples/sec: 41.61 - lr: 0.003125 2021-05-16 19:14:15,758 epoch 36 - iter 2/4 - loss 1.76909965 - samples/sec: 43.41 - lr: 0.003125 2021-05-16 19:14:16,694 epoch 36 - iter 3/4 - loss 1.96563844 - samples/sec: 34.46 - lr: 0.003125 2021-05-16 19:14:16,926 epoch 36 - iter 4/4 - loss 2.04810312 - samples/sec: 137.94 - lr: 0.003125 2021-05-16 19:14:16,926 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:16,926 EPOCH 36 done: loss 2.0481 - lr 0.0031250 2021-05-16 19:14:17,022 DEV : loss 3.467947483062744 - score 0.6667 2021-05-16 19:14:17,022 BAD EPOCHS (no improvement): 1 2021-05-16 19:14:17,022 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:17,771 epoch 37 - iter 1/4 - loss 1.59361398 - samples/sec: 42.71 - lr: 0.003125 2021-05-16 19:14:18,573 epoch 37 - iter 2/4 - loss 1.86242718 - samples/sec: 39.93 - lr: 0.003125 2021-05-16 19:14:19,367 epoch 37 - iter 3/4 - loss 1.84938045 - samples/sec: 40.27 - lr: 0.003125 2021-05-16 19:14:19,575 epoch 37 - iter 4/4 - loss 1.94639012 - samples/sec: 159.98 - lr: 0.003125 2021-05-16 19:14:19,575 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:19,575 EPOCH 37 done: loss 1.9464 - lr 0.0031250 2021-05-16 19:14:19,663 DEV : loss 3.4721953868865967 - score 0.6667 2021-05-16 19:14:19,663 BAD EPOCHS (no improvement): 2 2021-05-16 19:14:19,663 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:20,420 epoch 38 - iter 1/4 - loss 1.87127459 - samples/sec: 42.26 - lr: 0.003125 2021-05-16 19:14:21,214 epoch 38 - iter 2/4 - loss 1.65014571 - samples/sec: 40.34 - lr: 0.003125 2021-05-16 19:14:22,201 epoch 38 - iter 3/4 - loss 1.78922117 - samples/sec: 32.41 - lr: 0.003125 2021-05-16 19:14:22,409 epoch 38 - iter 4/4 - loss 1.57039295 - samples/sec: 153.84 - lr: 0.003125 2021-05-16 19:14:22,417 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:22,417 EPOCH 38 done: loss 1.5704 - lr 0.0031250 2021-05-16 19:14:22,522 DEV : loss 3.4747495651245117 - score 0.6667 2021-05-16 19:14:22,522 BAD EPOCHS (no improvement): 3 2021-05-16 19:14:22,522 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:23,532 epoch 39 - iter 1/4 - loss 1.71339095 - samples/sec: 31.94 - lr: 0.003125 2021-05-16 19:14:24,351 epoch 39 - iter 2/4 - loss 1.87997061 - samples/sec: 39.07 - lr: 0.003125 2021-05-16 19:14:25,353 epoch 39 - iter 3/4 - loss 1.93014069 - samples/sec: 31.93 - lr: 0.003125 2021-05-16 19:14:25,553 epoch 39 - iter 4/4 - loss 1.66254094 - samples/sec: 166.68 - lr: 0.003125 2021-05-16 19:14:25,561 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:25,561 EPOCH 39 done: loss 1.6625 - lr 0.0031250 2021-05-16 19:14:25,650 DEV : loss 3.4640121459960938 - score 0.6667 Epoch 39: reducing learning rate of group 0 to 1.5625e-03. 2021-05-16 19:14:25,650 BAD EPOCHS (no improvement): 4 2021-05-16 19:14:25,650 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:26,482 epoch 40 - iter 1/4 - loss 1.51390183 - samples/sec: 38.46 - lr: 0.001563 2021-05-16 19:14:27,268 epoch 40 - iter 2/4 - loss 1.62989253 - samples/sec: 40.73 - lr: 0.001563 2021-05-16 19:14:28,116 epoch 40 - iter 3/4 - loss 1.59191600 - samples/sec: 37.73 - lr: 0.001563 2021-05-16 19:14:28,389 epoch 40 - iter 4/4 - loss 1.58031228 - samples/sec: 116.91 - lr: 0.001563 2021-05-16 19:14:28,389 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:28,389 EPOCH 40 done: loss 1.5803 - lr 0.0015625 2021-05-16 19:14:28,493 DEV : loss 3.464979648590088 - score 0.6667 2021-05-16 19:14:28,493 BAD EPOCHS (no improvement): 1 2021-05-16 19:14:28,493 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:29,395 epoch 41 - iter 1/4 - loss 2.09950924 - samples/sec: 35.51 - lr: 0.001563 2021-05-16 19:14:30,198 epoch 41 - iter 2/4 - loss 2.02299452 - samples/sec: 39.85 - lr: 0.001563 2021-05-16 19:14:30,959 epoch 41 - iter 3/4 - loss 1.83912905 - samples/sec: 42.02 - lr: 0.001563 2021-05-16 19:14:31,168 epoch 41 - iter 4/4 - loss 2.28552222 - samples/sec: 152.95 - lr: 0.001563 2021-05-16 19:14:31,176 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:31,176 EPOCH 41 done: loss 2.2855 - lr 0.0015625 2021-05-16 19:14:31,256 DEV : loss 3.46785044670105 - score 0.6667 2021-05-16 19:14:31,256 BAD EPOCHS (no improvement): 2 2021-05-16 19:14:31,264 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:31,960 epoch 42 - iter 1/4 - loss 2.07870221 - samples/sec: 45.98 - lr: 0.001563 2021-05-16 19:14:32,809 epoch 42 - iter 2/4 - loss 1.80660170 - samples/sec: 38.05 - lr: 0.001563 2021-05-16 19:14:33,486 epoch 42 - iter 3/4 - loss 1.86924104 - samples/sec: 47.31 - lr: 0.001563 2021-05-16 19:14:33,738 epoch 42 - iter 4/4 - loss 2.06889942 - samples/sec: 126.97 - lr: 0.001563 2021-05-16 19:14:33,738 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:33,738 EPOCH 42 done: loss 2.0689 - lr 0.0015625 2021-05-16 19:14:33,827 DEV : loss 3.464182138442993 - score 0.6667 2021-05-16 19:14:33,835 BAD EPOCHS (no improvement): 3 2021-05-16 19:14:33,835 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:34,689 epoch 43 - iter 1/4 - loss 2.16509676 - samples/sec: 37.68 - lr: 0.001563 2021-05-16 19:14:35,420 epoch 43 - iter 2/4 - loss 1.79616153 - samples/sec: 44.27 - lr: 0.001563 2021-05-16 19:14:36,298 epoch 43 - iter 3/4 - loss 1.79792849 - samples/sec: 36.44 - lr: 0.001563 2021-05-16 19:14:36,517 epoch 43 - iter 4/4 - loss 1.78867936 - samples/sec: 146.19 - lr: 0.001563 2021-05-16 19:14:36,517 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:36,517 EPOCH 43 done: loss 1.7887 - lr 0.0015625 2021-05-16 19:14:36,589 DEV : loss 3.464967966079712 - score 0.6667 Epoch 43: reducing learning rate of group 0 to 7.8125e-04. 2021-05-16 19:14:36,589 BAD EPOCHS (no improvement): 4 2021-05-16 19:14:36,603 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:37,308 epoch 44 - iter 1/4 - loss 1.60833621 - samples/sec: 45.36 - lr: 0.000781 2021-05-16 19:14:38,140 epoch 44 - iter 2/4 - loss 1.45758373 - samples/sec: 38.48 - lr: 0.000781 2021-05-16 19:14:38,983 epoch 44 - iter 3/4 - loss 1.52034609 - samples/sec: 37.96 - lr: 0.000781 2021-05-16 19:14:39,226 epoch 44 - iter 4/4 - loss 2.32687372 - samples/sec: 131.50 - lr: 0.000781 2021-05-16 19:14:39,235 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:39,236 EPOCH 44 done: loss 2.3269 - lr 0.0007813 2021-05-16 19:14:39,343 DEV : loss 3.467527151107788 - score 0.6667 2021-05-16 19:14:39,343 BAD EPOCHS (no improvement): 1 2021-05-16 19:14:39,343 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:40,254 epoch 45 - iter 1/4 - loss 2.09789848 - samples/sec: 35.42 - lr: 0.000781 2021-05-16 19:14:41,142 epoch 45 - iter 2/4 - loss 1.90345168 - samples/sec: 36.05 - lr: 0.000781 2021-05-16 19:14:41,828 epoch 45 - iter 3/4 - loss 1.76009802 - samples/sec: 46.62 - lr: 0.000781 2021-05-16 19:14:42,079 epoch 45 - iter 4/4 - loss 1.94041607 - samples/sec: 127.70 - lr: 0.000781 2021-05-16 19:14:42,079 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:42,079 EPOCH 45 done: loss 1.9404 - lr 0.0007813 2021-05-16 19:14:42,174 DEV : loss 3.4680516719818115 - score 0.6667 2021-05-16 19:14:42,174 BAD EPOCHS (no improvement): 2 2021-05-16 19:14:42,174 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:42,949 epoch 46 - iter 1/4 - loss 2.13200164 - samples/sec: 41.69 - lr: 0.000781 2021-05-16 19:14:43,628 epoch 46 - iter 2/4 - loss 1.92884541 - samples/sec: 47.13 - lr: 0.000781 2021-05-16 19:14:44,188 epoch 46 - iter 3/4 - loss 1.86859485 - samples/sec: 57.14 - lr: 0.000781 2021-05-16 19:14:44,420 epoch 46 - iter 4/4 - loss 2.23936662 - samples/sec: 137.82 - lr: 0.000781 2021-05-16 19:14:44,420 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:44,420 EPOCH 46 done: loss 2.2394 - lr 0.0007813 2021-05-16 19:14:44,516 DEV : loss 3.467272996902466 - score 0.6667 2021-05-16 19:14:44,516 BAD EPOCHS (no improvement): 3 2021-05-16 19:14:44,516 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:45,083 epoch 47 - iter 1/4 - loss 1.17524457 - samples/sec: 57.22 - lr: 0.000781 2021-05-16 19:14:45,804 epoch 47 - iter 2/4 - loss 1.69363821 - samples/sec: 44.40 - lr: 0.000781 2021-05-16 19:14:46,515 epoch 47 - iter 3/4 - loss 1.80291025 - samples/sec: 45.00 - lr: 0.000781 2021-05-16 19:14:46,744 epoch 47 - iter 4/4 - loss 1.68751404 - samples/sec: 139.56 - lr: 0.000781 2021-05-16 19:14:46,744 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:46,744 EPOCH 47 done: loss 1.6875 - lr 0.0007813 2021-05-16 19:14:46,841 DEV : loss 3.4656827449798584 - score 0.6667 Epoch 47: reducing learning rate of group 0 to 3.9063e-04. 2021-05-16 19:14:46,841 BAD EPOCHS (no improvement): 4 2021-05-16 19:14:46,845 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:47,512 epoch 48 - iter 1/4 - loss 1.40106690 - samples/sec: 47.97 - lr: 0.000391 2021-05-16 19:14:48,126 epoch 48 - iter 2/4 - loss 1.41452271 - samples/sec: 52.10 - lr: 0.000391 2021-05-16 19:14:48,882 epoch 48 - iter 3/4 - loss 1.74593834 - samples/sec: 42.34 - lr: 0.000391 2021-05-16 19:14:49,064 epoch 48 - iter 4/4 - loss 1.58755332 - samples/sec: 176.07 - lr: 0.000391 2021-05-16 19:14:49,064 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:49,064 EPOCH 48 done: loss 1.5876 - lr 0.0003906 2021-05-16 19:14:49,149 DEV : loss 3.467986822128296 - score 0.6667 2021-05-16 19:14:49,149 BAD EPOCHS (no improvement): 1 2021-05-16 19:14:49,149 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:49,930 epoch 49 - iter 1/4 - loss 1.38971734 - samples/sec: 40.97 - lr: 0.000391 2021-05-16 19:14:50,510 epoch 49 - iter 2/4 - loss 1.67799520 - samples/sec: 55.24 - lr: 0.000391 2021-05-16 19:14:51,137 epoch 49 - iter 3/4 - loss 1.69751259 - samples/sec: 51.05 - lr: 0.000391 2021-05-16 19:14:51,356 epoch 49 - iter 4/4 - loss 1.83348897 - samples/sec: 145.87 - lr: 0.000391 2021-05-16 19:14:51,356 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:51,356 EPOCH 49 done: loss 1.8335 - lr 0.0003906 2021-05-16 19:14:51,446 DEV : loss 3.4678850173950195 - score 0.6667 2021-05-16 19:14:51,446 BAD EPOCHS (no improvement): 2 2021-05-16 19:14:51,462 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:52,179 epoch 50 - iter 1/4 - loss 1.13970292 - samples/sec: 44.71 - lr: 0.000391 2021-05-16 19:14:52,916 epoch 50 - iter 2/4 - loss 1.94286901 - samples/sec: 43.40 - lr: 0.000391 2021-05-16 19:14:53,640 epoch 50 - iter 3/4 - loss 1.91910776 - samples/sec: 44.19 - lr: 0.000391 2021-05-16 19:14:53,807 epoch 50 - iter 4/4 - loss 1.56437027 - samples/sec: 191.98 - lr: 0.000391 2021-05-16 19:14:53,807 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:53,807 EPOCH 50 done: loss 1.5644 - lr 0.0003906 2021-05-16 19:14:53,886 DEV : loss 3.4673101902008057 - score 0.6667 2021-05-16 19:14:53,886 BAD EPOCHS (no improvement): 3 2021-05-16 19:14:53,898 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:54,525 epoch 51 - iter 1/4 - loss 1.64230800 - samples/sec: 50.99 - lr: 0.000391 2021-05-16 19:14:55,323 epoch 51 - iter 2/4 - loss 1.66435432 - samples/sec: 40.11 - lr: 0.000391 2021-05-16 19:14:56,158 epoch 51 - iter 3/4 - loss 1.76997383 - samples/sec: 38.33 - lr: 0.000391 2021-05-16 19:14:56,348 epoch 51 - iter 4/4 - loss 1.45529963 - samples/sec: 168.77 - lr: 0.000391 2021-05-16 19:14:56,348 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:56,348 EPOCH 51 done: loss 1.4553 - lr 0.0003906 2021-05-16 19:14:56,451 DEV : loss 3.46675705909729 - score 0.6667 Epoch 51: reducing learning rate of group 0 to 1.9531e-04. 2021-05-16 19:14:56,451 BAD EPOCHS (no improvement): 4 2021-05-16 19:14:56,451 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:57,134 epoch 52 - iter 1/4 - loss 1.39893460 - samples/sec: 47.38 - lr: 0.000195 2021-05-16 19:14:57,904 epoch 52 - iter 2/4 - loss 1.95114291 - samples/sec: 41.57 - lr: 0.000195 2021-05-16 19:14:58,589 epoch 52 - iter 3/4 - loss 1.87273510 - samples/sec: 46.70 - lr: 0.000195 2021-05-16 19:14:58,814 epoch 52 - iter 4/4 - loss 1.66518828 - samples/sec: 142.21 - lr: 0.000195 2021-05-16 19:14:58,814 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:58,814 EPOCH 52 done: loss 1.6652 - lr 0.0001953 2021-05-16 19:14:58,898 DEV : loss 3.4661099910736084 - score 0.6667 2021-05-16 19:14:58,898 BAD EPOCHS (no improvement): 1 2021-05-16 19:14:58,898 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:14:59,621 epoch 53 - iter 1/4 - loss 1.52661002 - samples/sec: 44.90 - lr: 0.000195 2021-05-16 19:15:00,323 epoch 53 - iter 2/4 - loss 1.72744888 - samples/sec: 45.60 - lr: 0.000195 2021-05-16 19:15:01,033 epoch 53 - iter 3/4 - loss 1.67759216 - samples/sec: 45.09 - lr: 0.000195 2021-05-16 19:15:01,186 epoch 53 - iter 4/4 - loss 1.46851297 - samples/sec: 208.70 - lr: 0.000195 2021-05-16 19:15:01,186 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:01,186 EPOCH 53 done: loss 1.4685 - lr 0.0001953 2021-05-16 19:15:01,282 DEV : loss 3.466641426086426 - score 0.6667 2021-05-16 19:15:01,282 BAD EPOCHS (no improvement): 2 2021-05-16 19:15:01,282 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:01,903 epoch 54 - iter 1/4 - loss 1.67276871 - samples/sec: 51.56 - lr: 0.000195 2021-05-16 19:15:02,720 epoch 54 - iter 2/4 - loss 1.84151357 - samples/sec: 39.15 - lr: 0.000195 2021-05-16 19:15:03,497 epoch 54 - iter 3/4 - loss 1.79460196 - samples/sec: 41.16 - lr: 0.000195 2021-05-16 19:15:03,697 epoch 54 - iter 4/4 - loss 1.73617950 - samples/sec: 160.20 - lr: 0.000195 2021-05-16 19:15:03,697 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:03,697 EPOCH 54 done: loss 1.7362 - lr 0.0001953 2021-05-16 19:15:03,791 DEV : loss 3.4663610458374023 - score 0.6667 2021-05-16 19:15:03,807 BAD EPOCHS (no improvement): 3 2021-05-16 19:15:03,809 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:04,563 epoch 55 - iter 1/4 - loss 2.19241428 - samples/sec: 42.46 - lr: 0.000195 2021-05-16 19:15:05,206 epoch 55 - iter 2/4 - loss 1.68816346 - samples/sec: 49.73 - lr: 0.000195 2021-05-16 19:15:05,899 epoch 55 - iter 3/4 - loss 1.67743218 - samples/sec: 46.20 - lr: 0.000195 2021-05-16 19:15:06,147 epoch 55 - iter 4/4 - loss 1.62165421 - samples/sec: 129.04 - lr: 0.000195 2021-05-16 19:15:06,147 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:06,147 EPOCH 55 done: loss 1.6217 - lr 0.0001953 2021-05-16 19:15:06,243 DEV : loss 3.4659790992736816 - score 0.6667 Epoch 55: reducing learning rate of group 0 to 9.7656e-05. 2021-05-16 19:15:06,243 BAD EPOCHS (no improvement): 4 2021-05-16 19:15:06,243 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:06,243 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:06,243 learning rate too small - quitting training! 2021-05-16 19:15:06,243 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:14,421 ---------------------------------------------------------------------------------------------------- 2021-05-16 19:15:14,421 Testing using best model ... 2021-05-16 19:15:14,426 loading file slot-model\best-model.pt 2021-05-16 19:15:34,103 0.6759 0.6901 0.6829 2021-05-16 19:15:34,103 Results: - F1-score (micro) 0.6829 - F1-score (macro) 0.3185 By class: NoLabel I-end_conversation tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 affirm tp: 11 - fp: 6 - fn: 2 - precision: 0.6471 - recall: 0.8462 - f1-score: 0.7333 appoinment tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 appoinment/doctor tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 appointment tp: 19 - fp: 4 - fn: 1 - precision: 0.8261 - recall: 0.9500 - f1-score: 0.8837 appointment/doctor tp: 16 - fp: 13 - fn: 5 - precision: 0.5517 - recall: 0.7619 - f1-score: 0.6400 appointment/office tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 appointment/type tp: 4 - fp: 2 - fn: 2 - precision: 0.6667 - recall: 0.6667 - f1-score: 0.6667 datetime tp: 12 - fp: 7 - fn: 6 - precision: 0.6316 - recall: 0.6667 - f1-score: 0.6486 deny tp: 0 - fp: 0 - fn: 4 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 doctor tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 end_conversation tp: 14 - fp: 8 - fn: 6 - precision: 0.6364 - recall: 0.7000 - f1-score: 0.6667 greeting tp: 18 - fp: 3 - fn: 2 - precision: 0.8571 - recall: 0.9000 - f1-score: 0.8780 prescription tp: 4 - fp: 3 - fn: 2 - precision: 0.5714 - recall: 0.6667 - f1-score: 0.6154 prescription/type tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 register/email tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 register/name tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 results tp: 0 - fp: 1 - fn: 3 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000 2021-05-16 19:15:34,103 ----------------------------------------------------------------------------------------------------
{'test_score': 0.6829268292682927, 'dev_score_history': [0.0, 0.0, 0.0, 0.0, 0.06451612903225808, 0.0, 0.15384615384615383, 0.34782608695652173, 0.2608695652173913, 0.30769230769230765, 0.34782608695652173, 0.3333333333333333, 0.4615384615384615, 0.43478260869565216, 0.6923076923076924, 0.5833333333333334, 0.6153846153846153, 0.48000000000000004, 0.5185185185185186, 0.5599999999999999, 0.5599999999999999, 0.6666666666666666, 0.5833333333333334, 0.5833333333333334, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.5833333333333334, 0.5833333333333334, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666, 0.6666666666666666], 'train_loss_history': [16.562259912490845, 10.837674498558044, 8.949250102043152, 7.318224191665649, 8.57210910320282, 8.556124210357666, 7.01634418964386, 5.2021918296813965, 4.593203812837601, 4.4080811738967896, 4.860165357589722, 3.447446435689926, 4.471416532993317, 3.3014023900032043, 2.902692526578903, 3.367657423019409, 4.03450483083725, 2.9484628438949585, 2.2557127252221107, 2.8968922197818756, 2.5365295708179474, 2.157317876815796, 1.735389031469822, 1.698147714138031, 1.8288256227970123, 2.5120953917503357, 2.5171812176704407, 1.5335412621498108, 1.8758874237537384, 1.606706902384758, 1.7897180914878845, 1.9879519641399384, 2.1257758736610413, 1.828156739473343, 1.4034691751003265, 2.0481031239032745, 1.9463901221752167, 1.5703929513692856, 1.6625409424304962, 1.5803122818470001, 2.285522222518921, 2.0688994228839874, 1.7886793613433838, 2.3268737196922302, 1.9404160678386688, 2.2393666207790375, 1.6875140368938446, 1.587553322315216, 1.8334889709949493, 1.5643702745437622, 1.4552996307611465, 1.6651882827281952, 1.4685129672288895, 1.7361795008182526, 1.621654212474823], 'dev_loss_history': [12.217952728271484, 8.176359176635742, 7.451809883117676, 7.464598178863525, 7.330676555633545, 5.898077011108398, 5.496520519256592, 5.2129292488098145, 4.9869303703308105, 4.855195045471191, 4.352779865264893, 4.364665508270264, 4.251131057739258, 3.9291062355041504, 4.368889808654785, 3.6790337562561035, 3.864961862564087, 3.8492608070373535, 3.9649171829223633, 3.6375184059143066, 3.636472463607788, 3.7137885093688965, 3.495877265930176, 3.547339677810669, 3.695451259613037, 3.5949974060058594, 3.624786376953125, 3.486685276031494, 3.5309135913848877, 3.4875831604003906, 3.4355287551879883, 3.4302756786346436, 3.4519147872924805, 3.4468681812286377, 3.4555253982543945, 3.467947483062744, 3.4721953868865967, 3.4747495651245117, 3.4640121459960938, 3.464979648590088, 3.46785044670105, 3.464182138442993, 3.464967966079712, 3.467527151107788, 3.4680516719818115, 3.467272996902466, 3.4656827449798584, 3.467986822128296, 3.4678850173950195, 3.4673101902008057, 3.46675705909729, 3.4661099910736084, 3.466641426086426, 3.4663610458374023, 3.4659790992736816]}
Jakość wyuczonego modelu możemy ocenić, korzystając z zaraportowanych powyżej metryk, tj.:
_tp (true positives)
liczba słów oznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą
_fp (false positives)
liczba słów nieoznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą
_fn (false negatives)
liczba słów oznaczonych w zbiorze testowym etykietą $e$, którym model nie nadał etykiety $e$
_precision
$$\frac{tp}{tp + fp}$$
_recall
$$\frac{tp}{tp + fn}$$
$F_1$
$$\frac{2 \cdot precision \cdot recall}{precision + recall}$$
_micro $F_1$
$F_1$ w którym $tp$, $fp$ i $fn$ są liczone łącznie dla wszystkich etykiet, tj. $tp = \sum_{e}{{tp}_e}$, $fn = \sum{e}{{fn}e}$, $fp = \sum{e}{{fp}_e}$
_macro $F_1$
średnia arytmetyczna z $F_1$ obliczonych dla poszczególnych etykiet z osobna.
Wyuczony model możemy wczytać z pliku korzystając z metody load
.
model = SequenceTagger.load('slot-model/final-model.pt')
2021-05-16 19:15:34,133 loading file slot-model/final-model.pt
Wczytany model możemy wykorzystać do przewidywania slotów w wypowiedziach użytkownika, korzystając
z przedstawionej poniżej funkcji predict
.
def predict(model, sentence):
csentence = [{'form': word} for word in sentence]
fsentence = conllu2flair([csentence])[0]
model.predict(fsentence)
return [(token, ftoken.get_tag('slot').value) for token, ftoken in zip(sentence, fsentence)]
Jak pokazuje przykład poniżej model wyuczony tylko na 100 przykładach popełnia w dosyć prostej
wypowiedzi błąd etykietując słowo alarm
tagiem B-weather/noun
.
tabulate(predict(model, ' dzien dobry poprosze wizytę do doktor lekarza rodzinnego najlepiej dzisiaj w godzinach popołudniowych dziś albo jutro internisty'.split()), tablefmt='html')
dzien | B-greeting |
dobry | I-greeting |
poprosze | O |
wizytę | B-appointment |
do | O |
doktor | B-appointment/doctor |
lekarza | B-appointment/doctor |
rodzinnego | I-appointment/doctor |
najlepiej | O |
dzisiaj | O |
w | O |
godzinach | I-datetime |
popołudniowych | I-datetime |
dziś | B-datetime |
albo | O |
jutro | I-datetime |
internisty | I-appointment/doctor |
Literatura
- Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis, Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog. NAACL-HLT (1) 2019, pp. 3795-3805
- John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282–289, https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (November 15, 1997), 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, Attention is All you Need, NIPS 2017, pp. 5998-6008, https://arxiv.org/abs/1706.03762
- Alan Akbik, Duncan Blythe, Roland Vollgraf, Contextual String Embeddings for Sequence Labeling, Proceedings of the 27th International Conference on Computational Linguistics, pp. 1638–1649, https://www.aclweb.org/anthology/C18-1139.pdf