System_Dialogowy_Janet/07-parsing-semantyczny-uczenie.ipynb
2021-05-16 19:26:52 +02:00

94 KiB
Raw Blame History

Parsing semantyczny z wykorzystaniem technik uczenia maszynowego

Wprowadzenie

Problem wykrywania slotów i ich wartości w wypowiedziach użytkownika można sformułować jako zadanie polegające na przewidywaniu dla poszczególnych słów etykiet wskazujących na to czy i do jakiego slotu dane słowo należy.

chciałbym zarezerwować stolik na jutro**/day** na godzinę dwunastą**/hour** czterdzieści**/hour** pięć**/hour** na pięć**/size** osób

Granice slotów oznacza się korzystając z wybranego schematu etykietowania.

Schemat IOB

Prefix Znaczenie
I wnętrze slotu (inside)
O poza slotem (outside)
B początek slotu (beginning)

chciałbym zarezerwować stolik na jutro**/B-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/I-hour** na pięć**/B-size** osób

Schemat IOBES

Prefix Znaczenie
I wnętrze slotu (inside)
O poza slotem (outside)
B początek slotu (beginning)
E koniec slotu (ending)
S pojedyncze słowo (single)

chciałbym zarezerwować stolik na jutro**/S-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/E-hour** na pięć**/S-size** osób

Jeżeli dla tak sformułowanego zadania przygotujemy zbiór danych złożony z wypowiedzi użytkownika z oznaczonymi slotami (tzw. _zbiór uczący), to możemy zastosować techniki (nadzorowanego) uczenia maszynowego w celu zbudowania modelu annotującego wypowiedzi użytkownika etykietami slotów.

Do zbudowania takiego modelu można wykorzystać między innymi:

  1. warunkowe pola losowe (Lafferty i in.; 2001),

  2. rekurencyjne sieci neuronowe, np. sieci LSTM (Hochreiter i Schmidhuber; 1997),

  3. transformery (Vaswani i in., 2017).

Przykład

Skorzystamy ze zbioru danych przygotowanego przez Schustera (2019).

!mkdir -p l07
%cd l07
!curl -L -C -  https://fb.me/multilingual_task_oriented_data  -o data.zip
%cd ..
C:\Users\Ania\Desktop\System_Dialogowy_Janet\l07
C:\Users\Ania\Desktop\System_Dialogowy_Janet
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

  0 8714k    0  1656    0     0    886      0  2:47:51  0:00:01  2:47:50   886
  4 8714k    4  406k    0     0   167k      0  0:00:52  0:00:02  0:00:50  721k
 33 8714k   33 2957k    0     0   863k      0  0:00:10  0:00:03  0:00:07 1898k
 69 8714k   69 6035k    0     0  1387k      0  0:00:06  0:00:04  0:00:02 2429k
100 8714k  100 8714k    0     0  1703k      0  0:00:05  0:00:05 --:--:-- 2683k

Zbiór ten gromadzi wypowiedzi w trzech językach opisane slotami dla dwunastu ram należących do trzech dziedzin Alarm, Reminder oraz Weather. Dane wczytamy korzystając z biblioteki conllu.

!pip3 install conllu
import codecs
from conllu import parse_incr
fields = ['id', 'form', 'frame', 'slot']

def nolabel2o(line, i):
    return 'O' if line[i] == 'NoLabel' else line[i]

with open('Janet.conllu', encoding='utf-8') as trainfile:
    trainset = list(parse_incr(trainfile, fields=fields, field_parsers={'slot': nolabel2o}))
with open('Janet.conllu', encoding='utf-8') as testfile:
    testset = list(parse_incr(testfile, fields=fields, field_parsers={'slot': nolabel2o}))
Requirement already satisfied: conllu in c:\programdata\anaconda3\lib\site-packages (4.4)

Zobaczmy kilka przykładowych wypowiedzi z tego zbioru.

!pip3 install tabulate
from tabulate import tabulate
tabulate(trainset[0], tablefmt='html')
Requirement already satisfied: tabulate in c:\programdata\anaconda3\lib\site-packages (0.8.9)
1chciałem appointment/request_prescriptionO
2prosić appointment/request_prescriptionO
3o appointment/request_prescriptionO
4wypisanieappointment/request_prescriptionO
5kolejnej appointment/request_prescriptionO
6recepty appointment/request_prescriptionB-prescription
7na appointment/request_prescriptionO
8lek appointment/request_prescriptionB-prescription/type
9x appointment/request_prescriptionI-prescription/type

Na potrzeby prezentacji procesu uczenia w jupyterowym notatniku zawęzimy zbiór danych do początkowych przykładów.

Budując model skorzystamy z architektury opartej o rekurencyjne sieci neuronowe zaimplementowanej w bibliotece flair (Akbik i in. 2018).

!pip3 install flair
Requirement already satisfied: flair in c:\programdata\anaconda3\lib\site-packages (0.8.0.post1)
Requirement already satisfied: deprecated>=1.2.4 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.2.12)
Requirement already satisfied: janome in c:\programdata\anaconda3\lib\site-packages (from flair) (0.4.1)
Requirement already satisfied: langdetect in c:\programdata\anaconda3\lib\site-packages (from flair) (1.0.9)
Requirement already satisfied: hyperopt>=0.1.1 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.2.5)
Requirement already satisfied: sentencepiece==0.1.95 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.1.95)
Requirement already satisfied: python-dateutil>=2.6.1 in c:\programdata\anaconda3\lib\site-packages (from flair) (2.8.1)
Requirement already satisfied: regex in c:\programdata\anaconda3\lib\site-packages (from flair) (2020.10.15)
Requirement already satisfied: segtok>=1.5.7 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.5.10)
Requirement already satisfied: numpy<1.20.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.19.2)
Requirement already satisfied: mpld3==0.3 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.3)
Requirement already satisfied: bpemb>=0.3.2 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.3.3)
Requirement already satisfied: scikit-learn>=0.21.3 in c:\programdata\anaconda3\lib\site-packages (from flair) (0.23.2)
Requirement already satisfied: matplotlib>=2.2.3 in c:\programdata\anaconda3\lib\site-packages (from flair) (3.3.2)
Requirement already satisfied: sqlitedict>=1.6.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.7.0)
Requirement already satisfied: tabulate in c:\programdata\anaconda3\lib\site-packages (from flair) (0.8.9)
Requirement already satisfied: ftfy in c:\programdata\anaconda3\lib\site-packages (from flair) (6.0.1)
Requirement already satisfied: konoha<5.0.0,>=4.0.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (4.6.4)
Requirement already satisfied: torch<=1.7.1,>=1.5.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (1.7.1)
Requirement already satisfied: gensim<=3.8.3,>=3.4.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (3.8.3)
Requirement already satisfied: transformers>=4.0.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (4.6.0)
Requirement already satisfied: gdown==3.12.2 in c:\programdata\anaconda3\lib\site-packages (from flair) (3.12.2)
Requirement already satisfied: tqdm>=4.26.0 in c:\programdata\anaconda3\lib\site-packages (from flair) (4.50.2)
Requirement already satisfied: lxml in c:\programdata\anaconda3\lib\site-packages (from flair) (4.6.1)
Requirement already satisfied: huggingface-hub in c:\programdata\anaconda3\lib\site-packages (from flair) (0.0.8)
Requirement already satisfied: wrapt<2,>=1.10 in c:\users\ania\appdata\roaming\python\python38\site-packages (from deprecated>=1.2.4->flair) (1.12.1)
Requirement already satisfied: six in c:\programdata\anaconda3\lib\site-packages (from langdetect->flair) (1.15.0)
Requirement already satisfied: cloudpickle in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (1.6.0)
Requirement already satisfied: scipy in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (1.5.2)
Requirement already satisfied: networkx>=2.2 in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (2.5)
Requirement already satisfied: future in c:\programdata\anaconda3\lib\site-packages (from hyperopt>=0.1.1->flair) (0.18.2)
Requirement already satisfied: requests in c:\programdata\anaconda3\lib\site-packages (from bpemb>=0.3.2->flair) (2.24.0)
Requirement already satisfied: joblib>=0.11 in c:\programdata\anaconda3\lib\site-packages (from scikit-learn>=0.21.3->flair) (0.17.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\programdata\anaconda3\lib\site-packages (from scikit-learn>=0.21.3->flair) (2.1.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (2.4.7)
Requirement already satisfied: pillow>=6.2.0 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (8.0.1)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (1.3.0)
Requirement already satisfied: certifi>=2020.06.20 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (2020.6.20)
Requirement already satisfied: cycler>=0.10 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=2.2.3->flair) (0.10.0)
Requirement already satisfied: wcwidth in c:\programdata\anaconda3\lib\site-packages (from ftfy->flair) (0.2.5)
Requirement already satisfied: importlib-metadata<4.0.0,>=3.7.0 in c:\programdata\anaconda3\lib\site-packages (from konoha<5.0.0,>=4.0.0->flair) (3.10.1)
Requirement already satisfied: overrides<4.0.0,>=3.0.0 in c:\programdata\anaconda3\lib\site-packages (from konoha<5.0.0,>=4.0.0->flair) (3.1.0)
Requirement already satisfied: typing-extensions in c:\programdata\anaconda3\lib\site-packages (from torch<=1.7.1,>=1.5.0->flair) (3.7.4.3)
Requirement already satisfied: smart-open>=1.8.1 in c:\programdata\anaconda3\lib\site-packages (from gensim<=3.8.3,>=3.4.0->flair) (5.0.0)
Requirement already satisfied: Cython==0.29.14 in c:\programdata\anaconda3\lib\site-packages (from gensim<=3.8.3,>=3.4.0->flair) (0.29.14)
Requirement already satisfied: packaging in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (20.4)
Requirement already satisfied: filelock in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (3.0.12)
Requirement already satisfied: sacremoses in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (0.0.45)
Requirement already satisfied: tokenizers<0.11,>=0.10.1 in c:\programdata\anaconda3\lib\site-packages (from transformers>=4.0.0->flair) (0.10.2)
Requirement already satisfied: decorator>=4.3.0 in c:\programdata\anaconda3\lib\site-packages (from networkx>=2.2->hyperopt>=0.1.1->flair) (4.4.2)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\programdata\anaconda3\lib\site-packages (from requests->bpemb>=0.3.2->flair) (1.25.11)
Requirement already satisfied: idna<3,>=2.5 in c:\programdata\anaconda3\lib\site-packages (from requests->bpemb>=0.3.2->flair) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in c:\programdata\anaconda3\lib\site-packages (from requests->bpemb>=0.3.2->flair) (3.0.4)
Requirement already satisfied: zipp>=0.5 in c:\programdata\anaconda3\lib\site-packages (from importlib-metadata<4.0.0,>=3.7.0->konoha<5.0.0,>=4.0.0->flair) (3.4.0)
Requirement already satisfied: click in c:\programdata\anaconda3\lib\site-packages (from sacremoses->transformers>=4.0.0->flair) (7.1.2)
from flair.data import Corpus, Sentence, Token
from flair.datasets import SentenceDataset
from flair.embeddings import StackedEmbeddings
from flair.embeddings import WordEmbeddings
from flair.embeddings import CharacterEmbeddings
from flair.embeddings import FlairEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer

!pip3 install torch
# determinizacja obliczeń
import random
import torch
random.seed(42)
torch.manual_seed(42)

if torch.cuda.is_available():
    torch.cuda.manual_seed(0)
    torch.cuda.manual_seed_all(0)
    torch.backends.cudnn.enabled = False
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.deterministic = True
Requirement already satisfied: torch in c:\programdata\anaconda3\lib\site-packages (1.7.1)
Requirement already satisfied: numpy in c:\programdata\anaconda3\lib\site-packages (from torch) (1.19.2)
Requirement already satisfied: typing-extensions in c:\programdata\anaconda3\lib\site-packages (from torch) (3.7.4.3)

Dane skonwertujemy do formatu wykorzystywanego przez flair, korzystając z następującej funkcji.

def conllu2flair(sentences, label=None):
    fsentences = []

    for sentence in sentences:
        fsentence = Sentence()

        for token in sentence:
            ftoken = Token(token['form'])

            if label:
                ftoken.add_tag(label, token[label])

            fsentence.add_token(ftoken)

        fsentences.append(fsentence)

    return SentenceDataset(fsentences)

corpus = Corpus(train=conllu2flair(trainset, 'slot'), test=conllu2flair(testset, 'slot'))
print(corpus)
tag_dictionary = corpus.make_tag_dictionary(tag_type='slot')
print(tag_dictionary)
Corpus: 99 train + 11 dev + 110 test sentences
Dictionary with 31 tags: <unk>, O, B-prescription, B-prescription/type, I-prescription/type, B-end_conversation, B-deny, I-end_conversation, B-greeting, I-greeting, B-appointment, B-appointment/doctor, I-appointment/doctor, B-datetime, NoLabel I-end_conversation, I-datetime, B-affirm, B-appointment/office, I-B-datetime, B-results, B-appointment/type, I-appointment/type, B-register/email, B-doctor, I-affirm, B-appoinment/doctor, B-appoinment, B-register/name, I-register/name, <START>

Nasz model będzie wykorzystywał wektorowe reprezentacje słów (zob. Word Embeddings).

embedding_types = [
    WordEmbeddings('pl'),
    FlairEmbeddings('pl-forward'),
    FlairEmbeddings('pl-backward'),
    CharacterEmbeddings(),
]

embeddings = StackedEmbeddings(embeddings=embedding_types)
tagger = SequenceTagger(hidden_size=256, embeddings=embeddings,
                        tag_dictionary=tag_dictionary,
                        tag_type='slot', use_crf=True)

Zobaczmy jak wygląda architektura sieci neuronowej, która będzie odpowiedzialna za przewidywanie slotów w wypowiedziach.

print(tagger)
SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('pl')
    (list_embedding_1): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_2): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_3): CharacterEmbeddings(
      (char_embedding): Embedding(275, 25)
      (char_rnn): LSTM(25, 25, bidirectional=True)
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)
  (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=31, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)

Wykonamy dziesięć iteracji (epok) uczenia a wynikowy model zapiszemy w katalogu slot-model.

trainer = ModelTrainer(tagger, corpus)
trainer.train('slot-model',
              learning_rate=0.1,
              mini_batch_size=32,
              max_epochs=100,
              train_with_dev=False)
2021-05-16 19:11:09,838 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:09,846 Model: "SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('pl')
    (list_embedding_1): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_2): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_3): CharacterEmbeddings(
      (char_embedding): Embedding(275, 25)
      (char_rnn): LSTM(25, 25, bidirectional=True)
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)
  (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=31, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)"
2021-05-16 19:11:09,846 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:09,846 Corpus: "Corpus: 99 train + 11 dev + 110 test sentences"
2021-05-16 19:11:09,846 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:09,846 Parameters:
2021-05-16 19:11:09,854  - learning_rate: "0.1"
2021-05-16 19:11:09,854  - mini_batch_size: "32"
2021-05-16 19:11:09,854  - patience: "3"
2021-05-16 19:11:09,854  - anneal_factor: "0.5"
2021-05-16 19:11:09,854  - max_epochs: "100"
2021-05-16 19:11:09,854  - shuffle: "True"
2021-05-16 19:11:09,854  - train_with_dev: "False"
2021-05-16 19:11:09,854  - batch_growth_annealing: "False"
2021-05-16 19:11:09,862 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:09,862 Model training base path: "slot-model"
2021-05-16 19:11:09,862 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:09,862 Device: cpu
2021-05-16 19:11:09,862 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:09,862 Embeddings storage mode: cpu
2021-05-16 19:11:09,870 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:12,779 epoch 1 - iter 1/4 - loss 23.51556206 - samples/sec: 11.00 - lr: 0.100000
2021-05-16 19:11:16,270 epoch 1 - iter 2/4 - loss 19.95522118 - samples/sec: 9.17 - lr: 0.100000
2021-05-16 19:11:19,989 epoch 1 - iter 3/4 - loss 18.64025307 - samples/sec: 8.64 - lr: 0.100000
2021-05-16 19:11:20,665 epoch 1 - iter 4/4 - loss 16.56225991 - samples/sec: 47.34 - lr: 0.100000
2021-05-16 19:11:20,665 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:20,665 EPOCH 1 done: loss 16.5623 - lr 0.1000000
2021-05-16 19:11:23,175 DEV : loss 12.217952728271484 - score 0.0
2021-05-16 19:11:23,175 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:11:31,472 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:32,200 epoch 2 - iter 1/4 - loss 13.48146439 - samples/sec: 44.15 - lr: 0.100000
2021-05-16 19:11:32,902 epoch 2 - iter 2/4 - loss 13.13387251 - samples/sec: 45.60 - lr: 0.100000
2021-05-16 19:11:33,485 epoch 2 - iter 3/4 - loss 12.05493037 - samples/sec: 54.92 - lr: 0.100000
2021-05-16 19:11:33,672 epoch 2 - iter 4/4 - loss 10.83767450 - samples/sec: 170.46 - lr: 0.100000
2021-05-16 19:11:33,672 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:33,672 EPOCH 2 done: loss 10.8377 - lr 0.1000000
2021-05-16 19:11:33,768 DEV : loss 8.176359176635742 - score 0.0
2021-05-16 19:11:33,771 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:11:42,363 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:43,054 epoch 3 - iter 1/4 - loss 9.78410912 - samples/sec: 46.31 - lr: 0.100000
2021-05-16 19:11:43,672 epoch 3 - iter 2/4 - loss 9.88690376 - samples/sec: 51.75 - lr: 0.100000
2021-05-16 19:11:44,405 epoch 3 - iter 3/4 - loss 9.67457644 - samples/sec: 43.69 - lr: 0.100000
2021-05-16 19:11:44,589 epoch 3 - iter 4/4 - loss 8.94925010 - samples/sec: 173.35 - lr: 0.100000
2021-05-16 19:11:44,589 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:44,589 EPOCH 3 done: loss 8.9493 - lr 0.1000000
2021-05-16 19:11:44,693 DEV : loss 7.451809883117676 - score 0.0
2021-05-16 19:11:44,693 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:11:53,845 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:54,437 epoch 4 - iter 1/4 - loss 8.59626198 - samples/sec: 55.55 - lr: 0.100000
2021-05-16 19:11:55,150 epoch 4 - iter 2/4 - loss 8.40540457 - samples/sec: 44.85 - lr: 0.100000
2021-05-16 19:11:55,995 epoch 4 - iter 3/4 - loss 8.39408366 - samples/sec: 37.88 - lr: 0.100000
2021-05-16 19:11:56,222 epoch 4 - iter 4/4 - loss 7.31822419 - samples/sec: 141.22 - lr: 0.100000
2021-05-16 19:11:56,222 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:56,222 EPOCH 4 done: loss 7.3182 - lr 0.1000000
2021-05-16 19:11:56,309 DEV : loss 7.464598178863525 - score 0.0
2021-05-16 19:11:56,309 BAD EPOCHS (no improvement): 1
2021-05-16 19:11:56,309 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:57,036 epoch 5 - iter 1/4 - loss 7.71572590 - samples/sec: 44.96 - lr: 0.100000
2021-05-16 19:11:57,744 epoch 5 - iter 2/4 - loss 8.43728781 - samples/sec: 45.20 - lr: 0.100000
2021-05-16 19:11:58,488 epoch 5 - iter 3/4 - loss 7.66639407 - samples/sec: 43.01 - lr: 0.100000
2021-05-16 19:11:58,705 epoch 5 - iter 4/4 - loss 8.57210910 - samples/sec: 147.23 - lr: 0.100000
2021-05-16 19:11:58,705 ----------------------------------------------------------------------------------------------------
2021-05-16 19:11:58,705 EPOCH 5 done: loss 8.5721 - lr 0.1000000
2021-05-16 19:11:58,801 DEV : loss 7.330676555633545 - score 0.0645
2021-05-16 19:11:58,809 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:12:09,132 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:10,066 epoch 6 - iter 1/4 - loss 6.82695341 - samples/sec: 34.26 - lr: 0.100000
2021-05-16 19:12:10,923 epoch 6 - iter 2/4 - loss 6.71814942 - samples/sec: 37.31 - lr: 0.100000
2021-05-16 19:12:11,835 epoch 6 - iter 3/4 - loss 7.02111626 - samples/sec: 35.09 - lr: 0.100000
2021-05-16 19:12:12,029 epoch 6 - iter 4/4 - loss 8.55612421 - samples/sec: 165.49 - lr: 0.100000
2021-05-16 19:12:12,029 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:12,029 EPOCH 6 done: loss 8.5561 - lr 0.1000000
2021-05-16 19:12:12,117 DEV : loss 5.898077011108398 - score 0.0
2021-05-16 19:12:12,117 BAD EPOCHS (no improvement): 1
2021-05-16 19:12:12,117 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:12,829 epoch 7 - iter 1/4 - loss 3.95063305 - samples/sec: 45.47 - lr: 0.100000
2021-05-16 19:12:13,605 epoch 7 - iter 2/4 - loss 4.73969674 - samples/sec: 41.22 - lr: 0.100000
2021-05-16 19:12:14,424 epoch 7 - iter 3/4 - loss 6.22298797 - samples/sec: 39.08 - lr: 0.100000
2021-05-16 19:12:14,648 epoch 7 - iter 4/4 - loss 7.01634419 - samples/sec: 142.74 - lr: 0.100000
2021-05-16 19:12:14,648 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:14,648 EPOCH 7 done: loss 7.0163 - lr 0.1000000
2021-05-16 19:12:14,745 DEV : loss 5.496520519256592 - score 0.1538
2021-05-16 19:12:14,745 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:12:24,553 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:25,305 epoch 8 - iter 1/4 - loss 5.84166050 - samples/sec: 43.01 - lr: 0.100000
2021-05-16 19:12:26,009 epoch 8 - iter 2/4 - loss 5.58190751 - samples/sec: 45.43 - lr: 0.100000
2021-05-16 19:12:26,803 epoch 8 - iter 3/4 - loss 6.09121291 - samples/sec: 40.28 - lr: 0.100000
2021-05-16 19:12:27,011 epoch 8 - iter 4/4 - loss 5.20219183 - samples/sec: 153.85 - lr: 0.100000
2021-05-16 19:12:27,011 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:27,011 EPOCH 8 done: loss 5.2022 - lr 0.1000000
2021-05-16 19:12:27,099 DEV : loss 5.2129292488098145 - score 0.3478
2021-05-16 19:12:27,099 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:12:37,200 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:37,968 epoch 9 - iter 1/4 - loss 6.38291883 - samples/sec: 41.64 - lr: 0.100000
2021-05-16 19:12:38,703 epoch 9 - iter 2/4 - loss 6.26358747 - samples/sec: 43.56 - lr: 0.100000
2021-05-16 19:12:39,284 epoch 9 - iter 3/4 - loss 5.50593615 - samples/sec: 55.03 - lr: 0.100000
2021-05-16 19:12:39,476 epoch 9 - iter 4/4 - loss 4.59320381 - samples/sec: 166.66 - lr: 0.100000
2021-05-16 19:12:39,476 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:39,476 EPOCH 9 done: loss 4.5932 - lr 0.1000000
2021-05-16 19:12:39,580 DEV : loss 4.9869303703308105 - score 0.2609
2021-05-16 19:12:39,580 BAD EPOCHS (no improvement): 1
2021-05-16 19:12:39,590 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:40,387 epoch 10 - iter 1/4 - loss 4.83267832 - samples/sec: 40.15 - lr: 0.100000
2021-05-16 19:12:41,158 epoch 10 - iter 2/4 - loss 4.78956985 - samples/sec: 41.52 - lr: 0.100000
2021-05-16 19:12:41,792 epoch 10 - iter 3/4 - loss 4.80196079 - samples/sec: 50.47 - lr: 0.100000
2021-05-16 19:12:41,993 epoch 10 - iter 4/4 - loss 4.40808117 - samples/sec: 158.79 - lr: 0.100000
2021-05-16 19:12:42,001 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:42,001 EPOCH 10 done: loss 4.4081 - lr 0.1000000
2021-05-16 19:12:42,089 DEV : loss 4.855195045471191 - score 0.3077
2021-05-16 19:12:42,089 BAD EPOCHS (no improvement): 2
2021-05-16 19:12:42,097 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:42,780 epoch 11 - iter 1/4 - loss 3.66451931 - samples/sec: 46.87 - lr: 0.100000
2021-05-16 19:12:43,666 epoch 11 - iter 2/4 - loss 4.65244174 - samples/sec: 36.16 - lr: 0.100000
2021-05-16 19:12:44,456 epoch 11 - iter 3/4 - loss 4.58611314 - samples/sec: 40.51 - lr: 0.100000
2021-05-16 19:12:44,648 epoch 11 - iter 4/4 - loss 4.86016536 - samples/sec: 166.85 - lr: 0.100000
2021-05-16 19:12:44,656 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:44,656 EPOCH 11 done: loss 4.8602 - lr 0.1000000
2021-05-16 19:12:44,737 DEV : loss 4.352779865264893 - score 0.3478
2021-05-16 19:12:44,745 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:12:53,094 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:53,668 epoch 12 - iter 1/4 - loss 3.02415586 - samples/sec: 55.76 - lr: 0.100000
2021-05-16 19:12:54,381 epoch 12 - iter 2/4 - loss 3.78920162 - samples/sec: 44.90 - lr: 0.100000
2021-05-16 19:12:55,097 epoch 12 - iter 3/4 - loss 4.02983785 - samples/sec: 44.67 - lr: 0.100000
2021-05-16 19:12:55,304 epoch 12 - iter 4/4 - loss 3.44744644 - samples/sec: 154.91 - lr: 0.100000
2021-05-16 19:12:55,304 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:55,304 EPOCH 12 done: loss 3.4474 - lr 0.1000000
2021-05-16 19:12:55,402 DEV : loss 4.364665508270264 - score 0.3333
2021-05-16 19:12:55,402 BAD EPOCHS (no improvement): 1
2021-05-16 19:12:55,414 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:56,003 epoch 13 - iter 1/4 - loss 4.21208715 - samples/sec: 54.32 - lr: 0.100000
2021-05-16 19:12:56,765 epoch 13 - iter 2/4 - loss 4.02075458 - samples/sec: 42.01 - lr: 0.100000
2021-05-16 19:12:57,528 epoch 13 - iter 3/4 - loss 3.93069355 - samples/sec: 41.92 - lr: 0.100000
2021-05-16 19:12:57,757 epoch 13 - iter 4/4 - loss 4.47141653 - samples/sec: 139.66 - lr: 0.100000
2021-05-16 19:12:57,757 ----------------------------------------------------------------------------------------------------
2021-05-16 19:12:57,757 EPOCH 13 done: loss 4.4714 - lr 0.1000000
2021-05-16 19:12:57,856 DEV : loss 4.251131057739258 - score 0.4615
2021-05-16 19:12:57,856 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:13:07,766 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:08,603 epoch 14 - iter 1/4 - loss 4.07004356 - samples/sec: 38.23 - lr: 0.100000
2021-05-16 19:13:09,137 epoch 14 - iter 2/4 - loss 3.58775365 - samples/sec: 60.00 - lr: 0.100000
2021-05-16 19:13:09,805 epoch 14 - iter 3/4 - loss 3.37540340 - samples/sec: 49.04 - lr: 0.100000
2021-05-16 19:13:10,017 epoch 14 - iter 4/4 - loss 3.30140239 - samples/sec: 150.99 - lr: 0.100000
2021-05-16 19:13:10,017 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:10,017 EPOCH 14 done: loss 3.3014 - lr 0.1000000
2021-05-16 19:13:10,108 DEV : loss 3.9291062355041504 - score 0.4348
2021-05-16 19:13:10,108 BAD EPOCHS (no improvement): 1
2021-05-16 19:13:10,126 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:10,799 epoch 15 - iter 1/4 - loss 4.12087154 - samples/sec: 47.53 - lr: 0.100000
2021-05-16 19:13:11,479 epoch 15 - iter 2/4 - loss 3.45777619 - samples/sec: 47.09 - lr: 0.100000
2021-05-16 19:13:12,230 epoch 15 - iter 3/4 - loss 3.44035808 - samples/sec: 42.59 - lr: 0.100000
2021-05-16 19:13:12,392 epoch 15 - iter 4/4 - loss 2.90269253 - samples/sec: 197.83 - lr: 0.100000
2021-05-16 19:13:12,408 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:12,408 EPOCH 15 done: loss 2.9027 - lr 0.1000000
2021-05-16 19:13:12,498 DEV : loss 4.368889808654785 - score 0.6923
2021-05-16 19:13:12,498 BAD EPOCHS (no improvement): 0
saving best model
2021-05-16 19:13:22,020 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:22,716 epoch 16 - iter 1/4 - loss 2.49819446 - samples/sec: 45.95 - lr: 0.100000
2021-05-16 19:13:23,466 epoch 16 - iter 2/4 - loss 3.36824119 - samples/sec: 43.59 - lr: 0.100000
2021-05-16 19:13:24,067 epoch 16 - iter 3/4 - loss 3.36522110 - samples/sec: 53.20 - lr: 0.100000
2021-05-16 19:13:24,253 epoch 16 - iter 4/4 - loss 3.36765742 - samples/sec: 188.42 - lr: 0.100000
2021-05-16 19:13:24,253 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:24,253 EPOCH 16 done: loss 3.3677 - lr 0.1000000
2021-05-16 19:13:24,348 DEV : loss 3.6790337562561035 - score 0.5833
2021-05-16 19:13:24,348 BAD EPOCHS (no improvement): 1
2021-05-16 19:13:24,356 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:24,905 epoch 17 - iter 1/4 - loss 3.17663288 - samples/sec: 58.35 - lr: 0.100000
2021-05-16 19:13:25,620 epoch 17 - iter 2/4 - loss 3.24819005 - samples/sec: 44.73 - lr: 0.100000
2021-05-16 19:13:26,267 epoch 17 - iter 3/4 - loss 2.86507106 - samples/sec: 49.44 - lr: 0.100000
2021-05-16 19:13:26,483 epoch 17 - iter 4/4 - loss 4.03450483 - samples/sec: 160.21 - lr: 0.100000
2021-05-16 19:13:26,483 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:26,483 EPOCH 17 done: loss 4.0345 - lr 0.1000000
2021-05-16 19:13:26,579 DEV : loss 3.864961862564087 - score 0.6154
2021-05-16 19:13:26,580 BAD EPOCHS (no improvement): 2
2021-05-16 19:13:26,583 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:27,322 epoch 18 - iter 1/4 - loss 3.06332946 - samples/sec: 43.30 - lr: 0.100000
2021-05-16 19:13:27,901 epoch 18 - iter 2/4 - loss 3.11640310 - samples/sec: 55.27 - lr: 0.100000
2021-05-16 19:13:28,698 epoch 18 - iter 3/4 - loss 2.99107130 - samples/sec: 40.18 - lr: 0.100000
2021-05-16 19:13:28,898 epoch 18 - iter 4/4 - loss 2.94846284 - samples/sec: 160.00 - lr: 0.100000
2021-05-16 19:13:28,898 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:28,898 EPOCH 18 done: loss 2.9485 - lr 0.1000000
2021-05-16 19:13:28,986 DEV : loss 3.8492608070373535 - score 0.48
2021-05-16 19:13:28,994 BAD EPOCHS (no improvement): 3
2021-05-16 19:13:28,994 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:29,622 epoch 19 - iter 1/4 - loss 2.81688428 - samples/sec: 50.89 - lr: 0.100000
2021-05-16 19:13:30,354 epoch 19 - iter 2/4 - loss 2.99261010 - samples/sec: 44.72 - lr: 0.100000
2021-05-16 19:13:30,979 epoch 19 - iter 3/4 - loss 2.85697055 - samples/sec: 51.15 - lr: 0.100000
2021-05-16 19:13:31,139 epoch 19 - iter 4/4 - loss 2.25571273 - samples/sec: 200.02 - lr: 0.100000
2021-05-16 19:13:31,139 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:31,139 EPOCH 19 done: loss 2.2557 - lr 0.1000000
2021-05-16 19:13:31,235 DEV : loss 3.9649171829223633 - score 0.5185
Epoch    19: reducing learning rate of group 0 to 5.0000e-02.
2021-05-16 19:13:31,235 BAD EPOCHS (no improvement): 4
2021-05-16 19:13:31,242 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:31,906 epoch 20 - iter 1/4 - loss 3.35270214 - samples/sec: 48.22 - lr: 0.050000
2021-05-16 19:13:32,555 epoch 20 - iter 2/4 - loss 2.56608105 - samples/sec: 49.28 - lr: 0.050000
2021-05-16 19:13:33,131 epoch 20 - iter 3/4 - loss 2.33327313 - samples/sec: 56.37 - lr: 0.050000
2021-05-16 19:13:33,332 epoch 20 - iter 4/4 - loss 2.89689222 - samples/sec: 165.52 - lr: 0.050000
2021-05-16 19:13:33,340 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:33,340 EPOCH 20 done: loss 2.8969 - lr 0.0500000
2021-05-16 19:13:33,421 DEV : loss 3.6375184059143066 - score 0.56
2021-05-16 19:13:33,421 BAD EPOCHS (no improvement): 1
2021-05-16 19:13:33,421 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:34,102 epoch 21 - iter 1/4 - loss 2.03401089 - samples/sec: 47.65 - lr: 0.050000
2021-05-16 19:13:34,750 epoch 21 - iter 2/4 - loss 2.45254445 - samples/sec: 49.40 - lr: 0.050000
2021-05-16 19:13:35,405 epoch 21 - iter 3/4 - loss 2.02827569 - samples/sec: 48.84 - lr: 0.050000
2021-05-16 19:13:35,652 epoch 21 - iter 4/4 - loss 2.53652957 - samples/sec: 129.49 - lr: 0.050000
2021-05-16 19:13:35,652 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:35,652 EPOCH 21 done: loss 2.5365 - lr 0.0500000
2021-05-16 19:13:35,756 DEV : loss 3.636472463607788 - score 0.56
2021-05-16 19:13:35,756 BAD EPOCHS (no improvement): 2
2021-05-16 19:13:35,763 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:36,461 epoch 22 - iter 1/4 - loss 2.35593867 - samples/sec: 45.85 - lr: 0.050000
2021-05-16 19:13:37,157 epoch 22 - iter 2/4 - loss 1.78290999 - samples/sec: 45.97 - lr: 0.050000
2021-05-16 19:13:37,821 epoch 22 - iter 3/4 - loss 2.12207437 - samples/sec: 48.21 - lr: 0.050000
2021-05-16 19:13:38,014 epoch 22 - iter 4/4 - loss 2.15731788 - samples/sec: 165.55 - lr: 0.050000
2021-05-16 19:13:38,014 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:38,021 EPOCH 22 done: loss 2.1573 - lr 0.0500000
2021-05-16 19:13:38,108 DEV : loss 3.7137885093688965 - score 0.6667
2021-05-16 19:13:38,116 BAD EPOCHS (no improvement): 3
2021-05-16 19:13:38,116 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:38,822 epoch 23 - iter 1/4 - loss 1.83278751 - samples/sec: 45.53 - lr: 0.050000
2021-05-16 19:13:39,736 epoch 23 - iter 2/4 - loss 2.04161525 - samples/sec: 35.03 - lr: 0.050000
2021-05-16 19:13:40,684 epoch 23 - iter 3/4 - loss 2.19689337 - samples/sec: 33.76 - lr: 0.050000
2021-05-16 19:13:40,933 epoch 23 - iter 4/4 - loss 1.73538903 - samples/sec: 128.34 - lr: 0.050000
2021-05-16 19:13:40,934 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:40,934 EPOCH 23 done: loss 1.7354 - lr 0.0500000
2021-05-16 19:13:41,043 DEV : loss 3.495877265930176 - score 0.5833
Epoch    23: reducing learning rate of group 0 to 2.5000e-02.
2021-05-16 19:13:41,043 BAD EPOCHS (no improvement): 4
2021-05-16 19:13:41,051 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:41,949 epoch 24 - iter 1/4 - loss 2.58249235 - samples/sec: 35.62 - lr: 0.025000
2021-05-16 19:13:42,545 epoch 24 - iter 2/4 - loss 2.33847690 - samples/sec: 53.73 - lr: 0.025000
2021-05-16 19:13:43,209 epoch 24 - iter 3/4 - loss 2.05386758 - samples/sec: 48.20 - lr: 0.025000
2021-05-16 19:13:43,426 epoch 24 - iter 4/4 - loss 1.69814771 - samples/sec: 147.27 - lr: 0.025000
2021-05-16 19:13:43,426 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:43,426 EPOCH 24 done: loss 1.6981 - lr 0.0250000
2021-05-16 19:13:43,514 DEV : loss 3.547339677810669 - score 0.5833
2021-05-16 19:13:43,514 BAD EPOCHS (no improvement): 1
2021-05-16 19:13:43,514 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:44,502 epoch 25 - iter 1/4 - loss 2.63612175 - samples/sec: 32.67 - lr: 0.025000
2021-05-16 19:13:45,551 epoch 25 - iter 2/4 - loss 2.28528547 - samples/sec: 30.49 - lr: 0.025000
2021-05-16 19:13:46,368 epoch 25 - iter 3/4 - loss 2.18019919 - samples/sec: 39.20 - lr: 0.025000
2021-05-16 19:13:46,585 epoch 25 - iter 4/4 - loss 1.82882562 - samples/sec: 147.22 - lr: 0.025000
2021-05-16 19:13:46,585 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:46,585 EPOCH 25 done: loss 1.8288 - lr 0.0250000
2021-05-16 19:13:46,681 DEV : loss 3.695451259613037 - score 0.6667
2021-05-16 19:13:46,681 BAD EPOCHS (no improvement): 2
2021-05-16 19:13:46,681 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:47,435 epoch 26 - iter 1/4 - loss 2.46649575 - samples/sec: 42.90 - lr: 0.025000
2021-05-16 19:13:48,195 epoch 26 - iter 2/4 - loss 1.86319947 - samples/sec: 42.09 - lr: 0.025000
2021-05-16 19:13:49,101 epoch 26 - iter 3/4 - loss 1.99375129 - samples/sec: 35.34 - lr: 0.025000
2021-05-16 19:13:49,350 epoch 26 - iter 4/4 - loss 2.51209539 - samples/sec: 132.64 - lr: 0.025000
2021-05-16 19:13:49,350 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:49,350 EPOCH 26 done: loss 2.5121 - lr 0.0250000
2021-05-16 19:13:49,454 DEV : loss 3.5949974060058594 - score 0.6667
2021-05-16 19:13:49,457 BAD EPOCHS (no improvement): 3
2021-05-16 19:13:49,457 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:50,194 epoch 27 - iter 1/4 - loss 1.67152703 - samples/sec: 43.40 - lr: 0.025000
2021-05-16 19:13:50,906 epoch 27 - iter 2/4 - loss 1.81827271 - samples/sec: 44.95 - lr: 0.025000
2021-05-16 19:13:51,642 epoch 27 - iter 3/4 - loss 1.91284267 - samples/sec: 43.46 - lr: 0.025000
2021-05-16 19:13:51,834 epoch 27 - iter 4/4 - loss 2.51718122 - samples/sec: 166.65 - lr: 0.025000
2021-05-16 19:13:51,834 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:51,834 EPOCH 27 done: loss 2.5172 - lr 0.0250000
2021-05-16 19:13:51,930 DEV : loss 3.624786376953125 - score 0.6667
Epoch    27: reducing learning rate of group 0 to 1.2500e-02.
2021-05-16 19:13:51,930 BAD EPOCHS (no improvement): 4
2021-05-16 19:13:51,930 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:52,650 epoch 28 - iter 1/4 - loss 2.06657982 - samples/sec: 44.45 - lr: 0.012500
2021-05-16 19:13:53,405 epoch 28 - iter 2/4 - loss 2.16739893 - samples/sec: 42.42 - lr: 0.012500
2021-05-16 19:13:54,234 epoch 28 - iter 3/4 - loss 1.87206562 - samples/sec: 38.60 - lr: 0.012500
2021-05-16 19:13:54,402 epoch 28 - iter 4/4 - loss 1.53354126 - samples/sec: 190.48 - lr: 0.012500
2021-05-16 19:13:54,410 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:54,410 EPOCH 28 done: loss 1.5335 - lr 0.0125000
2021-05-16 19:13:54,498 DEV : loss 3.486685276031494 - score 0.6667
2021-05-16 19:13:54,498 BAD EPOCHS (no improvement): 1
2021-05-16 19:13:54,498 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:55,514 epoch 29 - iter 1/4 - loss 1.94683826 - samples/sec: 31.74 - lr: 0.012500
2021-05-16 19:13:56,355 epoch 29 - iter 2/4 - loss 1.87296987 - samples/sec: 38.03 - lr: 0.012500
2021-05-16 19:13:57,018 epoch 29 - iter 3/4 - loss 1.93602276 - samples/sec: 48.88 - lr: 0.012500
2021-05-16 19:13:57,202 epoch 29 - iter 4/4 - loss 1.87588742 - samples/sec: 173.70 - lr: 0.012500
2021-05-16 19:13:57,202 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:57,202 EPOCH 29 done: loss 1.8759 - lr 0.0125000
2021-05-16 19:13:57,298 DEV : loss 3.5309135913848877 - score 0.6667
2021-05-16 19:13:57,298 BAD EPOCHS (no improvement): 2
2021-05-16 19:13:57,298 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:58,250 epoch 30 - iter 1/4 - loss 2.16732407 - samples/sec: 33.90 - lr: 0.012500
2021-05-16 19:13:58,931 epoch 30 - iter 2/4 - loss 1.72622716 - samples/sec: 46.96 - lr: 0.012500
2021-05-16 19:13:59,781 epoch 30 - iter 3/4 - loss 1.93175316 - samples/sec: 37.65 - lr: 0.012500
2021-05-16 19:13:59,982 epoch 30 - iter 4/4 - loss 1.60670690 - samples/sec: 159.08 - lr: 0.012500
2021-05-16 19:13:59,990 ----------------------------------------------------------------------------------------------------
2021-05-16 19:13:59,990 EPOCH 30 done: loss 1.6067 - lr 0.0125000
2021-05-16 19:14:00,088 DEV : loss 3.4875831604003906 - score 0.6667
2021-05-16 19:14:00,096 BAD EPOCHS (no improvement): 3
2021-05-16 19:14:00,096 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:01,011 epoch 31 - iter 1/4 - loss 2.39419317 - samples/sec: 34.99 - lr: 0.012500
2021-05-16 19:14:01,826 epoch 31 - iter 2/4 - loss 1.94124657 - samples/sec: 39.64 - lr: 0.012500
2021-05-16 19:14:02,676 epoch 31 - iter 3/4 - loss 1.81396655 - samples/sec: 37.62 - lr: 0.012500
2021-05-16 19:14:02,876 epoch 31 - iter 4/4 - loss 1.78971809 - samples/sec: 166.69 - lr: 0.012500
2021-05-16 19:14:02,884 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:02,884 EPOCH 31 done: loss 1.7897 - lr 0.0125000
2021-05-16 19:14:02,961 DEV : loss 3.4355287551879883 - score 0.5833
Epoch    31: reducing learning rate of group 0 to 6.2500e-03.
2021-05-16 19:14:02,961 BAD EPOCHS (no improvement): 4
2021-05-16 19:14:02,976 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:03,838 epoch 32 - iter 1/4 - loss 1.18405724 - samples/sec: 37.13 - lr: 0.006250
2021-05-16 19:14:04,727 epoch 32 - iter 2/4 - loss 1.78029823 - samples/sec: 35.98 - lr: 0.006250
2021-05-16 19:14:05,416 epoch 32 - iter 3/4 - loss 1.71468850 - samples/sec: 46.96 - lr: 0.006250
2021-05-16 19:14:05,673 epoch 32 - iter 4/4 - loss 1.98795196 - samples/sec: 124.99 - lr: 0.006250
2021-05-16 19:14:05,673 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:05,673 EPOCH 32 done: loss 1.9880 - lr 0.0062500
2021-05-16 19:14:05,768 DEV : loss 3.4302756786346436 - score 0.5833
2021-05-16 19:14:05,776 BAD EPOCHS (no improvement): 1
2021-05-16 19:14:05,776 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:06,493 epoch 33 - iter 1/4 - loss 1.43548059 - samples/sec: 44.69 - lr: 0.006250
2021-05-16 19:14:07,307 epoch 33 - iter 2/4 - loss 1.70211828 - samples/sec: 39.28 - lr: 0.006250
2021-05-16 19:14:08,082 epoch 33 - iter 3/4 - loss 1.72906860 - samples/sec: 41.30 - lr: 0.006250
2021-05-16 19:14:08,343 epoch 33 - iter 4/4 - loss 2.12577587 - samples/sec: 122.39 - lr: 0.006250
2021-05-16 19:14:08,343 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:08,343 EPOCH 33 done: loss 2.1258 - lr 0.0062500
2021-05-16 19:14:08,431 DEV : loss 3.4519147872924805 - score 0.6667
2021-05-16 19:14:08,439 BAD EPOCHS (no improvement): 2
2021-05-16 19:14:08,439 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:09,154 epoch 34 - iter 1/4 - loss 1.07441115 - samples/sec: 44.79 - lr: 0.006250
2021-05-16 19:14:09,975 epoch 34 - iter 2/4 - loss 1.89638603 - samples/sec: 38.96 - lr: 0.006250
2021-05-16 19:14:10,993 epoch 34 - iter 3/4 - loss 1.81038960 - samples/sec: 31.45 - lr: 0.006250
2021-05-16 19:14:11,289 epoch 34 - iter 4/4 - loss 1.82815674 - samples/sec: 108.11 - lr: 0.006250
2021-05-16 19:14:11,289 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:11,289 EPOCH 34 done: loss 1.8282 - lr 0.0062500
2021-05-16 19:14:11,393 DEV : loss 3.4468681812286377 - score 0.6667
2021-05-16 19:14:11,393 BAD EPOCHS (no improvement): 3
2021-05-16 19:14:11,393 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:12,314 epoch 35 - iter 1/4 - loss 1.71202326 - samples/sec: 34.74 - lr: 0.006250
2021-05-16 19:14:13,347 epoch 35 - iter 2/4 - loss 2.02234995 - samples/sec: 30.99 - lr: 0.006250
2021-05-16 19:14:13,977 epoch 35 - iter 3/4 - loss 1.83293974 - samples/sec: 51.40 - lr: 0.006250
2021-05-16 19:14:14,155 epoch 35 - iter 4/4 - loss 1.40346918 - samples/sec: 188.15 - lr: 0.006250
2021-05-16 19:14:14,155 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:14,155 EPOCH 35 done: loss 1.4035 - lr 0.0062500
2021-05-16 19:14:14,251 DEV : loss 3.4555253982543945 - score 0.6667
Epoch    35: reducing learning rate of group 0 to 3.1250e-03.
2021-05-16 19:14:14,251 BAD EPOCHS (no improvement): 4
2021-05-16 19:14:14,251 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:15,020 epoch 36 - iter 1/4 - loss 1.60199451 - samples/sec: 41.61 - lr: 0.003125
2021-05-16 19:14:15,758 epoch 36 - iter 2/4 - loss 1.76909965 - samples/sec: 43.41 - lr: 0.003125
2021-05-16 19:14:16,694 epoch 36 - iter 3/4 - loss 1.96563844 - samples/sec: 34.46 - lr: 0.003125
2021-05-16 19:14:16,926 epoch 36 - iter 4/4 - loss 2.04810312 - samples/sec: 137.94 - lr: 0.003125
2021-05-16 19:14:16,926 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:16,926 EPOCH 36 done: loss 2.0481 - lr 0.0031250
2021-05-16 19:14:17,022 DEV : loss 3.467947483062744 - score 0.6667
2021-05-16 19:14:17,022 BAD EPOCHS (no improvement): 1
2021-05-16 19:14:17,022 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:17,771 epoch 37 - iter 1/4 - loss 1.59361398 - samples/sec: 42.71 - lr: 0.003125
2021-05-16 19:14:18,573 epoch 37 - iter 2/4 - loss 1.86242718 - samples/sec: 39.93 - lr: 0.003125
2021-05-16 19:14:19,367 epoch 37 - iter 3/4 - loss 1.84938045 - samples/sec: 40.27 - lr: 0.003125
2021-05-16 19:14:19,575 epoch 37 - iter 4/4 - loss 1.94639012 - samples/sec: 159.98 - lr: 0.003125
2021-05-16 19:14:19,575 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:19,575 EPOCH 37 done: loss 1.9464 - lr 0.0031250
2021-05-16 19:14:19,663 DEV : loss 3.4721953868865967 - score 0.6667
2021-05-16 19:14:19,663 BAD EPOCHS (no improvement): 2
2021-05-16 19:14:19,663 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:20,420 epoch 38 - iter 1/4 - loss 1.87127459 - samples/sec: 42.26 - lr: 0.003125
2021-05-16 19:14:21,214 epoch 38 - iter 2/4 - loss 1.65014571 - samples/sec: 40.34 - lr: 0.003125
2021-05-16 19:14:22,201 epoch 38 - iter 3/4 - loss 1.78922117 - samples/sec: 32.41 - lr: 0.003125
2021-05-16 19:14:22,409 epoch 38 - iter 4/4 - loss 1.57039295 - samples/sec: 153.84 - lr: 0.003125
2021-05-16 19:14:22,417 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:22,417 EPOCH 38 done: loss 1.5704 - lr 0.0031250
2021-05-16 19:14:22,522 DEV : loss 3.4747495651245117 - score 0.6667
2021-05-16 19:14:22,522 BAD EPOCHS (no improvement): 3
2021-05-16 19:14:22,522 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:23,532 epoch 39 - iter 1/4 - loss 1.71339095 - samples/sec: 31.94 - lr: 0.003125
2021-05-16 19:14:24,351 epoch 39 - iter 2/4 - loss 1.87997061 - samples/sec: 39.07 - lr: 0.003125
2021-05-16 19:14:25,353 epoch 39 - iter 3/4 - loss 1.93014069 - samples/sec: 31.93 - lr: 0.003125
2021-05-16 19:14:25,553 epoch 39 - iter 4/4 - loss 1.66254094 - samples/sec: 166.68 - lr: 0.003125
2021-05-16 19:14:25,561 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:25,561 EPOCH 39 done: loss 1.6625 - lr 0.0031250
2021-05-16 19:14:25,650 DEV : loss 3.4640121459960938 - score 0.6667
Epoch    39: reducing learning rate of group 0 to 1.5625e-03.
2021-05-16 19:14:25,650 BAD EPOCHS (no improvement): 4
2021-05-16 19:14:25,650 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:26,482 epoch 40 - iter 1/4 - loss 1.51390183 - samples/sec: 38.46 - lr: 0.001563
2021-05-16 19:14:27,268 epoch 40 - iter 2/4 - loss 1.62989253 - samples/sec: 40.73 - lr: 0.001563
2021-05-16 19:14:28,116 epoch 40 - iter 3/4 - loss 1.59191600 - samples/sec: 37.73 - lr: 0.001563
2021-05-16 19:14:28,389 epoch 40 - iter 4/4 - loss 1.58031228 - samples/sec: 116.91 - lr: 0.001563
2021-05-16 19:14:28,389 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:28,389 EPOCH 40 done: loss 1.5803 - lr 0.0015625
2021-05-16 19:14:28,493 DEV : loss 3.464979648590088 - score 0.6667
2021-05-16 19:14:28,493 BAD EPOCHS (no improvement): 1
2021-05-16 19:14:28,493 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:29,395 epoch 41 - iter 1/4 - loss 2.09950924 - samples/sec: 35.51 - lr: 0.001563
2021-05-16 19:14:30,198 epoch 41 - iter 2/4 - loss 2.02299452 - samples/sec: 39.85 - lr: 0.001563
2021-05-16 19:14:30,959 epoch 41 - iter 3/4 - loss 1.83912905 - samples/sec: 42.02 - lr: 0.001563
2021-05-16 19:14:31,168 epoch 41 - iter 4/4 - loss 2.28552222 - samples/sec: 152.95 - lr: 0.001563
2021-05-16 19:14:31,176 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:31,176 EPOCH 41 done: loss 2.2855 - lr 0.0015625
2021-05-16 19:14:31,256 DEV : loss 3.46785044670105 - score 0.6667
2021-05-16 19:14:31,256 BAD EPOCHS (no improvement): 2
2021-05-16 19:14:31,264 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:31,960 epoch 42 - iter 1/4 - loss 2.07870221 - samples/sec: 45.98 - lr: 0.001563
2021-05-16 19:14:32,809 epoch 42 - iter 2/4 - loss 1.80660170 - samples/sec: 38.05 - lr: 0.001563
2021-05-16 19:14:33,486 epoch 42 - iter 3/4 - loss 1.86924104 - samples/sec: 47.31 - lr: 0.001563
2021-05-16 19:14:33,738 epoch 42 - iter 4/4 - loss 2.06889942 - samples/sec: 126.97 - lr: 0.001563
2021-05-16 19:14:33,738 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:33,738 EPOCH 42 done: loss 2.0689 - lr 0.0015625
2021-05-16 19:14:33,827 DEV : loss 3.464182138442993 - score 0.6667
2021-05-16 19:14:33,835 BAD EPOCHS (no improvement): 3
2021-05-16 19:14:33,835 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:34,689 epoch 43 - iter 1/4 - loss 2.16509676 - samples/sec: 37.68 - lr: 0.001563
2021-05-16 19:14:35,420 epoch 43 - iter 2/4 - loss 1.79616153 - samples/sec: 44.27 - lr: 0.001563
2021-05-16 19:14:36,298 epoch 43 - iter 3/4 - loss 1.79792849 - samples/sec: 36.44 - lr: 0.001563
2021-05-16 19:14:36,517 epoch 43 - iter 4/4 - loss 1.78867936 - samples/sec: 146.19 - lr: 0.001563
2021-05-16 19:14:36,517 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:36,517 EPOCH 43 done: loss 1.7887 - lr 0.0015625
2021-05-16 19:14:36,589 DEV : loss 3.464967966079712 - score 0.6667
Epoch    43: reducing learning rate of group 0 to 7.8125e-04.
2021-05-16 19:14:36,589 BAD EPOCHS (no improvement): 4
2021-05-16 19:14:36,603 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:37,308 epoch 44 - iter 1/4 - loss 1.60833621 - samples/sec: 45.36 - lr: 0.000781
2021-05-16 19:14:38,140 epoch 44 - iter 2/4 - loss 1.45758373 - samples/sec: 38.48 - lr: 0.000781
2021-05-16 19:14:38,983 epoch 44 - iter 3/4 - loss 1.52034609 - samples/sec: 37.96 - lr: 0.000781
2021-05-16 19:14:39,226 epoch 44 - iter 4/4 - loss 2.32687372 - samples/sec: 131.50 - lr: 0.000781
2021-05-16 19:14:39,235 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:39,236 EPOCH 44 done: loss 2.3269 - lr 0.0007813
2021-05-16 19:14:39,343 DEV : loss 3.467527151107788 - score 0.6667
2021-05-16 19:14:39,343 BAD EPOCHS (no improvement): 1
2021-05-16 19:14:39,343 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:40,254 epoch 45 - iter 1/4 - loss 2.09789848 - samples/sec: 35.42 - lr: 0.000781
2021-05-16 19:14:41,142 epoch 45 - iter 2/4 - loss 1.90345168 - samples/sec: 36.05 - lr: 0.000781
2021-05-16 19:14:41,828 epoch 45 - iter 3/4 - loss 1.76009802 - samples/sec: 46.62 - lr: 0.000781
2021-05-16 19:14:42,079 epoch 45 - iter 4/4 - loss 1.94041607 - samples/sec: 127.70 - lr: 0.000781
2021-05-16 19:14:42,079 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:42,079 EPOCH 45 done: loss 1.9404 - lr 0.0007813
2021-05-16 19:14:42,174 DEV : loss 3.4680516719818115 - score 0.6667
2021-05-16 19:14:42,174 BAD EPOCHS (no improvement): 2
2021-05-16 19:14:42,174 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:42,949 epoch 46 - iter 1/4 - loss 2.13200164 - samples/sec: 41.69 - lr: 0.000781
2021-05-16 19:14:43,628 epoch 46 - iter 2/4 - loss 1.92884541 - samples/sec: 47.13 - lr: 0.000781
2021-05-16 19:14:44,188 epoch 46 - iter 3/4 - loss 1.86859485 - samples/sec: 57.14 - lr: 0.000781
2021-05-16 19:14:44,420 epoch 46 - iter 4/4 - loss 2.23936662 - samples/sec: 137.82 - lr: 0.000781
2021-05-16 19:14:44,420 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:44,420 EPOCH 46 done: loss 2.2394 - lr 0.0007813
2021-05-16 19:14:44,516 DEV : loss 3.467272996902466 - score 0.6667
2021-05-16 19:14:44,516 BAD EPOCHS (no improvement): 3
2021-05-16 19:14:44,516 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:45,083 epoch 47 - iter 1/4 - loss 1.17524457 - samples/sec: 57.22 - lr: 0.000781
2021-05-16 19:14:45,804 epoch 47 - iter 2/4 - loss 1.69363821 - samples/sec: 44.40 - lr: 0.000781
2021-05-16 19:14:46,515 epoch 47 - iter 3/4 - loss 1.80291025 - samples/sec: 45.00 - lr: 0.000781
2021-05-16 19:14:46,744 epoch 47 - iter 4/4 - loss 1.68751404 - samples/sec: 139.56 - lr: 0.000781
2021-05-16 19:14:46,744 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:46,744 EPOCH 47 done: loss 1.6875 - lr 0.0007813
2021-05-16 19:14:46,841 DEV : loss 3.4656827449798584 - score 0.6667
Epoch    47: reducing learning rate of group 0 to 3.9063e-04.
2021-05-16 19:14:46,841 BAD EPOCHS (no improvement): 4
2021-05-16 19:14:46,845 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:47,512 epoch 48 - iter 1/4 - loss 1.40106690 - samples/sec: 47.97 - lr: 0.000391
2021-05-16 19:14:48,126 epoch 48 - iter 2/4 - loss 1.41452271 - samples/sec: 52.10 - lr: 0.000391
2021-05-16 19:14:48,882 epoch 48 - iter 3/4 - loss 1.74593834 - samples/sec: 42.34 - lr: 0.000391
2021-05-16 19:14:49,064 epoch 48 - iter 4/4 - loss 1.58755332 - samples/sec: 176.07 - lr: 0.000391
2021-05-16 19:14:49,064 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:49,064 EPOCH 48 done: loss 1.5876 - lr 0.0003906
2021-05-16 19:14:49,149 DEV : loss 3.467986822128296 - score 0.6667
2021-05-16 19:14:49,149 BAD EPOCHS (no improvement): 1
2021-05-16 19:14:49,149 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:49,930 epoch 49 - iter 1/4 - loss 1.38971734 - samples/sec: 40.97 - lr: 0.000391
2021-05-16 19:14:50,510 epoch 49 - iter 2/4 - loss 1.67799520 - samples/sec: 55.24 - lr: 0.000391
2021-05-16 19:14:51,137 epoch 49 - iter 3/4 - loss 1.69751259 - samples/sec: 51.05 - lr: 0.000391
2021-05-16 19:14:51,356 epoch 49 - iter 4/4 - loss 1.83348897 - samples/sec: 145.87 - lr: 0.000391
2021-05-16 19:14:51,356 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:51,356 EPOCH 49 done: loss 1.8335 - lr 0.0003906
2021-05-16 19:14:51,446 DEV : loss 3.4678850173950195 - score 0.6667
2021-05-16 19:14:51,446 BAD EPOCHS (no improvement): 2
2021-05-16 19:14:51,462 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:52,179 epoch 50 - iter 1/4 - loss 1.13970292 - samples/sec: 44.71 - lr: 0.000391
2021-05-16 19:14:52,916 epoch 50 - iter 2/4 - loss 1.94286901 - samples/sec: 43.40 - lr: 0.000391
2021-05-16 19:14:53,640 epoch 50 - iter 3/4 - loss 1.91910776 - samples/sec: 44.19 - lr: 0.000391
2021-05-16 19:14:53,807 epoch 50 - iter 4/4 - loss 1.56437027 - samples/sec: 191.98 - lr: 0.000391
2021-05-16 19:14:53,807 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:53,807 EPOCH 50 done: loss 1.5644 - lr 0.0003906
2021-05-16 19:14:53,886 DEV : loss 3.4673101902008057 - score 0.6667
2021-05-16 19:14:53,886 BAD EPOCHS (no improvement): 3
2021-05-16 19:14:53,898 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:54,525 epoch 51 - iter 1/4 - loss 1.64230800 - samples/sec: 50.99 - lr: 0.000391
2021-05-16 19:14:55,323 epoch 51 - iter 2/4 - loss 1.66435432 - samples/sec: 40.11 - lr: 0.000391
2021-05-16 19:14:56,158 epoch 51 - iter 3/4 - loss 1.76997383 - samples/sec: 38.33 - lr: 0.000391
2021-05-16 19:14:56,348 epoch 51 - iter 4/4 - loss 1.45529963 - samples/sec: 168.77 - lr: 0.000391
2021-05-16 19:14:56,348 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:56,348 EPOCH 51 done: loss 1.4553 - lr 0.0003906
2021-05-16 19:14:56,451 DEV : loss 3.46675705909729 - score 0.6667
Epoch    51: reducing learning rate of group 0 to 1.9531e-04.
2021-05-16 19:14:56,451 BAD EPOCHS (no improvement): 4
2021-05-16 19:14:56,451 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:57,134 epoch 52 - iter 1/4 - loss 1.39893460 - samples/sec: 47.38 - lr: 0.000195
2021-05-16 19:14:57,904 epoch 52 - iter 2/4 - loss 1.95114291 - samples/sec: 41.57 - lr: 0.000195
2021-05-16 19:14:58,589 epoch 52 - iter 3/4 - loss 1.87273510 - samples/sec: 46.70 - lr: 0.000195
2021-05-16 19:14:58,814 epoch 52 - iter 4/4 - loss 1.66518828 - samples/sec: 142.21 - lr: 0.000195
2021-05-16 19:14:58,814 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:58,814 EPOCH 52 done: loss 1.6652 - lr 0.0001953
2021-05-16 19:14:58,898 DEV : loss 3.4661099910736084 - score 0.6667
2021-05-16 19:14:58,898 BAD EPOCHS (no improvement): 1
2021-05-16 19:14:58,898 ----------------------------------------------------------------------------------------------------
2021-05-16 19:14:59,621 epoch 53 - iter 1/4 - loss 1.52661002 - samples/sec: 44.90 - lr: 0.000195
2021-05-16 19:15:00,323 epoch 53 - iter 2/4 - loss 1.72744888 - samples/sec: 45.60 - lr: 0.000195
2021-05-16 19:15:01,033 epoch 53 - iter 3/4 - loss 1.67759216 - samples/sec: 45.09 - lr: 0.000195
2021-05-16 19:15:01,186 epoch 53 - iter 4/4 - loss 1.46851297 - samples/sec: 208.70 - lr: 0.000195
2021-05-16 19:15:01,186 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:01,186 EPOCH 53 done: loss 1.4685 - lr 0.0001953
2021-05-16 19:15:01,282 DEV : loss 3.466641426086426 - score 0.6667
2021-05-16 19:15:01,282 BAD EPOCHS (no improvement): 2
2021-05-16 19:15:01,282 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:01,903 epoch 54 - iter 1/4 - loss 1.67276871 - samples/sec: 51.56 - lr: 0.000195
2021-05-16 19:15:02,720 epoch 54 - iter 2/4 - loss 1.84151357 - samples/sec: 39.15 - lr: 0.000195
2021-05-16 19:15:03,497 epoch 54 - iter 3/4 - loss 1.79460196 - samples/sec: 41.16 - lr: 0.000195
2021-05-16 19:15:03,697 epoch 54 - iter 4/4 - loss 1.73617950 - samples/sec: 160.20 - lr: 0.000195
2021-05-16 19:15:03,697 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:03,697 EPOCH 54 done: loss 1.7362 - lr 0.0001953
2021-05-16 19:15:03,791 DEV : loss 3.4663610458374023 - score 0.6667
2021-05-16 19:15:03,807 BAD EPOCHS (no improvement): 3
2021-05-16 19:15:03,809 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:04,563 epoch 55 - iter 1/4 - loss 2.19241428 - samples/sec: 42.46 - lr: 0.000195
2021-05-16 19:15:05,206 epoch 55 - iter 2/4 - loss 1.68816346 - samples/sec: 49.73 - lr: 0.000195
2021-05-16 19:15:05,899 epoch 55 - iter 3/4 - loss 1.67743218 - samples/sec: 46.20 - lr: 0.000195
2021-05-16 19:15:06,147 epoch 55 - iter 4/4 - loss 1.62165421 - samples/sec: 129.04 - lr: 0.000195
2021-05-16 19:15:06,147 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:06,147 EPOCH 55 done: loss 1.6217 - lr 0.0001953
2021-05-16 19:15:06,243 DEV : loss 3.4659790992736816 - score 0.6667
Epoch    55: reducing learning rate of group 0 to 9.7656e-05.
2021-05-16 19:15:06,243 BAD EPOCHS (no improvement): 4
2021-05-16 19:15:06,243 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:06,243 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:06,243 learning rate too small - quitting training!
2021-05-16 19:15:06,243 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:14,421 ----------------------------------------------------------------------------------------------------
2021-05-16 19:15:14,421 Testing using best model ...
2021-05-16 19:15:14,426 loading file slot-model\best-model.pt
2021-05-16 19:15:34,103 0.6759	0.6901	0.6829
2021-05-16 19:15:34,103 
Results:
- F1-score (micro) 0.6829
- F1-score (macro) 0.3185

By class:
NoLabel I-end_conversation tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
affirm     tp: 11 - fp: 6 - fn: 2 - precision: 0.6471 - recall: 0.8462 - f1-score: 0.7333
appoinment tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
appoinment/doctor tp: 0 - fp: 0 - fn: 2 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
appointment tp: 19 - fp: 4 - fn: 1 - precision: 0.8261 - recall: 0.9500 - f1-score: 0.8837
appointment/doctor tp: 16 - fp: 13 - fn: 5 - precision: 0.5517 - recall: 0.7619 - f1-score: 0.6400
appointment/office tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
appointment/type tp: 4 - fp: 2 - fn: 2 - precision: 0.6667 - recall: 0.6667 - f1-score: 0.6667
datetime   tp: 12 - fp: 7 - fn: 6 - precision: 0.6316 - recall: 0.6667 - f1-score: 0.6486
deny       tp: 0 - fp: 0 - fn: 4 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
doctor     tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
end_conversation tp: 14 - fp: 8 - fn: 6 - precision: 0.6364 - recall: 0.7000 - f1-score: 0.6667
greeting   tp: 18 - fp: 3 - fn: 2 - precision: 0.8571 - recall: 0.9000 - f1-score: 0.8780
prescription tp: 4 - fp: 3 - fn: 2 - precision: 0.5714 - recall: 0.6667 - f1-score: 0.6154
prescription/type tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
register/email tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
register/name tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
results    tp: 0 - fp: 1 - fn: 3 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
2021-05-16 19:15:34,103 ----------------------------------------------------------------------------------------------------
{'test_score': 0.6829268292682927,
 'dev_score_history': [0.0,
  0.0,
  0.0,
  0.0,
  0.06451612903225808,
  0.0,
  0.15384615384615383,
  0.34782608695652173,
  0.2608695652173913,
  0.30769230769230765,
  0.34782608695652173,
  0.3333333333333333,
  0.4615384615384615,
  0.43478260869565216,
  0.6923076923076924,
  0.5833333333333334,
  0.6153846153846153,
  0.48000000000000004,
  0.5185185185185186,
  0.5599999999999999,
  0.5599999999999999,
  0.6666666666666666,
  0.5833333333333334,
  0.5833333333333334,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.5833333333333334,
  0.5833333333333334,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666,
  0.6666666666666666],
 'train_loss_history': [16.562259912490845,
  10.837674498558044,
  8.949250102043152,
  7.318224191665649,
  8.57210910320282,
  8.556124210357666,
  7.01634418964386,
  5.2021918296813965,
  4.593203812837601,
  4.4080811738967896,
  4.860165357589722,
  3.447446435689926,
  4.471416532993317,
  3.3014023900032043,
  2.902692526578903,
  3.367657423019409,
  4.03450483083725,
  2.9484628438949585,
  2.2557127252221107,
  2.8968922197818756,
  2.5365295708179474,
  2.157317876815796,
  1.735389031469822,
  1.698147714138031,
  1.8288256227970123,
  2.5120953917503357,
  2.5171812176704407,
  1.5335412621498108,
  1.8758874237537384,
  1.606706902384758,
  1.7897180914878845,
  1.9879519641399384,
  2.1257758736610413,
  1.828156739473343,
  1.4034691751003265,
  2.0481031239032745,
  1.9463901221752167,
  1.5703929513692856,
  1.6625409424304962,
  1.5803122818470001,
  2.285522222518921,
  2.0688994228839874,
  1.7886793613433838,
  2.3268737196922302,
  1.9404160678386688,
  2.2393666207790375,
  1.6875140368938446,
  1.587553322315216,
  1.8334889709949493,
  1.5643702745437622,
  1.4552996307611465,
  1.6651882827281952,
  1.4685129672288895,
  1.7361795008182526,
  1.621654212474823],
 'dev_loss_history': [12.217952728271484,
  8.176359176635742,
  7.451809883117676,
  7.464598178863525,
  7.330676555633545,
  5.898077011108398,
  5.496520519256592,
  5.2129292488098145,
  4.9869303703308105,
  4.855195045471191,
  4.352779865264893,
  4.364665508270264,
  4.251131057739258,
  3.9291062355041504,
  4.368889808654785,
  3.6790337562561035,
  3.864961862564087,
  3.8492608070373535,
  3.9649171829223633,
  3.6375184059143066,
  3.636472463607788,
  3.7137885093688965,
  3.495877265930176,
  3.547339677810669,
  3.695451259613037,
  3.5949974060058594,
  3.624786376953125,
  3.486685276031494,
  3.5309135913848877,
  3.4875831604003906,
  3.4355287551879883,
  3.4302756786346436,
  3.4519147872924805,
  3.4468681812286377,
  3.4555253982543945,
  3.467947483062744,
  3.4721953868865967,
  3.4747495651245117,
  3.4640121459960938,
  3.464979648590088,
  3.46785044670105,
  3.464182138442993,
  3.464967966079712,
  3.467527151107788,
  3.4680516719818115,
  3.467272996902466,
  3.4656827449798584,
  3.467986822128296,
  3.4678850173950195,
  3.4673101902008057,
  3.46675705909729,
  3.4661099910736084,
  3.466641426086426,
  3.4663610458374023,
  3.4659790992736816]}

Jakość wyuczonego modelu możemy ocenić, korzystając z zaraportowanych powyżej metryk, tj.:

  • _tp (true positives)

    liczba słów oznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą

  • _fp (false positives)

    liczba słów nieoznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą

  • _fn (false negatives)

    liczba słów oznaczonych w zbiorze testowym etykietą $e$, którym model nie nadał etykiety $e$

  • _precision

    $$\frac{tp}{tp + fp}$$

  • _recall

    $$\frac{tp}{tp + fn}$$

  • $F_1$

    $$\frac{2 \cdot precision \cdot recall}{precision + recall}$$

  • _micro $F_1$

    $F_1$ w którym $tp$, $fp$ i $fn$ są liczone łącznie dla wszystkich etykiet, tj. $tp = \sum_{e}{{tp}_e}$, $fn = \sum{e}{{fn}e}$, $fp = \sum{e}{{fp}_e}$

  • _macro $F_1$

    średnia arytmetyczna z $F_1$ obliczonych dla poszczególnych etykiet z osobna.

Wyuczony model możemy wczytać z pliku korzystając z metody load.

model = SequenceTagger.load('slot-model/final-model.pt')
2021-05-16 19:15:34,133 loading file slot-model/final-model.pt

Wczytany model możemy wykorzystać do przewidywania slotów w wypowiedziach użytkownika, korzystając z przedstawionej poniżej funkcji predict.

def predict(model, sentence):
    csentence = [{'form': word} for word in sentence]
    fsentence = conllu2flair([csentence])[0]
    model.predict(fsentence)
    return [(token, ftoken.get_tag('slot').value) for token, ftoken in zip(sentence, fsentence)]

Jak pokazuje przykład poniżej model wyuczony tylko na 100 przykładach popełnia w dosyć prostej wypowiedzi błąd etykietując słowo alarm tagiem B-weather/noun.

tabulate(predict(model, ' dzien dobry poprosze wizytę do doktor lekarza rodzinnego najlepiej dzisiaj w godzinach popołudniowych dziś albo jutro internisty'.split()), tablefmt='html')
dzien B-greeting
dobry I-greeting
poprosze O
wizytę B-appointment
do O
doktor B-appointment/doctor
lekarza B-appointment/doctor
rodzinnego I-appointment/doctor
najlepiej O
dzisiaj O
w O
godzinach I-datetime
popołudniowychI-datetime
dziś B-datetime
albo O
jutro I-datetime
internisty I-appointment/doctor

Literatura

  1. Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis, Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog. NAACL-HLT (1) 2019, pp. 3795-3805
  2. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282289, https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers
  3. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (November 15, 1997), 17351780, https://doi.org/10.1162/neco.1997.9.8.1735
  4. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, Attention is All you Need, NIPS 2017, pp. 5998-6008, https://arxiv.org/abs/1706.03762
  5. Alan Akbik, Duncan Blythe, Roland Vollgraf, Contextual String Embeddings for Sequence Labeling, Proceedings of the 27th International Conference on Computational Linguistics, pp. 16381649, https://www.aclweb.org/anthology/C18-1139.pdf