SystemyDialogowe-ProjektMag.../lab/08-parsing-semantyczny-uczenie.ipynb
2022-04-29 01:06:09 +02:00

61 KiB
Raw Blame History

Logo 1

Systemy Dialogowe

8. Parsing semantyczny z wykorzystaniem technik uczenia maszynowego [laboratoria]

Marek Kubis (2021)

Logo 2

Parsing semantyczny z wykorzystaniem technik uczenia maszynowego

Wprowadzenie

Problem wykrywania slotów i ich wartości w wypowiedziach użytkownika można sformułować jako zadanie polegające na przewidywaniu dla poszczególnych słów etykiet wskazujących na to czy i do jakiego slotu dane słowo należy.

chciałbym zarezerwować stolik na jutro**/day** na godzinę dwunastą**/hour** czterdzieści**/hour** pięć**/hour** na pięć**/size** osób

Granice slotów oznacza się korzystając z wybranego schematu etykietowania.

Schemat IOB

Prefix Znaczenie
I wnętrze slotu (inside)
O poza slotem (outside)
B początek slotu (beginning)

chciałbym zarezerwować stolik na jutro**/B-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/I-hour** na pięć**/B-size** osób

Schemat IOBES

Prefix Znaczenie
I wnętrze slotu (inside)
O poza slotem (outside)
B początek slotu (beginning)
E koniec slotu (ending)
S pojedyncze słowo (single)

chciałbym zarezerwować stolik na jutro**/S-day** na godzinę dwunastą**/B-hour** czterdzieści**/I-hour** pięć**/E-hour** na pięć**/S-size** osób

Jeżeli dla tak sformułowanego zadania przygotujemy zbiór danych złożony z wypowiedzi użytkownika z oznaczonymi slotami (tzw. _zbiór uczący), to możemy zastosować techniki (nadzorowanego) uczenia maszynowego w celu zbudowania modelu annotującego wypowiedzi użytkownika etykietami slotów.

Do zbudowania takiego modelu można wykorzystać między innymi:

  1. warunkowe pola losowe (Lafferty i in.; 2001),

  2. rekurencyjne sieci neuronowe, np. sieci LSTM (Hochreiter i Schmidhuber; 1997),

  3. transformery (Vaswani i in., 2017).

Przykład

Skorzystamy ze zbioru danych przygotowanego przez Schustera (2019).

!mkdir -p l07
%cd l07
!curl -L -C -  https://fb.me/multilingual_task_oriented_data  -o data.zip
!unzip data.zip
%cd ..
c:\Develop\wmi\AITECH\sem1\Systemy dialogowe\lab\l07
A subdirectory or file -p already exists.
Error occurred while processing: -p.
A subdirectory or file l07 already exists.
Error occurred while processing: l07.
** Resuming transfer from byte position 8923190
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

100    49  100    49    0     0    118      0 --:--:-- --:--:-- --:--:--   118
c:\Develop\wmi\AITECH\sem1\Systemy dialogowe\lab
'unzip' is not recognized as an internal or external command,
operable program or batch file.

Zbiór ten gromadzi wypowiedzi w trzech językach opisane slotami dla dwunastu ram należących do trzech dziedzin Alarm, Reminder oraz Weather. Dane wczytamy korzystając z biblioteki conllu.

from conllu import parse_incr
fields = ['id', 'form', 'frame', 'slot']

def nolabel2o(line, i):
    return 'O' if line[i] == 'NoLabel' else line[i]
# pathTrain = '../tasks/zad8/en/train-en.conllu'
# pathTest = '../tasks/zad8/en/test-en.conllu'

pathTrain = '../tasks/zad8/pl/train.conllu'
pathTest = '../tasks/zad8/pl/test.conllu'

with open(pathTrain, encoding="UTF-8") as trainfile:
    i=0
    for line in trainfile:
        print(line)
        i+=1
        if i==15: break 
    trainset = list(parse_incr(trainfile, fields=fields, field_parsers={'slot': nolabel2o}))
with open(pathTest,  encoding="UTF-8") as testfile:
    testset = list(parse_incr(testfile, fields=fields, field_parsers={'slot': nolabel2o}))
    
# text: halo

# intent: hello

# slots: 

1	halo	hello	NoLabel



# text: chaciałbym pójść na premierę filmu jakie premiery są w tym tygodniu

# intent: reqmore

# slots: 

1	chaciałbym	reqmore	NoLabel

2	pójść	reqmore	NoLabel

3	na	reqmore	NoLabel

4	premierę	reqmore	NoLabel

5	filmu	reqmore	NoLabel

6	jakie	reqmore	NoLabel

7	premiery	reqmore	NoLabel

Zobaczmy kilka przykładowych wypowiedzi z tego zbioru.

from tabulate import tabulate
tabulate(trainset[1], tablefmt='html')
1wybieraminformO
2batmana informB-title
tabulate(trainset[16], tablefmt='html')
1chcę informO
2zarezerwowaćinformO
3bilety informO
tabulate(trainset[20], tablefmt='html')
1chciałbym informO
2anulować informO
3rezerwacjęinformO
4biletu informO

Budując model skorzystamy z architektury opartej o rekurencyjne sieci neuronowe zaimplementowanej w bibliotece flair (Akbik i in. 2018).

from flair.data import Corpus, Sentence, Token
from flair.datasets import SentenceDataset
from flair.embeddings import StackedEmbeddings
from flair.embeddings import WordEmbeddings
from flair.embeddings import CharacterEmbeddings
from flair.embeddings import FlairEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer

# determinizacja obliczeń
import random
import torch
random.seed(42)
torch.manual_seed(42)

if torch.cuda.is_available():
    torch.cuda.manual_seed(0)
    torch.cuda.manual_seed_all(0)
    torch.backends.cudnn.enabled = False
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.deterministic = True

Dane skonwertujemy do formatu wykorzystywanego przez flair, korzystając z następującej funkcji.

def conllu2flair(sentences, label=None):
    fsentences = []

    for sentence in sentences:
        fsentence = Sentence()

        for token in sentence:
            ftoken = Token(token['form'])

            if label:
                ftoken.add_tag(label, token[label])

            fsentence.add_token(ftoken)

        fsentences.append(fsentence)

    return SentenceDataset(fsentences)

corpus = Corpus(train=conllu2flair(trainset, 'slot'), test=conllu2flair(testset, 'slot'))
print(corpus)
tag_dictionary = corpus.make_tag_dictionary(tag_type='slot')
print(tag_dictionary)
Corpus: 297 train + 33 dev + 33 test sentences
Dictionary with 14 tags: <unk>, O, B-date, I-date, B-time, I-time, B-area, I-area, B-title, B-quantity, I-title, I-quantity, <START>, <STOP>

Nasz model będzie wykorzystywał wektorowe reprezentacje słów (zob. Word Embeddings).

embedding_types = [
    WordEmbeddings('pl'),
    FlairEmbeddings('polish-forward'),
    FlairEmbeddings('polish-backward'),
    CharacterEmbeddings(),
]

embeddings = StackedEmbeddings(embeddings=embedding_types)
tagger = SequenceTagger(hidden_size=256, embeddings=embeddings,
                        tag_dictionary=tag_dictionary,
                        tag_type='slot', use_crf=True)
2022-04-28 22:14:01,525 https://flair.informatik.hu-berlin.de/resources/embeddings/token/pl-wiki-fasttext-300d-1M.vectors.npy not found in cache, downloading to C:\Users\48516\AppData\Local\Temp\tmp8ekygs88
100%|██████████| 1199998928/1199998928 [01:00<00:00, 19734932.13B/s]
2022-04-28 22:15:02,505 copying C:\Users\48516\AppData\Local\Temp\tmp8ekygs88 to cache at C:\Users\48516\.flair\embeddings\pl-wiki-fasttext-300d-1M.vectors.npy
2022-04-28 22:15:03,136 removing temp file C:\Users\48516\AppData\Local\Temp\tmp8ekygs88
2022-04-28 22:15:03,420 https://flair.informatik.hu-berlin.de/resources/embeddings/token/pl-wiki-fasttext-300d-1M not found in cache, downloading to C:\Users\48516\AppData\Local\Temp\tmp612sxdgl
100%|██████████| 40874795/40874795 [00:02<00:00, 18943852.55B/s]
2022-04-28 22:15:05,807 copying C:\Users\48516\AppData\Local\Temp\tmp612sxdgl to cache at C:\Users\48516\.flair\embeddings\pl-wiki-fasttext-300d-1M
2022-04-28 22:15:05,830 removing temp file C:\Users\48516\AppData\Local\Temp\tmp612sxdgl
2022-04-28 22:15:13,095 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-polish-forward-v0.2.pt not found in cache, downloading to C:\Users\48516\AppData\Local\Temp\tmp05k_xff8
100%|██████████| 84244196/84244196 [00:04<00:00, 19653900.77B/s]
2022-04-28 22:15:17,599 copying C:\Users\48516\AppData\Local\Temp\tmp05k_xff8 to cache at C:\Users\48516\.flair\embeddings\lm-polish-forward-v0.2.pt
2022-04-28 22:15:17,640 removing temp file C:\Users\48516\AppData\Local\Temp\tmp05k_xff8
2022-04-28 22:15:18,034 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-polish-backward-v0.2.pt not found in cache, downloading to C:\Users\48516\AppData\Local\Temp\tmpbjevekqx
100%|██████████| 84244196/84244196 [00:04<00:00, 19850177.72B/s]
2022-04-28 22:15:22,467 copying C:\Users\48516\AppData\Local\Temp\tmpbjevekqx to cache at C:\Users\48516\.flair\embeddings\lm-polish-backward-v0.2.pt
2022-04-28 22:15:22,518 removing temp file C:\Users\48516\AppData\Local\Temp\tmpbjevekqx

Zobaczmy jak wygląda architektura sieci neuronowej, która będzie odpowiedzialna za przewidywanie slotów w wypowiedziach.

print(tagger)
SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('pl')
    (list_embedding_1): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_2): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_3): CharacterEmbeddings(
      (char_embedding): Embedding(275, 25)
      (char_rnn): LSTM(25, 25, bidirectional=True)
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)
  (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=14, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)

Wykonamy dziesięć iteracji (epok) uczenia a wynikowy model zapiszemy w katalogu slot-model.

trainer = ModelTrainer(tagger, corpus)
trainer.train('slot-model',
              learning_rate=0.1,
              mini_batch_size=32,
              max_epochs=10,
              train_with_dev=False)
2022-04-28 22:15:23,085 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:23,086 Model: "SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('pl')
    (list_embedding_1): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_2): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.25, inplace=False)
        (encoder): Embedding(1602, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=1602, bias=True)
      )
    )
    (list_embedding_3): CharacterEmbeddings(
      (char_embedding): Embedding(275, 25)
      (char_rnn): LSTM(25, 25, bidirectional=True)
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=4446, out_features=4446, bias=True)
  (rnn): LSTM(4446, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=14, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)"
2022-04-28 22:15:23,087 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:23,088 Corpus: "Corpus: 297 train + 33 dev + 33 test sentences"
2022-04-28 22:15:23,088 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:23,089 Parameters:
2022-04-28 22:15:23,089  - learning_rate: "0.1"
2022-04-28 22:15:23,090  - mini_batch_size: "32"
2022-04-28 22:15:23,090  - patience: "3"
2022-04-28 22:15:23,091  - anneal_factor: "0.5"
2022-04-28 22:15:23,092  - max_epochs: "10"
2022-04-28 22:15:23,093  - shuffle: "True"
2022-04-28 22:15:23,093  - train_with_dev: "False"
2022-04-28 22:15:23,094  - batch_growth_annealing: "False"
2022-04-28 22:15:23,094 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:23,095 Model training base path: "slot-model"
2022-04-28 22:15:23,095 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:23,096 Device: cpu
2022-04-28 22:15:23,096 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:23,097 Embeddings storage mode: cpu
2022-04-28 22:15:23,100 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:25,051 epoch 1 - iter 1/10 - loss 15.67058754 - samples/sec: 16.40 - lr: 0.100000
2022-04-28 22:15:27,334 epoch 1 - iter 2/10 - loss 13.01803017 - samples/sec: 14.02 - lr: 0.100000
2022-04-28 22:15:29,132 epoch 1 - iter 3/10 - loss 11.16305335 - samples/sec: 17.81 - lr: 0.100000
2022-04-28 22:15:30,629 epoch 1 - iter 4/10 - loss 9.23769999 - samples/sec: 21.39 - lr: 0.100000
2022-04-28 22:15:32,614 epoch 1 - iter 5/10 - loss 7.94914236 - samples/sec: 16.13 - lr: 0.100000
2022-04-28 22:15:34,081 epoch 1 - iter 6/10 - loss 7.05464562 - samples/sec: 21.83 - lr: 0.100000
2022-04-28 22:15:35,257 epoch 1 - iter 7/10 - loss 6.28502292 - samples/sec: 27.26 - lr: 0.100000
2022-04-28 22:15:37,386 epoch 1 - iter 8/10 - loss 5.74554797 - samples/sec: 15.04 - lr: 0.100000
2022-04-28 22:15:39,009 epoch 1 - iter 9/10 - loss 5.48559354 - samples/sec: 19.73 - lr: 0.100000
2022-04-28 22:15:39,892 epoch 1 - iter 10/10 - loss 5.10890775 - samples/sec: 36.28 - lr: 0.100000
2022-04-28 22:15:39,893 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:39,894 EPOCH 1 done: loss 5.1089 - lr 0.1000000
2022-04-28 22:15:41,651 DEV : loss 1.1116931438446045 - score 0.0
2022-04-28 22:15:41,654 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:15:54,970 ----------------------------------------------------------------------------------------------------
2022-04-28 22:15:55,703 epoch 2 - iter 1/10 - loss 2.39535546 - samples/sec: 48.71 - lr: 0.100000
2022-04-28 22:15:56,276 epoch 2 - iter 2/10 - loss 3.14594960 - samples/sec: 55.94 - lr: 0.100000
2022-04-28 22:15:56,849 epoch 2 - iter 3/10 - loss 2.96723008 - samples/sec: 55.94 - lr: 0.100000
2022-04-28 22:15:57,326 epoch 2 - iter 4/10 - loss 2.72414619 - samples/sec: 67.23 - lr: 0.100000
2022-04-28 22:15:57,799 epoch 2 - iter 5/10 - loss 2.52746274 - samples/sec: 67.80 - lr: 0.100000
2022-04-28 22:15:58,255 epoch 2 - iter 6/10 - loss 2.41920217 - samples/sec: 70.33 - lr: 0.100000
2022-04-28 22:15:58,770 epoch 2 - iter 7/10 - loss 2.48535442 - samples/sec: 62.26 - lr: 0.100000
2022-04-28 22:15:59,324 epoch 2 - iter 8/10 - loss 2.40343314 - samples/sec: 57.87 - lr: 0.100000
2022-04-28 22:15:59,827 epoch 2 - iter 9/10 - loss 2.41345758 - samples/sec: 63.74 - lr: 0.100000
2022-04-28 22:16:00,052 epoch 2 - iter 10/10 - loss 2.63766205 - samples/sec: 142.86 - lr: 0.100000
2022-04-28 22:16:00,053 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:00,054 EPOCH 2 done: loss 2.6377 - lr 0.1000000
2022-04-28 22:16:00,234 DEV : loss 1.2027416229248047 - score 0.0
2022-04-28 22:16:00,238 BAD EPOCHS (no improvement): 1
2022-04-28 22:16:00,241 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:00,771 epoch 3 - iter 1/10 - loss 2.07519531 - samples/sec: 60.61 - lr: 0.100000
2022-04-28 22:16:01,297 epoch 3 - iter 2/10 - loss 2.21946335 - samples/sec: 60.95 - lr: 0.100000
2022-04-28 22:16:01,826 epoch 3 - iter 3/10 - loss 2.32372427 - samples/sec: 60.61 - lr: 0.100000
2022-04-28 22:16:02,304 epoch 3 - iter 4/10 - loss 2.18133342 - samples/sec: 67.23 - lr: 0.100000
2022-04-28 22:16:02,727 epoch 3 - iter 5/10 - loss 2.10553741 - samples/sec: 75.83 - lr: 0.100000
2022-04-28 22:16:03,215 epoch 3 - iter 6/10 - loss 1.99518015 - samples/sec: 65.84 - lr: 0.100000
2022-04-28 22:16:03,670 epoch 3 - iter 7/10 - loss 2.03174150 - samples/sec: 70.64 - lr: 0.100000
2022-04-28 22:16:04,239 epoch 3 - iter 8/10 - loss 2.19520997 - samples/sec: 56.34 - lr: 0.100000
2022-04-28 22:16:04,686 epoch 3 - iter 9/10 - loss 2.15986861 - samples/sec: 71.75 - lr: 0.100000
2022-04-28 22:16:04,919 epoch 3 - iter 10/10 - loss 2.02860461 - samples/sec: 137.93 - lr: 0.100000
2022-04-28 22:16:04,920 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:04,921 EPOCH 3 done: loss 2.0286 - lr 0.1000000
2022-04-28 22:16:05,067 DEV : loss 0.9265440702438354 - score 0.0
2022-04-28 22:16:05,069 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:16:10,882 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:11,339 epoch 4 - iter 1/10 - loss 2.63443780 - samples/sec: 70.33 - lr: 0.100000
2022-04-28 22:16:11,858 epoch 4 - iter 2/10 - loss 2.35905457 - samples/sec: 61.78 - lr: 0.100000
2022-04-28 22:16:12,523 epoch 4 - iter 3/10 - loss 2.23206981 - samples/sec: 48.19 - lr: 0.100000
2022-04-28 22:16:13,026 epoch 4 - iter 4/10 - loss 2.28027773 - samples/sec: 63.75 - lr: 0.100000
2022-04-28 22:16:13,610 epoch 4 - iter 5/10 - loss 2.22129200 - samples/sec: 54.98 - lr: 0.100000
2022-04-28 22:16:14,074 epoch 4 - iter 6/10 - loss 2.10545621 - samples/sec: 69.11 - lr: 0.100000
2022-04-28 22:16:14,646 epoch 4 - iter 7/10 - loss 2.10457425 - samples/sec: 56.04 - lr: 0.100000
2022-04-28 22:16:15,144 epoch 4 - iter 8/10 - loss 2.04774940 - samples/sec: 64.38 - lr: 0.100000
2022-04-28 22:16:15,698 epoch 4 - iter 9/10 - loss 1.99643935 - samples/sec: 57.97 - lr: 0.100000
2022-04-28 22:16:15,935 epoch 4 - iter 10/10 - loss 1.81641705 - samples/sec: 136.14 - lr: 0.100000
2022-04-28 22:16:15,936 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:15,937 EPOCH 4 done: loss 1.8164 - lr 0.1000000
2022-04-28 22:16:16,092 DEV : loss 0.8311207890510559 - score 0.0
2022-04-28 22:16:16,094 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:16:21,938 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:22,424 epoch 5 - iter 1/10 - loss 1.31467295 - samples/sec: 66.12 - lr: 0.100000
2022-04-28 22:16:22,852 epoch 5 - iter 2/10 - loss 1.87177873 - samples/sec: 74.94 - lr: 0.100000
2022-04-28 22:16:23,440 epoch 5 - iter 3/10 - loss 1.83717314 - samples/sec: 54.51 - lr: 0.100000
2022-04-28 22:16:23,991 epoch 5 - iter 4/10 - loss 2.06565040 - samples/sec: 58.18 - lr: 0.100000
2022-04-28 22:16:24,364 epoch 5 - iter 5/10 - loss 1.95749507 - samples/sec: 86.25 - lr: 0.100000
2022-04-28 22:16:24,832 epoch 5 - iter 6/10 - loss 1.84727591 - samples/sec: 68.67 - lr: 0.100000
2022-04-28 22:16:25,238 epoch 5 - iter 7/10 - loss 1.79978011 - samples/sec: 79.21 - lr: 0.100000
2022-04-28 22:16:25,679 epoch 5 - iter 8/10 - loss 1.69797329 - samples/sec: 72.73 - lr: 0.100000
2022-04-28 22:16:26,173 epoch 5 - iter 9/10 - loss 1.70765987 - samples/sec: 64.84 - lr: 0.100000
2022-04-28 22:16:26,364 epoch 5 - iter 10/10 - loss 1.76581790 - samples/sec: 169.31 - lr: 0.100000
2022-04-28 22:16:26,366 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:26,367 EPOCH 5 done: loss 1.7658 - lr 0.1000000
2022-04-28 22:16:26,509 DEV : loss 0.7797471880912781 - score 0.2222
2022-04-28 22:16:26,510 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:16:32,211 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:32,666 epoch 6 - iter 1/10 - loss 2.04772544 - samples/sec: 70.64 - lr: 0.100000
2022-04-28 22:16:33,172 epoch 6 - iter 2/10 - loss 1.61218661 - samples/sec: 63.37 - lr: 0.100000
2022-04-28 22:16:33,673 epoch 6 - iter 3/10 - loss 1.55716117 - samples/sec: 64.00 - lr: 0.100000
2022-04-28 22:16:34,183 epoch 6 - iter 4/10 - loss 1.54974008 - samples/sec: 62.87 - lr: 0.100000
2022-04-28 22:16:34,687 epoch 6 - iter 5/10 - loss 1.50827932 - samples/sec: 63.62 - lr: 0.100000
2022-04-28 22:16:35,155 epoch 6 - iter 6/10 - loss 1.46459270 - samples/sec: 68.52 - lr: 0.100000
2022-04-28 22:16:35,658 epoch 6 - iter 7/10 - loss 1.50249643 - samples/sec: 63.87 - lr: 0.100000
2022-04-28 22:16:36,094 epoch 6 - iter 8/10 - loss 1.51979375 - samples/sec: 73.56 - lr: 0.100000
2022-04-28 22:16:36,548 epoch 6 - iter 9/10 - loss 1.56509953 - samples/sec: 70.64 - lr: 0.100000
2022-04-28 22:16:36,744 epoch 6 - iter 10/10 - loss 1.55241492 - samples/sec: 164.10 - lr: 0.100000
2022-04-28 22:16:36,746 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:36,746 EPOCH 6 done: loss 1.5524 - lr 0.1000000
2022-04-28 22:16:36,884 DEV : loss 0.9345423579216003 - score 0.3333
2022-04-28 22:16:36,885 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:16:42,377 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:42,856 epoch 7 - iter 1/10 - loss 2.15539050 - samples/sec: 67.09 - lr: 0.100000
2022-04-28 22:16:43,336 epoch 7 - iter 2/10 - loss 1.68949413 - samples/sec: 66.95 - lr: 0.100000
2022-04-28 22:16:43,781 epoch 7 - iter 3/10 - loss 1.81478349 - samples/sec: 72.07 - lr: 0.100000
2022-04-28 22:16:44,241 epoch 7 - iter 4/10 - loss 1.68033907 - samples/sec: 69.87 - lr: 0.100000
2022-04-28 22:16:44,730 epoch 7 - iter 5/10 - loss 1.64062953 - samples/sec: 65.57 - lr: 0.100000
2022-04-28 22:16:45,227 epoch 7 - iter 6/10 - loss 1.59568199 - samples/sec: 64.78 - lr: 0.100000
2022-04-28 22:16:45,663 epoch 7 - iter 7/10 - loss 1.46137918 - samples/sec: 73.39 - lr: 0.100000
2022-04-28 22:16:46,169 epoch 7 - iter 8/10 - loss 1.41721664 - samples/sec: 63.36 - lr: 0.100000
2022-04-28 22:16:46,734 epoch 7 - iter 9/10 - loss 1.39811980 - samples/sec: 56.74 - lr: 0.100000
2022-04-28 22:16:46,937 epoch 7 - iter 10/10 - loss 1.38412433 - samples/sec: 159.20 - lr: 0.100000
2022-04-28 22:16:46,938 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:46,939 EPOCH 7 done: loss 1.3841 - lr 0.1000000
2022-04-28 22:16:47,081 DEV : loss 0.6798948049545288 - score 0.5
2022-04-28 22:16:47,083 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:16:52,628 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:53,137 epoch 8 - iter 1/10 - loss 1.08732188 - samples/sec: 63.12 - lr: 0.100000
2022-04-28 22:16:53,606 epoch 8 - iter 2/10 - loss 1.29048711 - samples/sec: 68.38 - lr: 0.100000
2022-04-28 22:16:54,039 epoch 8 - iter 3/10 - loss 1.04415214 - samples/sec: 74.07 - lr: 0.100000
2022-04-28 22:16:54,568 epoch 8 - iter 4/10 - loss 1.02857886 - samples/sec: 60.60 - lr: 0.100000
2022-04-28 22:16:55,148 epoch 8 - iter 5/10 - loss 1.26690668 - samples/sec: 55.27 - lr: 0.100000
2022-04-28 22:16:55,602 epoch 8 - iter 6/10 - loss 1.30797880 - samples/sec: 70.80 - lr: 0.100000
2022-04-28 22:16:56,075 epoch 8 - iter 7/10 - loss 1.22035806 - samples/sec: 67.72 - lr: 0.100000
2022-04-28 22:16:56,494 epoch 8 - iter 8/10 - loss 1.23306625 - samples/sec: 76.51 - lr: 0.100000
2022-04-28 22:16:56,933 epoch 8 - iter 9/10 - loss 1.18903442 - samples/sec: 73.15 - lr: 0.100000
2022-04-28 22:16:57,147 epoch 8 - iter 10/10 - loss 1.31105986 - samples/sec: 150.24 - lr: 0.100000
2022-04-28 22:16:57,148 ----------------------------------------------------------------------------------------------------
2022-04-28 22:16:57,149 EPOCH 8 done: loss 1.3111 - lr 0.1000000
2022-04-28 22:16:57,289 DEV : loss 0.5563207864761353 - score 0.5
2022-04-28 22:16:57,290 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:17:02,550 ----------------------------------------------------------------------------------------------------
2022-04-28 22:17:03,134 epoch 9 - iter 1/10 - loss 1.32691610 - samples/sec: 54.89 - lr: 0.100000
2022-04-28 22:17:03,595 epoch 9 - iter 2/10 - loss 1.16159409 - samples/sec: 69.57 - lr: 0.100000
2022-04-28 22:17:04,014 epoch 9 - iter 3/10 - loss 1.10929267 - samples/sec: 76.56 - lr: 0.100000
2022-04-28 22:17:04,518 epoch 9 - iter 4/10 - loss 1.05318102 - samples/sec: 63.62 - lr: 0.100000
2022-04-28 22:17:04,966 epoch 9 - iter 5/10 - loss 1.07275693 - samples/sec: 71.75 - lr: 0.100000
2022-04-28 22:17:05,432 epoch 9 - iter 6/10 - loss 1.02824855 - samples/sec: 68.82 - lr: 0.100000
2022-04-28 22:17:05,909 epoch 9 - iter 7/10 - loss 1.04051120 - samples/sec: 67.23 - lr: 0.100000
2022-04-28 22:17:06,404 epoch 9 - iter 8/10 - loss 1.00513531 - samples/sec: 64.78 - lr: 0.100000
2022-04-28 22:17:06,831 epoch 9 - iter 9/10 - loss 1.03960636 - samples/sec: 75.29 - lr: 0.100000
2022-04-28 22:17:07,019 epoch 9 - iter 10/10 - loss 1.07805606 - samples/sec: 171.12 - lr: 0.100000
2022-04-28 22:17:07,020 ----------------------------------------------------------------------------------------------------
2022-04-28 22:17:07,021 EPOCH 9 done: loss 1.0781 - lr 0.1000000
2022-04-28 22:17:07,151 DEV : loss 0.909138560295105 - score 0.7143
2022-04-28 22:17:07,153 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:17:12,454 ----------------------------------------------------------------------------------------------------
2022-04-28 22:17:12,906 epoch 10 - iter 1/10 - loss 1.49117911 - samples/sec: 70.96 - lr: 0.100000
2022-04-28 22:17:13,334 epoch 10 - iter 2/10 - loss 1.23203236 - samples/sec: 74.94 - lr: 0.100000
2022-04-28 22:17:13,789 epoch 10 - iter 3/10 - loss 1.12988973 - samples/sec: 70.48 - lr: 0.100000
2022-04-28 22:17:14,275 epoch 10 - iter 4/10 - loss 1.07148103 - samples/sec: 65.98 - lr: 0.100000
2022-04-28 22:17:14,795 epoch 10 - iter 5/10 - loss 1.08848752 - samples/sec: 61.66 - lr: 0.100000
2022-04-28 22:17:15,328 epoch 10 - iter 6/10 - loss 1.05938606 - samples/sec: 60.26 - lr: 0.100000
2022-04-28 22:17:15,730 epoch 10 - iter 7/10 - loss 1.00324091 - samples/sec: 79.80 - lr: 0.100000
2022-04-28 22:17:16,245 epoch 10 - iter 8/10 - loss 0.93657552 - samples/sec: 62.26 - lr: 0.100000
2022-04-28 22:17:16,681 epoch 10 - iter 9/10 - loss 0.95801387 - samples/sec: 73.56 - lr: 0.100000
2022-04-28 22:17:16,901 epoch 10 - iter 10/10 - loss 0.87346228 - samples/sec: 146.77 - lr: 0.100000
2022-04-28 22:17:16,902 ----------------------------------------------------------------------------------------------------
2022-04-28 22:17:16,903 EPOCH 10 done: loss 0.8735 - lr 0.1000000
2022-04-28 22:17:17,047 DEV : loss 0.5443210601806641 - score 0.7143
2022-04-28 22:17:17,050 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 22:17:27,557 ----------------------------------------------------------------------------------------------------
2022-04-28 22:17:27,557 Testing using best model ...
2022-04-28 22:17:27,566 loading file slot-model\best-model.pt
2022-04-28 22:17:33,102 0.6429	0.4500	0.5294
2022-04-28 22:17:33,103 
Results:
- F1-score (micro) 0.5294
- F1-score (macro) 0.4533

By class:
area       tp: 0 - fp: 0 - fn: 1 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
date       tp: 1 - fp: 1 - fn: 0 - precision: 0.5000 - recall: 1.0000 - f1-score: 0.6667
quantity   tp: 3 - fp: 1 - fn: 3 - precision: 0.7500 - recall: 0.5000 - f1-score: 0.6000
time       tp: 2 - fp: 2 - fn: 4 - precision: 0.5000 - recall: 0.3333 - f1-score: 0.4000
title      tp: 3 - fp: 1 - fn: 3 - precision: 0.7500 - recall: 0.5000 - f1-score: 0.6000
2022-04-28 22:17:33,104 ----------------------------------------------------------------------------------------------------
{'test_score': 0.5294117647058824,
 'dev_score_history': [0.0,
  0.0,
  0.0,
  0.0,
  0.2222222222222222,
  0.3333333333333333,
  0.5,
  0.5,
  0.7142857142857143,
  0.7142857142857143],
 'train_loss_history': [5.108907747268677,
  2.6376620531082153,
  2.0286046147346495,
  1.816417047381401,
  1.7658178985118866,
  1.5524149179458617,
  1.384124332666397,
  1.3110598623752594,
  1.0780560612678527,
  0.8734622806310653],
 'dev_loss_history': [1.1116931438446045,
  1.2027416229248047,
  0.9265440702438354,
  0.8311207890510559,
  0.7797471880912781,
  0.9345423579216003,
  0.6798948049545288,
  0.5563207864761353,
  0.909138560295105,
  0.5443210601806641]}

Jakość wyuczonego modelu możemy ocenić, korzystając z zaraportowanych powyżej metryk, tj.:

  • _tp (true positives)

    liczba słów oznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą

  • _fp (false positives)

    liczba słów nieoznaczonych w zbiorze testowym etykietą $e$, które model oznaczył tą etykietą

  • _fn (false negatives)

    liczba słów oznaczonych w zbiorze testowym etykietą $e$, którym model nie nadał etykiety $e$

  • _precision

    $$\frac{tp}{tp + fp}$$

  • _recall

    $$\frac{tp}{tp + fn}$$

  • $F_1$

    $$\frac{2 \cdot precision \cdot recall}{precision + recall}$$

  • _micro $F_1$

    $F_1$ w którym $tp$, $fp$ i $fn$ są liczone łącznie dla wszystkich etykiet, tj. $tp = \sum_{e}{{tp}_e}$, $fn = \sum{e}{{fn}e}$, $fp = \sum{e}{{fp}_e}$

  • _macro $F_1$

    średnia arytmetyczna z $F_1$ obliczonych dla poszczególnych etykiet z osobna.

Wyuczony model możemy wczytać z pliku korzystając z metody load.

model = SequenceTagger.load('slot-model/final-model.pt')
2022-04-28 22:17:33,278 loading file slot-model/final-model.pt

Wczytany model możemy wykorzystać do przewidywania slotów w wypowiedziach użytkownika, korzystając z przedstawionej poniżej funkcji predict.

def predict(model, sentence):
    csentence = [{'form': word} for word in sentence]
    fsentence = conllu2flair([csentence])[0]
    model.predict(fsentence)
    return [(token, ftoken.get_tag('slot').value) for token, ftoken in zip(sentence, fsentence)]

Jak pokazuje przykład poniżej model wyuczony tylko na 100 przykładach popełnia w dosyć prostej wypowiedzi błąd etykietując słowo alarm tagiem B-weather/noun.

tabulate(predict(model, 'batman'.split()), tablefmt='html')
co O
gracie O
popołudniuO

Literatura

  1. Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis, Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog. NAACL-HLT (1) 2019, pp. 3795-3805
  2. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282289, https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers
  3. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (November 15, 1997), 17351780, https://doi.org/10.1162/neco.1997.9.8.1735
  4. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, Attention is All you Need, NIPS 2017, pp. 5998-6008, https://arxiv.org/abs/1706.03762
  5. Alan Akbik, Duncan Blythe, Roland Vollgraf, Contextual String Embeddings for Sequence Labeling, Proceedings of the 27th International Conference on Computational Linguistics, pp. 16381649, https://www.aclweb.org/anthology/C18-1139.pdf