projekt-glebokie/GPT_2.ipynb
2023-02-13 00:49:10 +01:00

79 KiB
Raw Permalink Blame History

Setup

Requirements

!pip install torch
!pip install datasets
!pip install transformers
!pip install scikit-learn
!pip install evaluate
!pip install accelerate
!pip install sentencepiece
!pip install protobuf
!pip install sacrebleu
!pip install py7zr
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: torch in /usr/local/lib/python3.8/dist-packages (1.13.1+cu116)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.8/dist-packages (from torch) (4.4.0)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: datasets in /usr/local/lib/python3.8/dist-packages (2.9.0)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.8/dist-packages (from datasets) (3.8.3)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.8/dist-packages (from datasets) (2.25.1)
Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.8/dist-packages (from datasets) (4.64.1)
Requirement already satisfied: pandas in /usr/local/lib/python3.8/dist-packages (from datasets) (1.3.5)
Requirement already satisfied: huggingface-hub<1.0.0,>=0.2.0 in /usr/local/lib/python3.8/dist-packages (from datasets) (0.12.0)
Requirement already satisfied: responses<0.19 in /usr/local/lib/python3.8/dist-packages (from datasets) (0.18.0)
Requirement already satisfied: dill<0.3.7 in /usr/local/lib/python3.8/dist-packages (from datasets) (0.3.6)
Requirement already satisfied: pyarrow>=6.0.0 in /usr/local/lib/python3.8/dist-packages (from datasets) (9.0.0)
Requirement already satisfied: xxhash in /usr/local/lib/python3.8/dist-packages (from datasets) (3.2.0)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from datasets) (1.21.6)
Requirement already satisfied: multiprocess in /usr/local/lib/python3.8/dist-packages (from datasets) (0.70.14)
Requirement already satisfied: fsspec[http]>=2021.11.1 in /usr/local/lib/python3.8/dist-packages (from datasets) (2023.1.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from datasets) (23.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from datasets) (6.0)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets) (4.0.2)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets) (1.3.3)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets) (22.2.0)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets) (2.1.1)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets) (1.3.1)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets) (1.8.2)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets) (6.0.4)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub<1.0.0,>=0.2.0->datasets) (4.4.0)
Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from huggingface-hub<1.0.0,>=0.2.0->datasets) (3.9.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->datasets) (2022.12.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->datasets) (1.26.14)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->datasets) (4.0.0)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->datasets) (2.10)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas->datasets) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas->datasets) (2022.7.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.8/dist-packages (from python-dateutil>=2.7.3->pandas->datasets) (1.15.0)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: transformers in /usr/local/lib/python3.8/dist-packages (4.26.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (2022.6.2)
Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.8/dist-packages (from transformers) (0.13.2)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from transformers) (6.0)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (1.21.6)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.8/dist-packages (from transformers) (4.64.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from transformers) (3.9.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.11.0 in /usr/local/lib/python3.8/dist-packages (from transformers) (0.12.0)
Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from transformers) (2.25.1)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from transformers) (23.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub<1.0,>=0.11.0->transformers) (4.4.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (1.26.14)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2.10)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (4.0.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2022.12.7)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.8/dist-packages (1.0.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from scikit-learn) (3.1.0)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.8/dist-packages (from scikit-learn) (1.2.0)
Requirement already satisfied: scipy>=1.1.0 in /usr/local/lib/python3.8/dist-packages (from scikit-learn) (1.7.3)
Requirement already satisfied: numpy>=1.14.6 in /usr/local/lib/python3.8/dist-packages (from scikit-learn) (1.21.6)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: evaluate in /usr/local/lib/python3.8/dist-packages (0.4.0)
Requirement already satisfied: fsspec[http]>=2021.05.0 in /usr/local/lib/python3.8/dist-packages (from evaluate) (2023.1.0)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.8/dist-packages (from evaluate) (2.25.1)
Requirement already satisfied: pandas in /usr/local/lib/python3.8/dist-packages (from evaluate) (1.3.5)
Requirement already satisfied: multiprocess in /usr/local/lib/python3.8/dist-packages (from evaluate) (0.70.14)
Requirement already satisfied: huggingface-hub>=0.7.0 in /usr/local/lib/python3.8/dist-packages (from evaluate) (0.12.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from evaluate) (23.0)
Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.8/dist-packages (from evaluate) (4.64.1)
Requirement already satisfied: dill in /usr/local/lib/python3.8/dist-packages (from evaluate) (0.3.6)
Requirement already satisfied: datasets>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from evaluate) (2.9.0)
Requirement already satisfied: responses<0.19 in /usr/local/lib/python3.8/dist-packages (from evaluate) (0.18.0)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from evaluate) (1.21.6)
Requirement already satisfied: xxhash in /usr/local/lib/python3.8/dist-packages (from evaluate) (3.2.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from datasets>=2.0.0->evaluate) (6.0)
Requirement already satisfied: pyarrow>=6.0.0 in /usr/local/lib/python3.8/dist-packages (from datasets>=2.0.0->evaluate) (9.0.0)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.8/dist-packages (from datasets>=2.0.0->evaluate) (3.8.3)
Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from huggingface-hub>=0.7.0->evaluate) (3.9.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub>=0.7.0->evaluate) (4.4.0)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->evaluate) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->evaluate) (2022.12.7)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->evaluate) (4.0.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->evaluate) (1.26.14)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas->evaluate) (2022.7.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas->evaluate) (2.8.2)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (6.0.4)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (22.2.0)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.3.1)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.3.3)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.8.2)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (2.1.1)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (4.0.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.8/dist-packages (from python-dateutil>=2.7.3->pandas->evaluate) (1.15.0)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: accelerate in /usr/local/lib/python3.8/dist-packages (0.16.0)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from accelerate) (23.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.8/dist-packages (from accelerate) (5.4.8)
Requirement already satisfied: torch>=1.4.0 in /usr/local/lib/python3.8/dist-packages (from accelerate) (1.13.1+cu116)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from accelerate) (1.21.6)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.8/dist-packages (from accelerate) (6.0)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.8/dist-packages (from torch>=1.4.0->accelerate) (4.4.0)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: sentencepiece in /usr/local/lib/python3.8/dist-packages (0.1.97)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: protobuf in /usr/local/lib/python3.8/dist-packages (3.19.6)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: sacrebleu in /usr/local/lib/python3.8/dist-packages (2.3.1)
Requirement already satisfied: regex in /usr/local/lib/python3.8/dist-packages (from sacrebleu) (2022.6.2)
Requirement already satisfied: tabulate>=0.8.9 in /usr/local/lib/python3.8/dist-packages (from sacrebleu) (0.8.10)
Requirement already satisfied: colorama in /usr/local/lib/python3.8/dist-packages (from sacrebleu) (0.4.6)
Requirement already satisfied: portalocker in /usr/local/lib/python3.8/dist-packages (from sacrebleu) (2.7.0)
Requirement already satisfied: lxml in /usr/local/lib/python3.8/dist-packages (from sacrebleu) (4.9.2)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from sacrebleu) (1.21.6)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: py7zr in /usr/local/lib/python3.8/dist-packages (0.20.4)
Requirement already satisfied: inflate64>=0.3.1 in /usr/local/lib/python3.8/dist-packages (from py7zr) (0.3.1)
Requirement already satisfied: pybcj>=0.6.0 in /usr/local/lib/python3.8/dist-packages (from py7zr) (1.0.1)
Requirement already satisfied: pyzstd>=0.14.4 in /usr/local/lib/python3.8/dist-packages (from py7zr) (0.15.3)
Requirement already satisfied: multivolumefile>=0.2.3 in /usr/local/lib/python3.8/dist-packages (from py7zr) (0.2.3)
Requirement already satisfied: pycryptodomex>=3.6.6 in /usr/local/lib/python3.8/dist-packages (from py7zr) (3.17)
Requirement already satisfied: brotli>=1.0.9 in /usr/local/lib/python3.8/dist-packages (from py7zr) (1.0.9)
Requirement already satisfied: psutil in /usr/local/lib/python3.8/dist-packages (from py7zr) (5.4.8)
Requirement already satisfied: texttable in /usr/local/lib/python3.8/dist-packages (from py7zr) (1.6.7)
Requirement already satisfied: pyppmd<1.1.0,>=0.18.1 in /usr/local/lib/python3.8/dist-packages (from py7zr) (1.0.0)

Imports

import os
import json
import torch
from google.colab import drive
from pathlib import Path
from typing import Dict, List
from datasets import load_dataset
from transformers import T5Tokenizer

Loading data

loaded_data = load_dataset('emotion')
!mkdir -v -p data
train_path = Path('data/train.json')
valid_path = Path('data/valid.json')
test_path = Path('data/test.json')
data_train, data_valid, data_test = [], [], []
WARNING:datasets.builder:No config specified, defaulting to: emotion/split
WARNING:datasets.builder:Found cached dataset emotion (/root/.cache/huggingface/datasets/emotion/split/1.0.0/cca5efe2dfeb58c1d098e0f9eeb200e9927d889b5a03c67097275dfb5fe463bd)
  0%|          | 0/3 [00:00<?, ?it/s]
for source_data, dataset, max_size in [
  (loaded_data['train'], data_train, None),
  (loaded_data['validation'], data_valid, None),
  (loaded_data['test'], data_test, None),
]:
  for i, data in enumerate(source_data):
    if max_size is not None and i >= max_size:
      break
    data_line = {
      'label': int(data['label']),
      'text': data['text'],
    }
    dataset.append(data_line)

print(f'Train: {len(data_train):6d}')
print(f'Valid: {len(data_valid):6d}')
print(f'Test: {len(data_test):6d}')
Train:  16000
Valid:   2000
Test:   2000
MAP_LABEL_TRANSLATION = {
    0: 'sadness',
    1: 'joy',
    2: 'love',
    3: 'anger',
    4: 'fear',
    5: 'surprise',
}
def save_as_translations(original_save_path: Path, data_to_save: List[Dict]) -> None:
    file_name = 's2s-' + original_save_path.name
    file_path = original_save_path.parent / file_name

    print(f'Saving into: {file_path}')
    with open(file_path, 'wt') as f_write:
        for data_line in data_to_save:
            label = data_line['label']
            new_label = MAP_LABEL_TRANSLATION[label]
            data_line['label'] = new_label
            data_line_str = json.dumps(data_line)
            f_write.write(f'{data_line_str}\n')
for file_path, data_to_save in [(train_path, data_train), (valid_path, data_valid), (test_path, data_test)]:
  print(f'Saving into: {file_path}')
  with open(file_path, 'wt') as f_write:
    for data_line in data_to_save:
      data_line_str = json.dumps(data_line)
      f_write.write(f'{data_line_str}\n')
  
  save_as_translations(file_path, data_to_save)
Saving into: data/train.json
Saving into: data/s2s-train.json
Saving into: data/valid.json
Saving into: data/s2s-valid.json
Saving into: data/test.json
Saving into: data/s2s-test.json
!head data/train.json
{"label": 0, "text": "i didnt feel humiliated"}
{"label": 0, "text": "i can go from feeling so hopeless to so damned hopeful just from being around someone who cares and is awake"}
{"label": 3, "text": "im grabbing a minute to post i feel greedy wrong"}
{"label": 2, "text": "i am ever feeling nostalgic about the fireplace i will know that it is still on the property"}
{"label": 3, "text": "i am feeling grouchy"}
{"label": 0, "text": "ive been feeling a little burdened lately wasnt sure why that was"}
{"label": 5, "text": "ive been taking or milligrams or times recommended amount and ive fallen asleep a lot faster but i also feel like so funny"}
{"label": 4, "text": "i feel as confused about life as a teenager or as jaded as a year old man"}
{"label": 1, "text": "i have been with petronas for years i feel that petronas has performed well and made a huge profit"}
{"label": 2, "text": "i feel romantic too"}
!head data/s2s-train.json
{"label": "sadness", "text": "i didnt feel humiliated"}
{"label": "sadness", "text": "i can go from feeling so hopeless to so damned hopeful just from being around someone who cares and is awake"}
{"label": "anger", "text": "im grabbing a minute to post i feel greedy wrong"}
{"label": "love", "text": "i am ever feeling nostalgic about the fireplace i will know that it is still on the property"}
{"label": "anger", "text": "i am feeling grouchy"}
{"label": "sadness", "text": "ive been feeling a little burdened lately wasnt sure why that was"}
{"label": "surprise", "text": "ive been taking or milligrams or times recommended amount and ive fallen asleep a lot faster but i also feel like so funny"}
{"label": "fear", "text": "i feel as confused about life as a teenager or as jaded as a year old man"}
{"label": "joy", "text": "i have been with petronas for years i feel that petronas has performed well and made a huge profit"}
{"label": "love", "text": "i feel romantic too"}
# create tiny datasets for debugging purposes
for file_name in ["train", "valid", "test"]:
  print(f"=== {file_name} ===")
  all_text = Path(f"data/{file_name}.json").read_text().split('\n')
  text = all_text[:250] + all_text[-250:]
  Path(f"data/{file_name}-500.json").write_text("\n".join(text))
=== train ===
=== valid ===
=== test ===
!wc -l data/*
   2000 data/s2s-test.json
  16000 data/s2s-train.json
   2000 data/s2s-valid.json
    499 data/test-500.json
   2000 data/test.json
    499 data/train-500.json
  16000 data/train.json
    499 data/valid-500.json
   2000 data/valid.json
  41497 total

GPU Info

!nvidia-smi
Sun Feb 12 23:30:18 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   56C    P0    26W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
os.environ['TOKENIZERS_PARALLELISM'] = 'true'

Run

!wget 'https://git.wmi.amu.edu.pl/s444465/projekt-glebokie/raw/branch/master/run_glue.py' -O 'run_glue.py'
--2023-02-12 23:30:18--  https://git.wmi.amu.edu.pl/s444465/projekt-glebokie/raw/branch/master/run_glue.py
Resolving git.wmi.amu.edu.pl (git.wmi.amu.edu.pl)... 150.254.78.40
Connecting to git.wmi.amu.edu.pl (git.wmi.amu.edu.pl)|150.254.78.40|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30601 (30K) [text/plain]
Saving to: run_glue.py

run_glue.py         100%[===================>]  29.88K  --.-KB/s    in 0.03s   

2023-02-12 23:30:18 (982 KB/s) - run_glue.py saved [30601/30601]

!wget 'https://git.wmi.amu.edu.pl/s444465/projekt-glebokie/raw/branch/master/roberta.py' -O 'roberta.py'
--2023-02-12 23:30:18--  https://git.wmi.amu.edu.pl/s444465/projekt-glebokie/raw/branch/master/roberta.py
Resolving git.wmi.amu.edu.pl (git.wmi.amu.edu.pl)... 150.254.78.40
Connecting to git.wmi.amu.edu.pl (git.wmi.amu.edu.pl)|150.254.78.40|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12783 (12K) [text/plain]
Saving to: roberta.py

roberta.py          100%[===================>]  12.48K  --.-KB/s    in 0s      

2023-02-12 23:30:18 (263 MB/s) - roberta.py saved [12783/12783]

!wget 'https://git.wmi.amu.edu.pl/s444465/projekt-glebokie/raw/branch/master/gpt2.py' -O 'gpt2.py'
--2023-02-12 23:30:18--  https://git.wmi.amu.edu.pl/s444465/projekt-glebokie/raw/branch/master/gpt2.py
Resolving git.wmi.amu.edu.pl (git.wmi.amu.edu.pl)... 150.254.78.40
Connecting to git.wmi.amu.edu.pl (git.wmi.amu.edu.pl)|150.254.78.40|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8017 (7.8K) [text/plain]
Saving to: gpt2.py

gpt2.py             100%[===================>]   7.83K  --.-KB/s    in 0s      

2023-02-12 23:30:19 (1.42 GB/s) - gpt2.py saved [8017/8017]

torch.cuda.empty_cache()
! python run_glue.py \
  --cache_dir .cache_training \
  --model_name_or_path gpt2 \
  --custom_model gpt2_hidden \
  --train_file data/train.json  \
  --validation_file data/valid.json \
  --test_file data/test.json \
  --per_device_train_batch_size 8 \
  --per_device_eval_batch_size 8 \
  --do_train \
  --do_eval \
  --do_predict \
  --max_seq_length 128 \
  --num_train_epochs 1 \
  --metric_for_best_model accuracy \
  --greater_is_better True \
  --overwrite_output_dir \
  --output_dir out/emotion/gpt2
2023-02-12 23:30:29.286531: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2023-02-12 23:30:29.287316: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2023-02-12 23:30:29.287348: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
WARNING:__main__:Process rank: -1, device: cuda:0, n_gpu: 1distributed training: False, 16-bits training: False
INFO:__main__:Training/evaluation parameters TrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=True,
do_predict=True,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=no,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
greater_is_better=True,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=-1,
log_level=passive,
log_level_replica=passive,
log_on_each_node=True,
logging_dir=out/emotion/gpt2/runs/Feb12_23-30-34_2740c0a1a5dc,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=accuracy,
mp_parameters=,
no_cuda=False,
num_train_epochs=1.0,
optim=adamw_hf,
optim_args=None,
output_dir=out/emotion/gpt2,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=8,
per_device_train_batch_size=8,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['tensorboard'],
resume_from_checkpoint=None,
run_name=out/emotion/gpt2,
save_on_each_node=False,
save_steps=500,
save_strategy=steps,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
INFO:__main__:load a local file for train: data/train.json
INFO:__main__:load a local file for validation: data/valid.json
INFO:__main__:load a local file for test: data/test.json
WARNING:datasets.builder:Using custom data configuration default-79a9e082059ced07
INFO:datasets.info:Loading Dataset Infos from /usr/local/lib/python3.8/dist-packages/datasets/packaged_modules/json
INFO:datasets.builder:Generating dataset json (/content/.cache_training/json/default-79a9e082059ced07/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51)
Downloading and preparing dataset json/default to /content/.cache_training/json/default-79a9e082059ced07/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51...
Downloading data files: 100% 3/3 [00:00<00:00, 12384.76it/s]
INFO:datasets.download.download_manager:Downloading took 0.0 min
INFO:datasets.download.download_manager:Checksum Computation took 0.0 min
Extracting data files: 100% 3/3 [00:00<00:00, 1936.13it/s]
INFO:datasets.utils.info_utils:Unable to verify checksums.
INFO:datasets.builder:Generating train split
INFO:datasets.builder:Generating validation split
INFO:datasets.builder:Generating test split
INFO:datasets.utils.info_utils:Unable to verify splits sizes.
Dataset json downloaded and prepared to /content/.cache_training/json/default-79a9e082059ced07/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data.
100% 3/3 [00:00<00:00, 989.92it/s]
[INFO|configuration_utils.py:660] 2023-02-12 23:30:36,613 >> loading configuration file config.json from cache at .cache_training/models--gpt2/snapshots/e7da7f221d5bf496a48136c0cd264e630fe9fcc8/config.json
[INFO|configuration_utils.py:712] 2023-02-12 23:30:36,614 >> Model config GPT2Config {
  "_name_or_path": "gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3",
    "4": "LABEL_4",
    "5": "LABEL_5"
  },
  "initializer_range": 0.02,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_2": 2,
    "LABEL_3": 3,
    "LABEL_4": 4,
    "LABEL_5": 5
  },
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.26.1",
  "use_cache": true,
  "vocab_size": 50257
}

[INFO|tokenization_auto.py:458] 2023-02-12 23:30:36,976 >> Could not locate the tokenizer configuration file, will try to use the model config instead.
[INFO|configuration_utils.py:660] 2023-02-12 23:30:37,341 >> loading configuration file config.json from cache at .cache_training/models--gpt2/snapshots/e7da7f221d5bf496a48136c0cd264e630fe9fcc8/config.json
[INFO|configuration_utils.py:712] 2023-02-12 23:30:37,342 >> Model config GPT2Config {
  "_name_or_path": "gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.26.1",
  "use_cache": true,
  "vocab_size": 50257
}

[INFO|tokenization_utils_base.py:1802] 2023-02-12 23:30:38,088 >> loading file vocab.json from cache at .cache_training/models--gpt2/snapshots/e7da7f221d5bf496a48136c0cd264e630fe9fcc8/vocab.json
[INFO|tokenization_utils_base.py:1802] 2023-02-12 23:30:38,088 >> loading file merges.txt from cache at .cache_training/models--gpt2/snapshots/e7da7f221d5bf496a48136c0cd264e630fe9fcc8/merges.txt
[INFO|tokenization_utils_base.py:1802] 2023-02-12 23:30:38,088 >> loading file tokenizer.json from cache at .cache_training/models--gpt2/snapshots/e7da7f221d5bf496a48136c0cd264e630fe9fcc8/tokenizer.json
[INFO|tokenization_utils_base.py:1802] 2023-02-12 23:30:38,089 >> loading file added_tokens.json from cache at None
[INFO|tokenization_utils_base.py:1802] 2023-02-12 23:30:38,089 >> loading file special_tokens_map.json from cache at None
[INFO|tokenization_utils_base.py:1802] 2023-02-12 23:30:38,089 >> loading file tokenizer_config.json from cache at None
[INFO|configuration_utils.py:660] 2023-02-12 23:30:38,089 >> loading configuration file config.json from cache at .cache_training/models--gpt2/snapshots/e7da7f221d5bf496a48136c0cd264e630fe9fcc8/config.json
[INFO|configuration_utils.py:712] 2023-02-12 23:30:38,090 >> Model config GPT2Config {
  "_name_or_path": "gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.26.1",
  "use_cache": true,
  "vocab_size": 50257
}

INFO:__main__:Using hidden states in model: True
INFO:__main__:Using implementation from class: GPT2ForSequenceClassificationCustom
[INFO|modeling_utils.py:2275] 2023-02-12 23:30:38,214 >> loading weights file pytorch_model.bin from cache at .cache_training/models--gpt2/snapshots/e7da7f221d5bf496a48136c0cd264e630fe9fcc8/pytorch_model.bin
[INFO|modeling_utils.py:2857] 2023-02-12 23:30:43,108 >> All model checkpoint weights were used when initializing GPT2ForSequenceClassificationCustom.

[WARNING|modeling_utils.py:2859] 2023-02-12 23:30:43,108 >> Some weights of GPT2ForSequenceClassificationCustom were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.out_proj.weight', 'score.dense_1_hidden.bias', 'score.dense_3.bias', 'score.dense_1_input.weight', 'score.dense_1_hidden.weight', 'score.dense_3.weight', 'score.dense_1_input.bias', 'score.dense_2.weight', 'score.dense_2.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[ERROR|tokenization_utils_base.py:1042] 2023-02-12 23:30:43,118 >> Using pad_token, but it is not set yet.
INFO:__main__:Set PAD token to EOS: <|endoftext|>
Running tokenizer on dataset:   0% 0/16 [00:00<?, ?ba/s]INFO:datasets.arrow_dataset:Caching processed dataset at /content/.cache_training/json/default-79a9e082059ced07/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51/cache-5493d7f118c94c16.arrow
Running tokenizer on dataset: 100% 16/16 [00:01<00:00,  9.52ba/s]
Running tokenizer on dataset:   0% 0/2 [00:00<?, ?ba/s]INFO:datasets.arrow_dataset:Caching processed dataset at /content/.cache_training/json/default-79a9e082059ced07/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51/cache-b591b09c51834ed3.arrow
Running tokenizer on dataset: 100% 2/2 [00:00<00:00, 10.61ba/s]
Running tokenizer on dataset:   0% 0/2 [00:00<?, ?ba/s]INFO:datasets.arrow_dataset:Caching processed dataset at /content/.cache_training/json/default-79a9e082059ced07/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51/cache-a7b29ded24225be9.arrow
Running tokenizer on dataset: 100% 2/2 [00:00<00:00, 10.91ba/s]
INFO:__main__:Sample 10476 of the training set: {'label': 0, 'text': 'i do find new friends i m going to try extra hard to make them stay and if i decide that i don t want to feel hurt again and just ride out the last year of school on my own i m going to have to try extra hard not to care what people think of me being a loner', 'input_ids': [72, 466, 1064, 649, 2460, 1312, 285, 1016, 284, 1949, 3131, 1327, 284, 787, 606, 2652, 290, 611, 1312, 5409, 326, 1312, 836, 256, 765, 284, 1254, 5938, 757, 290, 655, 6594, 503, 262, 938, 614, 286, 1524, 319, 616, 898, 1312, 285, 1016, 284, 423, 284, 1949, 3131, 1327, 407, 284, 1337, 644, 661, 892, 286, 502, 852, 257, 300, 14491, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}.
INFO:__main__:Sample 1824 of the training set: {'label': 1, 'text': 'i asked them to join me in creating a world where all year old girls could grow up feeling hopeful and powerful', 'input_ids': [72, 1965, 606, 284, 4654, 502, 287, 4441, 257, 995, 810, 477, 614, 1468, 4813, 714, 1663, 510, 4203, 17836, 290, 3665, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}.
INFO:__main__:Sample 409 of the training set: {'label': 2, 'text': 'i feel when you are a caring person you attract other caring people into your life', 'input_ids': [72, 1254, 618, 345, 389, 257, 18088, 1048, 345, 4729, 584, 18088, 661, 656, 534, 1204, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}.
[INFO|trainer.py:710] 2023-02-12 23:30:52,243 >> The following columns in the training set don't have a corresponding argument in `GPT2ForSequenceClassificationCustom.forward` and have been ignored: text. If text are not expected by `GPT2ForSequenceClassificationCustom.forward`,  you can safely ignore this message.
/usr/local/lib/python3.8/dist-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
[INFO|trainer.py:1650] 2023-02-12 23:30:52,252 >> ***** Running training *****
[INFO|trainer.py:1651] 2023-02-12 23:30:52,253 >>   Num examples = 16000
[INFO|trainer.py:1652] 2023-02-12 23:30:52,253 >>   Num Epochs = 1
[INFO|trainer.py:1653] 2023-02-12 23:30:52,253 >>   Instantaneous batch size per device = 8
[INFO|trainer.py:1654] 2023-02-12 23:30:52,253 >>   Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:1655] 2023-02-12 23:30:52,253 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1656] 2023-02-12 23:30:52,253 >>   Total optimization steps = 2000
[INFO|trainer.py:1657] 2023-02-12 23:30:52,254 >>   Number of trainable parameters = 137425920
{'loss': 0.9449, 'learning_rate': 3.7500000000000003e-05, 'epoch': 0.25}
 25% 500/2000 [02:07<06:07,  4.08it/s][INFO|trainer.py:2709] 2023-02-12 23:32:59,613 >> Saving model checkpoint to out/emotion/gpt2/checkpoint-500
[INFO|configuration_utils.py:453] 2023-02-12 23:32:59,615 >> Configuration saved in out/emotion/gpt2/checkpoint-500/config.json
[INFO|modeling_utils.py:1704] 2023-02-12 23:33:01,554 >> Model weights saved in out/emotion/gpt2/checkpoint-500/pytorch_model.bin
[INFO|tokenization_utils_base.py:2160] 2023-02-12 23:33:01,555 >> tokenizer config file saved in out/emotion/gpt2/checkpoint-500/tokenizer_config.json
[INFO|tokenization_utils_base.py:2167] 2023-02-12 23:33:01,555 >> Special tokens file saved in out/emotion/gpt2/checkpoint-500/special_tokens_map.json
{'loss': 0.3705, 'learning_rate': 2.5e-05, 'epoch': 0.5}
 50% 1000/2000 [04:17<04:09,  4.01it/s][INFO|trainer.py:2709] 2023-02-12 23:35:09,781 >> Saving model checkpoint to out/emotion/gpt2/checkpoint-1000
[INFO|configuration_utils.py:453] 2023-02-12 23:35:09,783 >> Configuration saved in out/emotion/gpt2/checkpoint-1000/config.json
[INFO|modeling_utils.py:1704] 2023-02-12 23:35:11,881 >> Model weights saved in out/emotion/gpt2/checkpoint-1000/pytorch_model.bin
[INFO|tokenization_utils_base.py:2160] 2023-02-12 23:35:11,882 >> tokenizer config file saved in out/emotion/gpt2/checkpoint-1000/tokenizer_config.json
[INFO|tokenization_utils_base.py:2167] 2023-02-12 23:35:11,882 >> Special tokens file saved in out/emotion/gpt2/checkpoint-1000/special_tokens_map.json
{'loss': 0.264, 'learning_rate': 1.25e-05, 'epoch': 0.75}
 75% 1500/2000 [06:27<02:03,  4.06it/s][INFO|trainer.py:2709] 2023-02-12 23:37:20,141 >> Saving model checkpoint to out/emotion/gpt2/checkpoint-1500
[INFO|configuration_utils.py:453] 2023-02-12 23:37:20,142 >> Configuration saved in out/emotion/gpt2/checkpoint-1500/config.json
[INFO|modeling_utils.py:1704] 2023-02-12 23:37:22,060 >> Model weights saved in out/emotion/gpt2/checkpoint-1500/pytorch_model.bin
[INFO|tokenization_utils_base.py:2160] 2023-02-12 23:37:22,061 >> tokenizer config file saved in out/emotion/gpt2/checkpoint-1500/tokenizer_config.json
[INFO|tokenization_utils_base.py:2167] 2023-02-12 23:37:22,061 >> Special tokens file saved in out/emotion/gpt2/checkpoint-1500/special_tokens_map.json
{'loss': 0.2223, 'learning_rate': 0.0, 'epoch': 1.0}
100% 2000/2000 [08:38<00:00,  4.06it/s][INFO|trainer.py:2709] 2023-02-12 23:39:30,550 >> Saving model checkpoint to out/emotion/gpt2/checkpoint-2000
[INFO|configuration_utils.py:453] 2023-02-12 23:39:30,551 >> Configuration saved in out/emotion/gpt2/checkpoint-2000/config.json
[INFO|modeling_utils.py:1704] 2023-02-12 23:39:32,522 >> Model weights saved in out/emotion/gpt2/checkpoint-2000/pytorch_model.bin
[INFO|tokenization_utils_base.py:2160] 2023-02-12 23:39:32,523 >> tokenizer config file saved in out/emotion/gpt2/checkpoint-2000/tokenizer_config.json
[INFO|tokenization_utils_base.py:2167] 2023-02-12 23:39:32,524 >> Special tokens file saved in out/emotion/gpt2/checkpoint-2000/special_tokens_map.json
[INFO|trainer.py:1901] 2023-02-12 23:39:36,929 >> 

Training completed. Do not forget to share your model on huggingface.co/models =)


{'train_runtime': 524.6759, 'train_samples_per_second': 30.495, 'train_steps_per_second': 3.812, 'train_loss': 0.4504347610473633, 'epoch': 1.0}
100% 2000/2000 [08:44<00:00,  3.81it/s]
[INFO|trainer.py:2709] 2023-02-12 23:39:36,932 >> Saving model checkpoint to out/emotion/gpt2
[INFO|configuration_utils.py:453] 2023-02-12 23:39:36,934 >> Configuration saved in out/emotion/gpt2/config.json
[INFO|modeling_utils.py:1704] 2023-02-12 23:39:39,121 >> Model weights saved in out/emotion/gpt2/pytorch_model.bin
[INFO|tokenization_utils_base.py:2160] 2023-02-12 23:39:39,122 >> tokenizer config file saved in out/emotion/gpt2/tokenizer_config.json
[INFO|tokenization_utils_base.py:2167] 2023-02-12 23:39:39,122 >> Special tokens file saved in out/emotion/gpt2/special_tokens_map.json
***** train metrics *****
  epoch                    =        1.0
  train_loss               =     0.4504
  train_runtime            = 0:08:44.67
  train_samples            =      16000
  train_samples_per_second =     30.495
  train_steps_per_second   =      3.812
INFO:__main__:*** Evaluate ***
[INFO|trainer.py:710] 2023-02-12 23:39:39,296 >> The following columns in the evaluation set don't have a corresponding argument in `GPT2ForSequenceClassificationCustom.forward` and have been ignored: text. If text are not expected by `GPT2ForSequenceClassificationCustom.forward`,  you can safely ignore this message.
[INFO|trainer.py:2964] 2023-02-12 23:39:39,300 >> ***** Running Evaluation *****
[INFO|trainer.py:2966] 2023-02-12 23:39:39,301 >>   Num examples = 2000
[INFO|trainer.py:2969] 2023-02-12 23:39:39,301 >>   Batch size = 8
100% 250/250 [00:16<00:00, 14.71it/s]
***** eval metrics *****
  epoch                   =        1.0
  eval_accuracy           =     0.9355
  eval_loss               =     0.1925
  eval_runtime            = 0:00:17.11
  eval_samples            =       2000
  eval_samples_per_second =    116.846
  eval_steps_per_second   =     14.606
INFO:__main__:*** Predict ***
[INFO|trainer.py:710] 2023-02-12 23:39:56,431 >> The following columns in the test set don't have a corresponding argument in `GPT2ForSequenceClassificationCustom.forward` and have been ignored: text. If text are not expected by `GPT2ForSequenceClassificationCustom.forward`,  you can safely ignore this message.
[INFO|trainer.py:2964] 2023-02-12 23:39:56,432 >> ***** Running Prediction *****
[INFO|trainer.py:2966] 2023-02-12 23:39:56,433 >>   Num examples = 2000
[INFO|trainer.py:2969] 2023-02-12 23:39:56,433 >>   Batch size = 8
100% 250/250 [00:17<00:00, 14.46it/s]
INFO:__main__:***** Predict results None *****
[INFO|modelcard.py:449] 2023-02-12 23:40:14,252 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Text Classification', 'type': 'text-classification'}, 'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.9355000257492065}]}

Save model

drive.mount('/content/drive')
!cp -r /content/out/emotion /content/drive/MyDrive/models
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).