
2067 lines
130 KiB
Raw Normal View History

2021-06-23 10:01:55 +02:00
"cells": [
"cell_type": "markdown",
"metadata": {},
"source": [
"cell_type": "markdown",
"metadata": {},
"source": [
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from transformers import T5Tokenizer, T5ForConditionalGeneration"
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"text = \"translate English to French: My name is Azeem and I live in India\""
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"text = \"summarize: Machine learning involves computers discovering how they can perform tasks without being explicitly programmed to do so. It involves computers learning from data provided so that they carry out certain tasks. For simple tasks assigned to computers, it is possible to program algorithms telling the machine how to execute all steps required to solve the problem at hand; on the computer's part, no learning is needed. For more advanced tasks, it can be challenging for a human to manually create the needed algorithms. In practice, it can turn out to be more effective to help the machine develop its own algorithm, rather than having human programmers specify every needed step.\""
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"machine learning involves computers learning from data provided so that they carry out certain tasks without being explicitly programme\n"
"source": [
"from transformers import T5Tokenizer, T5ForConditionalGeneration\n",
"tokenizer = T5Tokenizer.from_pretrained('t5-small')\n",
"model = T5ForConditionalGeneration.from_pretrained('t5-small', return_dict=True,).to('cuda')\n",
"# You can also use \"translate English to French\" and \"translate English to Romanian\"\n",
"input_ids = tokenizer(text, return_tensors=\"pt\")'cuda') # Batch size 1\n",
"outputs = model.generate(input_ids)\n",
"decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
" (shared): Embedding(32128, 512)\n",
" (encoder): T5Stack(\n",
" (embed_tokens): Embedding(32128, 512)\n",
" (block): ModuleList(\n",
" (0): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" (relative_attention_bias): Embedding(32, 8)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (1): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (2): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (3): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (4): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (5): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" )\n",
" (final_layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (decoder): T5Stack(\n",
" (embed_tokens): Embedding(32128, 512)\n",
" (block): ModuleList(\n",
" (0): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" (relative_attention_bias): Embedding(32, 8)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (1): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (2): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (3): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (4): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (5): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" )\n",
" (final_layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (lm_head): Linear(in_features=512, out_features=32128, bias=False)\n",
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"KLEISTER_PATH = '/media/kuba/ssdsam/Syncthing/Syncthing/przedmioty/2020-02/IE/applica/kleister-nda/'"
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"train_exp_f = open(KLEISTER_PATH + 'train/expected.tsv')\n",
"train_exp = []\n",
"for line in train_exp_f:\n",
" line_splitted = line.strip('\\n').split(' ')\n",
" found = False\n",
" for elem in line_splitted:\n",
" if 'jurisdiction=' in elem:\n",
" train_exp.append('jurisdiction: ' + elem.split('=')[1])\n",
" found = True\n",
" break\n",
" if not found:\n",
" train_exp.append('jurisdiction: NONE')"
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"dev_exp_f = open(KLEISTER_PATH + 'dev-0/expected.tsv')\n",
"dev_exp = []\n",
"for line in dev_exp_f:\n",
" line_splitted = line.strip('\\n').split(' ')\n",
" found = False\n",
" for elem in line_splitted:\n",
" if 'jurisdiction=' in elem:\n",
" dev_exp.append('jurisdiction: ' + elem.split('=')[1])\n",
" found = True\n",
" break\n",
" if not found:\n",
" dev_exp.append('jurisdiction: NONE')"
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"['jurisdiction: Oregon',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Iowa',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Indiana',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Michigan',\n",
" 'jurisdiction: Indiana',\n",
" 'jurisdiction: Colorado',\n",
" 'jurisdiction: Georgia',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Connecticut',\n",
" 'jurisdiction: Nevada',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: Idaho',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Minnesota',\n",
" 'jurisdiction: Virginia',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Nevada',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Washington',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: Nevada',\n",
" 'jurisdiction: Georgia',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Virginia',\n",
" 'jurisdiction: Wisconsin',\n",
" 'jurisdiction: Colorado',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: South_Dakota',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Indiana',\n",
" 'jurisdiction: Minnesota',\n",
" 'jurisdiction: Maine',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: Indiana',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: New_Jersey',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Maine',\n",
" 'jurisdiction: North_Carolina',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: Georgia',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Georgia',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Kansas',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Connecticut',\n",
" 'jurisdiction: Utah',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: South_Carolina',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_Jersey',\n",
" 'jurisdiction: Georgia',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: North_Carolina',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: Virginia',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Wisconsin',\n",
" 'jurisdiction: Washington',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_Jersey',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Utah',\n",
" 'jurisdiction: Washington',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Colorado',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: New_Jersey',\n",
" 'jurisdiction: Virginia',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Nevada',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: New_Jersey',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: New_Jersey',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Nevada',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: Kansas',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: New_Jersey',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: Minnesota',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Colorado',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Indiana',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: Illinois',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Oregon',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: Michigan',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Georgia',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: Massachusetts',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: Michigan',\n",
" 'jurisdiction: Washington',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Missouri',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Texas',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: Ohio',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Pennsylvania',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Rhode_Island',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Florida',\n",
" 'jurisdiction: New_York',\n",
" 'jurisdiction: Delaware',\n",
" 'jurisdiction: California',\n",
" 'jurisdiction: Delaware']"
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"train_in_f = open(KLEISTER_PATH + 'train/in.tsv')\n",
"train_in = []\n",
"for line in train_in_f:\n",
" line = line.rstrip('\\n')\n",
" train_in.append(line)"
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"dev_in_f = open(KLEISTER_PATH + 'dev-0/in.tsv')\n",
"dev_in = []\n",
"for line in dev_in_f:\n",
" line = line.rstrip('\\n')\n",
" dev_in.append(line)"
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"'00a1d238e37ac225b8045a97953e845d.pdf\\teffective_date jurisdiction party term\\tEX-10.23 5 dex1023.htm COVENANT NOT TO COMPETE AND NON-DISCLOSURE AGREEMENT\\\\nExhibit 10.23\\\\nCOVENANT NOT TO COMPETE\\\\nAND NON-DISCLOSURE AGREEMENT\\\\nPARTIES:\\\\nEric Dean Sprunk (“EMPLOYEE”)\\\\nand\\\\nNIKE, Inc., divisions, subsidiaries\\\\nand affiliates. (“NIKE”):\\\\nRECITALS:\\\\nA. This Covenant Not to Compete and Non-Disclosure Agreement is executed upon initial employment or upon the EMPLOYEEs\\\\nadvancement with NIKE and is a condition of such employment or advancement.\\\\nB. Over the course of EMPLOYEEs employment with NIKE, EMPLOYEE will be or has been exposed to and/or is in a position to\\\\ndevelop confidential information peculiar to NIKEs business and not generally known to the public as defined below (“Protected Information”). It is\\\\nanticipated that EMPLOYEE will continue to be exposed to Protected Information of greater sensitivity as EMPLOYEE advances in the company.\\\\nC. The nature of NIKEs business is highly competitive and disclosure of any Protected Information would result in severe damage to NIKE\\\\nand be difficult to measure.\\\\nD. NIKE makes use of its Protected Information throughout the world. Protected Information of NIKE can be used to NIKEs detriment\\\\nanywhere in the world.\\\\nAGREEMENT:\\\\nIn consideration of the foregoing, and the terms and conditions set forth below, the parties agree as follows:\\\\n1. Covenant Not to Compete.\\\\n(a) Competition Restriction. During EMPLOYEEs employment by NIKE, under the terms of any employment contract or\\\\notherwise, and for one year thereafter, (the “Restriction Period”), EMPLOYEE will not directly or indirectly, own, manage, control, or participate in\\\\nthe ownership,\\\\nmanagement or control of, or be employed by, consult for, or be connected in any manner with, any business engaged anywhere in the world in the\\\\nathletic footwear, athletic apparel or sports equipment and accessories business, or any other business which directly competes with NIKE or any of\\\\nits parent, subsidiaries or affiliated corporations ( “Competitor”). By way of illustration only, examples of NIKE competitors include, but are not\\\\nlimited to: Adidas, FILA, Reebok, Puma, Champion, Oakley, DKNY, Converse, Asics, Saucony, New Balance, Ralph Lauren/Polo Sport, B.U.M,\\\\nFUBU, The Gap, Tommy Hilfiger, Umbro, Northface, Venator (Foot lockers), Sports Authority, Columbia Sportswear, Wilson, Mizuno, Callaway\\\\nGolf and Titleist. This provision is subject to NIKEs option to waive all or any portion of the Restriction Period as more specifically provided\\\\nbelow.\\\\n(b) Extension of Time. In the event EMPLOYEE breaches this covenant not to compete, the Restriction Period shall automatically\\\\ntoll from the date of the first breach, and all subsequent breaches, until the resolution of the breach through private settlement, judicial or other\\\\naction, including all appeals. The Restriction Period shall continue upon the effective date of any such settlement judicial or other resolution. NIKE\\\\nshall not be obligated to pay EMPLOYEE the additional compensation described in paragraph 1(d) below during any period of time in which this\\\\nAgreement is tolled due to EMPLOYEEs breach. In the event EMPLOYEE receives such additional compensation after any such breach,\\\\nEMPLOYEE must immediately reimburse NIKE in the amount of all such compensation upon the receipt of a written request by NIKE.\\\\n(c) Waiver of Non-Compete. NIKE has the option, in its sole discretion, to elect to waive all or a portion of the Restriction Period or\\\\nto limit the definition of Competitor, by giving EMPLOYEE seven (7) days prior notice of such election. In the event all or a portion of the\\\\nRestriction Period is waived, NIKE shall not be obligated to pay EMPLOYEE for any period of time as to which the covenant not to compete has\\\\nbeen waived.\\\\n(d) Additional Consideration. As additional consideration for the cov
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"device(type='cuda', index=0)"
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
"name": "stderr",
"output_type": "stream",
"text": [
"Token indices sequence length is longer than the specified maximum sequence length for this model (11717 > 512). Running this sequence through the model will result in indexing errors\n"
"name": "stdout",
"output_type": "stream",
"text": [
"and non-disclosure Agreement.n(a) Competition Restriction.\n"
"source": [
"input = train_in[0]\n",
"# You can also use \"translate English to French\" and \"translate English to Romanian\"\n",
"input_ids = tokenizer(input, return_tensors=\"pt\").input_ids[:,:512].to('cuda') # Batch size 1\n",
"outputs = model.generate(input_ids)\n",
"decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"input_ids = tokenizer('translate English to German: The house is wonderful.', return_tensors='pt')'cuda')\n",
"labels = tokenizer('Das Haus ist wunderbar.', return_tensors='pt')'cuda')\n",
"# the forward function automatically creates the correct decoder_input_ids\n",
"loss = model(input_ids=input_ids, labels=labels).loss"
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"tensor(0.2543, device='cuda:0', grad_fn=<NllLossBackward>)"
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"from transformers import AdamW\n",
"optimizer = AdamW(model.parameters(), lr=5e-5)"
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
" (shared): Embedding(32128, 512)\n",
" (encoder): T5Stack(\n",
" (embed_tokens): Embedding(32128, 512)\n",
" (block): ModuleList(\n",
" (0): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" (relative_attention_bias): Embedding(32, 8)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (1): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (2): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (3): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (4): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (5): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" )\n",
" (final_layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (decoder): T5Stack(\n",
" (embed_tokens): Embedding(32128, 512)\n",
" (block): ModuleList(\n",
" (0): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" (relative_attention_bias): Embedding(32, 8)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (1): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (2): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (3): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (4): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (5): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" )\n",
" (final_layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (lm_head): Linear(in_features=512, out_features=32128, bias=False)\n",
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"source": [
"for line_in, line_exp in zip(train_in, train_exp):\n",
" input_ids = tokenizer(line_in, return_tensors='pt').input_ids[:,:512].to('cuda')\n",
" labels = tokenizer(line_exp, return_tensors='pt')'cuda')\n",
" # the forward function automatically creates the correct decoder_input_ids\n",
" loss = model(input_ids=input_ids, labels=labels).loss\n",
" loss.backward()\n",
" optimizer.step()\n",
" optimizer.zero_grad()\n",
" print(loss.item())"
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
" (shared): Embedding(32128, 512)\n",
" (encoder): T5Stack(\n",
" (embed_tokens): Embedding(32128, 512)\n",
" (block): ModuleList(\n",
" (0): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" (relative_attention_bias): Embedding(32, 8)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (1): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (2): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (3): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (4): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (5): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" )\n",
" (final_layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (decoder): T5Stack(\n",
" (embed_tokens): Embedding(32128, 512)\n",
" (block): ModuleList(\n",
" (0): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" (relative_attention_bias): Embedding(32, 8)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (1): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (2): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (3): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (4): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" (5): T5Block(\n",
" (layer): ModuleList(\n",
" (0): T5LayerSelfAttention(\n",
" (SelfAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (1): T5LayerCrossAttention(\n",
" (EncDecAttention): T5Attention(\n",
" (q): Linear(in_features=512, out_features=512, bias=False)\n",
" (k): Linear(in_features=512, out_features=512, bias=False)\n",
" (v): Linear(in_features=512, out_features=512, bias=False)\n",
" (o): Linear(in_features=512, out_features=512, bias=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (2): T5LayerFF(\n",
" (DenseReluDense): T5DenseReluDense(\n",
" (wi): Linear(in_features=512, out_features=2048, bias=False)\n",
" (wo): Linear(in_features=2048, out_features=512, bias=False)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" )\n",
" (final_layer_norm): T5LayerNorm()\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (lm_head): Linear(in_features=512, out_features=32128, bias=False)\n",
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"jurisdiction: Colorado\n"
"source": [
"input = dev_in[0]\n",
"input_ids = tokenizer(input, return_tensors=\"pt\").input_ids[:,:512].to('cuda') # Batch size 1\n",
"outputs = model.generate(input_ids)\n",
"decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
"cell_type": "code",
"execution_count": 22,
"metadata": {
"scrolled": true
"outputs": [
"data": {
"text/plain": [
"'jurisdiction: New_York'"
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"jurisdiction: Delaware\n"
"source": [
"input = dev_in[2]\n",
"input_ids = tokenizer(input, return_tensors=\"pt\").input_ids[:,:512].to('cuda') # Batch size 1\n",
"outputs = model.generate(input_ids)\n",
"decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"'jurisdiction: Delaware'"
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "markdown",
"metadata": {},
"source": [
"## pytanie:\n",
"- co można poprawić w istniejącym rozwiązaniu?"
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2021-07-12 12:44:24 +02:00
"version": "3.8.3"
2021-06-23 10:01:55 +02:00
"nbformat": 4,
"nbformat_minor": 4