added all experiments

2023-02-10 12:42:56 +01:00 · 2023-02-10 12:42:56 +01:00 · 1d9457baaa
commit 1d9457baaa
parent 14a6a9b3c3
7 changed files with 23895 additions and 33 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1 +1,2 @@
-.DS_Store
+.DS_Store
+models/
--- a/projekt/BERT_sms_spam.ipynb
+++ b/projekt/BERT_sms_spam.ipynb
--- a/projekt/FLAN_T5_sms_spam.ipynb
+++ b/projekt/FLAN_T5_sms_spam.ipynb
--- a/projekt/GPT2_sms_spam.ipynb
+++ b/projekt/GPT2_sms_spam.ipynb
--- a/projekt/README.md
+++ b/projekt/README.md
@ -0,0 +1,73 @@
+# Projekt
+Wykrywanie czy podany SMS to spam - klasyfikacja.
+
+## Zbiór danych
+Wykorzystaliśmy zbiór danych [sms spam](https://huggingface.co/datasets/sms_spam). Dataset posiada tylko zbiór treningowy dlatego w trakcie uczenia modeli podzielilśmy go losowo na 3 podzbiory:
+- zbiór testowy 1 000 przykładów
+- zbiór treningowy 4 116 przykładów
+- zbiór walidacyjny 458 przykładów
+
+## Ewaluacja
+Ewaluacja modeli występuje po etapie trenowania na zbiorze testowym. Metryki:
+- accuracy 0-100%
+- Matthews’s correlation coefficient - w skrócie accuracy, tylko bierze pod uwagę zbalansowanie zbioru, wyniki: -1 przeciwne predykcje, 0 losowe, 1 100% dokładności.
+
+## Rozwiązania
+Wykorzystaliśmy 4 modele - BERT, GPT2, T5 oraz FLAN-T5
+
+### Transformer Encoder - BERT
+Najważniejsze cechy:
+- wytrenowany model: bert-base-uncased
+- typ modelu transformers.BertForSequenceClassification 
+- input modelu - treść smsa
+- output modelu - klasa 1 lub 2
+- finetuning na zbiorze treningowym
+  - adamW optimizer
+  - learning rade 2e-5
+  - 32 batch size
+  - 4 epoch
+- Accuracy: 99%
+- MCC: 0.973
+
+### Transformer Decoder - GPT2
+Najważniejsze cechy:
+- wytrenowany model gpt2
+- typ modelu transformers.GPT2ForSequenceClassification
+- input modelu - treść smsa
+- output modelu - klasa 1 lub 2
+- finetuning na zbiorze treningowym
+  - adamW optimizer
+  - learning rate 2e-5
+  - 8 batch size (because of OOM)
+  - 4 epoch
+- Accuracy: 99%
+- MCC: 0.960
+
+### Transformer Encoder-Decoder - T5
+Najważniejsze cechy:
+- wytrenowany model t5-base
+- typ modelu transformers.T5ForConditionalGeneration
+- input modelu - treść smsa
+- output modelu - tekstowo klasa 1 'conversation' lub klasa 2 'advertising'
+- finetuning na zbiorze treningowym
+  - adamW optimizer
+  - learning rate 3e-4
+  - 16 batch size
+  - 4 epoch
+- Accuracy: 0%
+- MCC: 0
+
+### Zero-shot Transformer Encoder-Decoder - FLAN-T5
+Najważniejsze cechy:
+- wytrenowany model google/flan-t5-base
+- typ modelu transformers.AutoModelForSeq2SeqLM
+- input modelu - Opis zadania + treść smsa
+  - Przykład: "Answer the question in one word - true if provided text is spam or false, if provided text is not spam. \nQ: Is this text spam? \nText: treść smsa \nA:"
+- output modelu - tekstowo klasa 1 'true' lub klasa 2 'false'
+- finetuning na zbiorze treningowym
+  - adamW optimizer
+  - learning rate 3e-4
+  - 8 batch size
+  - 4 epoch
+- Accauracy: 43%
+- MCC: -0.033
--- a/projekt/T5_sms_spam.ipynb
+++ b/projekt/T5_sms_spam.ipynb
--- a/projekt/transformer_encoder.ipynb
+++ b/projekt/transformer_encoder.ipynb
@ -1,32 +0,0 @@
-{
- "cells": [
-  {
-   "attachments": {},
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Rozwiązanie oparte na modelu transformer encoder\n",
-    "https://colab.research.google.com/drive/1lbwSUqLABIfcPwFhD5iSMR0v5Tv0yLGI?usp=sharing"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "name": "python",
-   "version": "3.10.9 (main, Dec 15 2022, 18:18:30) [Clang 14.0.0 (clang-1400.0.29.202)]"
-  },
-  "orig_nbformat": 4,
-  "vscode": {
-   "interpreter": {
-    "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
-   }
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}