2023-02-13 14:35:40 +01:00
|
|
|
# Projekt na przedmiot Uczenie głębokie w przetwarzaniu tekstu
|
|
|
|
|
|
|
|
## Skład grupy:
|
|
|
|
|
2023-02-13 14:49:03 +01:00
|
|
|
- Michał Kozłowski 444415
|
2023-02-13 17:29:51 +01:00
|
|
|
- Szymon Jadczak 444386
|
2023-02-13 14:35:40 +01:00
|
|
|
|
|
|
|
## Modele:
|
|
|
|
- RobertaForSequenceClassification
|
2023-02-13 18:26:05 +01:00
|
|
|
- GPT2ForSequenceClassification
|
2023-02-13 14:35:40 +01:00
|
|
|
- T5
|
|
|
|
- FLAN-T5
|
|
|
|
|
2023-02-13 14:46:19 +01:00
|
|
|
## Accuracy test split
|
|
|
|
- RobertaForSequenceClassification -> 0.9392201834862385
|
2023-02-13 18:26:05 +01:00
|
|
|
- GPT2ForSequenceClassification -> 0.9174311926605505
|
2023-02-13 16:09:28 +01:00
|
|
|
- T5 -> 0.9129464285714286
|
2023-02-13 17:29:51 +01:00
|
|
|
- FLAN-T5 -> 0.903114186851211
|
2023-02-13 14:46:19 +01:00
|
|
|
|
2023-02-13 14:35:40 +01:00
|
|
|
## Trenowanie
|
|
|
|
- Google Colab
|
|
|
|
|
2023-02-13 14:41:40 +01:00
|
|
|
## Link do modeli na google drive
|
2023-02-13 14:42:13 +01:00
|
|
|
https://drive.google.com/drive/folders/1GWNah7-LZI7jrFzUpL9Le1E7b73RutbU?usp=sharing
|
2023-02-13 14:41:40 +01:00
|
|
|
|
2023-02-13 14:47:32 +01:00
|
|
|
### Linki do poszczególnych modeli na huggingface oraz wykresów na tensorboard znajdują się w notatnikach
|
2023-02-13 14:35:40 +01:00
|
|
|
|
|
|
|
## Dataset:
|
|
|
|
- Orginalny dataset: https://huggingface.co/datasets/sst2
|
|
|
|
- Przetworzony dataset: https://huggingface.co/datasets/Zombely/sst2-project-dataset
|