This commit is contained in:
kubapok 2021-11-01 19:46:45 +01:00
commit fbedb57538
7 changed files with 1700384 additions and 0 deletions

1
.gitignore vendored Normal file
View File

@ -0,0 +1 @@
fsdgsgsg

23
README.md Normal file
View File

@ -0,0 +1,23 @@
twitter 140 temporal word gap filling
=====================================
Dataset from paper "Twitter Sentiment Classification using Distant Supervision"
Dev test contains 100k samples from train set.
Test set has deleted neutral samples and added 100k samples from train set.
Directory structure
-------------------
* `README.md` — this file
* `config.txt` — configuration file
* `train/` — directory with training data
* `train/in.tsv` — input data for the train set
* `train/expected.tsv` — expected (reference) data for the train set
* `dev-0/` — directory with dev (test) data
* `dev-0/in.tsv` — input data for the dev set
* `dev-0/expected.tsv` — expected (reference) data for the dev set
* `test-A` — directory with test data
* `test-A/in.tsv` — input data for the test set
* `test-A/expected.tsv` — expected (reference) data for the test set

1
config.txt Normal file
View File

@ -0,0 +1 @@
--metric LikelihoodHashed --precision 5

100000
dev-0/expected.tsv Normal file

File diff suppressed because it is too large Load Diff

100000
dev-0/in.tsv Normal file

File diff suppressed because it is too large Load Diff

100359
test-A/in.tsv Normal file

File diff suppressed because it is too large Load Diff

1400000
train/train.tsv Normal file

File diff suppressed because it is too large Load Diff