24 lines
879 B
Markdown
24 lines
879 B
Markdown
|
|
||
|
twitter 140 temporal word gap filling
|
||
|
=====================================
|
||
|
|
||
|
Dataset from paper "Twitter Sentiment Classification using Distant Supervision"
|
||
|
|
||
|
Dev test contains 100k samples from train set.
|
||
|
Test set has deleted neutral samples and added 100k samples from train set.
|
||
|
|
||
|
Directory structure
|
||
|
-------------------
|
||
|
|
||
|
* `README.md` — this file
|
||
|
* `config.txt` — configuration file
|
||
|
* `train/` — directory with training data
|
||
|
* `train/in.tsv` — input data for the train set
|
||
|
* `train/expected.tsv` — expected (reference) data for the train set
|
||
|
* `dev-0/` — directory with dev (test) data
|
||
|
* `dev-0/in.tsv` — input data for the dev set
|
||
|
* `dev-0/expected.tsv` — expected (reference) data for the dev set
|
||
|
* `test-A` — directory with test data
|
||
|
* `test-A/in.tsv` — input data for the test set
|
||
|
* `test-A/expected.tsv` — expected (reference) data for the test set
|