kubapok/sentiment140-word-gap

kubapok fbedb57538 init

2021-11-01 19:46:45 +01:00

879 B

Raw Permalink Blame History

twitter 140 temporal word gap filling

Dataset from paper "Twitter Sentiment Classification using Distant Supervision"

Dev test contains 100k samples from train set. Test set has deleted neutral samples and added 100k samples from train set.

Directory structure

README.md — this file
config.txt — configuration file
train/ — directory with training data
train/in.tsv — input data for the train set
train/expected.tsv — expected (reference) data for the train set
dev-0/ — directory with dev (test) data
dev-0/in.tsv — input data for the dev set
dev-0/expected.tsv — expected (reference) data for the dev set
test-A — directory with test data
test-A/in.tsv — input data for the test set
test-A/expected.tsv — expected (reference) data for the test set