sentiment140-word-gap/README.md
2021-11-01 19:46:45 +01:00

879 B

twitter 140 temporal word gap filling

Dataset from paper "Twitter Sentiment Classification using Distant Supervision"

Dev test contains 100k samples from train set. Test set has deleted neutral samples and added 100k samples from train set.

Directory structure

  • README.md — this file
  • config.txt — configuration file
  • train/ — directory with training data
  • train/in.tsv — input data for the train set
  • train/expected.tsv — expected (reference) data for the train set
  • dev-0/ — directory with dev (test) data
  • dev-0/in.tsv — input data for the dev set
  • dev-0/expected.tsv — expected (reference) data for the dev set
  • test-A — directory with test data
  • test-A/in.tsv — input data for the test set
  • test-A/expected.tsv — expected (reference) data for the test set