s452101/TAU_22_sane_words_torch_nn_onehot

Go to file

ksanu dd0e6c1fba test5		2019-12-04 01:07:41 +01:00
.idea	test5	2019-12-04 01:07:41 +01:00
dev-0	test5	2019-12-04 01:07:41 +01:00
test-A	test5	2019-12-04 01:07:41 +01:00
train	TAU_22_sane_words_torch_nn	2019-12-02 14:41:07 +01:00
.gitignore	TAU_22_sane_words_torch_nn	2019-12-02 14:41:07 +01:00
config.txt	TAU_22_sane_words_torch_nn	2019-12-02 14:41:07 +01:00
README.md	TAU_22_sane_words_torch_nn	2019-12-02 14:41:07 +01:00
s.py	test4	2019-12-04 00:20:52 +01:00

README.md

Sane words challenge

Guess if a given word is a correct Polish word in a given domain. Additionally, you have the information on reported frequency of the word in source texts.

Each entry in training data set is of the form: Sane (0 or 1), Domain, Word, Frequency. Evaluation metric is F2-score.

Directory structure

README.md — this file
config.txt — configuration file
train/ — directory with training data
train/train.tsv — train set
dev-0/ — directory with dev (test) data
dev-0/in.tsv — input data for the dev set
dev-0/expected.tsv — expected (reference) data for the dev set
test-A — directory with test data
test-A/in.tsv — input data for the test set
test-A/expected.tsv — expected (reference) data for the test set