Go to file
2023-10-12 11:53:23 +02:00
dev-0 solution 2023-10-12 11:53:23 +02:00
dev-1 solution 2023-10-12 11:53:23 +02:00
test-A solution 2023-10-12 11:52:52 +02:00
config.txt solution 2023-10-12 11:53:23 +02:00
README.md solution 2023-10-12 11:53:23 +02:00

Diachronic normalisation of Polish texts

Transform old Polish texts into modern spelling.

CharMatch metric is used here, i.e. F-score for expected corrections (i.e. changes between the input text and the expected output).

Directory structure

  • README.md — this file
  • config.txt — configuration file
  • dev-0/ — directory with dev (test) data
  • dev-0/in.tsv — input text for the dev set
  • dev-0/expected.tsv — reference text for the dev set
  • dev-1/ — directory with another dev (test) set
  • dev-1/in.tsv — input text for the dev set
  • dev-1/expected.tsv — reference text for the dev set
  • test-A — directory with test data
  • test-A/in.tsv — input data for the test set
  • test-A/expected.tsv — reference text for the test set