Mechanism for preparing corpora for Concordia, built around the Fast-aligner software.
Go to file
2019-08-29 21:08:15 +02:00
bad-words redesign 2019-06-13 12:34:19 +02:00
dgt process dgt 2019-06-27 14:40:43 +02:00
dictionaries dictionaries, paths 2019-06-13 12:44:16 +02:00
censor_sources.py redesign 2019-06-13 12:34:19 +02:00
collect_dict.py dictionaries, paths 2019-06-13 12:44:16 +02:00
get_alignments.py redesign 2019-06-13 12:34:19 +02:00
Makefile generating src_clean.tok 2019-08-29 21:08:15 +02:00
prepare_corpus.py generating src_clean.tok 2019-08-29 21:08:15 +02:00
sentence_lemmatizer.py lemmatizer 2019-06-26 09:08:00 +02:00