This commit is contained in:
deadsmond 2019-12-15 17:13:37 +01:00
commit 68f0c953a7
22 changed files with 9585 additions and 9508 deletions

View File

@ -7,7 +7,9 @@ http://poleval2020.nlp.ipipan.waw.pl/challenge/poleval-diachronic-morpho
Also, apart from submitting a solution you're expected to report any Also, apart from submitting a solution you're expected to report any
problems with the web site (including incorrect English, unclear problems with the web site (including incorrect English, unclear
instructions, software errors). instructions, software errors). Please add issues at
https://git.wmi.amu.edu.pl/filipg/gonito with label "poleval".
(first check whether you're the first to submit a problem).
For an account there, please contact Filip Graliński, but do it only For an account there, please contact Filip Graliński, but do it only
when you're sure you'll take part in it — the dataset and instructions when you're sure you'll take part in it — the dataset and instructions
@ -20,7 +22,8 @@ This is a special task, Jenkins/make won't be used. The task will be
scored manually, according to the following criteria: scored manually, according to the following criteria:
* submitting a solution beating a simple baseline along with the * submitting a solution beating a simple baseline along with the
source codes: 4 points source codes: 4 points (the baseline is available here:
http://poleval2020.nlp.ipipan.waw.pl/q/afaf0b03df49b6d14e3a9dd3aeae3a5ea58141a2)
* quality of solution (including the result obtained): 0-8 * quality of solution (including the result obtained): 0-8
* quality of usage report (0-6 points) * quality of usage report (0-6 points)

View File

@ -18,5 +18,5 @@ NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3. indeksu") is divisable by 3.
POINTS: 8 POINTS: 8
DEADLINE: 2019-12-09 18:45 DEADLINE: 2019-12-12 18:45
REMAINDER: 0/3 REMAINDER: 0/3

View File

@ -4,8 +4,8 @@ Deterministic automaton II
Read a description of a finite-state automaton in the AT&T format Read a description of a finite-state automaton in the AT&T format
(without weights) from the standard input. Then, read strings from the (without weights) from the standard input. Then, read strings from the
file whose name was given as the first argument. If a string is file whose name was given as the first argument. If a string is
accepted by the automated, write YES, a space and the string on the accepted by the automated, write TRUE, a space and the string on the
standard output, otherwise — write NO, a space and the string. standard output, otherwise — write FALSE, a space and the string.
If there is a non-determinism in the automaton, the first transition should be chosen. If there is a non-determinism in the automaton, the first transition should be chosen.
@ -16,5 +16,5 @@ NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 1 indeksu") is divisable by 3 with a remainder of 1
POINTS: 8 POINTS: 8
DEADLINE: 2019-12-09 18:45 DEADLINE: 2019-12-12 18:45
REMAINDER: 1/3 REMAINDER: 1/3

View File

@ -24,5 +24,5 @@ NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 2 indeksu") is divisable by 3 with a remainder of 2
POINTS: 8 POINTS: 8
DEADLINE: 2019-12-09 18:45 DEADLINE: 2019-12-12 18:45
REMAINDER: 2/3 REMAINDER: 2/3

View File

@ -1,16 +1,15 @@
Słownik Dictionary
======= ==========
Program powinien wczytać automat skończeniestanowy (bez wag) ze Your program should read a finite-state automaton from standard input.
standardowego wejścia. Zakładamy, że automat jest deterministyczny i The automaton is deterministic, you can assume it does not contain
nie zawiera cykli. cycles.
Każda ścieżka automatu etykietowana jest ciągiem symboli o Each automaton path is labeled with a symbol sequence of the following form:
następującej strukturze:
<słowo wejściowe>;<opis> <input word>;<description>
np. e.g.:
biały;ADJ biały;ADJ
dom;N dom;N
@ -20,37 +19,38 @@ stali;N
stali;V stali;V
stali;ADJ stali;ADJ
Następnie należy wczytać słowa z kolejnych wierszy pliku podanego jako Next you should read words from the file whose name is given as the
argument. Dla każdego słowa należy wypisać wszystkie ścieżki automatu, first argument (`*.arg` file). For each word, you should all automaton
które rozpoczynają się tym słowem, a następnym symbolem jest ';' paths that begin a given word, the following symbol is ';'
(średnik), czyli np. dla słowa wejściowego 'dom' szukamy ścieżek o (semicolon), e.g. for the word 'dom' we are looking for paths
prefiksie 'dom;'. Jeśli nie ma żadnej takiej ścieżki, to należy beginning with 'dom;'. If there is no such path, the following message
wyprowadzić napis: should be printed:
<słowo wejściowe>;OOV <input word>;OOV
Przykładowo, dla automatu ze ścieżkami, jak wyżej, i wejścia: For instance, for the automaton given above and the input:
budynek budynek
dom dom
piła piła
powinniśmy otrzymać: we should get:
budynek;OOV budynek;OOV
dom;N dom;N
piła;N piła;N
piła;V piła;V
W sytuacji, gdy dla jednego prefiksu wejściowego otrzymujemy na wyściu If there is more than one path for a given word, they should be given in alphabetical order.
wiele ścieżek, powinny być one posortowane alfabetycznie.
Program nie musi sprawdzać, czy tekst wczytany ze standardowego The program does not have to check whether the automaton is correct
wejścia jest poprawnym opisem automatu i czy automat jest and whether it is deterministic and does not contain cycles.
deterministyczny i nie zawiera cykli.
NOTE: Task only for students whose student index number ("numer NOTE 1. In section B for points for your tasks, the maximum (rather
indeksu") is divisable by 3 with a remainder of 1 than sum) is taken.
NOTE 2. Task only for students whose student index number ("numer
indeksu") is divisible by 3 with a remainder of 0
POINTS: 14 POINTS: 14
DEADLINE: 2019-12-16 23:59 DEADLINE: 2019-12-16 23:59

View File

@ -1,24 +0,0 @@
Ścieżki
=======
Program powinien wczytać automat skończenie stanowy (bez wag) ze
standardowego wejścia. Zakładamy, że automat jest deterministyczny i
nie zawiera cykli. Alfabet automatu stanowią litery języka polskiego.
Program powinien wypisać na standardowe wyjście wszystkie ścieżki
automatu w porządku alfabetycznym. Wypisać ścieżkę oznacza wyprowadzić
linię tekstu zawierającą jako kolejne znaki kolejne symbole wejściowe
automatu i zakończoną znakiem końca linii.
Program nie musi sprawdzać, czy tekst wczytany ze standardowego
wejścia jest poprawnym opisem automatu i czy automat jest
deterministyczny i nie zawiera cykli.
Ewentualne wagi należy pomijać.
NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 1
POINTS: 14
DEADLINE: 2019-12-16 23:59
REMAINDER: 1/3

File diff suppressed because it is too large Load Diff

View File

@ -1,9 +1,8 @@
#!/usr/bin/python3 #!/usr/bin/python3
import sys import sys
import re import re
import os import os
import shutil
class automata: class automata:
@ -47,13 +46,6 @@ auto = automata()
for line in sys.stdin: for line in sys.stdin:
auto.add_node(line) auto.add_node(line)
'''
shutil.copy(
sys.argv[1].replace('.arg', '.exp'),
sys.argv[1].replace('.arg', '')
)
'''
with open(sys.argv[1].replace('.arg', '.exp'), 'r') as f: with open(sys.argv[1].replace('.arg', '.exp'), 'r') as f:
for line in f: for line in f:
print(''.join(list([list(line), line][1][:-1]))) print(''.join(list([list(line), line][1][:-1])))

View File

@ -1,22 +1,28 @@
Cykle Cycles
===== ======
Program powinien wczytać automat skończeniestanowy (bez wag) ze Your program should read a finite-state automaton (without weights)
standardowego wejścia. Automat może być niedeterministyczny i zawierać from standard input. The automaton can be nondeterministic and can
epsilon-przejścia. contain epsilon-transitions.
Program powinien sprawdzić, czy automat zawiera cykle. Your program should check whether the automaton contains a cycle (of any length).
Jeśli tak, na wyjściu powinna zostać wypisana linia If so, the following line should be written on the standard output:
TAK TAK
w przeciwnym razie otherwise:
NIE NIE
NOTE: Task only for students whose student index number ("numer ("TAK" and "NIE" are "YES" and "NO" in Polish, these are used for
indeksu") is divisable by 3 with a remainder of 2 compatibility with further tasks.)
NOTE 1. In section B for points for your tasks, the maximum (rather
than sum) is taken.
NOTE 2. Task only for students whose student index number ("numer
indeksu") is divisible by 3 with a remainder of 2.
POINTS: 14 POINTS: 14
DEADLINE: 2019-12-16 23:59 DEADLINE: 2019-12-16 23:59

14
TaskB06/description.txt Normal file
View File

@ -0,0 +1,14 @@
Compressing Polish inflected form
=================================
Try to create a deterministic automaton, as small as you can get, for
storing Polish inflected forms listed in the PoliMorf lexicon:
http://zil.ipipan.waw.pl/PoliMorf?action=AttachFile&do=get&target=PoliMorf-0.6.7.tab.gz
There will be automated tests for this task. It will be assessed manually.
NOTE. In section B for points for your tasks, the maximum (rather
than sum) is taken.
POINTS: 18
DEADLINE: 2020-01-10 23:59

9
TaskX05/Makefile Normal file
View File

@ -0,0 +1,9 @@
# .far to specjalny format do przechowywania spakowanych transduktorów
BINARIES += TaskX05/legiatolech.far
TaskX05/run: TaskX05/legiatolech.far
TaskX05/legiatolech.far: TaskX05/legiatolech.grm
LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/lib/fst" thraxcompiler --input_grammar=$< --output_far=$@

9
TaskX05/description.txt Normal file
View File

@ -0,0 +1,9 @@
Write a Thrax grammar which replaces all occurrences of Polish
inflected forms of the word "Legia" ("Legia", "Legią", "Legię",
"Legii") into the corresponding forms of the word "Lech" ("Lech",
"Lechem", "Lecha", "Lecha", respectively).
The task has a solution already, do not solve it!
POINTS: 0
DEADLINE: 2020-01-30 23:00

42
TaskX05/legiatolech.grm Normal file
View File

@ -0,0 +1,42 @@
# zamiana końcówek, "i"/"ę" zgrupowane, bo zamieniamy na wspólną końcówkę
suffixes = ("a" : "") | ("i" | "ę" : "a") | ("ą" : "em");
# zamiana rdzenia
legia_to_lech_core = "Legi" : "Lech";
legia_to_lech = legia_to_lech_core suffixes;
# Wszystkie możliwe znaki, niestety nie ma (?) prostszego sposobu.
# Zauważmy, że polskie znaki diakrytyczne będą reprezentowane
# tak naprawdę jako dwuznaki, nie stanowi to jednak problemu.
allChars = Optimize[
"[1]" | "[2]" | "[3]" | "[4]" | "[5]" | "[6]" | "[7]" | "[8]" | "[9]" | "[10]" |
"[11]" | "[12]" | "[13]" | "[14]" | "[15]" | "[16]" | "[17]" | "[18]" | "[19]" | "[20]" |
"[21]" | "[22]" | "[23]" | "[24]" | "[25]" | "[26]" | "[27]" | "[28]" | "[29]" | "[30]" |
"[31]" | "[32]" | "[33]" | "[34]" | "[35]" | "[36]" | "[37]" | "[38]" | "[39]" | "[40]" |
"[41]" | "[42]" | "[43]" | "[44]" | "[45]" | "[46]" | "[47]" | "[48]" | "[49]" | "[50]" |
"[51]" | "[52]" | "[53]" | "[54]" | "[55]" | "[56]" | "[57]" | "[58]" | "[59]" | "[60]" |
"[61]" | "[62]" | "[63]" | "[64]" | "[65]" | "[66]" | "[67]" | "[68]" | "[69]" | "[70]" |
"[71]" | "[72]" | "[73]" | "[74]" | "[75]" | "[76]" | "[77]" | "[78]" | "[79]" | "[80]" |
"[81]" | "[82]" | "[83]" | "[84]" | "[85]" | "[86]" | "[87]" | "[88]" | "[89]" | "[90]" |
"[91]" | "[92]" | "[93]" | "[94]" | "[95]" | "[96]" | "[97]" | "[98]" | "[99]" | "[100]" |
"[101]" | "[102]" | "[103]" | "[104]" | "[105]" | "[106]" | "[107]" | "[108]" | "[109]" | "[110]" |
"[111]" | "[112]" | "[113]" | "[114]" | "[115]" | "[116]" | "[117]" | "[118]" | "[119]" | "[120]" |
"[121]" | "[122]" | "[123]" | "[124]" | "[125]" | "[126]" | "[127]" | "[128]" | "[129]" | "[130]" |
"[131]" | "[132]" | "[133]" | "[134]" | "[135]" | "[136]" | "[137]" | "[138]" | "[139]" | "[140]" |
"[141]" | "[142]" | "[143]" | "[144]" | "[145]" | "[146]" | "[147]" | "[148]" | "[149]" | "[150]" |
"[151]" | "[152]" | "[153]" | "[154]" | "[155]" | "[156]" | "[157]" | "[158]" | "[159]" | "[160]" |
"[161]" | "[162]" | "[163]" | "[164]" | "[165]" | "[166]" | "[167]" | "[168]" | "[169]" | "[170]" |
"[171]" | "[172]" | "[173]" | "[174]" | "[175]" | "[176]" | "[177]" | "[178]" | "[179]" | "[180]" |
"[181]" | "[182]" | "[183]" | "[184]" | "[185]" | "[186]" | "[187]" | "[188]" | "[189]" | "[190]" |
"[191]" | "[192]" | "[193]" | "[194]" | "[195]" | "[196]" | "[197]" | "[198]" | "[199]" | "[200]" |
"[201]" | "[202]" | "[203]" | "[204]" | "[205]" | "[206]" | "[207]" | "[208]" | "[209]" | "[210]" |
"[211]" | "[212]" | "[213]" | "[214]" | "[215]" | "[216]" | "[217]" | "[218]" | "[219]" | "[220]" |
"[221]" | "[222]" | "[223]" | "[224]" | "[225]" | "[226]" | "[227]" | "[228]" | "[229]" | "[230]" |
"[231]" | "[232]" | "[233]" | "[234]" | "[235]" | "[236]" | "[237]" | "[238]" | "[239]" | "[240]" |
"[241]" | "[242]" | "[243]" | "[244]" | "[245]" | "[246]" | "[247]" | "[248]" | "[249]" | "[250]" |
"[251]" | "[252]" | "[253]" | "[254]" | "[255]"
];
export PROCESS = Optimize[CDRewrite[legia_to_lech, "", "", allChars*]];

3
TaskX05/run Executable file
View File

@ -0,0 +1,3 @@
#!/bin/bash
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/fst" thraxrewrite-tester --far=TaskX05/legiatolech.far --rules=PROCESS

6
TaskX05/test.exp Normal file
View File

@ -0,0 +1,6 @@
Input string: Output string: Lech Warszawa
Input string: Output string: Górnik gra z Lechem
Input string: Output string: Lech Lecha Lecha Lechem
Input string: Output string: Lechxxxxx
Input string: Output string: tu nic nie ma do zamiany
Input string:

5
TaskX05/test.in Normal file
View File

@ -0,0 +1,5 @@
Legia Warszawa
Górnik gra z Legią
Legia Legii Legię Legią
Legiaxxxxx
tu nic nie ma do zamiany

8
TaskX06/description.txt Normal file
View File

@ -0,0 +1,8 @@
Write a Thrax grammar which replaces all 0s to 1s and 1s to 0s (other
digits should not be changed).
You can assume that only string composed of digits are given on the
input.
POINTS: 1
DEADLINE: 2019-12-13 23:59

6
TaskX06/test.exp Normal file
View File

@ -0,0 +1,6 @@
Input string: Output string: 0023410
Input string: Output string: 0101010
Input string: Output string: 9999999
Input string: Output string: 888088888888888
Input string: Output string: 111
Input string:

5
TaskX06/test.in Normal file
View File

@ -0,0 +1,5 @@
1123401
1010101
9999999
888188888888888
000

View File

@ -19,7 +19,7 @@ cp "${PREFIX}/count-points.pl" arena/
cp "${PREFIX}/overrides.txt" arena/ cp "${PREFIX}/overrides.txt" arena/
cp "${PREFIX}/Makefile" arena/ cp "${PREFIX}/Makefile" arena/
for TX in X01 X02 X03 X04 B00 B01 B02 B03 B04 B05 E01 E02 E03 E04 # X05 X06 X07 X08 X09 X10 B03 B04 X10 for TX in X01 X02 X03 X04 X05 X06 B00 B01 B02 B03 B04 B05 B06 E01 E02 E03 E04 # X05 X06 X07 X08 X09 X10 B03 B04 X10
do do
mkdir -p arena/Task$TX mkdir -p arena/Task$TX
done done

View File

@ -241,6 +241,7 @@ sub is_estudent {
my %estudents = map { $_ => 1 } split/\n/,<<'END_OF_NUMBERS'; my %estudents = map { $_ => 1 } split/\n/,<<'END_OF_NUMBERS';
16136 16136
21804
30291 30291
30686 30686
32746 32746

View File

@ -0,0 +1 @@
434739 A44 4 manually