This commit is contained in:
deadsmond 2019-12-15 17:13:37 +01:00
commit 68f0c953a7
22 changed files with 9585 additions and 9508 deletions

View File

@ -7,7 +7,9 @@ http://poleval2020.nlp.ipipan.waw.pl/challenge/poleval-diachronic-morpho
Also, apart from submitting a solution you're expected to report any
problems with the web site (including incorrect English, unclear
instructions, software errors).
instructions, software errors). Please add issues at
https://git.wmi.amu.edu.pl/filipg/gonito with label "poleval".
(first check whether you're the first to submit a problem).
For an account there, please contact Filip Graliński, but do it only
when you're sure you'll take part in it — the dataset and instructions
@ -20,7 +22,8 @@ This is a special task, Jenkins/make won't be used. The task will be
scored manually, according to the following criteria:
* submitting a solution beating a simple baseline along with the
source codes: 4 points
source codes: 4 points (the baseline is available here:
http://poleval2020.nlp.ipipan.waw.pl/q/afaf0b03df49b6d14e3a9dd3aeae3a5ea58141a2)
* quality of solution (including the result obtained): 0-8
* quality of usage report (0-6 points)

View File

@ -18,5 +18,5 @@ NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3.
POINTS: 8
DEADLINE: 2019-12-09 18:45
DEADLINE: 2019-12-12 18:45
REMAINDER: 0/3

View File

@ -4,8 +4,8 @@ Deterministic automaton II
Read a description of a finite-state automaton in the AT&T format
(without weights) from the standard input. Then, read strings from the
file whose name was given as the first argument. If a string is
accepted by the automated, write YES, a space and the string on the
standard output, otherwise — write NO, a space and the string.
accepted by the automated, write TRUE, a space and the string on the
standard output, otherwise — write FALSE, a space and the string.
If there is a non-determinism in the automaton, the first transition should be chosen.
@ -16,5 +16,5 @@ NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 1
POINTS: 8
DEADLINE: 2019-12-09 18:45
DEADLINE: 2019-12-12 18:45
REMAINDER: 1/3

View File

@ -24,5 +24,5 @@ NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 2
POINTS: 8
DEADLINE: 2019-12-09 18:45
DEADLINE: 2019-12-12 18:45
REMAINDER: 2/3

View File

@ -1,16 +1,15 @@
Słownik
=======
Dictionary
==========
Program powinien wczytać automat skończeniestanowy (bez wag) ze
standardowego wejścia. Zakładamy, że automat jest deterministyczny i
nie zawiera cykli.
Your program should read a finite-state automaton from standard input.
The automaton is deterministic, you can assume it does not contain
cycles.
Każda ścieżka automatu etykietowana jest ciągiem symboli o
następującej strukturze:
Each automaton path is labeled with a symbol sequence of the following form:
<słowo wejściowe>;<opis>
<input word>;<description>
np.
e.g.:
biały;ADJ
dom;N
@ -20,37 +19,38 @@ stali;N
stali;V
stali;ADJ
Następnie należy wczytać słowa z kolejnych wierszy pliku podanego jako
argument. Dla każdego słowa należy wypisać wszystkie ścieżki automatu,
które rozpoczynają się tym słowem, a następnym symbolem jest ';'
(średnik), czyli np. dla słowa wejściowego 'dom' szukamy ścieżek o
prefiksie 'dom;'. Jeśli nie ma żadnej takiej ścieżki, to należy
wyprowadzić napis:
Next you should read words from the file whose name is given as the
first argument (`*.arg` file). For each word, you should all automaton
paths that begin a given word, the following symbol is ';'
(semicolon), e.g. for the word 'dom' we are looking for paths
beginning with 'dom;'. If there is no such path, the following message
should be printed:
<słowo wejściowe>;OOV
<input word>;OOV
Przykładowo, dla automatu ze ścieżkami, jak wyżej, i wejścia:
For instance, for the automaton given above and the input:
budynek
dom
piła
powinniśmy otrzymać:
we should get:
budynek;OOV
dom;N
piła;N
piła;V
W sytuacji, gdy dla jednego prefiksu wejściowego otrzymujemy na wyściu
wiele ścieżek, powinny być one posortowane alfabetycznie.
If there is more than one path for a given word, they should be given in alphabetical order.
Program nie musi sprawdzać, czy tekst wczytany ze standardowego
wejścia jest poprawnym opisem automatu i czy automat jest
deterministyczny i nie zawiera cykli.
The program does not have to check whether the automaton is correct
and whether it is deterministic and does not contain cycles.
NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 1
NOTE 1. In section B for points for your tasks, the maximum (rather
than sum) is taken.
NOTE 2. Task only for students whose student index number ("numer
indeksu") is divisible by 3 with a remainder of 0
POINTS: 14
DEADLINE: 2019-12-16 23:59

View File

@ -1,24 +0,0 @@
Ścieżki
=======
Program powinien wczytać automat skończenie stanowy (bez wag) ze
standardowego wejścia. Zakładamy, że automat jest deterministyczny i
nie zawiera cykli. Alfabet automatu stanowią litery języka polskiego.
Program powinien wypisać na standardowe wyjście wszystkie ścieżki
automatu w porządku alfabetycznym. Wypisać ścieżkę oznacza wyprowadzić
linię tekstu zawierającą jako kolejne znaki kolejne symbole wejściowe
automatu i zakończoną znakiem końca linii.
Program nie musi sprawdzać, czy tekst wczytany ze standardowego
wejścia jest poprawnym opisem automatu i czy automat jest
deterministyczny i nie zawiera cykli.
Ewentualne wagi należy pomijać.
NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 1
POINTS: 14
DEADLINE: 2019-12-16 23:59
REMAINDER: 1/3

File diff suppressed because it is too large Load Diff

View File

@ -1,9 +1,8 @@
#!/usr/bin/python3
import sys
import re
import os
import shutil
class automata:
@ -47,13 +46,6 @@ auto = automata()
for line in sys.stdin:
auto.add_node(line)
'''
shutil.copy(
sys.argv[1].replace('.arg', '.exp'),
sys.argv[1].replace('.arg', '')
)
'''
with open(sys.argv[1].replace('.arg', '.exp'), 'r') as f:
for line in f:
print(''.join(list([list(line), line][1][:-1])))

View File

@ -1,22 +1,28 @@
Cykle
=====
Cycles
======
Program powinien wczytać automat skończeniestanowy (bez wag) ze
standardowego wejścia. Automat może być niedeterministyczny i zawierać
epsilon-przejścia.
Your program should read a finite-state automaton (without weights)
from standard input. The automaton can be nondeterministic and can
contain epsilon-transitions.
Program powinien sprawdzić, czy automat zawiera cykle.
Your program should check whether the automaton contains a cycle (of any length).
Jeśli tak, na wyjściu powinna zostać wypisana linia
If so, the following line should be written on the standard output:
TAK
w przeciwnym razie
otherwise:
NIE
NOTE: Task only for students whose student index number ("numer
indeksu") is divisable by 3 with a remainder of 2
("TAK" and "NIE" are "YES" and "NO" in Polish, these are used for
compatibility with further tasks.)
NOTE 1. In section B for points for your tasks, the maximum (rather
than sum) is taken.
NOTE 2. Task only for students whose student index number ("numer
indeksu") is divisible by 3 with a remainder of 2.
POINTS: 14
DEADLINE: 2019-12-16 23:59

14
TaskB06/description.txt Normal file
View File

@ -0,0 +1,14 @@
Compressing Polish inflected form
=================================
Try to create a deterministic automaton, as small as you can get, for
storing Polish inflected forms listed in the PoliMorf lexicon:
http://zil.ipipan.waw.pl/PoliMorf?action=AttachFile&do=get&target=PoliMorf-0.6.7.tab.gz
There will be automated tests for this task. It will be assessed manually.
NOTE. In section B for points for your tasks, the maximum (rather
than sum) is taken.
POINTS: 18
DEADLINE: 2020-01-10 23:59

9
TaskX05/Makefile Normal file
View File

@ -0,0 +1,9 @@
# .far to specjalny format do przechowywania spakowanych transduktorów
BINARIES += TaskX05/legiatolech.far
TaskX05/run: TaskX05/legiatolech.far
TaskX05/legiatolech.far: TaskX05/legiatolech.grm
LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/lib/fst" thraxcompiler --input_grammar=$< --output_far=$@

9
TaskX05/description.txt Normal file
View File

@ -0,0 +1,9 @@
Write a Thrax grammar which replaces all occurrences of Polish
inflected forms of the word "Legia" ("Legia", "Legią", "Legię",
"Legii") into the corresponding forms of the word "Lech" ("Lech",
"Lechem", "Lecha", "Lecha", respectively).
The task has a solution already, do not solve it!
POINTS: 0
DEADLINE: 2020-01-30 23:00

42
TaskX05/legiatolech.grm Normal file
View File

@ -0,0 +1,42 @@
# zamiana końcówek, "i"/"ę" zgrupowane, bo zamieniamy na wspólną końcówkę
suffixes = ("a" : "") | ("i" | "ę" : "a") | ("ą" : "em");
# zamiana rdzenia
legia_to_lech_core = "Legi" : "Lech";
legia_to_lech = legia_to_lech_core suffixes;
# Wszystkie możliwe znaki, niestety nie ma (?) prostszego sposobu.
# Zauważmy, że polskie znaki diakrytyczne będą reprezentowane
# tak naprawdę jako dwuznaki, nie stanowi to jednak problemu.
allChars = Optimize[
"[1]" | "[2]" | "[3]" | "[4]" | "[5]" | "[6]" | "[7]" | "[8]" | "[9]" | "[10]" |
"[11]" | "[12]" | "[13]" | "[14]" | "[15]" | "[16]" | "[17]" | "[18]" | "[19]" | "[20]" |
"[21]" | "[22]" | "[23]" | "[24]" | "[25]" | "[26]" | "[27]" | "[28]" | "[29]" | "[30]" |
"[31]" | "[32]" | "[33]" | "[34]" | "[35]" | "[36]" | "[37]" | "[38]" | "[39]" | "[40]" |
"[41]" | "[42]" | "[43]" | "[44]" | "[45]" | "[46]" | "[47]" | "[48]" | "[49]" | "[50]" |
"[51]" | "[52]" | "[53]" | "[54]" | "[55]" | "[56]" | "[57]" | "[58]" | "[59]" | "[60]" |
"[61]" | "[62]" | "[63]" | "[64]" | "[65]" | "[66]" | "[67]" | "[68]" | "[69]" | "[70]" |
"[71]" | "[72]" | "[73]" | "[74]" | "[75]" | "[76]" | "[77]" | "[78]" | "[79]" | "[80]" |
"[81]" | "[82]" | "[83]" | "[84]" | "[85]" | "[86]" | "[87]" | "[88]" | "[89]" | "[90]" |
"[91]" | "[92]" | "[93]" | "[94]" | "[95]" | "[96]" | "[97]" | "[98]" | "[99]" | "[100]" |
"[101]" | "[102]" | "[103]" | "[104]" | "[105]" | "[106]" | "[107]" | "[108]" | "[109]" | "[110]" |
"[111]" | "[112]" | "[113]" | "[114]" | "[115]" | "[116]" | "[117]" | "[118]" | "[119]" | "[120]" |
"[121]" | "[122]" | "[123]" | "[124]" | "[125]" | "[126]" | "[127]" | "[128]" | "[129]" | "[130]" |
"[131]" | "[132]" | "[133]" | "[134]" | "[135]" | "[136]" | "[137]" | "[138]" | "[139]" | "[140]" |
"[141]" | "[142]" | "[143]" | "[144]" | "[145]" | "[146]" | "[147]" | "[148]" | "[149]" | "[150]" |
"[151]" | "[152]" | "[153]" | "[154]" | "[155]" | "[156]" | "[157]" | "[158]" | "[159]" | "[160]" |
"[161]" | "[162]" | "[163]" | "[164]" | "[165]" | "[166]" | "[167]" | "[168]" | "[169]" | "[170]" |
"[171]" | "[172]" | "[173]" | "[174]" | "[175]" | "[176]" | "[177]" | "[178]" | "[179]" | "[180]" |
"[181]" | "[182]" | "[183]" | "[184]" | "[185]" | "[186]" | "[187]" | "[188]" | "[189]" | "[190]" |
"[191]" | "[192]" | "[193]" | "[194]" | "[195]" | "[196]" | "[197]" | "[198]" | "[199]" | "[200]" |
"[201]" | "[202]" | "[203]" | "[204]" | "[205]" | "[206]" | "[207]" | "[208]" | "[209]" | "[210]" |
"[211]" | "[212]" | "[213]" | "[214]" | "[215]" | "[216]" | "[217]" | "[218]" | "[219]" | "[220]" |
"[221]" | "[222]" | "[223]" | "[224]" | "[225]" | "[226]" | "[227]" | "[228]" | "[229]" | "[230]" |
"[231]" | "[232]" | "[233]" | "[234]" | "[235]" | "[236]" | "[237]" | "[238]" | "[239]" | "[240]" |
"[241]" | "[242]" | "[243]" | "[244]" | "[245]" | "[246]" | "[247]" | "[248]" | "[249]" | "[250]" |
"[251]" | "[252]" | "[253]" | "[254]" | "[255]"
];
export PROCESS = Optimize[CDRewrite[legia_to_lech, "", "", allChars*]];

3
TaskX05/run Executable file
View File

@ -0,0 +1,3 @@
#!/bin/bash
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/fst" thraxrewrite-tester --far=TaskX05/legiatolech.far --rules=PROCESS

6
TaskX05/test.exp Normal file
View File

@ -0,0 +1,6 @@
Input string: Output string: Lech Warszawa
Input string: Output string: Górnik gra z Lechem
Input string: Output string: Lech Lecha Lecha Lechem
Input string: Output string: Lechxxxxx
Input string: Output string: tu nic nie ma do zamiany
Input string:

5
TaskX05/test.in Normal file
View File

@ -0,0 +1,5 @@
Legia Warszawa
Górnik gra z Legią
Legia Legii Legię Legią
Legiaxxxxx
tu nic nie ma do zamiany

8
TaskX06/description.txt Normal file
View File

@ -0,0 +1,8 @@
Write a Thrax grammar which replaces all 0s to 1s and 1s to 0s (other
digits should not be changed).
You can assume that only string composed of digits are given on the
input.
POINTS: 1
DEADLINE: 2019-12-13 23:59

6
TaskX06/test.exp Normal file
View File

@ -0,0 +1,6 @@
Input string: Output string: 0023410
Input string: Output string: 0101010
Input string: Output string: 9999999
Input string: Output string: 888088888888888
Input string: Output string: 111
Input string:

5
TaskX06/test.in Normal file
View File

@ -0,0 +1,5 @@
1123401
1010101
9999999
888188888888888
000

View File

@ -19,7 +19,7 @@ cp "${PREFIX}/count-points.pl" arena/
cp "${PREFIX}/overrides.txt" arena/
cp "${PREFIX}/Makefile" arena/
for TX in X01 X02 X03 X04 B00 B01 B02 B03 B04 B05 E01 E02 E03 E04 # X05 X06 X07 X08 X09 X10 B03 B04 X10
for TX in X01 X02 X03 X04 X05 X06 B00 B01 B02 B03 B04 B05 B06 E01 E02 E03 E04 # X05 X06 X07 X08 X09 X10 B03 B04 X10
do
mkdir -p arena/Task$TX
done

View File

@ -241,6 +241,7 @@ sub is_estudent {
my %estudents = map { $_ => 1 } split/\n/,<<'END_OF_NUMBERS';
16136
21804
30291
30686
32746

View File

@ -0,0 +1 @@
434739 A44 4 manually