41 changed files with 344 additions and 947532 deletions
--- a/README.md
+++ b/README.md
@ -1,11 +1,6 @@
 ## Zajęcia 1
 Copyright AMU Poznan
 Made by multiple people 
 ### Informacje na temat przedmiotu
 Prowadzący: Jacek Kałużny
@ -23,3 +18,347 @@ W ten sposób będziemy aktualizować zadania co zajęcia.
 Zadania robimy do końca soboty poprzedzającej zajęcia
 Rozwiązanie zapisujemy w pliku run.py
 ## Zajęcia 2 Wyrażenia regularne
 Dokumentacja wyrażeń regularnych w python3: https://docs.python.org/3/library/re.html
 ### Podstawowe funkcje
 search - zwraca pierwsze dopasowanie w napisie
 findall - zwraca listę wszystkich dopasowań (nienakładających się na siebie)
 match - zwraca dopasowanie od początku string
 To tylko podstawowe funkcje, z których będziemy korzystać. W dokumentacji opisane są wszystkie.
 ### Obiekt match
 ```
 import re
 answer = re.search('na','banan')
 print(answer)
 print(answer.start())
 print(answer.end())
 print(answer.group())
 answer = re.search('na','kabanos')
 print(answer)
 type(answer)
 if answer:
    print(answer.group())
 else:
    pass
 ```
 ### Metaznaki
 - [] -  zbiór znaków
 - . - jakikolwiek znak
 - ^ - początek napisu
 - $ - koniec napisu
 - ? - znak występuje lub nie występuje
 - \* - zero albo więcej pojawień się
 - \+ - jeden albo więcej pojawień się
 - {} - dokładnie tyle pojawień się
 - | - lub
 - () - grupa
 - \ -znak ucieczki
 - \d digit
 - \D nie digit
 - \s whitespace
 - \S niewhitespace
 ### Flagi
 Można użyć specjalnych flag, np: 
 `re.search('ma', 'AlA Ma KoTa', re.IGNORECASE)`.
 ### Przykłady (objaśnienia na laboratoriach)
 Do nauki lepiej użyć pythona w wersji interaktywnej, a najlepiej ipython.
 ```
 import re
 text = 'Ala ma kota i hamak, oraz 150 bananów.'
 re.search('ma',text)
 re.match('ma',text)
 re.match('Ala ma',text)
 re.findall('ma',text)
 re.findall('[mn]a',text)
 re.findall('[0-9]',text)
 re.findall('[0-9abc]',text)
 re.findall('[a-z][a-z]ma[a-z]',text)
 re.findall('[a-zA-Z][a-zA-Z]ma[a-zA-z0-9]',text)
 re.findall('\d',text)
 re.search('[0-9][0-9][0-9]',text)
 re.search('[\d][\d][\d]',text)
 re.search('\d{2}',text)
 re.search('\d{3}',text)
 re.search('\d+',text)
 re.search('\d+ bananów',text)
 re.search('\d* bananów','Ala ma dużo bananów')
 re.search('\d* bananów',text)
 re.search('ma \d? bananów','Ala ma 5 bananów')
 re.search('ma ?\d? bananów','Ala ma bananów')
 re.search('ma( \d)? bananów','Ala ma bananów') 
 re.search('\d+ bananów','Ala ma 10 bananów albo 20 bananów')
 re.search('\d+ bananów$','Ala ma 10 bananów albo 20 bananów')
 text = 'Ala ma kota i hamak, oraz 150	bananów.'
 re.search('\d+ bananów',text)
 re.search('\d+\sbananów',text)
 re.search('kota . hamak',text)
 re.search('kota . hamak','Ala ma kota z hamakiem')
 re.search('kota .* hamak','Ala ma kota lub hamak')
 re.search('\.',text)
 re.search('kota|psa','Ala ma kota lub hamak')
 re.findall('kota|psa','Ala ma kota lub psa')
 re.search('kota (i|lub) psa','Ala ma kota lub psa')
 re.search('mam (kota).*(kota|psa)','Ja mam kota. Ala ma psa.').group(0)
 re.search('mam (kota).*(kota|psa)','Ja mam kota. Ala ma psa.').group(1)
 re.search('mam (kota).*(kota|psa)','Ja mam kota. Ala ma psa.').group(2)
 ```
 ### Przykłady wyrażenia regularne 2 (objaśnienia na laboratoriach)
 ####  ^
 ```
 re.search('[0-9]+', '123-456-789')
 re.search('[^0-9][0-9]+[^0-9]', '123-456-789')
 ```
 #### cudzysłów
 '' oraz "" - oznaczają to samo w pythonie
 ' ala ma psa o imieniu "Burek"'
 " ala ma psa o imieniu 'Burek' "
 ' ala ma psa o imieniu \'Burek\' '
 " ala ma psa o imieniu \"Burek\" "
 #### multiline string
 #### raw string
 przy raw string znaki \ traktowane są jako zwykłe znaki \
 chociaż nawet w raw string nadal są escapowane (ale wtedy \ pozostają również w stringu bez zmian)
 https://docs.python.org/3/reference/lexical_analysis.html
 dobra praktyka - wszędzie escapować
 ```
 '\\'
 print('\\')
 r'\\'
 print(r'\\')
 print("abcd")
 print("ab\cd")
 print(r"ab\cd")
 print("ab\nd")
 print(r"ab\nd")
 print("\"")
 print(r"\"")
 print("\")
 print(r"\")
 re.search('\\', r'a\bc')
 re.search(r'\\', r'a\bc')
 re.search('\\\\', r'a\bc')
 ```
 #### RE SUB
 ```
 re.sub(pattern, replacement, string)
 re.sub('a','b', 'ala ma kota')
 ```
 #### backreferencje:
 ```
 re.search(r' \d+ \d+', 'ala ma 41 41 kota')
 re.search(r' \d+ \d+', 'ala ma 41 123 kota')
 re.search(r' (\d+) \1', 'ala ma 41 41 kota')
 re.search(r' (\d+) \1', 'ala ma 41 123 kota')
 ```
 #### lookahead ( to sa takie assercje):
 ```
 re.search(r'ma kot', 'ala ma kot')
 re.search(r'ma kot(?=[ay])', 'ala ma kot')
 re.search(r'ma kot(?=[ay])', 'ala ma kotka')
 re.search(r'ma kot(?=[ay])', 'ala ma koty')
 re.search(r'ma kot(?=[ay])', 'ala ma kota')
 re.search(r'ma kot(?![ay])', 'ala ma kot')
 re.search(r'ma kot(?![ay])', 'ala ma kotka')
 re.search(r'ma kot(?![ay])', 'ala ma koty')
 re.search(r'ma kot(?![ay])', 'ala ma kota')
 ```
 #### named groups
 ```
 r = re.search(r'ma (?P<ilepsow>\d+) kotow i (?P<ilekotow>\d+) psow', 'ala ma 100 kotow i 200 psow')
 r.groups()
 r.groups('ilepsow')
 r.groups('ilekotow')
 ```
 #### re.split
 ```
 ('a,b.c,d').split(',')
 ('a,b.c,d').split(',')
 ('a,b.c,d').split(',.')
 re.split(r',', 'a,b.c,d') 
 re.split(r'[.,]', 'a,b.c,d') 
 ```
 #### \w word character
 ```
 \w - matchuje Unicod word character , jeżeli flaga ASCII to [a-zA-Z0-9_]
 \w - odwrotne do \W, jezeli flaga ASCI to [^a-zA-Z0-9_]
 re.findall(r'\w+', 'ala ma 3 koty.')
 re.findall(r'\W+', 'ala ma 3 koty.')
 ```
 #### początek albo koniec słowa | word boundary
 ```
 re.search(r'\bkot\b', 'Ala ma kota')
 re.search(r'\bkot\b', 'Ala ma kot')
 re.search(r'\bkot\b', 'Ala ma kot.')
 re.search(r'\bkot\b', 'Ala ma kot ')
 re.search(r'\Bot\B', 'Ala ma kot ')
 re.search(r'\Bot\B', 'Ala ma kota ')
 ```
 #### MULTILINE
 ```
 re.findall(r'^Ma', 'Ma kota Ala\nMa psa Jacek') 
 re.findall(r'^Ma', 'Ma kota Ala\nMa psa Jacek', re.MULTILINE)
 ```
 #### RE.COMPILE
 ## zajęcia 6
 instalacja https://pypi.org/project/google-re2/
 ### DFA i NDFA
 ```
 import re2 as re
 n = 50
 regexp =  "a?"*n+"a"*n
 s = "a"*n
 re.match(regexp, s)
 ```
 ```
 re.match(r"(\d)abc\1", "3abc3") # re2 nie obsługuje backreferencji
 ```
 re2 max memory - podniesienie limitu
 time # mierzenie czasu działania
 gdyby ktoś chciał poczytać więcej:
 https://swtch.com/~rsc/regexp/regexp1.html
 ### UTF-8
 ```
 c = "ℋ"
 ord(c)
 chr(8459)
 8* 16**2 + 0 * 16**(1) + 0*16**(0)
 15*16**3 + 15* 16**2 + 15 * 16**(1) + 15*16**(0)
 ```
 ```
 xxd -b file
 xxd  file
 ```
 termin oddawania zadań - 15. listopada
 ## Zajęcia 7
 https://www.openfst.org/twiki/bin/view/GRM/Thrax
 https://www.cs.jhu.edu/~jason/465/hw-ofst/hw-ofst.pdf
 Wszystkie zadania proszę robić na wzór `TaskH00`. Proszę umieszczać gramatykę w pliku `grammar.grm` oraz
 opisywać finalną regułę nazwą `FinalRule`.
 ## KOLOKWIUM
 Operatory, obowiązujące na kolokwium
 ====================================
 * kwantyfikatory `-` `*` `+` `?` `{n}` `{n,}` `{n, m}`
 * alternatywa — `|`
 * klasy znaków — `[...]`
 * zanegowane klasy znaków — `[^...]`
 * dowolny znak — `.`
 * unieważnianie znaków specjalnych — \
 * operatory zakotwiczające — `^` `$`
 Na kolokwium do każdego z 4 pytań będą 3 podpunkty. Na każdy podpunkt odpowiadamy TAK/NIE. Czas trwania to 15 minut.
 - zawsze daszek i dolar
 - nie bierzemy pod uwagę capturing (jeżeli są pytania o równoważne)
 - proponuję wydrukować cały test w wersji bez opdowiedzi i sprawdzać
 Do zaliczenia należy zdobyć conajmniej 10 punktów.
--- a/TaskB00/description.txt
+++ b/TaskB00/description.txt
@ -1,8 +0,0 @@
 Read a description of a deterministic finite-state automaton in the AT&T format
 (without weights) from the file in the first argument.
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
 The program is invoked like this: ./run.py fsa_description.arg < test1.in > test1.out
--- a/TaskB00/fsa_description.arg
+++ b/TaskB00/fsa_description.arg
@ -1,16 +0,0 @@
 0	1	x
 1	2	y
 2	3	z
 0	4	y
 0	4	z
 1	4	x
 1	4	z
 2	4	x
 2	4	y
 3	4	x
 3	4	y
 3	4	z
 4	4	x
 4	4	y
 4	4	z
 3
--- a/TaskB00/run.py
+++ b/TaskB00/run.py
@ -1,29 +0,0 @@
 import sys
 def write_answer(answer):
    with open('test1.out', 'a') as file:
        file.write(answer+'\n')
 def find_next_position(position, character):
    with open('fsa_description.arg', 'r') as readed_used_table:
        for row_used_table in readed_used_table:
            line = row_used_table.strip().split('\t')
            if position == line[0] and character == line[2]:
                return True,line[1]
 # used_table = sys.argv[1]
 # input_file = sys.argv[2]
 with open('test1.out', 'w') as readed_output_file:
    with open('test1.in', 'r') as readed_input_file:
        for row_input_file in readed_input_file:
            result  = False
            next_position = None
            position = '0'
            for character in row_input_file:
                if character =='\n':
                    if position=='3':
                        write_answer('YES')
                        break
                    else:
                        write_answer('NO')
                        break
                result, next_position = find_next_position(position,character)
                if result == True:
                    position = next_position
--- a/TaskB00/test1.exp
+++ b/TaskB00/test1.exp
@ -1,9 +0,0 @@
 NO
 YES
 NO
 NO
 NO
 NO
 NO
 NO
 NO
--- a/TaskB00/test1.in
+++ b/TaskB00/test1.in
@ -1,9 +0,0 @@
 xxyz
 xyz
 xy
 zz
 xxy
 yzx
 x
 xyzz
--- a/TaskB00/test1.out
+++ b/TaskB00/test1.out
@ -1,8 +0,0 @@
 NO
 YES
 NO
 NO
 NO
 NO
 NO
 NO
--- a/TaskB01/description.txt
+++ b/TaskB01/description.txt
@ -1,10 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the TaskE00.
 Create your own FSA description to check whether the string starts with "01" and ends with "01.
 Save it to fsa_description.arg file.
 The alphabet is "0", "1".
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB01/test.exp
+++ b/TaskB01/test.exp
@ -1,14 +0,0 @@
 YES
 NO
 YES
 NO
 YES
 NO
 NO
 YES
 NO
 NO
 NO
 NO
 NO
 NO
--- a/TaskB01/test.in
+++ b/TaskB01/test.in
@ -1,14 +0,0 @@
 01
 10
 0101
 1010
 011101
 101010
 100010
 0100001
 00110
 0000
 10101
 0
 1
--- a/TaskB02/description.txt
+++ b/TaskB02/description.txt
@ -1,9 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the TaskE00.
 Create your own FSA description to check whether the string starts with "10" and ends with "10.
 Save it to fsa_description.arg file.
 The alphabet is "0", "1".
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB02/test.exp
+++ b/TaskB02/test.exp
@ -1,14 +0,0 @@
 NO
 YES
 NO
 YES
 NO
 YES
 YES
 NO
 NO
 NO
 NO
 NO
 NO
 NO
--- a/TaskB02/test.in
+++ b/TaskB02/test.in
@ -1,14 +0,0 @@
 01
 10
 0101
 1010
 011101
 101010
 100010
 0100001
 00110
 0000
 10101
 0
 1
--- a/TaskB03/description.txt
+++ b/TaskB03/description.txt
@ -1,11 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the TaskE00.
 Create your own FSA description to check whether the string contains "0"
 even number of times.
 Save it to fsa_description.arg file.
 The alphabet is "0", "1".
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB03/test.exp
+++ b/TaskB03/test.exp
@ -1,14 +0,0 @@
 NO
 NO
 YES
 YES
 YES
 NO
 YES
 NO
 YES
 NO
 YES
 YES
 NO
 YES
--- a/TaskB03/test.in
+++ b/TaskB03/test.in
@ -1,14 +0,0 @@
 01
 10
 0101
 1010
 011101
 101010
 100010
 0100001
 00110
 0000
 10101
 0
 1
--- a/TaskB04/description.txt
+++ b/TaskB04/description.txt
@ -1,11 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the TaskE00.
 Create your own FSA description to check whether the string contains "0"
 odd number of times.
 Save it to fsa_description.arg file.
 The alphabet is "0", "1".
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB04/test.exp
+++ b/TaskB04/test.exp
@ -1,14 +0,0 @@
 YES
 YES
 NO
 NO
 NO
 YES
 NO
 YES
 NO
 YES
 NO
 NO
 YES
 NO
--- a/TaskB04/test.in
+++ b/TaskB04/test.in
@ -1,14 +0,0 @@
 01
 10
 0101
 1010
 011101
 101010
 100010
 0100001
 00110
 0000
 10101
 0
 1
--- a/TaskB05/description.txt
+++ b/TaskB05/description.txt
@ -1,10 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the TaskB00.
 Create your own FSA description to check whether the line contains string '19DD', where D is a digit.
 Save it to fsa_description.arg file.
 FSA alphabet is '0123456789x'.
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB05/polish_wiki_excerpt_only_digits.exp
+++ b/TaskB05/polish_wiki_excerpt_only_digits.exp
--- a/TaskB05/polish_wiki_excerpt_only_digits.in
+++ b/TaskB05/polish_wiki_excerpt_only_digits.in
--- a/TaskB05/simple.exp
+++ b/TaskB05/simple.exp
@ -1,6 +0,0 @@
 NO
 YES
 NO
 NO
 YES
 YES
--- a/TaskB05/simple.in
+++ b/TaskB05/simple.in
@ -1,6 +0,0 @@
 3214545443
 1910
 19
 xxx2190x
 xxx21905x
 1905x54545
--- a/TaskB06/description.txt
+++ b/TaskB06/description.txt
@ -1,9 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the previous task.
 Create your own FSA description to check whether the word "hamlet" is in the given line.
 Save it to fsa_description.arg file.
 FSA alphabet is 'abcdefghijklmnopqrstuvwxyz '.
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB06/shakespeare_ascii_lower.exp
+++ b/TaskB06/shakespeare_ascii_lower.exp
--- a/TaskB06/shakespeare_ascii_lower.in
+++ b/TaskB06/shakespeare_ascii_lower.in
--- a/TaskB06/simple.exp
+++ b/TaskB06/simple.exp
@ -1,3 +0,0 @@
 NO
 YES
 YES
--- a/TaskB06/simple.in
+++ b/TaskB06/simple.in
@ -1,3 +0,0 @@
 haml
 hamlet
 aaahamletbbb
--- a/TaskB07/description.txt
+++ b/TaskB07/description.txt
@ -1,10 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the previous task.
 Create your own FSA description to check whether the word "ophelia" is in the given line.
 Save it to fsa_description.arg file.
 FSA alphabet is 'abcdefghijklmnopqrstuvwxyz '.
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB07/shakespeare_ascii_lower.exp
+++ b/TaskB07/shakespeare_ascii_lower.exp
--- a/TaskB07/simple.exp
+++ b/TaskB07/simple.exp
@ -1,3 +0,0 @@
 NO
 YES
 YES
--- a/TaskB07/simple.in
+++ b/TaskB07/simple.in
@ -1,3 +0,0 @@
 oph
 ophelia
 xfdfdopheliafff
--- a/TaskB08/description.txt
+++ b/TaskB08/description.txt
@ -1,10 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the previous task.
 Create your own FSA description to check whether the word "juliet" is in the given line.
 Save it to fsa_description.arg file.
 FSA alphabet is 'abcdefghijklmnopqrstuvwxyz '.
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB08/shakespeare_ascii_lower.exp
+++ b/TaskB08/shakespeare_ascii_lower.exp
--- a/TaskB08/simple.exp
+++ b/TaskB08/simple.exp
@ -1,3 +0,0 @@
 NO
 YES
 YES
--- a/TaskB08/simple.in
+++ b/TaskB08/simple.in
@ -1,3 +0,0 @@
 juli
 juliet
 dgfdgjulietaaa
--- a/TaskB09/description.txt
+++ b/TaskB09/description.txt
@ -1,10 +0,0 @@
 Use a deterministic finite-state automaton (FSA) engine from the previous task.
 Create your own FSA description to check whether the word "macbeth" is in the given line.
 Save it to fsa_description.arg file.
 FSA alphabet is 'abcdefghijklmnopqrstuvwxyz '.
 Read strings from the standard input.
 If a string is accepted by the
 automaton, write YES, otherwise- write NO.
--- a/TaskB09/shakespeare_ascii_lower.exp
+++ b/TaskB09/shakespeare_ascii_lower.exp
--- a/TaskB09/simple.exp
+++ b/TaskB09/simple.exp
@ -1,3 +0,0 @@
 NO
 YES
 YES
--- a/TaskB09/simple.in
+++ b/TaskB09/simple.in
@ -1,3 +0,0 @@
 macb
 macbeth
 xadadamacbethrff
-	1	x
-	2	y
-	3	z
-	4	y
-	4	z
-	4	x
-	4	z
-	4	x
-	4	y
-	4	x
-	4	y
-	4	z
-	4	x
-	4	y
-	4	z
		`@ -1,14 +0,0 @@`
			`01`
			`10`
			`0101`
			`1010`
			`011101`
			`101010`
			`100010`
			`0100001`

			`00110`
			`0000`
			`10101`
			`0`
			`1`