4.1 KiB
DJFZ 2020 laboratories 2
Upgrade repo
First, please update your repo. There are new tasks.
git pull git@git.wmi.amu.edu.pl:filipg/djfz-2020.git
Tasks to do on your own
Tasks from section B and C - regular expressions. Deadline is till the end of November 22nd.
Please do all B tasks. These are the same tasks as A, but differ in the solution method. This time please use regular expressions. When writing the solutions, please pay attention to the diffrence in complexity and execution time between regular expressions (B) and solutions based on basic mechanisms (A).
Each of you is assigned exactly 4 C tasks (in total these tasks will constitute an "easy task" with section C, although you don't have to do them all). Note: which task falls to you depends on the student index number! The tasks are grouped into 4 blocks:
- TaskC00-TaskC09 - the remainder from dividing by 10,
- TaskC10-TaskC36 - the remainder from dividing by 27,
- TaskC37-TaskC43 - the remainder from dividing by 7,
- TaskC44-TaskC48 - the remainder from dividing by 5.
Please check in the repository which task from each block is assigned to you. So each of you has exactly one task from each of these 4 blocks.
You can get a total of 16 points for the second laboratory.
Regular expressions
We will do regular expressions based on python3. Documentation: https://docs.python.org/3/library/re.html.
Basic functions
search - returns the first match in the substring
findall - returns a list of all matches (not overlapping)
match - returns the match from the beginning of the string
These are just the basic functions that we will use. The documentation describes them all.
match object
import re
answer = re.search('na','banana')
print(answer)
print(answer.start())
print(answer.end())
print(answer.group())
answer = re.search('na','kabanos')
print(answer)
type(answer)
if answer:
print(answer.group())
else:
pass
Metacharacters
-
[] - set of characters
-
. - any sign
-
^ - beginning of a string
-
$ - end of the a string
-
? - the character preceding is or is not present
-
* - zero or more appearances of the character preceding
-
+ - one or more appearances of the character preceding
-
{} - exactly as many appearances of the character preceding
-
| - or
-
() - group
-
\ - escape character
-
\d a digit
-
\D every character, but digit
-
\s a whitespace
-
\S every character, but whitespace
Flags
You can use special flags, such as in example:
re.search('ma', 'AlA Ma KoTa', re.IGNORECASE)
.
Examples (explained during labs)
To study, it is better to use interactive python interpreter, preferably ipython.
import re
text = 'Ala has a cat and hammock, and 150 bananas.'
re.search('ma',text)
re.match('ma',text)
re.match('Ala ma',text)
re.findall('ma',text)
re.findall('[mn]a',text)
re.findall('[0-9]',text)
re.findall('[0-9abc]',text)
re.findall('[a-z][a-z]ma[a-z]',text)
re.findall('[a-zA-Z][a-zA-Z]ma[a-zA-z0-9]',text)
re.findall('\d',text)
re.search('[0-9][0-9][0-9]',text)
re.search('[\d][\d][\d]',text)
re.search('\d{2}',text)
re.search('\d{3}',text)
re.search('\d+',text)
re.search('\d+ bananas',text)
re.search('\d* bananas', 'Ala has a lot of bananas')
re.search('\d* bananas',text)
re.search('ma \d? bananas','Ala has 5 bananas')
re.search('ma ?d? bananas','Ala has bananas')
re.search('ma(\d)? bananas', 'Ala has bananas')
re.search('\d+ bananas', 'Ala has 10 bananas or 20 bananas')
re.search('\d+ bananas$', 'Ala has 10 bananas or 20 bananas')
text = 'Ala has a cat and hammock, and 150 bananas.'
re.search('\d+ bananas',text)
re.search('\d+\sbananów',text)
re.search('kota . hamak',text)
re.search('cat . hammock','Ala has a cat with a hammock')
re.search('cat .* hammock','Ala has a cat or hammock')
re.search('\.',text)
re.search('cat|psa','Ala has a cat or hammock')
re.findall('cat|psa','Ala has a cat or a dog')
re.search('cat (i|or) dog','Ala has a cat or dog')
re.search('mam (cat).*(cat|psa)','I have a cat. Ala has a dog.').group(0)
re.search('mam (cat).*(cat|psa)','I have a cat. Ala has a dog.').group(1)
re.search('mam (cat).*(cat|psa)','I have a cat. Ala has a dog.').group(2)