Use regular expressions to mark Polish first-person masculine forms. You should handle the following types of expressions: * first-person masculine past forms of verbs ("zrobiłem", "pisałem", etc.), * first-person singular masculine forms of the verb "być" ("be") combined with singular masculine nominative forms of adjectives ("wysoki", "sprytny", etc.), assuming that the form of the verb "być" is to the left of the adjective, not more than 3 other words, * the verb "będę" combined with the past participle (i.e. 3rd person masculine imperfect form, e.g. "robił", pisał"), assuming that "będę" is to the left of the adjective, not more than 3 other words to the left of the participle OR directly to the right of the participle ("robił będę"). The first-person masculine forms should be marked with curly brackets. You should mark only the masculine form. Do not mark the form of "być" (unless it clearly a masculine form, i.e. for "byłem"). The match should be case-insensitive. The PoliMorf dictionary of inflected forms should be applied: http://zil.ipipan.waw.pl/PoliMorf?action=AttachFile&do=get&target=PoliMorf-0.6.7.tab.gz Suggested steps: 1. Extract all the needed forms from the PoliMorf dictionary: * 1st person masculine past forms of verbs, unfortunately this form is not directly present in the lexicon, you need to add "em" to the 3rd person masculine form ("zrobił" => "zrobiłem") * singular masculine nominative forms of adjectives * masculine past participle (3rd person masculine imperfect forms of verbs) You could do this using grep/cut commands — to obtain a simple text files with a word in each line. You can do this once and commit the 3 files to your repository. 2. In your `run` script/program, read the 3 files and create a large expression with alternatives. Use a regexp library based on DFAs (determintistic finite-automatons).