2024-programowanie-w-python.../zajecia6/1.ipynb

8490 lines
678 KiB
Plaintext
Raw Normal View History

2024-12-08 13:26:08 +01:00
{
"cells": [
{
"cell_type": "markdown",
"id": "4426f633-9980-425c-ab4d-54c578fac9e1",
"metadata": {},
"source": [
"## Dzielenie tekstu"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "b1959d8c-f8bf-4bf3-82df-53ed113c1c7f",
"metadata": {},
"outputs": [],
"source": [
"text = \"\"\"Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly, procedural), object-oriented and functional programming. It is often described as a \"batteries included\" language due to its comprehensive ...\"\"\"\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "63f1426f-fa90-48cb-b35c-0a3c5ca4354f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Python is dynamically-typed and garbage-collected',\n",
" ' It supports multiple programming paradigms, including structured (particularly, procedural), object-oriented and functional programming',\n",
" ' It is often described as a \"batteries included\" language due to its comprehensive ',\n",
" '',\n",
" '',\n",
" '']"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text.split('.')"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "5b8bcda8-622e-46ae-af75-67eb68f5882d",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"['Python',\n",
" 'is',\n",
" 'dynamically-typed',\n",
" 'and',\n",
" 'garbage-collected.',\n",
" 'It',\n",
" 'supports',\n",
" 'multiple',\n",
" 'programming',\n",
" 'paradigms,',\n",
" 'including',\n",
" 'structured',\n",
" '(particularly,',\n",
" 'procedural),',\n",
" 'object-oriented',\n",
" 'and',\n",
" 'functional',\n",
" 'programming.',\n",
" 'It',\n",
" 'is',\n",
" 'often',\n",
" 'described',\n",
" 'as',\n",
" 'a',\n",
" '\"batteries',\n",
" 'included\"',\n",
" 'language',\n",
" 'due',\n",
" 'to',\n",
" 'its',\n",
" 'comprehensive',\n",
" '...']"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text.split(' ')"
]
},
{
"cell_type": "markdown",
"id": "64319dd5-2d01-465c-a97a-ab77f6ee2651",
"metadata": {},
"source": [
"### Inne podejście do dzielenia tekstu"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "d8c2983c-0f94-4385-bcf7-c3e03e606d65",
"metadata": {},
"outputs": [],
"source": [
"import nltk"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "bc3227a4-136b-4647-a0cc-03b7303d5060",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Python is dynamically-typed and garbage-collected.',\n",
" 'It supports multiple programming paradigms, including structured (particularly, procedural), object-oriented and functional programming.',\n",
" 'It is often described as a \"batteries included\" language due to its comprehensive ...']"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nltk.tokenize.sent_tokenize(text)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "4357ecc4-533a-4df5-b6f9-c72ecef31ea0",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"['Python',\n",
" 'is',\n",
" 'dynamically-typed',\n",
" 'and',\n",
" 'garbage-collected',\n",
" '.',\n",
" 'It',\n",
" 'supports',\n",
" 'multiple',\n",
" 'programming',\n",
" 'paradigms',\n",
" ',',\n",
" 'including',\n",
" 'structured',\n",
" '(',\n",
" 'particularly',\n",
" ',',\n",
" 'procedural',\n",
" ')',\n",
" ',',\n",
" 'object-oriented',\n",
" 'and',\n",
" 'functional',\n",
" 'programming',\n",
" '.',\n",
" 'It',\n",
" 'is',\n",
" 'often',\n",
" 'described',\n",
" 'as',\n",
" 'a',\n",
" '``',\n",
" 'batteries',\n",
" 'included',\n",
" \"''\",\n",
" 'language',\n",
" 'due',\n",
" 'to',\n",
" 'its',\n",
" 'comprehensive',\n",
" '...']"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nltk.tokenize.word_tokenize(text)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "724cca10-d2a3-4de8-a398-5c4b9c3e6041",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[['Python', 'is', 'dynamically-typed', 'and', 'garbage-collected', '.'],\n",
" ['It',\n",
" 'supports',\n",
" 'multiple',\n",
" 'programming',\n",
" 'paradigms',\n",
" ',',\n",
" 'including',\n",
" 'structured',\n",
" '(',\n",
" 'particularly',\n",
" ',',\n",
" 'procedural',\n",
" ')',\n",
" ',',\n",
" 'object-oriented',\n",
" 'and',\n",
" 'functional',\n",
" 'programming',\n",
" '.'],\n",
" ['It',\n",
" 'is',\n",
" 'often',\n",
" 'described',\n",
" 'as',\n",
" 'a',\n",
" '``',\n",
" 'batteries',\n",
" 'included',\n",
" \"''\",\n",
" 'language',\n",
" 'due',\n",
" 'to',\n",
" 'its',\n",
" 'comprehensive',\n",
" '...']]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[nltk.tokenize.word_tokenize(sentence) for sentence in nltk.tokenize.sent_tokenize(text) ]"
]
},
{
"cell_type": "markdown",
"id": "e2c94ae2-5827-4e9c-9909-c798d05f9995",
"metadata": {},
"source": [
"### Odmiana słów - lematyzer i stemmer"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "786bb0f2-f87b-4b56-8dc9-c8ccabe2d1f5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"program : program\n",
"programs : program\n",
"programmer : programm\n",
"programming : program\n",
"programmers : programm\n"
]
}
],
"source": [
"ps = nltk.stem.PorterStemmer()\n",
"\n",
"for w in [\"program\", \"programs\", \"programmer\", \"programming\", \"programmers\"]:\n",
" print(w, \" : \", ps.stem(w))"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "677db8af-3540-4429-9aa5-1d973d133866",
"metadata": {},
"outputs": [],
"source": [
"from nltk.stem import WordNetLemmatizer"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "88b2a20c-addb-41ff-b4ec-ec02084ce110",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"program : program\n",
"programs : program\n",
"programmer : programmer\n",
"programming : programming\n",
"programmers : programmer\n"
]
}
],
"source": [
"for w in [\"program\", \"programs\", \"programmer\", \"programming\", \"programmers\"]:\n",
" lemmatizer = WordNetLemmatizer()\n",
" print(w, \" : \", lemmatizer.lemmatize(w))"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "643bcbf0-1c60-49d3-8b79-3330173e8a15",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"wolves : wolv\n"
]
}
],
"source": [
"w = 'wolves'\n",
"print(w, \" : \", ps.stem(w))"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "e8031260-afb7-4ed1-9c2f-b8dafd7ced21",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"wolves : wolf\n"
]
}
],
"source": [
"w = 'wolves'\n",
"print(w, \" : \", lemmatizer.lemmatize(w))"
]
},
{
"cell_type": "markdown",
"id": "5cc66f80-6a60-4133-8f95-fd1a4132701a",
"metadata": {},
"source": [
"### wykres częstości słów"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "1e9f52b4-eace-4542-8347-ae80b6f86155",
"metadata": {},
"outputs": [],
"source": [
"with open('antek.txt') as f:\n",
" antek = f.read()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "e2c9d49a-1a35-45e4-bd2a-7ad6f3ae76a6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Bolesław Prus\\n\\nAntek\\n\\nAntek urodził się we wsi nad Wisłą.\\n\\nWieś leżała w niewielkiej dolinie. Od północy otaczały ją wzgórza spadziste, porosłe sosnowym lasem, a od południa wzgórza garbate, zasypane leszczyną, tarniną i głogiem. Tam najgłośniej śpiewały ptaki i najczęściej chodziły wiejskie dzieci rwać orzechy albo wybierać gniazda.\\n\\nKiedyś stanął na środku wsi, zdawało ci się, że oba pasma gór biegną ku sobie, ażeby zetknąć się tam, gdzie z rana wstaje czerwone słońce. Ale było to złudzenie.\\n\\nZa wsią bowiem ciągnęła się między wzgórzami dolina przecięta rzeczułką i przykryta zieloną łąką.\\n\\nTam pasano bydlątka i tam cienkonogie bociany chodziły polować na żaby kukające wieczorami.\\n\\nOd zachodu wieś miała tamę, za tamą Wisłę, a za Wisłą znowu wzgórza wapienne, nagie.\\n\\nKażdy chłopski dom, szarą słomą pokryty, miał ogródek, a w ogródku śliwki węgierki, spomiędzy których widać było komin sadzą uczerniony i pożarną drabinkę. Drabiny te zaprowadzono nie od dawna, a ludzie myśleli, że one lepiej chronić będą chaty od ognia niż dawniej bocianie gniazda. Toteż gdy płonął jaki budynek, dziwili się bardzo, ale go nie ratowali.\\n\\n— Widać, że na tego gospodarza był dopust boski — mówili między sobą. — Spalił się, choć miał przecie nową drabinę i choć zapłacił *śtraf* za starą, co to były u niej połamane szczeble.\\n\\nW takiej wsi urodził się Antek. Położyli go w niemalowanej kołysce, co została po zmarłym bracie, i sypiał w niej przez dwa lata. Potem przyszła mu na świat siostra, Rozalia, więc musiał jej miejsca ustąpić, a sam, jako osoba dorosła, przenieść się na ławę.\\n\\nPrzez ten rok kołysał siostrę, a przez cały następny — rozglądał się po świecie. Raz wpadł w rzekę, drugi raz dostał batem od przejezdnego furmana za to, że go o mało konie nie stratowały, a trzeci raz psy tak go pogryzły, że dwa tygodnie leżał na piecu. Doświadczył więc niemało. Za to w czwartym roku życia ojciec podarował mu swoją sukienną kamizelkę z mosiężnym guzikiem, a matka — kazała mu siostrę nosić.\\n\\nGdy miał pięć lat, użyto go już — do pasania świń. Ale Antek nie bardzo się za nimi oglądał. Wolał patrzeć na drugą stronę Wisły, gdzie za wapiennym wzgórzem raz na raz pokazywało się coś wysokiego i czarnego. Wyłaziło to z lewej strony jakby spod ziemi, szło w górę i upadało na prawo. Za tym pierwszym szło zaraz drugie i trzecie, takie samo czarne i wysokie.\\n\\nTymczasem świnie swoim obyczajem wlazły w kartofle. Matka spostrzegłszy to zawinęła się wedle sukiennej kamizelki Antkowej, tak że chłopiec prawie tchu nie mógł złapać. Ale że nie miał w sercu zawziętości, bo było z niego dziecko dobre, więc wykrzyczawszy się i wydrapawszy — kamizelkę, zapytał matki:\\n\\n— Matulu! A co to takie czarne chodzi za Wisłą?\\n\\nMatka spojrzała w kierunku Antkowego palca, przysłoniła oczy ręką i odparła:\\n\\n— Tam za Wisłą? Cóż to, nie widzisz, że wiatrak chodzi? A na drugi raz pilnuj świń, bo cię pokrzywami wysmaruję.\\n\\n— Acha, wiatrak! A co on, matulu, za jeden?\\n\\n— At, głupiś! — odparła matka i uciekła do swojej roboty. Gdzie ona miała czas i rozum do udzielania objaśnień o wiatrakach!\\n\\nAle chłopcu wiatrak spokojności nie dawał. Antek widywał go przecie co dzień. Widywał go i w nocy przez sen. Więc taka straszna urosła w chłopcu ciekawość, że jednego dnia zakradł się do promu, co ludzi na drugą stronę rzeki przewoził, i popłynął za Wisłę.\\n\\nPopłynął, wdrapał się na wapienną górę, akurat w tym miejscu, gdzie stało ogłoszenie, aby tędy nie chodzić, i zobaczył wiatrak. Wydał mu się budynek ten jakby dzwonnica, tylko w sobie był grubszy, a tam gdzie na dzwonnicy jest okno, miał cztery tęgie skrzydła ustawione na krzyż. Z początku nie rozumiał nic — co to i na co to? Ale wnet objaśnili mu rzecz pastu
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"antek"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "9a778f84-4c71-48ae-a78c-95dfadaaf4b0",
"metadata": {},
"outputs": [],
"source": [
"tokenized_text = nltk.tokenize.word_tokenize(antek)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "4db07ac7-51b4-4dbd-be48-0a0d445df267",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Bolesław',\n",
" 'Prus',\n",
" 'Antek',\n",
" 'Antek',\n",
" 'urodził',\n",
" 'się',\n",
" 'we',\n",
" 'wsi',\n",
" 'nad',\n",
" 'Wisłą',\n",
" '.',\n",
" 'Wieś',\n",
" 'leżała',\n",
" 'w',\n",
" 'niewielkiej',\n",
" 'dolinie',\n",
" '.',\n",
" 'Od',\n",
" 'północy',\n",
" 'otaczały',\n",
" 'ją',\n",
" 'wzgórza',\n",
" 'spadziste',\n",
" ',',\n",
" 'porosłe',\n",
" 'sosnowym',\n",
" 'lasem',\n",
" ',',\n",
" 'a',\n",
" 'od',\n",
" 'południa',\n",
" 'wzgórza',\n",
" 'garbate',\n",
" ',',\n",
" 'zasypane',\n",
" 'leszczyną',\n",
" ',',\n",
" 'tarniną',\n",
" 'i',\n",
" 'głogiem',\n",
" '.',\n",
" 'Tam',\n",
" 'najgłośniej',\n",
" 'śpiewały',\n",
" 'ptaki',\n",
" 'i',\n",
" 'najczęściej',\n",
" 'chodziły',\n",
" 'wiejskie',\n",
" 'dzieci',\n",
" 'rwać',\n",
" 'orzechy',\n",
" 'albo',\n",
" 'wybierać',\n",
" 'gniazda',\n",
" '.',\n",
" 'Kiedyś',\n",
" 'stanął',\n",
" 'na',\n",
" 'środku',\n",
" 'wsi',\n",
" ',',\n",
" 'zdawało',\n",
" 'ci',\n",
" 'się',\n",
" ',',\n",
" 'że',\n",
" 'oba',\n",
" 'pasma',\n",
" 'gór',\n",
" 'biegną',\n",
" 'ku',\n",
" 'sobie',\n",
" ',',\n",
" 'ażeby',\n",
" 'zetknąć',\n",
" 'się',\n",
" 'tam',\n",
" ',',\n",
" 'gdzie',\n",
" 'z',\n",
" 'rana',\n",
" 'wstaje',\n",
" 'czerwone',\n",
" 'słońce',\n",
" '.',\n",
" 'Ale',\n",
" 'było',\n",
" 'to',\n",
" 'złudzenie',\n",
" '.',\n",
" 'Za',\n",
" 'wsią',\n",
" 'bowiem',\n",
" 'ciągnęła',\n",
" 'się',\n",
" 'między',\n",
" 'wzgórzami',\n",
" 'dolina',\n",
" 'przecięta',\n",
" 'rzeczułką',\n",
" 'i',\n",
" 'przykryta',\n",
" 'zieloną',\n",
" 'łąką',\n",
" '.',\n",
" 'Tam',\n",
" 'pasano',\n",
" 'bydlątka',\n",
" 'i',\n",
" 'tam',\n",
" 'cienkonogie',\n",
" 'bociany',\n",
" 'chodziły',\n",
" 'polować',\n",
" 'na',\n",
" 'żaby',\n",
" 'kukające',\n",
" 'wieczorami',\n",
" '.',\n",
" 'Od',\n",
" 'zachodu',\n",
" 'wieś',\n",
" 'miała',\n",
" 'tamę',\n",
" ',',\n",
" 'za',\n",
" 'tamą',\n",
" 'Wisłę',\n",
" ',',\n",
" 'a',\n",
" 'za',\n",
" 'Wisłą',\n",
" 'znowu',\n",
" 'wzgórza',\n",
" 'wapienne',\n",
" ',',\n",
" 'nagie',\n",
" '.',\n",
" 'Każdy',\n",
" 'chłopski',\n",
" 'dom',\n",
" ',',\n",
" 'szarą',\n",
" 'słomą',\n",
" 'pokryty',\n",
" ',',\n",
" 'miał',\n",
" 'ogródek',\n",
" ',',\n",
" 'a',\n",
" 'w',\n",
" 'ogródku',\n",
" 'śliwki',\n",
" 'węgierki',\n",
" ',',\n",
" 'spomiędzy',\n",
" 'których',\n",
" 'widać',\n",
" 'było',\n",
" 'komin',\n",
" 'sadzą',\n",
" 'uczerniony',\n",
" 'i',\n",
" 'pożarną',\n",
" 'drabinkę',\n",
" '.',\n",
" 'Drabiny',\n",
" 'te',\n",
" 'zaprowadzono',\n",
" 'nie',\n",
" 'od',\n",
" 'dawna',\n",
" ',',\n",
" 'a',\n",
" 'ludzie',\n",
" 'myśleli',\n",
" ',',\n",
" 'że',\n",
" 'one',\n",
" 'lepiej',\n",
" 'chronić',\n",
" 'będą',\n",
" 'chaty',\n",
" 'od',\n",
" 'ognia',\n",
" 'niż',\n",
" 'dawniej',\n",
" 'bocianie',\n",
" 'gniazda',\n",
" '.',\n",
" 'Toteż',\n",
" 'gdy',\n",
" 'płonął',\n",
" 'jaki',\n",
" 'budynek',\n",
" ',',\n",
" 'dziwili',\n",
" 'się',\n",
" 'bardzo',\n",
" ',',\n",
" 'ale',\n",
" 'go',\n",
" 'nie',\n",
" 'ratowali',\n",
" '.',\n",
" '—',\n",
" 'Widać',\n",
" ',',\n",
" 'że',\n",
" 'na',\n",
" 'tego',\n",
" 'gospodarza',\n",
" 'był',\n",
" 'dopust',\n",
" 'boski',\n",
" '—',\n",
" 'mówili',\n",
" 'między',\n",
" 'sobą',\n",
" '.',\n",
" '—',\n",
" 'Spalił',\n",
" 'się',\n",
" ',',\n",
" 'choć',\n",
" 'miał',\n",
" 'przecie',\n",
" 'nową',\n",
" 'drabinę',\n",
" 'i',\n",
" 'choć',\n",
" 'zapłacił',\n",
" '*',\n",
" 'śtraf',\n",
" '*',\n",
" 'za',\n",
" 'starą',\n",
" ',',\n",
" 'co',\n",
" 'to',\n",
" 'były',\n",
" 'u',\n",
" 'niej',\n",
" 'połamane',\n",
" 'szczeble',\n",
" '.',\n",
" 'W',\n",
" 'takiej',\n",
" 'wsi',\n",
" 'urodził',\n",
" 'się',\n",
" 'Antek',\n",
" '.',\n",
" 'Położyli',\n",
" 'go',\n",
" 'w',\n",
" 'niemalowanej',\n",
" 'kołysce',\n",
" ',',\n",
" 'co',\n",
" 'została',\n",
" 'po',\n",
" 'zmarłym',\n",
" 'bracie',\n",
" ',',\n",
" 'i',\n",
" 'sypiał',\n",
" 'w',\n",
" 'niej',\n",
" 'przez',\n",
" 'dwa',\n",
" 'lata',\n",
" '.',\n",
" 'Potem',\n",
" 'przyszła',\n",
" 'mu',\n",
" 'na',\n",
" 'świat',\n",
" 'siostra',\n",
" ',',\n",
" 'Rozalia',\n",
" ',',\n",
" 'więc',\n",
" 'musiał',\n",
" 'jej',\n",
" 'miejsca',\n",
" 'ustąpić',\n",
" ',',\n",
" 'a',\n",
" 'sam',\n",
" ',',\n",
" 'jako',\n",
" 'osoba',\n",
" 'dorosła',\n",
" ',',\n",
" 'przenieść',\n",
" 'się',\n",
" 'na',\n",
" 'ławę',\n",
" '.',\n",
" 'Przez',\n",
" 'ten',\n",
" 'rok',\n",
" 'kołysał',\n",
" 'siostrę',\n",
" ',',\n",
" 'a',\n",
" 'przez',\n",
" 'cały',\n",
" 'następny',\n",
" '—',\n",
" 'rozglądał',\n",
" 'się',\n",
" 'po',\n",
" 'świecie',\n",
" '.',\n",
" 'Raz',\n",
" 'wpadł',\n",
" 'w',\n",
" 'rzekę',\n",
" ',',\n",
" 'drugi',\n",
" 'raz',\n",
" 'dostał',\n",
" 'batem',\n",
" 'od',\n",
" 'przejezdnego',\n",
" 'furmana',\n",
" 'za',\n",
" 'to',\n",
" ',',\n",
" 'że',\n",
" 'go',\n",
" 'o',\n",
" 'mało',\n",
" 'konie',\n",
" 'nie',\n",
" 'stratowały',\n",
" ',',\n",
" 'a',\n",
" 'trzeci',\n",
" 'raz',\n",
" 'psy',\n",
" 'tak',\n",
" 'go',\n",
" 'pogryzły',\n",
" ',',\n",
" 'że',\n",
" 'dwa',\n",
" 'tygodnie',\n",
" 'leżał',\n",
" 'na',\n",
" 'piecu',\n",
" '.',\n",
" 'Doświadczył',\n",
" 'więc',\n",
" 'niemało',\n",
" '.',\n",
" 'Za',\n",
" 'to',\n",
" 'w',\n",
" 'czwartym',\n",
" 'roku',\n",
" 'życia',\n",
" 'ojciec',\n",
" 'podarował',\n",
" 'mu',\n",
" 'swoją',\n",
" 'sukienną',\n",
" 'kamizelkę',\n",
" 'z',\n",
" 'mosiężnym',\n",
" 'guzikiem',\n",
" ',',\n",
" 'a',\n",
" 'matka',\n",
" '—',\n",
" 'kazała',\n",
" 'mu',\n",
" 'siostrę',\n",
" 'nosić',\n",
" '.',\n",
" 'Gdy',\n",
" 'miał',\n",
" 'pięć',\n",
" 'lat',\n",
" ',',\n",
" 'użyto',\n",
" 'go',\n",
" 'już',\n",
" '—',\n",
" 'do',\n",
" 'pasania',\n",
" 'świń',\n",
" '.',\n",
" 'Ale',\n",
" 'Antek',\n",
" 'nie',\n",
" 'bardzo',\n",
" 'się',\n",
" 'za',\n",
" 'nimi',\n",
" 'oglądał',\n",
" '.',\n",
" 'Wolał',\n",
" 'patrzeć',\n",
" 'na',\n",
" 'drugą',\n",
" 'stronę',\n",
" 'Wisły',\n",
" ',',\n",
" 'gdzie',\n",
" 'za',\n",
" 'wapiennym',\n",
" 'wzgórzem',\n",
" 'raz',\n",
" 'na',\n",
" 'raz',\n",
" 'pokazywało',\n",
" 'się',\n",
" 'coś',\n",
" 'wysokiego',\n",
" 'i',\n",
" 'czarnego',\n",
" '.',\n",
" 'Wyłaziło',\n",
" 'to',\n",
" 'z',\n",
" 'lewej',\n",
" 'strony',\n",
" 'jakby',\n",
" 'spod',\n",
" 'ziemi',\n",
" ',',\n",
" 'szło',\n",
" 'w',\n",
" 'górę',\n",
" 'i',\n",
" 'upadało',\n",
" 'na',\n",
" 'prawo',\n",
" '.',\n",
" 'Za',\n",
" 'tym',\n",
" 'pierwszym',\n",
" 'szło',\n",
" 'zaraz',\n",
" 'drugie',\n",
" 'i',\n",
" 'trzecie',\n",
" ',',\n",
" 'takie',\n",
" 'samo',\n",
" 'czarne',\n",
" 'i',\n",
" 'wysokie',\n",
" '.',\n",
" 'Tymczasem',\n",
" 'świnie',\n",
" 'swoim',\n",
" 'obyczajem',\n",
" 'wlazły',\n",
" 'w',\n",
" 'kartofle',\n",
" '.',\n",
" 'Matka',\n",
" 'spostrzegłszy',\n",
" 'to',\n",
" 'zawinęła',\n",
" 'się',\n",
" 'wedle',\n",
" 'sukiennej',\n",
" 'kamizelki',\n",
" 'Antkowej',\n",
" ',',\n",
" 'tak',\n",
" 'że',\n",
" 'chłopiec',\n",
" 'prawie',\n",
" 'tchu',\n",
" 'nie',\n",
" 'mógł',\n",
" 'złapać',\n",
" '.',\n",
" 'Ale',\n",
" 'że',\n",
" 'nie',\n",
" 'miał',\n",
" 'w',\n",
" 'sercu',\n",
" 'zawziętości',\n",
" ',',\n",
" 'bo',\n",
" 'było',\n",
" 'z',\n",
" 'niego',\n",
" 'dziecko',\n",
" 'dobre',\n",
" ',',\n",
" 'więc',\n",
" 'wykrzyczawszy',\n",
" 'się',\n",
" 'i',\n",
" 'wydrapawszy',\n",
" '—',\n",
" 'kamizelkę',\n",
" ',',\n",
" 'zapytał',\n",
" 'matki',\n",
" ':',\n",
" '—',\n",
" 'Matulu',\n",
" '!',\n",
" 'A',\n",
" 'co',\n",
" 'to',\n",
" 'takie',\n",
" 'czarne',\n",
" 'chodzi',\n",
" 'za',\n",
" 'Wisłą',\n",
" '?',\n",
" 'Matka',\n",
" 'spojrzała',\n",
" 'w',\n",
" 'kierunku',\n",
" 'Antkowego',\n",
" 'palca',\n",
" ',',\n",
" 'przysłoniła',\n",
" 'oczy',\n",
" 'ręką',\n",
" 'i',\n",
" 'odparła',\n",
" ':',\n",
" '—',\n",
" 'Tam',\n",
" 'za',\n",
" 'Wisłą',\n",
" '?',\n",
" 'Cóż',\n",
" 'to',\n",
" ',',\n",
" 'nie',\n",
" 'widzisz',\n",
" ',',\n",
" 'że',\n",
" 'wiatrak',\n",
" 'chodzi',\n",
" '?',\n",
" 'A',\n",
" 'na',\n",
" 'drugi',\n",
" 'raz',\n",
" 'pilnuj',\n",
" 'świń',\n",
" ',',\n",
" 'bo',\n",
" 'cię',\n",
" 'pokrzywami',\n",
" 'wysmaruję',\n",
" '.',\n",
" '—',\n",
" 'Acha',\n",
" ',',\n",
" 'wiatrak',\n",
" '!',\n",
" 'A',\n",
" 'co',\n",
" 'on',\n",
" ',',\n",
" 'matulu',\n",
" ',',\n",
" 'za',\n",
" 'jeden',\n",
" '?',\n",
" '—',\n",
" 'At',\n",
" ',',\n",
" 'głupiś',\n",
" '!',\n",
" '—',\n",
" 'odparła',\n",
" 'matka',\n",
" 'i',\n",
" 'uciekła',\n",
" 'do',\n",
" 'swojej',\n",
" 'roboty',\n",
" '.',\n",
" 'Gdzie',\n",
" 'ona',\n",
" 'miała',\n",
" 'czas',\n",
" 'i',\n",
" 'rozum',\n",
" 'do',\n",
" 'udzielania',\n",
" 'objaśnień',\n",
" 'o',\n",
" 'wiatrakach',\n",
" '!',\n",
" 'Ale',\n",
" 'chłopcu',\n",
" 'wiatrak',\n",
" 'spokojności',\n",
" 'nie',\n",
" 'dawał',\n",
" '.',\n",
" 'Antek',\n",
" 'widywał',\n",
" 'go',\n",
" 'przecie',\n",
" 'co',\n",
" 'dzień',\n",
" '.',\n",
" 'Widywał',\n",
" 'go',\n",
" 'i',\n",
" 'w',\n",
" 'nocy',\n",
" 'przez',\n",
" 'sen.',\n",
" 'Więc',\n",
" 'taka',\n",
" 'straszna',\n",
" 'urosła',\n",
" 'w',\n",
" 'chłopcu',\n",
" 'ciekawość',\n",
" ',',\n",
" 'że',\n",
" 'jednego',\n",
" 'dnia',\n",
" 'zakradł',\n",
" 'się',\n",
" 'do',\n",
" 'promu',\n",
" ',',\n",
" 'co',\n",
" 'ludzi',\n",
" 'na',\n",
" 'drugą',\n",
" 'stronę',\n",
" 'rzeki',\n",
" 'przewoził',\n",
" ',',\n",
" 'i',\n",
" 'popłynął',\n",
" 'za',\n",
" 'Wisłę',\n",
" '.',\n",
" 'Popłynął',\n",
" ',',\n",
" 'wdrapał',\n",
" 'się',\n",
" 'na',\n",
" 'wapienną',\n",
" 'górę',\n",
" ',',\n",
" 'akurat',\n",
" 'w',\n",
" 'tym',\n",
" 'miejscu',\n",
" ',',\n",
" 'gdzie',\n",
" 'stało',\n",
" 'ogłoszenie',\n",
" ',',\n",
" 'aby',\n",
" 'tędy',\n",
" 'nie',\n",
" 'chodzić',\n",
" ',',\n",
" 'i',\n",
" 'zobaczył',\n",
" 'wiatrak',\n",
" '.',\n",
" 'Wydał',\n",
" 'mu',\n",
" 'się',\n",
" 'budynek',\n",
" 'ten',\n",
" 'jakby',\n",
" 'dzwonnica',\n",
" ',',\n",
" 'tylko',\n",
" 'w',\n",
" 'sobie',\n",
" 'był',\n",
" 'grubszy',\n",
" ',',\n",
" 'a',\n",
" 'tam',\n",
" 'gdzie',\n",
" 'na',\n",
" 'dzwonnicy',\n",
" 'jest',\n",
" 'okno',\n",
" ',',\n",
" 'miał',\n",
" 'cztery',\n",
" 'tęgie',\n",
" 'skrzydła',\n",
" 'ustawione',\n",
" 'na',\n",
" 'krzyż',\n",
" '.',\n",
" 'Z',\n",
" 'początku',\n",
" 'nie',\n",
" 'rozumiał',\n",
" 'nic',\n",
" '—',\n",
" 'co',\n",
" 'to',\n",
" 'i',\n",
" 'na',\n",
" 'co',\n",
" 'to',\n",
" '?',\n",
" 'Ale',\n",
" 'wnet',\n",
" 'objaśnili',\n",
" 'mu',\n",
" 'rzecz',\n",
" 'pastuchowie',\n",
" ',',\n",
" 'więc',\n",
" 'dowiedział',\n",
" 'się',\n",
" 'o',\n",
" 'wszystkim',\n",
" '.',\n",
" 'Naprzód',\n",
" 'o',\n",
" 'tym',\n",
" ',',\n",
" 'że',\n",
" 'na',\n",
" 'skrzydła',\n",
" 'dmucha',\n",
" 'wiater',\n",
" 'i',\n",
" 'kręci',\n",
" 'nimi',\n",
" 'jak',\n",
" 'liśćmi',\n",
" '.',\n",
" 'Dalej',\n",
" 'o',\n",
" 'tym',\n",
" ',',\n",
" 'że',\n",
" 'w',\n",
" 'wiatraku',\n",
" 'miele',\n",
" 'się',\n",
" 'zboże',\n",
" 'na',\n",
" 'mąkę',\n",
" ',',\n",
" 'i',\n",
" 'nareszcie',\n",
" 'o',\n",
" 'tym',\n",
" ',',\n",
" 'że',\n",
" 'przy',\n",
" 'wiatraku',\n",
" 'siedzi',\n",
" 'młynarz',\n",
" ',',\n",
" 'co',\n",
" 'żonę',\n",
" 'bije',\n",
" ',',\n",
" 'a',\n",
" 'taki',\n",
" 'jest',\n",
" 'mądry',\n",
" ',',\n",
" 'że',\n",
" 'wie',\n",
" ',',\n",
" 'jakim',\n",
" 'sposobem',\n",
" 'ze',\n",
" 'śpichrzów',\n",
" 'wyprowadza',\n",
" 'się',\n",
" 'szczury',\n",
" '.',\n",
" 'Po',\n",
" 'takiej',\n",
" 'poglądowej',\n",
" 'lekcji',\n",
" 'Antek',\n",
" 'wrócił',\n",
" 'do',\n",
" 'domu',\n",
" 'tą',\n",
" 'samą',\n",
" 'drogą',\n",
" 'co',\n",
" 'pierwej',\n",
" '.',\n",
" 'Dali',\n",
" 'mu',\n",
" 'tam',\n",
" 'przewoźnicy',\n",
" 'parę',\n",
" 'razy',\n",
" 'w',\n",
" 'łeb',\n",
" 'za',\n",
" 'swoją',\n",
" 'krwawą',\n",
" 'pracę',\n",
" ',',\n",
" 'dała',\n",
" 'mu',\n",
" 'i',\n",
" 'matka',\n",
" 'coś',\n",
" 'na',\n",
" 'sukienną',\n",
" 'kamizelkę',\n",
" ',',\n",
" 'ale',\n",
" 'to',\n",
" 'nic',\n",
" ':',\n",
" 'Antek',\n",
" 'był',\n",
" 'kontent',\n",
" ',',\n",
" 'bo',\n",
" 'zaspokoił',\n",
" 'ciekawość',\n",
" '.',\n",
" 'Więc',\n",
" 'choć',\n",
" 'położył',\n",
" 'się',\n",
" 'spać',\n",
" 'o',\n",
" 'głodzie',\n",
" ',',\n",
" 'marzył',\n",
" 'całą',\n",
" 'noc',\n",
" 'to',\n",
" 'o',\n",
" 'wiatraku',\n",
" ',',\n",
" 'co',\n",
" 'mieli',\n",
" 'zboże',\n",
" ',',\n",
" 'to',\n",
" 'o',\n",
" 'młynarzu',\n",
" ',',\n",
" 'co',\n",
" 'bije',\n",
" 'żonę',\n",
" 'i',\n",
" 'szczury',\n",
" 'wyprowadza',\n",
" 'ze',\n",
" 'śpichrzów',\n",
" '.',\n",
" 'Drobny',\n",
" 'ten',\n",
" 'wypadek',\n",
" 'stanowczo',\n",
" 'wpłynął',\n",
" 'na',\n",
" 'całe',\n",
" 'życie',\n",
" 'chłopca',\n",
" '.',\n",
" 'Od',\n",
" 'tej',\n",
" 'pory',\n",
" '—',\n",
" 'od',\n",
" 'wschodu',\n",
" 'do',\n",
" 'zachodu',\n",
" 'słońca',\n",
" '—',\n",
" 'strugał',\n",
" 'on',\n",
" 'patyki',\n",
" 'i',\n",
" 'układał',\n",
" 'je',\n",
" 'na',\n",
" 'krzyż',\n",
" '.',\n",
" 'Potem',\n",
" 'wystrugał',\n",
" 'sobie',\n",
" 'kolumnę',\n",
" ';',\n",
" 'próbował',\n",
" ',',\n",
" 'obciosywał',\n",
" ',',\n",
" 'ustawiał',\n",
" ',',\n",
" 'aż',\n",
" 'nareszcie',\n",
" 'wybudował',\n",
" 'mały',\n",
" 'wiatraczek',\n",
" ',',\n",
" 'który',\n",
" 'na',\n",
" 'wietrze',\n",
" 'obracał',\n",
" 'mu',\n",
" 'się',\n",
" 'jak',\n",
" 'tamten',\n",
" 'za',\n",
" 'Wisłą',\n",
" '.',\n",
" 'Cóż',\n",
" 'to',\n",
" 'była',\n",
" 'za',\n",
" 'radość',\n",
" '!',\n",
" 'Teraz',\n",
" 'brakowało',\n",
" 'Antkowi',\n",
" 'tylko',\n",
" 'żony',\n",
" ',',\n",
" 'żeby',\n",
" 'mógł',\n",
" 'ją',\n",
" 'bić',\n",
" ',',\n",
" 'i',\n",
" 'już',\n",
" 'byłby',\n",
" 'z',\n",
" 'niego',\n",
" 'prawdziwy',\n",
" 'młynarz',\n",
" '!',\n",
" 'Do',\n",
" 'dziesiątego',\n",
" 'roku',\n",
" 'życia',\n",
" 'zepsuł',\n",
" 'ze',\n",
" 'cztery',\n",
" 'koziki',\n",
" ',',\n",
" 'ale',\n",
" 'też',\n",
" 'strugał',\n",
" 'nimi',\n",
" 'dziwne',\n",
" 'rzeczy',\n",
" '.',\n",
" 'Robił',\n",
" 'wiatraki',\n",
" ',',\n",
" 'płoty',\n",
" ',',\n",
" 'drabiny',\n",
" ',',\n",
" 'studnie',\n",
" ',',\n",
" 'a',\n",
" 'nawet',\n",
" 'całe',\n",
" 'chałupy',\n",
" '.',\n",
" 'Aż',\n",
" 'się',\n",
" 'ludzie',\n",
" 'zastanawiali',\n",
" 'i',\n",
" 'mówili',\n",
" 'do',\n",
" 'matki',\n",
" ',',\n",
" 'że',\n",
" 'z',\n",
" 'Antka',\n",
" 'albo',\n",
" 'będzie',\n",
" ...]"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tokenized_text"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "9e35a161-fbd7-484d-85ed-8282995f9cbe",
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"from collections import Counter\n",
"import string"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "d3b24a92-7f08-4bb6-9746-8c56eecfcf13",
"metadata": {},
"outputs": [],
"source": [
"word_counts = Counter(tokenized_text)\n",
"\n",
"words = list(word_counts.keys())\n",
"frequencies = list(word_counts.values())"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "547ff718-24f0-461a-b1ce-349f09ff4d62",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({',': 798,\n",
" '.': 455,\n",
" 'i': 277,\n",
" '—': 244,\n",
" 'się': 208,\n",
" 'na': 166,\n",
" 'w': 143,\n",
" 'z': 124,\n",
" 'nie': 117,\n",
" 'do': 107,\n",
" 'to': 89,\n",
" '!': 89,\n",
" 'a': 82,\n",
" 'że': 74,\n",
" 'Antek': 61,\n",
" 'za': 61,\n",
" 'co': 61,\n",
" 'jak': 58,\n",
" '?': 53,\n",
" 'mu': 52,\n",
" 'go': 50,\n",
" 'A': 50,\n",
" ':': 45,\n",
" '…': 35,\n",
" 'ale': 34,\n",
" 'tak': 34,\n",
" 'już': 32,\n",
" 'Ale': 26,\n",
" 'po': 26,\n",
" 'matka': 26,\n",
" 'o': 24,\n",
" 'jeszcze': 23,\n",
" 'sobie': 22,\n",
" 'tam': 21,\n",
" 'więc': 21,\n",
" 'od': 19,\n",
" 'choć': 19,\n",
" 'raz': 19,\n",
" 'nim': 19,\n",
" 'kiedy': 19,\n",
" 'było': 18,\n",
" 'tym': 18,\n",
" 'tylko': 18,\n",
" 'nawet': 18,\n",
" 'jego': 18,\n",
" 'wsi': 17,\n",
" 'był': 17,\n",
" 'W': 17,\n",
" 'bo': 17,\n",
" 'niego': 17,\n",
" 'gdzie': 16,\n",
" 'przez': 16,\n",
" 'on': 16,\n",
" 'ze': 16,\n",
" 'chłopca': 16,\n",
" 'aż': 16,\n",
" 'ją': 15,\n",
" 'miał': 15,\n",
" 'Antka': 15,\n",
" 'nad': 14,\n",
" '*': 14,\n",
" 'domu': 14,\n",
" 'ja': 14,\n",
" 'trochę': 14,\n",
" 'pod': 14,\n",
" 'gdy': 13,\n",
" 'przy': 13,\n",
" 'który': 13,\n",
" 'Andrzej': 13,\n",
" 'ty': 13,\n",
" 'przecie': 12,\n",
" 'u': 12,\n",
" 'jej': 12,\n",
" 'chłopiec': 12,\n",
" 'oczy': 12,\n",
" 'nic': 12,\n",
" 'Na': 12,\n",
" 'we': 11,\n",
" 'między': 11,\n",
" 'niej': 11,\n",
" 'cię': 11,\n",
" 'jest': 11,\n",
" ';': 11,\n",
" 'wciąż': 11,\n",
" 'szkoły': 11,\n",
" 'potem': 11,\n",
" 'ci': 10,\n",
" 'znowu': 10,\n",
" 'Potem': 10,\n",
" 'ten': 10,\n",
" 'była': 10,\n",
" 'żeby': 10,\n",
" 'ma': 10,\n",
" 'ino': 10,\n",
" 'wdowa': 10,\n",
" 'No': 10,\n",
" 'Jak': 10,\n",
" 'wójtowej': 10,\n",
" 'dzieci': 9,\n",
" 'drugi': 9,\n",
" 'jakby': 9,\n",
" 'taki': 9,\n",
" 'będzie': 9,\n",
" 'chłopak': 9,\n",
" 'I': 9,\n",
" 'nauczyciel': 9,\n",
" 'dobrze': 9,\n",
" 'poszedł': 9,\n",
" 'kuźni': 9,\n",
" 'albo': 8,\n",
" 'Za': 8,\n",
" 'dom': 8,\n",
" 'chaty': 8,\n",
" 'świat': 8,\n",
" 'swoją': 8,\n",
" 'lat': 8,\n",
" 'dzień': 8,\n",
" 'tej': 8,\n",
" 'też': 8,\n",
" 'chacie': 8,\n",
" 'Co': 8,\n",
" 'więcej': 8,\n",
" 'chyba': 8,\n",
" 'rzekł': 8,\n",
" 'ich': 8,\n",
" 'odparł': 8,\n",
" 'Nie': 8,\n",
" 'kowala': 8,\n",
" 'sołtys': 8,\n",
" 'wieś': 7,\n",
" 'ludzie': 7,\n",
" 'tego': 7,\n",
" 'nimi': 7,\n",
" 'Matka': 7,\n",
" 'mógł': 7,\n",
" 'matki': 7,\n",
" 'matulu': 7,\n",
" 'jeden': 7,\n",
" 'parę': 7,\n",
" 'majster': 7,\n",
" 'swoje': 7,\n",
" 'ani': 7,\n",
" 'czy': 7,\n",
" 'groszy': 7,\n",
" 'izby': 7,\n",
" 'wszystko': 7,\n",
" '„': 7,\n",
" '”': 7,\n",
" 'dla': 7,\n",
" 'mówił': 7,\n",
" 'Bo': 7,\n",
" 'rękę': 7,\n",
" 'bez': 7,\n",
" 'swój': 7,\n",
" 'krzyżyk': 7,\n",
" 'Wisłą': 6,\n",
" 'ażeby': 6,\n",
" 'bardzo': 6,\n",
" 'dwa': 6,\n",
" 'Przez': 6,\n",
" 'trzeci': 6,\n",
" 'roku': 6,\n",
" 'ona': 6,\n",
" 'czas': 6,\n",
" 'chłopcu': 6,\n",
" 'Po': 6,\n",
" 'drogą': 6,\n",
" 'wiatraki': 6,\n",
" 'kto': 6,\n",
" 'żal': 6,\n",
" 'naukę': 6,\n",
" 'stary': 6,\n",
" 'wziął': 6,\n",
" 'niby': 6,\n",
" 'dopiero': 6,\n",
" 'be': 6,\n",
" 'tu': 6,\n",
" 'szedł': 6,\n",
" 'kowal': 6,\n",
" 'wójtowa': 6,\n",
" 'Od': 5,\n",
" 'Tam': 5,\n",
" 'ku': 5,\n",
" 'miała': 5,\n",
" 'których': 5,\n",
" 'te': 5,\n",
" 'były': 5,\n",
" 'takiej': 5,\n",
" 'miejsca': 5,\n",
" 'sam': 5,\n",
" 'coś': 5,\n",
" 'prawie': 5,\n",
" 'dziecko': 5,\n",
" 'wiatrak': 5,\n",
" 'Więc': 5,\n",
" 'stało': 5,\n",
" 'krzyż': 5,\n",
" 'wrócił': 5,\n",
" 'razy': 5,\n",
" 'razem': 5,\n",
" 'ręce': 5,\n",
" 'przed': 5,\n",
" 'Jużci': 5,\n",
" 'by': 5,\n",
" 'wy': 5,\n",
" 'być': 5,\n",
" 'czasem': 5,\n",
" 'patrzył': 5,\n",
" 'mnie': 5,\n",
" 'tyle': 5,\n",
" 'może': 5,\n",
" 'To': 5,\n",
" 'myślał': 5,\n",
" 'czterdzieści': 5,\n",
" 'mi': 5,\n",
" 'nauki': 5,\n",
" 'prawda': 5,\n",
" 'spytał': 5,\n",
" 'gminy': 5,\n",
" 'góry': 5,\n",
" 'kościele': 5,\n",
" 'Nauczyciel': 5,\n",
" 'tablicy': 5,\n",
" 'jakiś': 5,\n",
" 'rozgrzewkę': 5,\n",
" 'nagle': 5,\n",
" 'kilka': 5,\n",
" 'jednak': 5,\n",
" 'figury': 5,\n",
" 'począł': 5,\n",
" 'której': 5,\n",
" 'wszyscy': 5,\n",
" 'głowę': 5,\n",
" 'Bóg': 5,\n",
" 'ramieniu': 5,\n",
" 'urodził': 4,\n",
" 'chodziły': 4,\n",
" 'lepiej': 4,\n",
" 'niż': 4,\n",
" 'siostra': 4,\n",
" 'jako': 4,\n",
" 'cały': 4,\n",
" 'świecie': 4,\n",
" 'piecu': 4,\n",
" 'drugą': 4,\n",
" 'szło': 4,\n",
" 'samo': 4,\n",
" 'Tymczasem': 4,\n",
" 'zapytał': 4,\n",
" 'Matulu': 4,\n",
" 'chodzi': 4,\n",
" 'spojrzała': 4,\n",
" 'ręką': 4,\n",
" 'widzisz': 4,\n",
" 'roboty': 4,\n",
" 'dnia': 4,\n",
" 'ludzi': 4,\n",
" 'zboże': 4,\n",
" 'sposobem': 4,\n",
" 'patyki': 4,\n",
" 'je': 4,\n",
" 'Teraz': 4,\n",
" 'rzeczy': 4,\n",
" 'matką': 4,\n",
" 'Już': 4,\n",
" 'Andrzeju': 4,\n",
" 'widział': 4,\n",
" 'Ja': 4,\n",
" 'odpowiadał': 4,\n",
" 'zawsze': 4,\n",
" 'wówczas': 4,\n",
" 'Wtedy': 4,\n",
" 'rzekła': 4,\n",
" 'trzy': 4,\n",
" 'moja': 4,\n",
" 'O': 4,\n",
" 'zawołał': 4,\n",
" 'tydzień': 4,\n",
" 'szkole': 4,\n",
" 'pytał': 4,\n",
" 'nóg': 4,\n",
" 'profesorowi': 4,\n",
" 'ręki': 4,\n",
" 'ładny': 4,\n",
" 'nauczył': 4,\n",
" 'Przecie': 4,\n",
" 'matce': 4,\n",
" 'odezwał': 4,\n",
" 'uczy': 4,\n",
" 'profesor': 4,\n",
" 'nasz': 4,\n",
" 'ta': 4,\n",
" 'inni': 4,\n",
" 'Kiedy': 4,\n",
" 'wyraz': 4,\n",
" 'siebie': 4,\n",
" 'szczęście': 4,\n",
" 'miesiąc': 4,\n",
" 'poczęli': 4,\n",
" 'nas': 4,\n",
" 'duszy': 4,\n",
" 'rzeźbił': 4,\n",
" 'poszli': 4,\n",
" 'Kowal': 4,\n",
" 'dziś': 4,\n",
" 'pracy': 4,\n",
" 'karczmy': 4,\n",
" 'chłop': 4,\n",
" 'zwykle': 4,\n",
" 'wójtową': 4,\n",
" 'temu': 4,\n",
" 'Antku': 4,\n",
" 'dolinie': 3,\n",
" 'wzgórza': 3,\n",
" 'ptaki': 3,\n",
" 'środku': 3,\n",
" 'widać': 3,\n",
" 'jaki': 3,\n",
" 'budynek': 3,\n",
" 'Widać': 3,\n",
" 'mówili': 3,\n",
" 'starą': 3,\n",
" 'rok': 3,\n",
" 'siostrę': 3,\n",
" 'życia': 3,\n",
" 'kamizelkę': 3,\n",
" 'Gdy': 3,\n",
" 'patrzeć': 3,\n",
" 'Wisły': 3,\n",
" 'ziemi': 3,\n",
" 'górę': 3,\n",
" 'prawo': 3,\n",
" 'takie': 3,\n",
" 'kartofle': 3,\n",
" 'sercu': 3,\n",
" 'dobre': 3,\n",
" 'odparła': 3,\n",
" 'Cóż': 3,\n",
" 'wiatrakach': 3,\n",
" 'taka': 3,\n",
" 'jednego': 3,\n",
" 'aby': 3,\n",
" 'chodzić': 3,\n",
" 'początku': 3,\n",
" 'wszystkim': 3,\n",
" 'wiatraku': 3,\n",
" 'nareszcie': 3,\n",
" 'bije': 3,\n",
" 'szczury': 3,\n",
" 'noc': 3,\n",
" 'całe': 3,\n",
" 'życie': 3,\n",
" 'strugał': 3,\n",
" 'mały': 3,\n",
" 'Antkowi': 3,\n",
" 'byłby': 3,\n",
" 'Aż': 3,\n",
" 'wielki': 3,\n",
" 'brat': 3,\n",
" 'Wojtek': 3,\n",
" 'drzewo': 3,\n",
" 'wodę': 3,\n",
" 'nigdy': 3,\n",
" 'Chłopak': 3,\n",
" 'Dopiero': 3,\n",
" 'aniżeli': 3,\n",
" 'gospodarz': 3,\n",
" 'świata': 3,\n",
" 'majstra': 3,\n",
" 'rzemiosła': 3,\n",
" 'jeżeli': 3,\n",
" 'Oj': 3,\n",
" 'strasznie': 3,\n",
" 'trudno': 3,\n",
" 'dziewczyna': 3,\n",
" 'gorzej': 3,\n",
" 'sześć': 3,\n",
" 'wielką': 3,\n",
" 'chleba': 3,\n",
" 'Wdowa': 3,\n",
" 'patrzy': 3,\n",
" 'piec': 3,\n",
" 'pieca': 3,\n",
" 'zawołała': 3,\n",
" 'Cicho': 3,\n",
" 'matkę': 3,\n",
" 'Pan': 3,\n",
" 'ledwie': 3,\n",
" 'wyście': 3,\n",
" 'prędko': 3,\n",
" 'nikt': 3,\n",
" 'wiedział': 3,\n",
" 'drodze': 3,\n",
" 'źle': 3,\n",
" 'polne': 3,\n",
" 'koniki': 3,\n",
" 'gospodarstwie': 3,\n",
" 'Andrzeja': 3,\n",
" 'kancelarii': 3,\n",
" 'pan': 3,\n",
" 'umie': 3,\n",
" 'Jakże': 3,\n",
" 'czego': 3,\n",
" 'Te': 3,\n",
" 'gadać': 3,\n",
" 'abecadło': 3,\n",
" 'umiał': 3,\n",
" 'pisarz': 3,\n",
" 'dni': 3,\n",
" 'ławki': 3,\n",
" 'jedna': 3,\n",
" 'znak': 3,\n",
" 'literę': 3,\n",
" 'a…': 3,\n",
" 'oddziału': 3,\n",
" 'precel': 3,\n",
" 'nazywa': 3,\n",
" 'Chłopiec': 3,\n",
" 'ramiona': 3,\n",
" 'uczuł': 3,\n",
" 'usłyszał': 3,\n",
" 'druga': 3,\n",
" 'hardy': 3,\n",
" 'taką': 3,\n",
" 'bądź': 3,\n",
" 'później': 3,\n",
" 'gospodyni': 3,\n",
" 'im': 3,\n",
" 'także': 3,\n",
" 'jakoś': 3,\n",
" 'Wreszcie': 3,\n",
" 'tę': 3,\n",
" 'łgarstwo': 3,\n",
" 'sama': 3,\n",
" 'Tak': 3,\n",
" 'rubla': 3,\n",
" 'przyszedł': 3,\n",
" 'Nawet': 3,\n",
" 'wyszedł': 3,\n",
" 'człowiek': 3,\n",
" 'mówi': 3,\n",
" 'chłopaku': 3,\n",
" 'pieniędzy': 3,\n",
" 'wiadomo': 3,\n",
" 'swoich': 3,\n",
" 'świętych': 3,\n",
" 'kum': 3,\n",
" 'najwięcej': 3,\n",
" 'Idzie': 3,\n",
" 'polu': 3,\n",
" 'Bogu': 3,\n",
" 'sołtysem': 3,\n",
" 'butelkę': 3,\n",
" 'jutro': 3,\n",
" 'ktoś': 3,\n",
" 'terminatorzy': 3,\n",
" 'trzymał': 3,\n",
" 'chwilę': 3,\n",
" 'kobiety': 3,\n",
" 'mówiła': 3,\n",
" 'Ledwie': 3,\n",
" 'życiu': 3,\n",
" 'ludu': 3,\n",
" 'święta': 3,\n",
" 'sukmanę': 3,\n",
" 'drogi': 3,\n",
" 'kobiałkę': 3,\n",
" 'torbę': 3,\n",
" 'Zdawało': 3,\n",
" 'lasem': 2,\n",
" 'południa': 2,\n",
" 'śpiewały': 2,\n",
" 'wiejskie': 2,\n",
" 'gniazda': 2,\n",
" 'czerwone': 2,\n",
" 'wsią': 2,\n",
" 'zachodu': 2,\n",
" 'tamę': 2,\n",
" 'Wisłę': 2,\n",
" 'ognia': 2,\n",
" 'dawniej': 2,\n",
" 'dziwili': 2,\n",
" 'sobą': 2,\n",
" 'została': 2,\n",
" 'lata': 2,\n",
" 'Rozalia': 2,\n",
" 'musiał': 2,\n",
" 'przenieść': 2,\n",
" 'kołysał': 2,\n",
" 'wpadł': 2,\n",
" 'dostał': 2,\n",
" 'mało': 2,\n",
" 'psy': 2,\n",
" 'leżał': 2,\n",
" 'ojciec': 2,\n",
" 'sukienną': 2,\n",
" 'świń': 2,\n",
" 'oglądał': 2,\n",
" 'stronę': 2,\n",
" 'spod': 2,\n",
" 'pierwszym': 2,\n",
" 'zaraz': 2,\n",
" 'drugie': 2,\n",
" 'czarne': 2,\n",
" 'swoim': 2,\n",
" 'swojej': 2,\n",
" 'nocy': 2,\n",
" 'ciekawość': 2,\n",
" 'zobaczył': 2,\n",
" 'cztery': 2,\n",
" 'skrzydła': 2,\n",
" 'rozumiał': 2,\n",
" 'wnet': 2,\n",
" 'rzecz': 2,\n",
" 'Naprzód': 2,\n",
" 'młynarz': 2,\n",
" 'żonę': 2,\n",
" 'jakim': 2,\n",
" 'śpichrzów': 2,\n",
" 'wyprowadza': 2,\n",
" 'samą': 2,\n",
" 'pierwej': 2,\n",
" 'łeb': 2,\n",
" 'pracę': 2,\n",
" 'dała': 2,\n",
" 'położył': 2,\n",
" 'głodzie': 2,\n",
" 'wypadek': 2,\n",
" 'pory': 2,\n",
" 'słońca': 2,\n",
" 'żony': 2,\n",
" 'bić': 2,\n",
" 'zepsuł': 2,\n",
" 'chałupy': 2,\n",
" 'ojca': 2,\n",
" 'przytłukło': 2,\n",
" 'wielka': 2,\n",
" 'Dziewczyna': 2,\n",
" 'krupnik': 2,\n",
" 'bydła': 2,\n",
" 'Antkiem': 2,\n",
" 'nabili': 2,\n",
" 'płakał': 2,\n",
" 'robił': 2,\n",
" 'bydło': 2,\n",
" 'widząc': 2,\n",
" 'dziewucha': 2,\n",
" 'rozumu': 2,\n",
" 'mój': 2,\n",
" 'ludziom': 2,\n",
" 'darmo': 2,\n",
" 'naprzód': 2,\n",
" 'książki': 2,\n",
" 'kumie': 2,\n",
" 'wstyd': 2,\n",
" 'dziecku': 2,\n",
" 'robotę': 2,\n",
" 'robić': 2,\n",
" 'ławie': 2,\n",
" 'chcesz': 2,\n",
" 'będę': 2,\n",
" 'Miał': 2,\n",
" 'razu': 2,\n",
" 'myślała': 2,\n",
" 'pomogło': 2,\n",
" 'dziewuchę': 2,\n",
" 'szmaty': 2,\n",
" 'znachorkę': 2,\n",
" 'baba': 2,\n",
" 'koło': 2,\n",
" 'podłogę': 2,\n",
" 'węgle': 2,\n",
" 'teraz': 2,\n",
" 'sosnowej': 2,\n",
" 'desce': 2,\n",
" 'zdrowaśki': 2,\n",
" 'Rozalię': 2,\n",
" '(': 2,\n",
" 'rogu': 2,\n",
" ')': 2,\n",
" 'gorąco': 2,\n",
" 'mną': 2,\n",
" 'wyjdzie': 2,\n",
" 'zdrowie': 2,\n",
" 'baby': 2,\n",
" 'poczęła': 2,\n",
" 'rzucać': 2,\n",
" 'rękami': 2,\n",
" 'dyć': 2,\n",
" 'całkiem': 2,\n",
" 'odmawiać': 2,\n",
" 'płaczem': 2,\n",
" 'deskę': 2,\n",
" 'uklękła': 2,\n",
" 'głową': 2,\n",
" 'klepisko': 2,\n",
" 'choroba': 2,\n",
" 'Rozalii': 2,\n",
" 'Alboż': 2,\n",
" 'jedno': 2,\n",
" 'trumnę': 2,\n",
" 'krzyże': 2,\n",
" 'bok': 2,\n",
" 'idąc': 2,\n",
" 'Musi': 2,\n",
" 'ksiądz': 2,\n",
" 'czterech': 2,\n",
" 'jednej': 2,\n",
" 'dziewuchy': 2,\n",
" 'którym': 2,\n",
" 'wzdychał': 2,\n",
" 'spadł': 2,\n",
" 'pomocy': 2,\n",
" 'kuma': 2,\n",
" 'oddać': 2,\n",
" 'budować': 2,\n",
" 'Oho': 2,\n",
" 'tedy': 2,\n",
" 'poszła': 2,\n",
" 'kożuch': 2,\n",
" 'pieniądze': 2,\n",
" 'któremu': 2,\n",
" 'popatrzył': 2,\n",
" 'Ładny': 2,\n",
" 'musi': 2,\n",
" 'bym': 2,\n",
" 'wiedzieć': 2,\n",
" 'tych': 2,\n",
" 'uczył': 2,\n",
" 'drwa': 2,\n",
" 'paru': 2,\n",
" 'On': 2,\n",
" 'dołu': 2,\n",
" 'znaczy': 2,\n",
" 'pierwszy': 2,\n",
" 'wójt': 2,\n",
" 'Żebym': 2,\n",
" 'karczmie': 2,\n",
" 'stoi': 2,\n",
" 'Tylko': 2,\n",
" 'drzwi': 2,\n",
" 'miały': 2,\n",
" 'twarze': 2,\n",
" 'chodził': 2,\n",
" 'mróz': 2,\n",
" 'liter': 2,\n",
" 'zaczęła': 2,\n",
" 'Patrzcie': 2,\n",
" 'Tę': 2,\n",
" 'łatwo': 2,\n",
" 'wygląda': 2,\n",
" 'zawołali': 2,\n",
" 'chórem': 2,\n",
" 'pierwszego': 2,\n",
" 'głos': 2,\n",
" 'wyrysował': 2,\n",
" 'która': 2,\n",
" 'krzyknął': 2,\n",
" 'środek': 2,\n",
" 'położyli': 2,\n",
" 'Jeszcze': 2,\n",
" 'becz': 2,\n",
" 'hultaju': 2,\n",
" 'miejsce': 2,\n",
" 'trzecią': 2,\n",
" 'Pierwszy': 2,\n",
" 'milczał': 2,\n",
" 'Powtórz': 2,\n",
" 'ośle': 2,\n",
" 'wolno': 2,\n",
" 'nauka': 2,\n",
" 'kuchni': 2,\n",
" 'profesora': 2,\n",
" 'pokarm': 2,\n",
" 'zapytała': 2,\n",
" 'początek': 2,\n",
" 'będziesz': 2,\n",
" 'Ha': 2,\n",
" 'Jednego': 2,\n",
" 'serce': 2,\n",
" 'pożytek': 2,\n",
" 'jaka': 2,\n",
" 'niewiele': 2,\n",
" 'oczami': 2,\n",
" 'okna': 2,\n",
" 'ławy': 2,\n",
" 'zobaczyć': 2,\n",
" 'swego': 2,\n",
" 'Widzisz': 2,\n",
" 'widzę': 2,\n",
" 'znaleźć': 2,\n",
" 'kominie': 2,\n",
" 'baraszkuje': 2,\n",
" 'Jeżeli': 2,\n",
" 'nieszczęście': 2,\n",
" 'miesiące': 2,\n",
" 'idzie': 2,\n",
" 'grosz': 2,\n",
" 'czapkę': 2,\n",
" 'izbie': 2,\n",
" 'patrząc': 2,\n",
" 'zaś': 2,\n",
" 'mógłby': 2,\n",
" 'oboje': 2,\n",
" 'wtrąciła': 2,\n",
" 'jeść': 2,\n",
" 'odzienie': 2,\n",
" 'Kto': 2,\n",
" 'se': 2,\n",
" 'szkolnemu': 2,\n",
" 'pieśni': 2,\n",
" 'gospodarstwa': 2,\n",
" 'drugiej': 2,\n",
" 'stamtąd': 2,\n",
" 'Strugał': 2,\n",
" 'nowy': 2,\n",
" 'koni': 2,\n",
" 'siwej': 2,\n",
" 'Mordce': 2,\n",
" 'wypoczywał': 2,\n",
" 'roli': 2,\n",
" 'resztę': 2,\n",
" 'musiała': 2,\n",
" 'chłopców': 2,\n",
" 'komina': 2,\n",
" 'twarzy': 2,\n",
" 'Nareszcie': 2,\n",
" 'przyjaciel': 2,\n",
" 'Jednej': 2,\n",
" 'niedzieli': 2,\n",
" 'przyjął': 2,\n",
" 'niezgorzej': 2,\n",
" 'wcale': 2,\n",
" 'terminu': 2,\n",
" 'którzy': 2,\n",
" 'zjedli': 2,\n",
" 'dali': 2,\n",
" 'żelazo': 2,\n",
" 'innym': 2,\n",
" 'żelaza': 2,\n",
" 'robota': 2,\n",
" 'walił': 2,\n",
" 'końcu': 2,\n",
" 'urzędzie': 2,\n",
" 'sosnową': 2,\n",
" 'Pochwalony': 2,\n",
" 'odpowiada': 2,\n",
" 'Niczego': 2,\n",
" 'was': 2,\n",
" 'zęby': 2,\n",
" 'Może': 2,\n",
" 'mogli': 2,\n",
" 'chłopcy': 2,\n",
" 'płukania': 2,\n",
" 'zarobku': 2,\n",
" 'półtora': 2,\n",
" 'znać': 2,\n",
" 'powoli': 2,\n",
" 'nią': 2,\n",
" 'płukanie': 2,\n",
" 'potrafi': 2,\n",
" 'najstarszy': 2,\n",
" 'garści': 2,\n",
" 'miewał': 2,\n",
" 'czasie': 2,\n",
" 'podkowę': 2,\n",
" 'gęby': 2,\n",
" 'obejrzał': 2,\n",
" 'narzędzi': 2,\n",
" 'uciekł': 2,\n",
" 'oddała': 2,\n",
" 'progu': 2,\n",
" 'Antkowa': 2,\n",
" 'bystry': 2,\n",
" 'poznał': 2,\n",
" 'pilnik': 2,\n",
" 'święci': 2,\n",
" 'nich': 2,\n",
" 'coraz': 2,\n",
" 'bestyja': 2,\n",
" 'Był': 2,\n",
" 'zręczny': 2,\n",
" 'włosy': 2,\n",
" 'znam': 2,\n",
" 'kobieta': 2,\n",
" 'małżeństwa': 2,\n",
" 'wydał': 2,\n",
" 'Wójtowa': 2,\n",
" 'weselu': 2,\n",
" 'wójta': 2,\n",
" 'goście': 2,\n",
" 'strzelcy': 2,\n",
" 'miesięczną': 2,\n",
" 'pensję': 2,\n",
" 'czwarty': 2,\n",
" 'zaczęło': 2,\n",
" 'pomiędzy': 2,\n",
" 'sumę': 2,\n",
" 'ciasno': 2,\n",
" 'modlił': 2,\n",
" 'świętego': 2,\n",
" 'wielkim': 2,\n",
" 'ołtarzu': 2,\n",
" 'wyżej': 2,\n",
" 'wszystkich': 2,\n",
" 'Wtem': 2,\n",
" 'ciżbą': 2,\n",
" 'ramion': 2,\n",
" 'paciorków': 2,\n",
" 'znikła': 2,\n",
" 'ławce': 2,\n",
" 'kościół': 2,\n",
" 'podniesienie': 2,\n",
" 'zrobiło': 2,\n",
" 'twarz': 2,\n",
" 'sumie': 2,\n",
" 'gorzelnik': 2,\n",
" 'trzeciej': 2,\n",
" 'pole': 2,\n",
" 'chleb': 2,\n",
" 'wiele': 2,\n",
" 'ostatni': 2,\n",
" 'mignęło': 2,\n",
" 'oczach': 2,\n",
" 'wśród': 2,\n",
" 'niebo': 2,\n",
" 'zamiast': 2,\n",
" 'znalazł': 2,\n",
" 'kościoła': 2,\n",
" 'myśląc': 2,\n",
" 'wtedy': 2,\n",
" 'spojrzy': 2,\n",
" 'Wówczas': 2,\n",
" 'piękny': 2,\n",
" 'ofiarować': 2,\n",
" 'lewym': 2,\n",
" 'Wisła': 2,\n",
" 'opuścić': 2,\n",
" 'narzędziami': 2,\n",
" 'dar': 2,\n",
" 'drogę': 2,\n",
" 'Czy': 2,\n",
" 'ciebie': 2,\n",
" 'swoimi': 2,\n",
" 'rękoma': 2,\n",
" 'takiego': 2,\n",
" 'powiedział': 2,\n",
" 'szmer': 2,\n",
" 'jedną': 2,\n",
" 'chwili': 2,\n",
" 'tajemną': 2,\n",
" 'miłość': 2,\n",
" 'ucałował': 2,\n",
" 'daj': 2,\n",
" 'Kumie': 2,\n",
" 'synusiu': 2,\n",
" 'Boże': 2,\n",
" 'święty': 2,\n",
" 'miłościw': 2,\n",
" 'Bądź': 2,\n",
" 'obcych': 2,\n",
" 'uszedł': 2,\n",
" 'Niech': 2,\n",
" 'błogosławi': 2,\n",
" 'Zostańcie': 2,\n",
" 'Bogiem': 2,\n",
" 'szarym': 2,\n",
" 'Bolesław': 1,\n",
" 'Prus': 1,\n",
" 'Wieś': 1,\n",
" 'leżała': 1,\n",
" 'niewielkiej': 1,\n",
" 'północy': 1,\n",
" 'otaczały': 1,\n",
" 'spadziste': 1,\n",
" 'porosłe': 1,\n",
" 'sosnowym': 1,\n",
" 'garbate': 1,\n",
" 'zasypane': 1,\n",
" 'leszczyną': 1,\n",
" 'tarniną': 1,\n",
" 'głogiem': 1,\n",
" 'najgłośniej': 1,\n",
" 'najczęściej': 1,\n",
" 'rwać': 1,\n",
" 'orzechy': 1,\n",
" 'wybierać': 1,\n",
" 'Kiedyś': 1,\n",
" 'stanął': 1,\n",
" 'zdawało': 1,\n",
" 'oba': 1,\n",
" 'pasma': 1,\n",
" 'gór': 1,\n",
" 'biegną': 1,\n",
" 'zetknąć': 1,\n",
" 'rana': 1,\n",
" 'wstaje': 1,\n",
" 'słońce': 1,\n",
" 'złudzenie': 1,\n",
" 'bowiem': 1,\n",
" 'ciągnęła': 1,\n",
" 'wzgórzami': 1,\n",
" 'dolina': 1,\n",
" 'przecięta': 1,\n",
" 'rzeczułką': 1,\n",
" 'przykryta': 1,\n",
" 'zieloną': 1,\n",
" 'łąką': 1,\n",
" 'pasano': 1,\n",
" 'bydlątka': 1,\n",
" 'cienkonogie': 1,\n",
" 'bociany': 1,\n",
" 'polować': 1,\n",
" 'żaby': 1,\n",
" 'kukające': 1,\n",
" 'wieczorami': 1,\n",
" 'tamą': 1,\n",
" 'wapienne': 1,\n",
" 'nagie': 1,\n",
" 'Każdy': 1,\n",
" 'chłopski': 1,\n",
" 'szarą': 1,\n",
" 'słomą': 1,\n",
" 'pokryty': 1,\n",
" 'ogródek': 1,\n",
" 'ogródku': 1,\n",
" 'śliwki': 1,\n",
" 'węgierki': 1,\n",
" 'spomiędzy': 1,\n",
" 'komin': 1,\n",
" 'sadzą': 1,\n",
" 'uczerniony': 1,\n",
" 'pożarną': 1,\n",
" 'drabinkę': 1,\n",
" 'Drabiny': 1,\n",
" 'zaprowadzono': 1,\n",
" 'dawna': 1,\n",
" 'myśleli': 1,\n",
" 'one': 1,\n",
" 'chronić': 1,\n",
" 'będą': 1,\n",
" 'bocianie': 1,\n",
" 'Toteż': 1,\n",
" 'płonął': 1,\n",
" 'ratowali': 1,\n",
" 'gospodarza': 1,\n",
" 'dopust': 1,\n",
" 'boski': 1,\n",
" 'Spalił': 1,\n",
" 'nową': 1,\n",
" 'drabinę': 1,\n",
" 'zapłacił': 1,\n",
" 'śtraf': 1,\n",
" 'połamane': 1,\n",
" 'szczeble': 1,\n",
" 'Położyli': 1,\n",
" 'niemalowanej': 1,\n",
" 'kołysce': 1,\n",
" 'zmarłym': 1,\n",
" 'bracie': 1,\n",
" 'sypiał': 1,\n",
" 'przyszła': 1,\n",
" 'ustąpić': 1,\n",
" 'osoba': 1,\n",
" 'dorosła': 1,\n",
" 'ławę': 1,\n",
" 'następny': 1,\n",
" 'rozglądał': 1,\n",
" 'Raz': 1,\n",
" 'rzekę': 1,\n",
" 'batem': 1,\n",
" 'przejezdnego': 1,\n",
" 'furmana': 1,\n",
" 'konie': 1,\n",
" 'stratowały': 1,\n",
" 'pogryzły': 1,\n",
" 'tygodnie': 1,\n",
" 'Doświadczył': 1,\n",
" 'niemało': 1,\n",
" 'czwartym': 1,\n",
" 'podarował': 1,\n",
" 'mosiężnym': 1,\n",
" 'guzikiem': 1,\n",
" 'kazała': 1,\n",
" 'nosić': 1,\n",
" 'pięć': 1,\n",
" 'użyto': 1,\n",
" 'pasania': 1,\n",
" 'Wolał': 1,\n",
" 'wapiennym': 1,\n",
" 'wzgórzem': 1,\n",
" 'pokazywało': 1,\n",
" 'wysokiego': 1,\n",
" 'czarnego': 1,\n",
" 'Wyłaziło': 1,\n",
" 'lewej': 1,\n",
" 'strony': 1,\n",
" 'upadało': 1,\n",
" 'trzecie': 1,\n",
" 'wysokie': 1,\n",
" ...})"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"word_counts"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "eef24a64-6dd2-4bfb-945a-d7cf69037a4d",
"metadata": {},
"outputs": [],
"source": [
"tokenized_text = [t for t in tokenized_text if t not in string.punctuation]"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "9c7ed634-a731-4a91-a1e4-ffc315b18068",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Bolesław',\n",
" 'Prus',\n",
" 'Antek',\n",
" 'Antek',\n",
" 'urodził',\n",
" 'się',\n",
" 'we',\n",
" 'wsi',\n",
" 'nad',\n",
" 'Wisłą',\n",
" 'Wieś',\n",
" 'leżała',\n",
" 'w',\n",
" 'niewielkiej',\n",
" 'dolinie',\n",
" 'Od',\n",
" 'północy',\n",
" 'otaczały',\n",
" 'ją',\n",
" 'wzgórza',\n",
" 'spadziste',\n",
" 'porosłe',\n",
" 'sosnowym',\n",
" 'lasem',\n",
" 'a',\n",
" 'od',\n",
" 'południa',\n",
" 'wzgórza',\n",
" 'garbate',\n",
" 'zasypane',\n",
" 'leszczyną',\n",
" 'tarniną',\n",
" 'i',\n",
" 'głogiem',\n",
" 'Tam',\n",
" 'najgłośniej',\n",
" 'śpiewały',\n",
" 'ptaki',\n",
" 'i',\n",
" 'najczęściej',\n",
" 'chodziły',\n",
" 'wiejskie',\n",
" 'dzieci',\n",
" 'rwać',\n",
" 'orzechy',\n",
" 'albo',\n",
" 'wybierać',\n",
" 'gniazda',\n",
" 'Kiedyś',\n",
" 'stanął',\n",
" 'na',\n",
" 'środku',\n",
" 'wsi',\n",
" 'zdawało',\n",
" 'ci',\n",
" 'się',\n",
" 'że',\n",
" 'oba',\n",
" 'pasma',\n",
" 'gór',\n",
" 'biegną',\n",
" 'ku',\n",
" 'sobie',\n",
" 'ażeby',\n",
" 'zetknąć',\n",
" 'się',\n",
" 'tam',\n",
" 'gdzie',\n",
" 'z',\n",
" 'rana',\n",
" 'wstaje',\n",
" 'czerwone',\n",
" 'słońce',\n",
" 'Ale',\n",
" 'było',\n",
" 'to',\n",
" 'złudzenie',\n",
" 'Za',\n",
" 'wsią',\n",
" 'bowiem',\n",
" 'ciągnęła',\n",
" 'się',\n",
" 'między',\n",
" 'wzgórzami',\n",
" 'dolina',\n",
" 'przecięta',\n",
" 'rzeczułką',\n",
" 'i',\n",
" 'przykryta',\n",
" 'zieloną',\n",
" 'łąką',\n",
" 'Tam',\n",
" 'pasano',\n",
" 'bydlątka',\n",
" 'i',\n",
" 'tam',\n",
" 'cienkonogie',\n",
" 'bociany',\n",
" 'chodziły',\n",
" 'polować',\n",
" 'na',\n",
" 'żaby',\n",
" 'kukające',\n",
" 'wieczorami',\n",
" 'Od',\n",
" 'zachodu',\n",
" 'wieś',\n",
" 'miała',\n",
" 'tamę',\n",
" 'za',\n",
" 'tamą',\n",
" 'Wisłę',\n",
" 'a',\n",
" 'za',\n",
" 'Wisłą',\n",
" 'znowu',\n",
" 'wzgórza',\n",
" 'wapienne',\n",
" 'nagie',\n",
" 'Każdy',\n",
" 'chłopski',\n",
" 'dom',\n",
" 'szarą',\n",
" 'słomą',\n",
" 'pokryty',\n",
" 'miał',\n",
" 'ogródek',\n",
" 'a',\n",
" 'w',\n",
" 'ogródku',\n",
" 'śliwki',\n",
" 'węgierki',\n",
" 'spomiędzy',\n",
" 'których',\n",
" 'widać',\n",
" 'było',\n",
" 'komin',\n",
" 'sadzą',\n",
" 'uczerniony',\n",
" 'i',\n",
" 'pożarną',\n",
" 'drabinkę',\n",
" 'Drabiny',\n",
" 'te',\n",
" 'zaprowadzono',\n",
" 'nie',\n",
" 'od',\n",
" 'dawna',\n",
" 'a',\n",
" 'ludzie',\n",
" 'myśleli',\n",
" 'że',\n",
" 'one',\n",
" 'lepiej',\n",
" 'chronić',\n",
" 'będą',\n",
" 'chaty',\n",
" 'od',\n",
" 'ognia',\n",
" 'niż',\n",
" 'dawniej',\n",
" 'bocianie',\n",
" 'gniazda',\n",
" 'Toteż',\n",
" 'gdy',\n",
" 'płonął',\n",
" 'jaki',\n",
" 'budynek',\n",
" 'dziwili',\n",
" 'się',\n",
" 'bardzo',\n",
" 'ale',\n",
" 'go',\n",
" 'nie',\n",
" 'ratowali',\n",
" '—',\n",
" 'Widać',\n",
" 'że',\n",
" 'na',\n",
" 'tego',\n",
" 'gospodarza',\n",
" 'był',\n",
" 'dopust',\n",
" 'boski',\n",
" '—',\n",
" 'mówili',\n",
" 'między',\n",
" 'sobą',\n",
" '—',\n",
" 'Spalił',\n",
" 'się',\n",
" 'choć',\n",
" 'miał',\n",
" 'przecie',\n",
" 'nową',\n",
" 'drabinę',\n",
" 'i',\n",
" 'choć',\n",
" 'zapłacił',\n",
" 'śtraf',\n",
" 'za',\n",
" 'starą',\n",
" 'co',\n",
" 'to',\n",
" 'były',\n",
" 'u',\n",
" 'niej',\n",
" 'połamane',\n",
" 'szczeble',\n",
" 'W',\n",
" 'takiej',\n",
" 'wsi',\n",
" 'urodził',\n",
" 'się',\n",
" 'Antek',\n",
" 'Położyli',\n",
" 'go',\n",
" 'w',\n",
" 'niemalowanej',\n",
" 'kołysce',\n",
" 'co',\n",
" 'została',\n",
" 'po',\n",
" 'zmarłym',\n",
" 'bracie',\n",
" 'i',\n",
" 'sypiał',\n",
" 'w',\n",
" 'niej',\n",
" 'przez',\n",
" 'dwa',\n",
" 'lata',\n",
" 'Potem',\n",
" 'przyszła',\n",
" 'mu',\n",
" 'na',\n",
" 'świat',\n",
" 'siostra',\n",
" 'Rozalia',\n",
" 'więc',\n",
" 'musiał',\n",
" 'jej',\n",
" 'miejsca',\n",
" 'ustąpić',\n",
" 'a',\n",
" 'sam',\n",
" 'jako',\n",
" 'osoba',\n",
" 'dorosła',\n",
" 'przenieść',\n",
" 'się',\n",
" 'na',\n",
" 'ławę',\n",
" 'Przez',\n",
" 'ten',\n",
" 'rok',\n",
" 'kołysał',\n",
" 'siostrę',\n",
" 'a',\n",
" 'przez',\n",
" 'cały',\n",
" 'następny',\n",
" '—',\n",
" 'rozglądał',\n",
" 'się',\n",
" 'po',\n",
" 'świecie',\n",
" 'Raz',\n",
" 'wpadł',\n",
" 'w',\n",
" 'rzekę',\n",
" 'drugi',\n",
" 'raz',\n",
" 'dostał',\n",
" 'batem',\n",
" 'od',\n",
" 'przejezdnego',\n",
" 'furmana',\n",
" 'za',\n",
" 'to',\n",
" 'że',\n",
" 'go',\n",
" 'o',\n",
" 'mało',\n",
" 'konie',\n",
" 'nie',\n",
" 'stratowały',\n",
" 'a',\n",
" 'trzeci',\n",
" 'raz',\n",
" 'psy',\n",
" 'tak',\n",
" 'go',\n",
" 'pogryzły',\n",
" 'że',\n",
" 'dwa',\n",
" 'tygodnie',\n",
" 'leżał',\n",
" 'na',\n",
" 'piecu',\n",
" 'Doświadczył',\n",
" 'więc',\n",
" 'niemało',\n",
" 'Za',\n",
" 'to',\n",
" 'w',\n",
" 'czwartym',\n",
" 'roku',\n",
" 'życia',\n",
" 'ojciec',\n",
" 'podarował',\n",
" 'mu',\n",
" 'swoją',\n",
" 'sukienną',\n",
" 'kamizelkę',\n",
" 'z',\n",
" 'mosiężnym',\n",
" 'guzikiem',\n",
" 'a',\n",
" 'matka',\n",
" '—',\n",
" 'kazała',\n",
" 'mu',\n",
" 'siostrę',\n",
" 'nosić',\n",
" 'Gdy',\n",
" 'miał',\n",
" 'pięć',\n",
" 'lat',\n",
" 'użyto',\n",
" 'go',\n",
" 'już',\n",
" '—',\n",
" 'do',\n",
" 'pasania',\n",
" 'świń',\n",
" 'Ale',\n",
" 'Antek',\n",
" 'nie',\n",
" 'bardzo',\n",
" 'się',\n",
" 'za',\n",
" 'nimi',\n",
" 'oglądał',\n",
" 'Wolał',\n",
" 'patrzeć',\n",
" 'na',\n",
" 'drugą',\n",
" 'stronę',\n",
" 'Wisły',\n",
" 'gdzie',\n",
" 'za',\n",
" 'wapiennym',\n",
" 'wzgórzem',\n",
" 'raz',\n",
" 'na',\n",
" 'raz',\n",
" 'pokazywało',\n",
" 'się',\n",
" 'coś',\n",
" 'wysokiego',\n",
" 'i',\n",
" 'czarnego',\n",
" 'Wyłaziło',\n",
" 'to',\n",
" 'z',\n",
" 'lewej',\n",
" 'strony',\n",
" 'jakby',\n",
" 'spod',\n",
" 'ziemi',\n",
" 'szło',\n",
" 'w',\n",
" 'górę',\n",
" 'i',\n",
" 'upadało',\n",
" 'na',\n",
" 'prawo',\n",
" 'Za',\n",
" 'tym',\n",
" 'pierwszym',\n",
" 'szło',\n",
" 'zaraz',\n",
" 'drugie',\n",
" 'i',\n",
" 'trzecie',\n",
" 'takie',\n",
" 'samo',\n",
" 'czarne',\n",
" 'i',\n",
" 'wysokie',\n",
" 'Tymczasem',\n",
" 'świnie',\n",
" 'swoim',\n",
" 'obyczajem',\n",
" 'wlazły',\n",
" 'w',\n",
" 'kartofle',\n",
" 'Matka',\n",
" 'spostrzegłszy',\n",
" 'to',\n",
" 'zawinęła',\n",
" 'się',\n",
" 'wedle',\n",
" 'sukiennej',\n",
" 'kamizelki',\n",
" 'Antkowej',\n",
" 'tak',\n",
" 'że',\n",
" 'chłopiec',\n",
" 'prawie',\n",
" 'tchu',\n",
" 'nie',\n",
" 'mógł',\n",
" 'złapać',\n",
" 'Ale',\n",
" 'że',\n",
" 'nie',\n",
" 'miał',\n",
" 'w',\n",
" 'sercu',\n",
" 'zawziętości',\n",
" 'bo',\n",
" 'było',\n",
" 'z',\n",
" 'niego',\n",
" 'dziecko',\n",
" 'dobre',\n",
" 'więc',\n",
" 'wykrzyczawszy',\n",
" 'się',\n",
" 'i',\n",
" 'wydrapawszy',\n",
" '—',\n",
" 'kamizelkę',\n",
" 'zapytał',\n",
" 'matki',\n",
" '—',\n",
" 'Matulu',\n",
" 'A',\n",
" 'co',\n",
" 'to',\n",
" 'takie',\n",
" 'czarne',\n",
" 'chodzi',\n",
" 'za',\n",
" 'Wisłą',\n",
" 'Matka',\n",
" 'spojrzała',\n",
" 'w',\n",
" 'kierunku',\n",
" 'Antkowego',\n",
" 'palca',\n",
" 'przysłoniła',\n",
" 'oczy',\n",
" 'ręką',\n",
" 'i',\n",
" 'odparła',\n",
" '—',\n",
" 'Tam',\n",
" 'za',\n",
" 'Wisłą',\n",
" 'Cóż',\n",
" 'to',\n",
" 'nie',\n",
" 'widzisz',\n",
" 'że',\n",
" 'wiatrak',\n",
" 'chodzi',\n",
" 'A',\n",
" 'na',\n",
" 'drugi',\n",
" 'raz',\n",
" 'pilnuj',\n",
" 'świń',\n",
" 'bo',\n",
" 'cię',\n",
" 'pokrzywami',\n",
" 'wysmaruję',\n",
" '—',\n",
" 'Acha',\n",
" 'wiatrak',\n",
" 'A',\n",
" 'co',\n",
" 'on',\n",
" 'matulu',\n",
" 'za',\n",
" 'jeden',\n",
" '—',\n",
" 'At',\n",
" 'głupiś',\n",
" '—',\n",
" 'odparła',\n",
" 'matka',\n",
" 'i',\n",
" 'uciekła',\n",
" 'do',\n",
" 'swojej',\n",
" 'roboty',\n",
" 'Gdzie',\n",
" 'ona',\n",
" 'miała',\n",
" 'czas',\n",
" 'i',\n",
" 'rozum',\n",
" 'do',\n",
" 'udzielania',\n",
" 'objaśnień',\n",
" 'o',\n",
" 'wiatrakach',\n",
" 'Ale',\n",
" 'chłopcu',\n",
" 'wiatrak',\n",
" 'spokojności',\n",
" 'nie',\n",
" 'dawał',\n",
" 'Antek',\n",
" 'widywał',\n",
" 'go',\n",
" 'przecie',\n",
" 'co',\n",
" 'dzień',\n",
" 'Widywał',\n",
" 'go',\n",
" 'i',\n",
" 'w',\n",
" 'nocy',\n",
" 'przez',\n",
" 'sen.',\n",
" 'Więc',\n",
" 'taka',\n",
" 'straszna',\n",
" 'urosła',\n",
" 'w',\n",
" 'chłopcu',\n",
" 'ciekawość',\n",
" 'że',\n",
" 'jednego',\n",
" 'dnia',\n",
" 'zakradł',\n",
" 'się',\n",
" 'do',\n",
" 'promu',\n",
" 'co',\n",
" 'ludzi',\n",
" 'na',\n",
" 'drugą',\n",
" 'stronę',\n",
" 'rzeki',\n",
" 'przewoził',\n",
" 'i',\n",
" 'popłynął',\n",
" 'za',\n",
" 'Wisłę',\n",
" 'Popłynął',\n",
" 'wdrapał',\n",
" 'się',\n",
" 'na',\n",
" 'wapienną',\n",
" 'górę',\n",
" 'akurat',\n",
" 'w',\n",
" 'tym',\n",
" 'miejscu',\n",
" 'gdzie',\n",
" 'stało',\n",
" 'ogłoszenie',\n",
" 'aby',\n",
" 'tędy',\n",
" 'nie',\n",
" 'chodzić',\n",
" 'i',\n",
" 'zobaczył',\n",
" 'wiatrak',\n",
" 'Wydał',\n",
" 'mu',\n",
" 'się',\n",
" 'budynek',\n",
" 'ten',\n",
" 'jakby',\n",
" 'dzwonnica',\n",
" 'tylko',\n",
" 'w',\n",
" 'sobie',\n",
" 'był',\n",
" 'grubszy',\n",
" 'a',\n",
" 'tam',\n",
" 'gdzie',\n",
" 'na',\n",
" 'dzwonnicy',\n",
" 'jest',\n",
" 'okno',\n",
" 'miał',\n",
" 'cztery',\n",
" 'tęgie',\n",
" 'skrzydła',\n",
" 'ustawione',\n",
" 'na',\n",
" 'krzyż',\n",
" 'Z',\n",
" 'początku',\n",
" 'nie',\n",
" 'rozumiał',\n",
" 'nic',\n",
" '—',\n",
" 'co',\n",
" 'to',\n",
" 'i',\n",
" 'na',\n",
" 'co',\n",
" 'to',\n",
" 'Ale',\n",
" 'wnet',\n",
" 'objaśnili',\n",
" 'mu',\n",
" 'rzecz',\n",
" 'pastuchowie',\n",
" 'więc',\n",
" 'dowiedział',\n",
" 'się',\n",
" 'o',\n",
" 'wszystkim',\n",
" 'Naprzód',\n",
" 'o',\n",
" 'tym',\n",
" 'że',\n",
" 'na',\n",
" 'skrzydła',\n",
" 'dmucha',\n",
" 'wiater',\n",
" 'i',\n",
" 'kręci',\n",
" 'nimi',\n",
" 'jak',\n",
" 'liśćmi',\n",
" 'Dalej',\n",
" 'o',\n",
" 'tym',\n",
" 'że',\n",
" 'w',\n",
" 'wiatraku',\n",
" 'miele',\n",
" 'się',\n",
" 'zboże',\n",
" 'na',\n",
" 'mąkę',\n",
" 'i',\n",
" 'nareszcie',\n",
" 'o',\n",
" 'tym',\n",
" 'że',\n",
" 'przy',\n",
" 'wiatraku',\n",
" 'siedzi',\n",
" 'młynarz',\n",
" 'co',\n",
" 'żonę',\n",
" 'bije',\n",
" 'a',\n",
" 'taki',\n",
" 'jest',\n",
" 'mądry',\n",
" 'że',\n",
" 'wie',\n",
" 'jakim',\n",
" 'sposobem',\n",
" 'ze',\n",
" 'śpichrzów',\n",
" 'wyprowadza',\n",
" 'się',\n",
" 'szczury',\n",
" 'Po',\n",
" 'takiej',\n",
" 'poglądowej',\n",
" 'lekcji',\n",
" 'Antek',\n",
" 'wrócił',\n",
" 'do',\n",
" 'domu',\n",
" 'tą',\n",
" 'samą',\n",
" 'drogą',\n",
" 'co',\n",
" 'pierwej',\n",
" 'Dali',\n",
" 'mu',\n",
" 'tam',\n",
" 'przewoźnicy',\n",
" 'parę',\n",
" 'razy',\n",
" 'w',\n",
" 'łeb',\n",
" 'za',\n",
" 'swoją',\n",
" 'krwawą',\n",
" 'pracę',\n",
" 'dała',\n",
" 'mu',\n",
" 'i',\n",
" 'matka',\n",
" 'coś',\n",
" 'na',\n",
" 'sukienną',\n",
" 'kamizelkę',\n",
" 'ale',\n",
" 'to',\n",
" 'nic',\n",
" 'Antek',\n",
" 'był',\n",
" 'kontent',\n",
" 'bo',\n",
" 'zaspokoił',\n",
" 'ciekawość',\n",
" 'Więc',\n",
" 'choć',\n",
" 'położył',\n",
" 'się',\n",
" 'spać',\n",
" 'o',\n",
" 'głodzie',\n",
" 'marzył',\n",
" 'całą',\n",
" 'noc',\n",
" 'to',\n",
" 'o',\n",
" 'wiatraku',\n",
" 'co',\n",
" 'mieli',\n",
" 'zboże',\n",
" 'to',\n",
" 'o',\n",
" 'młynarzu',\n",
" 'co',\n",
" 'bije',\n",
" 'żonę',\n",
" 'i',\n",
" 'szczury',\n",
" 'wyprowadza',\n",
" 'ze',\n",
" 'śpichrzów',\n",
" 'Drobny',\n",
" 'ten',\n",
" 'wypadek',\n",
" 'stanowczo',\n",
" 'wpłynął',\n",
" 'na',\n",
" 'całe',\n",
" 'życie',\n",
" 'chłopca',\n",
" 'Od',\n",
" 'tej',\n",
" 'pory',\n",
" '—',\n",
" 'od',\n",
" 'wschodu',\n",
" 'do',\n",
" 'zachodu',\n",
" 'słońca',\n",
" '—',\n",
" 'strugał',\n",
" 'on',\n",
" 'patyki',\n",
" 'i',\n",
" 'układał',\n",
" 'je',\n",
" 'na',\n",
" 'krzyż',\n",
" 'Potem',\n",
" 'wystrugał',\n",
" 'sobie',\n",
" 'kolumnę',\n",
" 'próbował',\n",
" 'obciosywał',\n",
" 'ustawiał',\n",
" 'aż',\n",
" 'nareszcie',\n",
" 'wybudował',\n",
" 'mały',\n",
" 'wiatraczek',\n",
" 'który',\n",
" 'na',\n",
" 'wietrze',\n",
" 'obracał',\n",
" 'mu',\n",
" 'się',\n",
" 'jak',\n",
" 'tamten',\n",
" 'za',\n",
" 'Wisłą',\n",
" 'Cóż',\n",
" 'to',\n",
" 'była',\n",
" 'za',\n",
" 'radość',\n",
" 'Teraz',\n",
" 'brakowało',\n",
" 'Antkowi',\n",
" 'tylko',\n",
" 'żony',\n",
" 'żeby',\n",
" 'mógł',\n",
" 'ją',\n",
" 'bić',\n",
" 'i',\n",
" 'już',\n",
" 'byłby',\n",
" 'z',\n",
" 'niego',\n",
" 'prawdziwy',\n",
" 'młynarz',\n",
" 'Do',\n",
" 'dziesiątego',\n",
" 'roku',\n",
" 'życia',\n",
" 'zepsuł',\n",
" 'ze',\n",
" 'cztery',\n",
" 'koziki',\n",
" 'ale',\n",
" 'też',\n",
" 'strugał',\n",
" 'nimi',\n",
" 'dziwne',\n",
" 'rzeczy',\n",
" 'Robił',\n",
" 'wiatraki',\n",
" 'płoty',\n",
" 'drabiny',\n",
" 'studnie',\n",
" 'a',\n",
" 'nawet',\n",
" 'całe',\n",
" 'chałupy',\n",
" 'Aż',\n",
" 'się',\n",
" 'ludzie',\n",
" 'zastanawiali',\n",
" 'i',\n",
" 'mówili',\n",
" 'do',\n",
" 'matki',\n",
" 'że',\n",
" 'z',\n",
" 'Antka',\n",
" 'albo',\n",
" 'będzie',\n",
" 'majster',\n",
" 'albo',\n",
" 'wielki',\n",
" 'gałgan',\n",
" 'Przez',\n",
" 'ten',\n",
" 'czas',\n",
" 'urodził',\n",
" 'mu',\n",
" 'się',\n",
" 'jeszcze',\n",
" 'jeden',\n",
" 'brat',\n",
" 'Wojtek',\n",
" 'siostra',\n",
" 'podrosła',\n",
" 'a',\n",
" 'ojca',\n",
" 'drzewo',\n",
" 'przytłukło',\n",
" '—',\n",
" 'w',\n",
" 'lesie',\n",
" 'W',\n",
" 'chacie',\n",
" 'była',\n",
" 'z',\n",
" 'Rozalią',\n",
" 'wielka',\n",
" 'wygoda',\n",
" 'Dziewczyna',\n",
" 'zimą',\n",
" 'zamiatała',\n",
" 'izbę',\n",
" 'nosiła',\n",
" 'wodę',\n",
" 'a',\n",
" 'nawet',\n",
" 'potrafiła',\n",
" 'krupnik',\n",
" 'ugotować',\n",
" 'Latem',\n",
" 'posyłano',\n",
" 'ją',\n",
" 'do',\n",
" 'bydła',\n",
" 'z',\n",
" 'Antkiem',\n",
" 'bo',\n",
" 'chłopak',\n",
" 'zajęty',\n",
" 'struganiem',\n",
" 'nigdy',\n",
" 'się',\n",
" 'nie',\n",
" 'dopilnował',\n",
" 'Co',\n",
" 'go',\n",
" 'nie',\n",
" 'nabili',\n",
" 'nie',\n",
" 'naprosili',\n",
" 'nie',\n",
" 'napłakali',\n",
" 'się',\n",
" 'nad',\n",
" 'nim',\n",
" 'Chłopak',\n",
" 'krzyczał',\n",
" 'obiecywał',\n",
" 'płakał',\n",
" 'nawet',\n",
" 'razem',\n",
" 'z',\n",
" 'matką',\n",
" 'ale',\n",
" 'robił',\n",
" 'swoje',\n",
" 'a',\n",
" 'bydło',\n",
" 'wciąż',\n",
" 'w',\n",
" 'szkodę',\n",
" 'właziło',\n",
" 'Dopiero',\n",
" 'gdy',\n",
" 'siostra',\n",
" 'razem',\n",
" 'z',\n",
" 'nim',\n",
" 'pasła',\n",
" 'było',\n",
" 'lepiej',\n",
" 'on',\n",
" 'strugał',\n",
" 'patyki',\n",
" 'a',\n",
" 'ona',\n",
" 'pilnowała',\n",
" 'krów',\n",
" 'Nieraz',\n",
" 'matka',\n",
" 'widząc',\n",
" 'że',\n",
" 'dziewucha',\n",
" 'choć',\n",
" 'młodsza',\n",
" 'ma',\n",
" 'więcej',\n",
" 'rozumu',\n",
" 'i',\n",
" 'chęci',\n",
" 'aniżeli',\n",
" 'Antek',\n",
" 'załamywała',\n",
" 'z',\n",
" 'żalu',\n",
" 'ręce',\n",
" 'i',\n",
" 'lamentowała',\n",
" 'przed',\n",
" 'starym',\n",
" 'kumem',\n",
" 'Andrzejem',\n",
" '—',\n",
" 'Co',\n",
" 'ja',\n",
" 'pocznę',\n",
" 'nieszczęsna',\n",
" 'z',\n",
" 'tym',\n",
" 'Antkiem',\n",
" 'odmieńcem',\n",
" 'Ani',\n",
" 'to',\n",
" 'w',\n",
" 'chacie',\n",
" 'nic',\n",
" 'nie',\n",
" 'zrobi',\n",
" 'ani',\n",
" 'bydła',\n",
" 'doglądnie',\n",
" 'ino',\n",
" 'wciąż',\n",
" 'kraje',\n",
" 'te',\n",
" 'patyki',\n",
" 'jakby',\n",
" 'co',\n",
" 'w',\n",
" 'niego',\n",
" 'wstąpiło',\n",
" ...]"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tokenized_text"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "046bbc28-a13f-49d8-b662-b28c9d16807c",
"metadata": {},
"outputs": [],
"source": [
"word_counts = Counter(tokenized_text)\n",
"\n",
"words = list(word_counts.keys())\n",
"frequencies = list(word_counts.values())"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "7028b29a-a3fb-4c8e-887a-1ac67e1b07f7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'i': 277,\n",
" '—': 244,\n",
" 'się': 208,\n",
" 'na': 166,\n",
" 'w': 143,\n",
" 'z': 124,\n",
" 'nie': 117,\n",
" 'do': 107,\n",
" 'to': 89,\n",
" 'a': 82,\n",
" 'że': 74,\n",
" 'Antek': 61,\n",
" 'za': 61,\n",
" 'co': 61,\n",
" 'jak': 58,\n",
" 'mu': 52,\n",
" 'go': 50,\n",
" 'A': 50,\n",
" '…': 35,\n",
" 'ale': 34,\n",
" 'tak': 34,\n",
" 'już': 32,\n",
" 'Ale': 26,\n",
" 'po': 26,\n",
" 'matka': 26,\n",
" 'o': 24,\n",
" 'jeszcze': 23,\n",
" 'sobie': 22,\n",
" 'tam': 21,\n",
" 'więc': 21,\n",
" 'od': 19,\n",
" 'choć': 19,\n",
" 'raz': 19,\n",
" 'nim': 19,\n",
" 'kiedy': 19,\n",
" 'było': 18,\n",
" 'tym': 18,\n",
" 'tylko': 18,\n",
" 'nawet': 18,\n",
" 'jego': 18,\n",
" 'wsi': 17,\n",
" 'był': 17,\n",
" 'W': 17,\n",
" 'bo': 17,\n",
" 'niego': 17,\n",
" 'gdzie': 16,\n",
" 'przez': 16,\n",
" 'on': 16,\n",
" 'ze': 16,\n",
" 'chłopca': 16,\n",
" 'aż': 16,\n",
" 'ją': 15,\n",
" 'miał': 15,\n",
" 'Antka': 15,\n",
" 'nad': 14,\n",
" 'domu': 14,\n",
" 'ja': 14,\n",
" 'trochę': 14,\n",
" 'pod': 14,\n",
" 'gdy': 13,\n",
" 'przy': 13,\n",
" 'który': 13,\n",
" 'Andrzej': 13,\n",
" 'ty': 13,\n",
" 'przecie': 12,\n",
" 'u': 12,\n",
" 'jej': 12,\n",
" 'chłopiec': 12,\n",
" 'oczy': 12,\n",
" 'nic': 12,\n",
" 'Na': 12,\n",
" 'we': 11,\n",
" 'między': 11,\n",
" 'niej': 11,\n",
" 'cię': 11,\n",
" 'jest': 11,\n",
" 'wciąż': 11,\n",
" 'szkoły': 11,\n",
" 'potem': 11,\n",
" 'ci': 10,\n",
" 'znowu': 10,\n",
" 'Potem': 10,\n",
" 'ten': 10,\n",
" 'była': 10,\n",
" 'żeby': 10,\n",
" 'ma': 10,\n",
" 'ino': 10,\n",
" 'wdowa': 10,\n",
" 'No': 10,\n",
" 'Jak': 10,\n",
" 'wójtowej': 10,\n",
" 'dzieci': 9,\n",
" 'drugi': 9,\n",
" 'jakby': 9,\n",
" 'taki': 9,\n",
" 'będzie': 9,\n",
" 'chłopak': 9,\n",
" 'I': 9,\n",
" 'nauczyciel': 9,\n",
" 'dobrze': 9,\n",
" 'poszedł': 9,\n",
" 'kuźni': 9,\n",
" 'albo': 8,\n",
" 'Za': 8,\n",
" 'dom': 8,\n",
" 'chaty': 8,\n",
" 'świat': 8,\n",
" 'swoją': 8,\n",
" 'lat': 8,\n",
" 'dzień': 8,\n",
" 'tej': 8,\n",
" 'też': 8,\n",
" 'chacie': 8,\n",
" 'Co': 8,\n",
" 'więcej': 8,\n",
" 'chyba': 8,\n",
" 'rzekł': 8,\n",
" 'ich': 8,\n",
" 'odparł': 8,\n",
" 'Nie': 8,\n",
" 'kowala': 8,\n",
" 'sołtys': 8,\n",
" 'wieś': 7,\n",
" 'ludzie': 7,\n",
" 'tego': 7,\n",
" 'nimi': 7,\n",
" 'Matka': 7,\n",
" 'mógł': 7,\n",
" 'matki': 7,\n",
" 'matulu': 7,\n",
" 'jeden': 7,\n",
" 'parę': 7,\n",
" 'majster': 7,\n",
" 'swoje': 7,\n",
" 'ani': 7,\n",
" 'czy': 7,\n",
" 'groszy': 7,\n",
" 'izby': 7,\n",
" 'wszystko': 7,\n",
" '„': 7,\n",
" '”': 7,\n",
" 'dla': 7,\n",
" 'mówił': 7,\n",
" 'Bo': 7,\n",
" 'rękę': 7,\n",
" 'bez': 7,\n",
" 'swój': 7,\n",
" 'krzyżyk': 7,\n",
" 'Wisłą': 6,\n",
" 'ażeby': 6,\n",
" 'bardzo': 6,\n",
" 'dwa': 6,\n",
" 'Przez': 6,\n",
" 'trzeci': 6,\n",
" 'roku': 6,\n",
" 'ona': 6,\n",
" 'czas': 6,\n",
" 'chłopcu': 6,\n",
" 'Po': 6,\n",
" 'drogą': 6,\n",
" 'wiatraki': 6,\n",
" 'kto': 6,\n",
" 'żal': 6,\n",
" 'naukę': 6,\n",
" 'stary': 6,\n",
" 'wziął': 6,\n",
" 'niby': 6,\n",
" 'dopiero': 6,\n",
" 'be': 6,\n",
" 'tu': 6,\n",
" 'szedł': 6,\n",
" 'kowal': 6,\n",
" 'wójtowa': 6,\n",
" 'Od': 5,\n",
" 'Tam': 5,\n",
" 'ku': 5,\n",
" 'miała': 5,\n",
" 'których': 5,\n",
" 'te': 5,\n",
" 'były': 5,\n",
" 'takiej': 5,\n",
" 'miejsca': 5,\n",
" 'sam': 5,\n",
" 'coś': 5,\n",
" 'prawie': 5,\n",
" 'dziecko': 5,\n",
" 'wiatrak': 5,\n",
" 'Więc': 5,\n",
" 'stało': 5,\n",
" 'krzyż': 5,\n",
" 'wrócił': 5,\n",
" 'razy': 5,\n",
" 'razem': 5,\n",
" 'ręce': 5,\n",
" 'przed': 5,\n",
" 'Jużci': 5,\n",
" 'by': 5,\n",
" 'wy': 5,\n",
" 'być': 5,\n",
" 'czasem': 5,\n",
" 'patrzył': 5,\n",
" 'mnie': 5,\n",
" 'tyle': 5,\n",
" 'może': 5,\n",
" 'To': 5,\n",
" 'myślał': 5,\n",
" 'czterdzieści': 5,\n",
" 'mi': 5,\n",
" 'nauki': 5,\n",
" 'prawda': 5,\n",
" 'spytał': 5,\n",
" 'gminy': 5,\n",
" 'góry': 5,\n",
" 'kościele': 5,\n",
" 'Nauczyciel': 5,\n",
" 'tablicy': 5,\n",
" 'jakiś': 5,\n",
" 'rozgrzewkę': 5,\n",
" 'nagle': 5,\n",
" 'kilka': 5,\n",
" 'jednak': 5,\n",
" 'figury': 5,\n",
" 'począł': 5,\n",
" 'której': 5,\n",
" 'wszyscy': 5,\n",
" 'głowę': 5,\n",
" 'Bóg': 5,\n",
" 'ramieniu': 5,\n",
" 'urodził': 4,\n",
" 'chodziły': 4,\n",
" 'lepiej': 4,\n",
" 'niż': 4,\n",
" 'siostra': 4,\n",
" 'jako': 4,\n",
" 'cały': 4,\n",
" 'świecie': 4,\n",
" 'piecu': 4,\n",
" 'drugą': 4,\n",
" 'szło': 4,\n",
" 'samo': 4,\n",
" 'Tymczasem': 4,\n",
" 'zapytał': 4,\n",
" 'Matulu': 4,\n",
" 'chodzi': 4,\n",
" 'spojrzała': 4,\n",
" 'ręką': 4,\n",
" 'widzisz': 4,\n",
" 'roboty': 4,\n",
" 'dnia': 4,\n",
" 'ludzi': 4,\n",
" 'zboże': 4,\n",
" 'sposobem': 4,\n",
" 'patyki': 4,\n",
" 'je': 4,\n",
" 'Teraz': 4,\n",
" 'rzeczy': 4,\n",
" 'matką': 4,\n",
" 'Już': 4,\n",
" 'Andrzeju': 4,\n",
" 'widział': 4,\n",
" 'Ja': 4,\n",
" 'odpowiadał': 4,\n",
" 'zawsze': 4,\n",
" 'wówczas': 4,\n",
" 'Wtedy': 4,\n",
" 'rzekła': 4,\n",
" 'trzy': 4,\n",
" 'moja': 4,\n",
" 'O': 4,\n",
" 'zawołał': 4,\n",
" 'tydzień': 4,\n",
" 'szkole': 4,\n",
" 'pytał': 4,\n",
" 'nóg': 4,\n",
" 'profesorowi': 4,\n",
" 'ręki': 4,\n",
" 'ładny': 4,\n",
" 'nauczył': 4,\n",
" 'Przecie': 4,\n",
" 'matce': 4,\n",
" 'odezwał': 4,\n",
" 'uczy': 4,\n",
" 'profesor': 4,\n",
" 'nasz': 4,\n",
" 'ta': 4,\n",
" 'inni': 4,\n",
" 'Kiedy': 4,\n",
" 'wyraz': 4,\n",
" 'siebie': 4,\n",
" 'szczęście': 4,\n",
" 'miesiąc': 4,\n",
" 'poczęli': 4,\n",
" 'nas': 4,\n",
" 'duszy': 4,\n",
" 'rzeźbił': 4,\n",
" 'poszli': 4,\n",
" 'Kowal': 4,\n",
" 'dziś': 4,\n",
" 'pracy': 4,\n",
" 'karczmy': 4,\n",
" 'chłop': 4,\n",
" 'zwykle': 4,\n",
" 'wójtową': 4,\n",
" 'temu': 4,\n",
" 'Antku': 4,\n",
" 'dolinie': 3,\n",
" 'wzgórza': 3,\n",
" 'ptaki': 3,\n",
" 'środku': 3,\n",
" 'widać': 3,\n",
" 'jaki': 3,\n",
" 'budynek': 3,\n",
" 'Widać': 3,\n",
" 'mówili': 3,\n",
" 'starą': 3,\n",
" 'rok': 3,\n",
" 'siostrę': 3,\n",
" 'życia': 3,\n",
" 'kamizelkę': 3,\n",
" 'Gdy': 3,\n",
" 'patrzeć': 3,\n",
" 'Wisły': 3,\n",
" 'ziemi': 3,\n",
" 'górę': 3,\n",
" 'prawo': 3,\n",
" 'takie': 3,\n",
" 'kartofle': 3,\n",
" 'sercu': 3,\n",
" 'dobre': 3,\n",
" 'odparła': 3,\n",
" 'Cóż': 3,\n",
" 'wiatrakach': 3,\n",
" 'taka': 3,\n",
" 'jednego': 3,\n",
" 'aby': 3,\n",
" 'chodzić': 3,\n",
" 'początku': 3,\n",
" 'wszystkim': 3,\n",
" 'wiatraku': 3,\n",
" 'nareszcie': 3,\n",
" 'bije': 3,\n",
" 'szczury': 3,\n",
" 'noc': 3,\n",
" 'całe': 3,\n",
" 'życie': 3,\n",
" 'strugał': 3,\n",
" 'mały': 3,\n",
" 'Antkowi': 3,\n",
" 'byłby': 3,\n",
" 'Aż': 3,\n",
" 'wielki': 3,\n",
" 'brat': 3,\n",
" 'Wojtek': 3,\n",
" 'drzewo': 3,\n",
" 'wodę': 3,\n",
" 'nigdy': 3,\n",
" 'Chłopak': 3,\n",
" 'Dopiero': 3,\n",
" 'aniżeli': 3,\n",
" 'gospodarz': 3,\n",
" 'świata': 3,\n",
" 'majstra': 3,\n",
" 'rzemiosła': 3,\n",
" 'jeżeli': 3,\n",
" 'Oj': 3,\n",
" 'strasznie': 3,\n",
" 'trudno': 3,\n",
" 'dziewczyna': 3,\n",
" 'gorzej': 3,\n",
" 'sześć': 3,\n",
" 'wielką': 3,\n",
" 'chleba': 3,\n",
" 'Wdowa': 3,\n",
" 'patrzy': 3,\n",
" 'piec': 3,\n",
" 'pieca': 3,\n",
" 'zawołała': 3,\n",
" 'Cicho': 3,\n",
" 'matkę': 3,\n",
" 'Pan': 3,\n",
" 'ledwie': 3,\n",
" 'wyście': 3,\n",
" 'prędko': 3,\n",
" 'nikt': 3,\n",
" 'wiedział': 3,\n",
" 'drodze': 3,\n",
" 'źle': 3,\n",
" 'polne': 3,\n",
" 'koniki': 3,\n",
" 'gospodarstwie': 3,\n",
" 'Andrzeja': 3,\n",
" 'kancelarii': 3,\n",
" 'pan': 3,\n",
" 'umie': 3,\n",
" 'Jakże': 3,\n",
" 'czego': 3,\n",
" 'Te': 3,\n",
" 'gadać': 3,\n",
" 'abecadło': 3,\n",
" 'umiał': 3,\n",
" 'pisarz': 3,\n",
" 'dni': 3,\n",
" 'ławki': 3,\n",
" 'jedna': 3,\n",
" 'znak': 3,\n",
" 'literę': 3,\n",
" 'a…': 3,\n",
" 'oddziału': 3,\n",
" 'precel': 3,\n",
" 'nazywa': 3,\n",
" 'Chłopiec': 3,\n",
" 'ramiona': 3,\n",
" 'uczuł': 3,\n",
" 'usłyszał': 3,\n",
" 'druga': 3,\n",
" 'hardy': 3,\n",
" 'taką': 3,\n",
" 'bądź': 3,\n",
" 'później': 3,\n",
" 'gospodyni': 3,\n",
" 'im': 3,\n",
" 'także': 3,\n",
" 'jakoś': 3,\n",
" 'Wreszcie': 3,\n",
" 'tę': 3,\n",
" 'łgarstwo': 3,\n",
" 'sama': 3,\n",
" 'Tak': 3,\n",
" 'rubla': 3,\n",
" 'przyszedł': 3,\n",
" 'Nawet': 3,\n",
" 'wyszedł': 3,\n",
" 'człowiek': 3,\n",
" 'mówi': 3,\n",
" 'chłopaku': 3,\n",
" 'pieniędzy': 3,\n",
" 'wiadomo': 3,\n",
" 'swoich': 3,\n",
" 'świętych': 3,\n",
" 'kum': 3,\n",
" 'najwięcej': 3,\n",
" 'Idzie': 3,\n",
" 'polu': 3,\n",
" 'Bogu': 3,\n",
" 'sołtysem': 3,\n",
" 'butelkę': 3,\n",
" 'jutro': 3,\n",
" 'ktoś': 3,\n",
" 'terminatorzy': 3,\n",
" 'trzymał': 3,\n",
" 'chwilę': 3,\n",
" 'kobiety': 3,\n",
" 'mówiła': 3,\n",
" 'Ledwie': 3,\n",
" 'życiu': 3,\n",
" 'ludu': 3,\n",
" 'święta': 3,\n",
" 'sukmanę': 3,\n",
" 'drogi': 3,\n",
" 'kobiałkę': 3,\n",
" 'torbę': 3,\n",
" 'Zdawało': 3,\n",
" 'lasem': 2,\n",
" 'południa': 2,\n",
" 'śpiewały': 2,\n",
" 'wiejskie': 2,\n",
" 'gniazda': 2,\n",
" 'czerwone': 2,\n",
" 'wsią': 2,\n",
" 'zachodu': 2,\n",
" 'tamę': 2,\n",
" 'Wisłę': 2,\n",
" 'ognia': 2,\n",
" 'dawniej': 2,\n",
" 'dziwili': 2,\n",
" 'sobą': 2,\n",
" 'została': 2,\n",
" 'lata': 2,\n",
" 'Rozalia': 2,\n",
" 'musiał': 2,\n",
" 'przenieść': 2,\n",
" 'kołysał': 2,\n",
" 'wpadł': 2,\n",
" 'dostał': 2,\n",
" 'mało': 2,\n",
" 'psy': 2,\n",
" 'leżał': 2,\n",
" 'ojciec': 2,\n",
" 'sukienną': 2,\n",
" 'świń': 2,\n",
" 'oglądał': 2,\n",
" 'stronę': 2,\n",
" 'spod': 2,\n",
" 'pierwszym': 2,\n",
" 'zaraz': 2,\n",
" 'drugie': 2,\n",
" 'czarne': 2,\n",
" 'swoim': 2,\n",
" 'swojej': 2,\n",
" 'nocy': 2,\n",
" 'ciekawość': 2,\n",
" 'zobaczył': 2,\n",
" 'cztery': 2,\n",
" 'skrzydła': 2,\n",
" 'rozumiał': 2,\n",
" 'wnet': 2,\n",
" 'rzecz': 2,\n",
" 'Naprzód': 2,\n",
" 'młynarz': 2,\n",
" 'żonę': 2,\n",
" 'jakim': 2,\n",
" 'śpichrzów': 2,\n",
" 'wyprowadza': 2,\n",
" 'samą': 2,\n",
" 'pierwej': 2,\n",
" 'łeb': 2,\n",
" 'pracę': 2,\n",
" 'dała': 2,\n",
" 'położył': 2,\n",
" 'głodzie': 2,\n",
" 'wypadek': 2,\n",
" 'pory': 2,\n",
" 'słońca': 2,\n",
" 'żony': 2,\n",
" 'bić': 2,\n",
" 'zepsuł': 2,\n",
" 'chałupy': 2,\n",
" 'ojca': 2,\n",
" 'przytłukło': 2,\n",
" 'wielka': 2,\n",
" 'Dziewczyna': 2,\n",
" 'krupnik': 2,\n",
" 'bydła': 2,\n",
" 'Antkiem': 2,\n",
" 'nabili': 2,\n",
" 'płakał': 2,\n",
" 'robił': 2,\n",
" 'bydło': 2,\n",
" 'widząc': 2,\n",
" 'dziewucha': 2,\n",
" 'rozumu': 2,\n",
" 'mój': 2,\n",
" 'ludziom': 2,\n",
" 'darmo': 2,\n",
" 'naprzód': 2,\n",
" 'książki': 2,\n",
" 'kumie': 2,\n",
" 'wstyd': 2,\n",
" 'dziecku': 2,\n",
" 'robotę': 2,\n",
" 'robić': 2,\n",
" 'ławie': 2,\n",
" 'chcesz': 2,\n",
" 'będę': 2,\n",
" 'Miał': 2,\n",
" 'razu': 2,\n",
" 'myślała': 2,\n",
" 'pomogło': 2,\n",
" 'dziewuchę': 2,\n",
" 'szmaty': 2,\n",
" 'znachorkę': 2,\n",
" 'baba': 2,\n",
" 'koło': 2,\n",
" 'podłogę': 2,\n",
" 'węgle': 2,\n",
" 'teraz': 2,\n",
" 'sosnowej': 2,\n",
" 'desce': 2,\n",
" 'zdrowaśki': 2,\n",
" 'Rozalię': 2,\n",
" 'rogu': 2,\n",
" 'gorąco': 2,\n",
" 'mną': 2,\n",
" 'wyjdzie': 2,\n",
" 'zdrowie': 2,\n",
" 'baby': 2,\n",
" 'poczęła': 2,\n",
" 'rzucać': 2,\n",
" 'rękami': 2,\n",
" 'dyć': 2,\n",
" 'całkiem': 2,\n",
" 'odmawiać': 2,\n",
" 'płaczem': 2,\n",
" 'deskę': 2,\n",
" 'uklękła': 2,\n",
" 'głową': 2,\n",
" 'klepisko': 2,\n",
" 'choroba': 2,\n",
" 'Rozalii': 2,\n",
" 'Alboż': 2,\n",
" 'jedno': 2,\n",
" 'trumnę': 2,\n",
" 'krzyże': 2,\n",
" 'bok': 2,\n",
" 'idąc': 2,\n",
" 'Musi': 2,\n",
" 'ksiądz': 2,\n",
" 'czterech': 2,\n",
" 'jednej': 2,\n",
" 'dziewuchy': 2,\n",
" 'którym': 2,\n",
" 'wzdychał': 2,\n",
" 'spadł': 2,\n",
" 'pomocy': 2,\n",
" 'kuma': 2,\n",
" 'oddać': 2,\n",
" 'budować': 2,\n",
" 'Oho': 2,\n",
" 'tedy': 2,\n",
" 'poszła': 2,\n",
" 'kożuch': 2,\n",
" 'pieniądze': 2,\n",
" 'któremu': 2,\n",
" 'popatrzył': 2,\n",
" 'Ładny': 2,\n",
" 'musi': 2,\n",
" 'bym': 2,\n",
" 'wiedzieć': 2,\n",
" 'tych': 2,\n",
" 'uczył': 2,\n",
" 'drwa': 2,\n",
" 'paru': 2,\n",
" 'On': 2,\n",
" 'dołu': 2,\n",
" 'znaczy': 2,\n",
" 'pierwszy': 2,\n",
" 'wójt': 2,\n",
" 'Żebym': 2,\n",
" 'karczmie': 2,\n",
" 'stoi': 2,\n",
" 'Tylko': 2,\n",
" 'drzwi': 2,\n",
" 'miały': 2,\n",
" 'twarze': 2,\n",
" 'chodził': 2,\n",
" 'mróz': 2,\n",
" 'liter': 2,\n",
" 'zaczęła': 2,\n",
" 'Patrzcie': 2,\n",
" 'Tę': 2,\n",
" 'łatwo': 2,\n",
" 'wygląda': 2,\n",
" 'zawołali': 2,\n",
" 'chórem': 2,\n",
" 'pierwszego': 2,\n",
" 'głos': 2,\n",
" 'wyrysował': 2,\n",
" 'która': 2,\n",
" 'krzyknął': 2,\n",
" 'środek': 2,\n",
" 'położyli': 2,\n",
" 'Jeszcze': 2,\n",
" 'becz': 2,\n",
" 'hultaju': 2,\n",
" 'miejsce': 2,\n",
" 'trzecią': 2,\n",
" 'Pierwszy': 2,\n",
" 'milczał': 2,\n",
" 'Powtórz': 2,\n",
" 'ośle': 2,\n",
" 'wolno': 2,\n",
" 'nauka': 2,\n",
" 'kuchni': 2,\n",
" 'profesora': 2,\n",
" 'pokarm': 2,\n",
" 'zapytała': 2,\n",
" 'początek': 2,\n",
" 'będziesz': 2,\n",
" 'Ha': 2,\n",
" 'Jednego': 2,\n",
" 'serce': 2,\n",
" 'pożytek': 2,\n",
" 'jaka': 2,\n",
" 'niewiele': 2,\n",
" 'oczami': 2,\n",
" 'okna': 2,\n",
" 'ławy': 2,\n",
" 'zobaczyć': 2,\n",
" 'swego': 2,\n",
" 'Widzisz': 2,\n",
" 'widzę': 2,\n",
" 'znaleźć': 2,\n",
" 'kominie': 2,\n",
" 'baraszkuje': 2,\n",
" 'Jeżeli': 2,\n",
" 'nieszczęście': 2,\n",
" 'miesiące': 2,\n",
" 'idzie': 2,\n",
" 'grosz': 2,\n",
" 'czapkę': 2,\n",
" 'izbie': 2,\n",
" 'patrząc': 2,\n",
" 'zaś': 2,\n",
" 'mógłby': 2,\n",
" 'oboje': 2,\n",
" 'wtrąciła': 2,\n",
" 'jeść': 2,\n",
" 'odzienie': 2,\n",
" 'Kto': 2,\n",
" 'se': 2,\n",
" 'szkolnemu': 2,\n",
" 'pieśni': 2,\n",
" 'gospodarstwa': 2,\n",
" 'drugiej': 2,\n",
" 'stamtąd': 2,\n",
" 'Strugał': 2,\n",
" 'nowy': 2,\n",
" 'koni': 2,\n",
" 'siwej': 2,\n",
" 'Mordce': 2,\n",
" 'wypoczywał': 2,\n",
" 'roli': 2,\n",
" 'resztę': 2,\n",
" 'musiała': 2,\n",
" 'chłopców': 2,\n",
" 'komina': 2,\n",
" 'twarzy': 2,\n",
" 'Nareszcie': 2,\n",
" 'przyjaciel': 2,\n",
" 'Jednej': 2,\n",
" 'niedzieli': 2,\n",
" 'przyjął': 2,\n",
" 'niezgorzej': 2,\n",
" 'wcale': 2,\n",
" 'terminu': 2,\n",
" 'którzy': 2,\n",
" 'zjedli': 2,\n",
" 'dali': 2,\n",
" 'żelazo': 2,\n",
" 'innym': 2,\n",
" 'żelaza': 2,\n",
" 'robota': 2,\n",
" 'walił': 2,\n",
" 'końcu': 2,\n",
" 'urzędzie': 2,\n",
" 'sosnową': 2,\n",
" 'Pochwalony': 2,\n",
" 'odpowiada': 2,\n",
" 'Niczego': 2,\n",
" 'was': 2,\n",
" 'zęby': 2,\n",
" 'Może': 2,\n",
" 'mogli': 2,\n",
" 'chłopcy': 2,\n",
" 'płukania': 2,\n",
" 'zarobku': 2,\n",
" 'półtora': 2,\n",
" 'znać': 2,\n",
" 'powoli': 2,\n",
" 'nią': 2,\n",
" 'płukanie': 2,\n",
" 'potrafi': 2,\n",
" 'najstarszy': 2,\n",
" 'garści': 2,\n",
" 'miewał': 2,\n",
" 'czasie': 2,\n",
" 'podkowę': 2,\n",
" 'gęby': 2,\n",
" 'obejrzał': 2,\n",
" 'narzędzi': 2,\n",
" 'uciekł': 2,\n",
" 'oddała': 2,\n",
" 'progu': 2,\n",
" 'Antkowa': 2,\n",
" 'bystry': 2,\n",
" 'poznał': 2,\n",
" 'pilnik': 2,\n",
" 'święci': 2,\n",
" 'nich': 2,\n",
" 'coraz': 2,\n",
" 'bestyja': 2,\n",
" 'Był': 2,\n",
" 'zręczny': 2,\n",
" 'włosy': 2,\n",
" 'znam': 2,\n",
" 'kobieta': 2,\n",
" 'małżeństwa': 2,\n",
" 'wydał': 2,\n",
" 'Wójtowa': 2,\n",
" 'weselu': 2,\n",
" 'wójta': 2,\n",
" 'goście': 2,\n",
" 'strzelcy': 2,\n",
" 'miesięczną': 2,\n",
" 'pensję': 2,\n",
" 'czwarty': 2,\n",
" 'zaczęło': 2,\n",
" 'pomiędzy': 2,\n",
" 'sumę': 2,\n",
" 'ciasno': 2,\n",
" 'modlił': 2,\n",
" 'świętego': 2,\n",
" 'wielkim': 2,\n",
" 'ołtarzu': 2,\n",
" 'wyżej': 2,\n",
" 'wszystkich': 2,\n",
" 'Wtem': 2,\n",
" 'ciżbą': 2,\n",
" 'ramion': 2,\n",
" 'paciorków': 2,\n",
" 'znikła': 2,\n",
" 'ławce': 2,\n",
" 'kościół': 2,\n",
" 'podniesienie': 2,\n",
" 'zrobiło': 2,\n",
" 'twarz': 2,\n",
" 'sumie': 2,\n",
" 'gorzelnik': 2,\n",
" 'trzeciej': 2,\n",
" 'pole': 2,\n",
" 'chleb': 2,\n",
" 'wiele': 2,\n",
" 'ostatni': 2,\n",
" 'mignęło': 2,\n",
" 'oczach': 2,\n",
" 'wśród': 2,\n",
" 'niebo': 2,\n",
" 'zamiast': 2,\n",
" 'znalazł': 2,\n",
" 'kościoła': 2,\n",
" 'myśląc': 2,\n",
" 'wtedy': 2,\n",
" 'spojrzy': 2,\n",
" 'Wówczas': 2,\n",
" 'piękny': 2,\n",
" 'ofiarować': 2,\n",
" 'lewym': 2,\n",
" 'Wisła': 2,\n",
" 'opuścić': 2,\n",
" 'narzędziami': 2,\n",
" 'dar': 2,\n",
" 'drogę': 2,\n",
" 'Czy': 2,\n",
" 'ciebie': 2,\n",
" 'swoimi': 2,\n",
" 'rękoma': 2,\n",
" 'takiego': 2,\n",
" 'powiedział': 2,\n",
" 'szmer': 2,\n",
" 'jedną': 2,\n",
" 'chwili': 2,\n",
" 'tajemną': 2,\n",
" 'miłość': 2,\n",
" 'ucałował': 2,\n",
" 'daj': 2,\n",
" 'Kumie': 2,\n",
" 'synusiu': 2,\n",
" 'Boże': 2,\n",
" 'święty': 2,\n",
" 'miłościw': 2,\n",
" 'Bądź': 2,\n",
" 'obcych': 2,\n",
" 'uszedł': 2,\n",
" 'Niech': 2,\n",
" 'błogosławi': 2,\n",
" 'Zostańcie': 2,\n",
" 'Bogiem': 2,\n",
" 'szarym': 2,\n",
" 'Bolesław': 1,\n",
" 'Prus': 1,\n",
" 'Wieś': 1,\n",
" 'leżała': 1,\n",
" 'niewielkiej': 1,\n",
" 'północy': 1,\n",
" 'otaczały': 1,\n",
" 'spadziste': 1,\n",
" 'porosłe': 1,\n",
" 'sosnowym': 1,\n",
" 'garbate': 1,\n",
" 'zasypane': 1,\n",
" 'leszczyną': 1,\n",
" 'tarniną': 1,\n",
" 'głogiem': 1,\n",
" 'najgłośniej': 1,\n",
" 'najczęściej': 1,\n",
" 'rwać': 1,\n",
" 'orzechy': 1,\n",
" 'wybierać': 1,\n",
" 'Kiedyś': 1,\n",
" 'stanął': 1,\n",
" 'zdawało': 1,\n",
" 'oba': 1,\n",
" 'pasma': 1,\n",
" 'gór': 1,\n",
" 'biegną': 1,\n",
" 'zetknąć': 1,\n",
" 'rana': 1,\n",
" 'wstaje': 1,\n",
" 'słońce': 1,\n",
" 'złudzenie': 1,\n",
" 'bowiem': 1,\n",
" 'ciągnęła': 1,\n",
" 'wzgórzami': 1,\n",
" 'dolina': 1,\n",
" 'przecięta': 1,\n",
" 'rzeczułką': 1,\n",
" 'przykryta': 1,\n",
" 'zieloną': 1,\n",
" 'łąką': 1,\n",
" 'pasano': 1,\n",
" 'bydlątka': 1,\n",
" 'cienkonogie': 1,\n",
" 'bociany': 1,\n",
" 'polować': 1,\n",
" 'żaby': 1,\n",
" 'kukające': 1,\n",
" 'wieczorami': 1,\n",
" 'tamą': 1,\n",
" 'wapienne': 1,\n",
" 'nagie': 1,\n",
" 'Każdy': 1,\n",
" 'chłopski': 1,\n",
" 'szarą': 1,\n",
" 'słomą': 1,\n",
" 'pokryty': 1,\n",
" 'ogródek': 1,\n",
" 'ogródku': 1,\n",
" 'śliwki': 1,\n",
" 'węgierki': 1,\n",
" 'spomiędzy': 1,\n",
" 'komin': 1,\n",
" 'sadzą': 1,\n",
" 'uczerniony': 1,\n",
" 'pożarną': 1,\n",
" 'drabinkę': 1,\n",
" 'Drabiny': 1,\n",
" 'zaprowadzono': 1,\n",
" 'dawna': 1,\n",
" 'myśleli': 1,\n",
" 'one': 1,\n",
" 'chronić': 1,\n",
" 'będą': 1,\n",
" 'bocianie': 1,\n",
" 'Toteż': 1,\n",
" 'płonął': 1,\n",
" 'ratowali': 1,\n",
" 'gospodarza': 1,\n",
" 'dopust': 1,\n",
" 'boski': 1,\n",
" 'Spalił': 1,\n",
" 'nową': 1,\n",
" 'drabinę': 1,\n",
" 'zapłacił': 1,\n",
" 'śtraf': 1,\n",
" 'połamane': 1,\n",
" 'szczeble': 1,\n",
" 'Położyli': 1,\n",
" 'niemalowanej': 1,\n",
" 'kołysce': 1,\n",
" 'zmarłym': 1,\n",
" 'bracie': 1,\n",
" 'sypiał': 1,\n",
" 'przyszła': 1,\n",
" 'ustąpić': 1,\n",
" 'osoba': 1,\n",
" 'dorosła': 1,\n",
" 'ławę': 1,\n",
" 'następny': 1,\n",
" 'rozglądał': 1,\n",
" 'Raz': 1,\n",
" 'rzekę': 1,\n",
" 'batem': 1,\n",
" 'przejezdnego': 1,\n",
" 'furmana': 1,\n",
" 'konie': 1,\n",
" 'stratowały': 1,\n",
" 'pogryzły': 1,\n",
" 'tygodnie': 1,\n",
" 'Doświadczył': 1,\n",
" 'niemało': 1,\n",
" 'czwartym': 1,\n",
" 'podarował': 1,\n",
" 'mosiężnym': 1,\n",
" 'guzikiem': 1,\n",
" 'kazała': 1,\n",
" 'nosić': 1,\n",
" 'pięć': 1,\n",
" 'użyto': 1,\n",
" 'pasania': 1,\n",
" 'Wolał': 1,\n",
" 'wapiennym': 1,\n",
" 'wzgórzem': 1,\n",
" 'pokazywało': 1,\n",
" 'wysokiego': 1,\n",
" 'czarnego': 1,\n",
" 'Wyłaziło': 1,\n",
" 'lewej': 1,\n",
" 'strony': 1,\n",
" 'upadało': 1,\n",
" 'trzecie': 1,\n",
" 'wysokie': 1,\n",
" 'świnie': 1,\n",
" 'obyczajem': 1,\n",
" 'wlazły': 1,\n",
" 'spostrzegłszy': 1,\n",
" 'zawinęła': 1,\n",
" 'wedle': 1,\n",
" 'sukiennej': 1,\n",
" 'kamizelki': 1,\n",
" 'Antkowej': 1,\n",
" ...})"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"word_counts"
]
},
{
"cell_type": "code",
"execution_count": 59,
"id": "e5fd7da7-420d-4fd5-af68-3ea5fd3189e7",
"metadata": {},
"outputs": [],
"source": [
"with open('polish_stopwords.txt') as f:\n",
" stopwords = [x.strip() for x in f.readlines()]"
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "862e4d48-aace-4504-95fe-f08a981d243f",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"['a',\n",
" 'aby',\n",
" 'ach',\n",
" 'acz',\n",
" 'aczkolwiek',\n",
" 'aj',\n",
" 'albo',\n",
" 'ale',\n",
" 'alez',\n",
" 'ależ',\n",
" 'ani',\n",
" 'az',\n",
" 'aż',\n",
" 'bardziej',\n",
" 'bardzo',\n",
" 'beda',\n",
" 'bedzie',\n",
" 'bez',\n",
" 'deda',\n",
" 'będą',\n",
" 'bede',\n",
" 'będę',\n",
" 'będzie',\n",
" 'bo',\n",
" 'bowiem',\n",
" 'by',\n",
" 'byc',\n",
" 'być',\n",
" 'byl',\n",
" 'byla',\n",
" 'byli',\n",
" 'bylo',\n",
" 'byly',\n",
" 'był',\n",
" 'była',\n",
" 'było',\n",
" 'były',\n",
" 'bynajmniej',\n",
" 'cala',\n",
" 'cali',\n",
" 'caly',\n",
" 'cała',\n",
" 'cały',\n",
" 'ci',\n",
" 'cie',\n",
" 'ciebie',\n",
" 'cię',\n",
" 'co',\n",
" 'cokolwiek',\n",
" 'cos',\n",
" 'coś',\n",
" 'czasami',\n",
" 'czasem',\n",
" 'czemu',\n",
" 'czy',\n",
" 'czyli',\n",
" 'daleko',\n",
" 'dla',\n",
" 'dlaczego',\n",
" 'dlatego',\n",
" 'do',\n",
" 'dobrze',\n",
" 'dokad',\n",
" 'dokąd',\n",
" 'dosc',\n",
" 'dość',\n",
" 'duzo',\n",
" 'dużo',\n",
" 'dwa',\n",
" 'dwaj',\n",
" 'dwie',\n",
" 'dwoje',\n",
" 'dzis',\n",
" 'dzisiaj',\n",
" 'dziś',\n",
" 'gdy',\n",
" 'gdyby',\n",
" 'gdyz',\n",
" 'gdyż',\n",
" 'gdzie',\n",
" 'gdziekolwiek',\n",
" 'gdzies',\n",
" 'gdzieś',\n",
" 'go',\n",
" 'i',\n",
" 'ich',\n",
" 'ile',\n",
" 'im',\n",
" 'inna',\n",
" 'inne',\n",
" 'inny',\n",
" 'innych',\n",
" 'iz',\n",
" 'iż',\n",
" 'ja',\n",
" 'jak',\n",
" 'jakas',\n",
" 'jakaś',\n",
" 'jakby',\n",
" 'jaki',\n",
" 'jakichs',\n",
" 'jakichś',\n",
" 'jakie',\n",
" 'jakis',\n",
" 'jakiś',\n",
" 'jakiz',\n",
" 'jakiż',\n",
" 'jakkolwiek',\n",
" 'jako',\n",
" 'jakos',\n",
" 'jakoś',\n",
" 'ją',\n",
" 'je',\n",
" 'jeden',\n",
" 'jedna',\n",
" 'jednak',\n",
" 'jednakze',\n",
" 'jednakże',\n",
" 'jedno',\n",
" 'jego',\n",
" 'jej',\n",
" 'jemu',\n",
" 'jesli',\n",
" 'jest',\n",
" 'jestem',\n",
" 'jeszcze',\n",
" 'jeśli',\n",
" 'jezeli',\n",
" 'jeżeli',\n",
" 'juz',\n",
" 'już',\n",
" 'kazdy',\n",
" 'każdy',\n",
" 'kiedy',\n",
" 'kilka',\n",
" 'kims',\n",
" 'kimś',\n",
" 'kto',\n",
" 'ktokolwiek',\n",
" 'ktora',\n",
" 'ktore',\n",
" 'ktorego',\n",
" 'ktorej',\n",
" 'ktory',\n",
" 'ktorych',\n",
" 'ktorym',\n",
" 'ktorzy',\n",
" 'ktos',\n",
" 'ktoś',\n",
" 'która',\n",
" 'które',\n",
" 'którego',\n",
" 'której',\n",
" 'który',\n",
" 'których',\n",
" 'którym',\n",
" 'którzy',\n",
" 'ku',\n",
" 'lat',\n",
" 'lecz',\n",
" 'lub',\n",
" 'ma',\n",
" 'mają',\n",
" 'mało',\n",
" 'mam',\n",
" 'mi',\n",
" 'miedzy',\n",
" 'między',\n",
" 'mimo',\n",
" 'mna',\n",
" 'mną',\n",
" 'mnie',\n",
" 'moga',\n",
" 'mogą',\n",
" 'moi',\n",
" 'moim',\n",
" 'moj',\n",
" 'moja',\n",
" 'moje',\n",
" 'moze',\n",
" 'mozliwe',\n",
" 'mozna',\n",
" 'może',\n",
" 'możliwe',\n",
" 'można',\n",
" 'mój',\n",
" 'mu',\n",
" 'musi',\n",
" 'my',\n",
" 'na',\n",
" 'nad',\n",
" 'nam',\n",
" 'nami',\n",
" 'nas',\n",
" 'nasi',\n",
" 'nasz',\n",
" 'nasza',\n",
" 'nasze',\n",
" 'naszego',\n",
" 'naszych',\n",
" 'natomiast',\n",
" 'natychmiast',\n",
" 'nawet',\n",
" 'nia',\n",
" 'nią',\n",
" 'nic',\n",
" 'nich',\n",
" 'nie',\n",
" 'niech',\n",
" 'niego',\n",
" 'niej',\n",
" 'niemu',\n",
" 'nigdy',\n",
" 'nim',\n",
" 'nimi',\n",
" 'niz',\n",
" 'niż',\n",
" 'no',\n",
" 'o',\n",
" 'obok',\n",
" 'od',\n",
" 'około',\n",
" 'on',\n",
" 'ona',\n",
" 'one',\n",
" 'oni',\n",
" 'ono',\n",
" 'oraz',\n",
" 'oto',\n",
" 'owszem',\n",
" 'pan',\n",
" 'pana',\n",
" 'pani',\n",
" 'po',\n",
" 'pod',\n",
" 'podczas',\n",
" 'pomimo',\n",
" 'ponad',\n",
" 'poniewaz',\n",
" 'ponieważ',\n",
" 'powinien',\n",
" 'powinna',\n",
" 'powinni',\n",
" 'powinno',\n",
" 'poza',\n",
" 'prawie',\n",
" 'przeciez',\n",
" 'przecież',\n",
" 'przed',\n",
" 'przede',\n",
" 'przedtem',\n",
" 'przez',\n",
" 'przy',\n",
" 'roku',\n",
" 'rowniez',\n",
" 'również',\n",
" 'sam',\n",
" 'sama',\n",
" 'są',\n",
" 'sie',\n",
" 'się',\n",
" 'skad',\n",
" 'skąd',\n",
" 'soba',\n",
" 'sobą',\n",
" 'sobie',\n",
" 'sposob',\n",
" 'sposób',\n",
" 'swoje',\n",
" 'ta',\n",
" 'tak',\n",
" 'taka',\n",
" 'taki',\n",
" 'takie',\n",
" 'takze',\n",
" 'także',\n",
" 'tam',\n",
" 'te',\n",
" 'tego',\n",
" 'tej',\n",
" 'ten',\n",
" 'teraz',\n",
" 'też',\n",
" 'to',\n",
" 'toba',\n",
" 'tobą',\n",
" 'tobie',\n",
" 'totez',\n",
" 'toteż',\n",
" 'totobą',\n",
" 'trzeba',\n",
" 'tu',\n",
" 'tutaj',\n",
" 'twoi',\n",
" 'twoim',\n",
" 'twoj',\n",
" 'twoja',\n",
" 'twoje',\n",
" 'twój',\n",
" 'twym',\n",
" 'ty',\n",
" 'tych',\n",
" 'tylko',\n",
" 'tym',\n",
" 'u',\n",
" 'w',\n",
" 'wam',\n",
" 'wami',\n",
" 'was',\n",
" 'wasz',\n",
" 'wasza',\n",
" 'wasze',\n",
" 'we',\n",
" 'według',\n",
" 'wiele',\n",
" 'wielu',\n",
" 'więc',\n",
" 'więcej',\n",
" 'wlasnie',\n",
" 'właśnie',\n",
" 'wszyscy',\n",
" 'wszystkich',\n",
" 'wszystkie',\n",
" 'wszystkim',\n",
" 'wszystko',\n",
" 'wtedy',\n",
" 'wy',\n",
" 'z',\n",
" 'za',\n",
" 'zaden',\n",
" 'zadna',\n",
" 'zadne',\n",
" 'zadnych',\n",
" 'zapewne',\n",
" 'zawsze',\n",
" 'ze',\n",
" 'zeby',\n",
" 'zeznowu',\n",
" 'zł',\n",
" 'znow',\n",
" 'znowu',\n",
" 'znów',\n",
" 'zostal',\n",
" 'został',\n",
" 'żaden',\n",
" 'żadna',\n",
" 'żadne',\n",
" 'żadnych',\n",
" 'że',\n",
" 'żeby']"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"stopwords"
]
},
{
"cell_type": "code",
"execution_count": 66,
"id": "b13ad981-c0c3-4611-9b6d-63e91af96bf8",
"metadata": {},
"outputs": [],
"source": [
"tokenized_text = [t for t in tokenized_text if t not in stopwords]\n",
"word_counts = Counter(tokenized_text)\n"
]
},
{
"cell_type": "code",
"execution_count": 67,
"id": "55247747-c7f0-491d-82c3-f699a8354541",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'—': 244,\n",
" 'Antek': 61,\n",
" 'A': 50,\n",
" '…': 35,\n",
" 'Ale': 26,\n",
" 'matka': 26,\n",
" 'choć': 19,\n",
" 'raz': 19,\n",
" 'wsi': 17,\n",
" 'W': 17,\n",
" 'chłopca': 16,\n",
" 'miał': 15,\n",
" 'Antka': 15,\n",
" 'domu': 14,\n",
" 'trochę': 14,\n",
" 'Andrzej': 13,\n",
" 'przecie': 12,\n",
" 'chłopiec': 12,\n",
" 'oczy': 12,\n",
" 'Na': 12,\n",
" 'wciąż': 11,\n",
" 'szkoły': 11,\n",
" 'potem': 11,\n",
" 'Potem': 10,\n",
" 'ino': 10,\n",
" 'wdowa': 10,\n",
" 'No': 10,\n",
" 'Jak': 10,\n",
" 'wójtowej': 10,\n",
" 'dzieci': 9,\n",
" 'drugi': 9,\n",
" 'chłopak': 9,\n",
" 'I': 9,\n",
" 'nauczyciel': 9,\n",
" 'poszedł': 9,\n",
" 'kuźni': 9,\n",
" 'Za': 8,\n",
" 'dom': 8,\n",
" 'chaty': 8,\n",
" 'świat': 8,\n",
" 'swoją': 8,\n",
" 'dzień': 8,\n",
" 'chacie': 8,\n",
" 'Co': 8,\n",
" 'chyba': 8,\n",
" 'rzekł': 8,\n",
" 'odparł': 8,\n",
" 'Nie': 8,\n",
" 'kowala': 8,\n",
" 'sołtys': 8,\n",
" 'wieś': 7,\n",
" 'ludzie': 7,\n",
" 'Matka': 7,\n",
" 'mógł': 7,\n",
" 'matki': 7,\n",
" 'matulu': 7,\n",
" 'parę': 7,\n",
" 'majster': 7,\n",
" 'groszy': 7,\n",
" 'izby': 7,\n",
" '„': 7,\n",
" '”': 7,\n",
" 'mówił': 7,\n",
" 'Bo': 7,\n",
" 'rękę': 7,\n",
" 'swój': 7,\n",
" 'krzyżyk': 7,\n",
" 'Wisłą': 6,\n",
" 'ażeby': 6,\n",
" 'Przez': 6,\n",
" 'trzeci': 6,\n",
" 'czas': 6,\n",
" 'chłopcu': 6,\n",
" 'Po': 6,\n",
" 'drogą': 6,\n",
" 'wiatraki': 6,\n",
" 'żal': 6,\n",
" 'naukę': 6,\n",
" 'stary': 6,\n",
" 'wziął': 6,\n",
" 'niby': 6,\n",
" 'dopiero': 6,\n",
" 'be': 6,\n",
" 'szedł': 6,\n",
" 'kowal': 6,\n",
" 'wójtowa': 6,\n",
" 'Od': 5,\n",
" 'Tam': 5,\n",
" 'miała': 5,\n",
" 'takiej': 5,\n",
" 'miejsca': 5,\n",
" 'dziecko': 5,\n",
" 'wiatrak': 5,\n",
" 'Więc': 5,\n",
" 'stało': 5,\n",
" 'krzyż': 5,\n",
" 'wrócił': 5,\n",
" 'razy': 5,\n",
" 'razem': 5,\n",
" 'ręce': 5,\n",
" 'Jużci': 5,\n",
" 'patrzył': 5,\n",
" 'tyle': 5,\n",
" 'To': 5,\n",
" 'myślał': 5,\n",
" 'czterdzieści': 5,\n",
" 'nauki': 5,\n",
" 'prawda': 5,\n",
" 'spytał': 5,\n",
" 'gminy': 5,\n",
" 'góry': 5,\n",
" 'kościele': 5,\n",
" 'Nauczyciel': 5,\n",
" 'tablicy': 5,\n",
" 'rozgrzewkę': 5,\n",
" 'nagle': 5,\n",
" 'figury': 5,\n",
" 'począł': 5,\n",
" 'głowę': 5,\n",
" 'Bóg': 5,\n",
" 'ramieniu': 5,\n",
" 'urodził': 4,\n",
" 'chodziły': 4,\n",
" 'lepiej': 4,\n",
" 'siostra': 4,\n",
" 'świecie': 4,\n",
" 'piecu': 4,\n",
" 'drugą': 4,\n",
" 'szło': 4,\n",
" 'samo': 4,\n",
" 'Tymczasem': 4,\n",
" 'zapytał': 4,\n",
" 'Matulu': 4,\n",
" 'chodzi': 4,\n",
" 'spojrzała': 4,\n",
" 'ręką': 4,\n",
" 'widzisz': 4,\n",
" 'roboty': 4,\n",
" 'dnia': 4,\n",
" 'ludzi': 4,\n",
" 'zboże': 4,\n",
" 'sposobem': 4,\n",
" 'patyki': 4,\n",
" 'Teraz': 4,\n",
" 'rzeczy': 4,\n",
" 'matką': 4,\n",
" 'Już': 4,\n",
" 'Andrzeju': 4,\n",
" 'widział': 4,\n",
" 'Ja': 4,\n",
" 'odpowiadał': 4,\n",
" 'wówczas': 4,\n",
" 'Wtedy': 4,\n",
" 'rzekła': 4,\n",
" 'trzy': 4,\n",
" 'O': 4,\n",
" 'zawołał': 4,\n",
" 'tydzień': 4,\n",
" 'szkole': 4,\n",
" 'pytał': 4,\n",
" 'nóg': 4,\n",
" 'profesorowi': 4,\n",
" 'ręki': 4,\n",
" 'ładny': 4,\n",
" 'nauczył': 4,\n",
" 'Przecie': 4,\n",
" 'matce': 4,\n",
" 'odezwał': 4,\n",
" 'uczy': 4,\n",
" 'profesor': 4,\n",
" 'inni': 4,\n",
" 'Kiedy': 4,\n",
" 'wyraz': 4,\n",
" 'siebie': 4,\n",
" 'szczęście': 4,\n",
" 'miesiąc': 4,\n",
" 'poczęli': 4,\n",
" 'duszy': 4,\n",
" 'rzeźbił': 4,\n",
" 'poszli': 4,\n",
" 'Kowal': 4,\n",
" 'pracy': 4,\n",
" 'karczmy': 4,\n",
" 'chłop': 4,\n",
" 'zwykle': 4,\n",
" 'wójtową': 4,\n",
" 'temu': 4,\n",
" 'Antku': 4,\n",
" 'dolinie': 3,\n",
" 'wzgórza': 3,\n",
" 'ptaki': 3,\n",
" 'środku': 3,\n",
" 'widać': 3,\n",
" 'budynek': 3,\n",
" 'Widać': 3,\n",
" 'mówili': 3,\n",
" 'starą': 3,\n",
" 'rok': 3,\n",
" 'siostrę': 3,\n",
" 'życia': 3,\n",
" 'kamizelkę': 3,\n",
" 'Gdy': 3,\n",
" 'patrzeć': 3,\n",
" 'Wisły': 3,\n",
" 'ziemi': 3,\n",
" 'górę': 3,\n",
" 'prawo': 3,\n",
" 'kartofle': 3,\n",
" 'sercu': 3,\n",
" 'dobre': 3,\n",
" 'odparła': 3,\n",
" 'Cóż': 3,\n",
" 'wiatrakach': 3,\n",
" 'jednego': 3,\n",
" 'chodzić': 3,\n",
" 'początku': 3,\n",
" 'wiatraku': 3,\n",
" 'nareszcie': 3,\n",
" 'bije': 3,\n",
" 'szczury': 3,\n",
" 'noc': 3,\n",
" 'całe': 3,\n",
" 'życie': 3,\n",
" 'strugał': 3,\n",
" 'mały': 3,\n",
" 'Antkowi': 3,\n",
" 'byłby': 3,\n",
" 'Aż': 3,\n",
" 'wielki': 3,\n",
" 'brat': 3,\n",
" 'Wojtek': 3,\n",
" 'drzewo': 3,\n",
" 'wodę': 3,\n",
" 'Chłopak': 3,\n",
" 'Dopiero': 3,\n",
" 'aniżeli': 3,\n",
" 'gospodarz': 3,\n",
" 'świata': 3,\n",
" 'majstra': 3,\n",
" 'rzemiosła': 3,\n",
" 'Oj': 3,\n",
" 'strasznie': 3,\n",
" 'trudno': 3,\n",
" 'dziewczyna': 3,\n",
" 'gorzej': 3,\n",
" 'sześć': 3,\n",
" 'wielką': 3,\n",
" 'chleba': 3,\n",
" 'Wdowa': 3,\n",
" 'patrzy': 3,\n",
" 'piec': 3,\n",
" 'pieca': 3,\n",
" 'zawołała': 3,\n",
" 'Cicho': 3,\n",
" 'matkę': 3,\n",
" 'Pan': 3,\n",
" 'ledwie': 3,\n",
" 'wyście': 3,\n",
" 'prędko': 3,\n",
" 'nikt': 3,\n",
" 'wiedział': 3,\n",
" 'drodze': 3,\n",
" 'źle': 3,\n",
" 'polne': 3,\n",
" 'koniki': 3,\n",
" 'gospodarstwie': 3,\n",
" 'Andrzeja': 3,\n",
" 'kancelarii': 3,\n",
" 'umie': 3,\n",
" 'Jakże': 3,\n",
" 'czego': 3,\n",
" 'Te': 3,\n",
" 'gadać': 3,\n",
" 'abecadło': 3,\n",
" 'umiał': 3,\n",
" 'pisarz': 3,\n",
" 'dni': 3,\n",
" 'ławki': 3,\n",
" 'znak': 3,\n",
" 'literę': 3,\n",
" 'a…': 3,\n",
" 'oddziału': 3,\n",
" 'precel': 3,\n",
" 'nazywa': 3,\n",
" 'Chłopiec': 3,\n",
" 'ramiona': 3,\n",
" 'uczuł': 3,\n",
" 'usłyszał': 3,\n",
" 'druga': 3,\n",
" 'hardy': 3,\n",
" 'taką': 3,\n",
" 'bądź': 3,\n",
" 'później': 3,\n",
" 'gospodyni': 3,\n",
" 'Wreszcie': 3,\n",
" 'tę': 3,\n",
" 'łgarstwo': 3,\n",
" 'Tak': 3,\n",
" 'rubla': 3,\n",
" 'przyszedł': 3,\n",
" 'Nawet': 3,\n",
" 'wyszedł': 3,\n",
" 'człowiek': 3,\n",
" 'mówi': 3,\n",
" 'chłopaku': 3,\n",
" 'pieniędzy': 3,\n",
" 'wiadomo': 3,\n",
" 'swoich': 3,\n",
" 'świętych': 3,\n",
" 'kum': 3,\n",
" 'najwięcej': 3,\n",
" 'Idzie': 3,\n",
" 'polu': 3,\n",
" 'Bogu': 3,\n",
" 'sołtysem': 3,\n",
" 'butelkę': 3,\n",
" 'jutro': 3,\n",
" 'terminatorzy': 3,\n",
" 'trzymał': 3,\n",
" 'chwilę': 3,\n",
" 'kobiety': 3,\n",
" 'mówiła': 3,\n",
" 'Ledwie': 3,\n",
" 'życiu': 3,\n",
" 'ludu': 3,\n",
" 'święta': 3,\n",
" 'sukmanę': 3,\n",
" 'drogi': 3,\n",
" 'kobiałkę': 3,\n",
" 'torbę': 3,\n",
" 'Zdawało': 3,\n",
" 'lasem': 2,\n",
" 'południa': 2,\n",
" 'śpiewały': 2,\n",
" 'wiejskie': 2,\n",
" 'gniazda': 2,\n",
" 'czerwone': 2,\n",
" 'wsią': 2,\n",
" 'zachodu': 2,\n",
" 'tamę': 2,\n",
" 'Wisłę': 2,\n",
" 'ognia': 2,\n",
" 'dawniej': 2,\n",
" 'dziwili': 2,\n",
" 'została': 2,\n",
" 'lata': 2,\n",
" 'Rozalia': 2,\n",
" 'musiał': 2,\n",
" 'przenieść': 2,\n",
" 'kołysał': 2,\n",
" 'wpadł': 2,\n",
" 'dostał': 2,\n",
" 'psy': 2,\n",
" 'leżał': 2,\n",
" 'ojciec': 2,\n",
" 'sukienną': 2,\n",
" 'świń': 2,\n",
" 'oglądał': 2,\n",
" 'stronę': 2,\n",
" 'spod': 2,\n",
" 'pierwszym': 2,\n",
" 'zaraz': 2,\n",
" 'drugie': 2,\n",
" 'czarne': 2,\n",
" 'swoim': 2,\n",
" 'swojej': 2,\n",
" 'nocy': 2,\n",
" 'ciekawość': 2,\n",
" 'zobaczył': 2,\n",
" 'cztery': 2,\n",
" 'skrzydła': 2,\n",
" 'rozumiał': 2,\n",
" 'wnet': 2,\n",
" 'rzecz': 2,\n",
" 'Naprzód': 2,\n",
" 'młynarz': 2,\n",
" 'żonę': 2,\n",
" 'jakim': 2,\n",
" 'śpichrzów': 2,\n",
" 'wyprowadza': 2,\n",
" 'samą': 2,\n",
" 'pierwej': 2,\n",
" 'łeb': 2,\n",
" 'pracę': 2,\n",
" 'dała': 2,\n",
" 'położył': 2,\n",
" 'głodzie': 2,\n",
" 'wypadek': 2,\n",
" 'pory': 2,\n",
" 'słońca': 2,\n",
" 'żony': 2,\n",
" 'bić': 2,\n",
" 'zepsuł': 2,\n",
" 'chałupy': 2,\n",
" 'ojca': 2,\n",
" 'przytłukło': 2,\n",
" 'wielka': 2,\n",
" 'Dziewczyna': 2,\n",
" 'krupnik': 2,\n",
" 'bydła': 2,\n",
" 'Antkiem': 2,\n",
" 'nabili': 2,\n",
" 'płakał': 2,\n",
" 'robił': 2,\n",
" 'bydło': 2,\n",
" 'widząc': 2,\n",
" 'dziewucha': 2,\n",
" 'rozumu': 2,\n",
" 'ludziom': 2,\n",
" 'darmo': 2,\n",
" 'naprzód': 2,\n",
" 'książki': 2,\n",
" 'kumie': 2,\n",
" 'wstyd': 2,\n",
" 'dziecku': 2,\n",
" 'robotę': 2,\n",
" 'robić': 2,\n",
" 'ławie': 2,\n",
" 'chcesz': 2,\n",
" 'Miał': 2,\n",
" 'razu': 2,\n",
" 'myślała': 2,\n",
" 'pomogło': 2,\n",
" 'dziewuchę': 2,\n",
" 'szmaty': 2,\n",
" 'znachorkę': 2,\n",
" 'baba': 2,\n",
" 'koło': 2,\n",
" 'podłogę': 2,\n",
" 'węgle': 2,\n",
" 'sosnowej': 2,\n",
" 'desce': 2,\n",
" 'zdrowaśki': 2,\n",
" 'Rozalię': 2,\n",
" 'rogu': 2,\n",
" 'gorąco': 2,\n",
" 'wyjdzie': 2,\n",
" 'zdrowie': 2,\n",
" 'baby': 2,\n",
" 'poczęła': 2,\n",
" 'rzucać': 2,\n",
" 'rękami': 2,\n",
" 'dyć': 2,\n",
" 'całkiem': 2,\n",
" 'odmawiać': 2,\n",
" 'płaczem': 2,\n",
" 'deskę': 2,\n",
" 'uklękła': 2,\n",
" 'głową': 2,\n",
" 'klepisko': 2,\n",
" 'choroba': 2,\n",
" 'Rozalii': 2,\n",
" 'Alboż': 2,\n",
" 'trumnę': 2,\n",
" 'krzyże': 2,\n",
" 'bok': 2,\n",
" 'idąc': 2,\n",
" 'Musi': 2,\n",
" 'ksiądz': 2,\n",
" 'czterech': 2,\n",
" 'jednej': 2,\n",
" 'dziewuchy': 2,\n",
" 'wzdychał': 2,\n",
" 'spadł': 2,\n",
" 'pomocy': 2,\n",
" 'kuma': 2,\n",
" 'oddać': 2,\n",
" 'budować': 2,\n",
" 'Oho': 2,\n",
" 'tedy': 2,\n",
" 'poszła': 2,\n",
" 'kożuch': 2,\n",
" 'pieniądze': 2,\n",
" 'któremu': 2,\n",
" 'popatrzył': 2,\n",
" 'Ładny': 2,\n",
" 'bym': 2,\n",
" 'wiedzieć': 2,\n",
" 'uczył': 2,\n",
" 'drwa': 2,\n",
" 'paru': 2,\n",
" 'On': 2,\n",
" 'dołu': 2,\n",
" 'znaczy': 2,\n",
" 'pierwszy': 2,\n",
" 'wójt': 2,\n",
" 'Żebym': 2,\n",
" 'karczmie': 2,\n",
" 'stoi': 2,\n",
" 'Tylko': 2,\n",
" 'drzwi': 2,\n",
" 'miały': 2,\n",
" 'twarze': 2,\n",
" 'chodził': 2,\n",
" 'mróz': 2,\n",
" 'liter': 2,\n",
" 'zaczęła': 2,\n",
" 'Patrzcie': 2,\n",
" 'Tę': 2,\n",
" 'łatwo': 2,\n",
" 'wygląda': 2,\n",
" 'zawołali': 2,\n",
" 'chórem': 2,\n",
" 'pierwszego': 2,\n",
" 'głos': 2,\n",
" 'wyrysował': 2,\n",
" 'krzyknął': 2,\n",
" 'środek': 2,\n",
" 'położyli': 2,\n",
" 'Jeszcze': 2,\n",
" 'becz': 2,\n",
" 'hultaju': 2,\n",
" 'miejsce': 2,\n",
" 'trzecią': 2,\n",
" 'Pierwszy': 2,\n",
" 'milczał': 2,\n",
" 'Powtórz': 2,\n",
" 'ośle': 2,\n",
" 'wolno': 2,\n",
" 'nauka': 2,\n",
" 'kuchni': 2,\n",
" 'profesora': 2,\n",
" 'pokarm': 2,\n",
" 'zapytała': 2,\n",
" 'początek': 2,\n",
" 'będziesz': 2,\n",
" 'Ha': 2,\n",
" 'Jednego': 2,\n",
" 'serce': 2,\n",
" 'pożytek': 2,\n",
" 'jaka': 2,\n",
" 'niewiele': 2,\n",
" 'oczami': 2,\n",
" 'okna': 2,\n",
" 'ławy': 2,\n",
" 'zobaczyć': 2,\n",
" 'swego': 2,\n",
" 'Widzisz': 2,\n",
" 'widzę': 2,\n",
" 'znaleźć': 2,\n",
" 'kominie': 2,\n",
" 'baraszkuje': 2,\n",
" 'Jeżeli': 2,\n",
" 'nieszczęście': 2,\n",
" 'miesiące': 2,\n",
" 'idzie': 2,\n",
" 'grosz': 2,\n",
" 'czapkę': 2,\n",
" 'izbie': 2,\n",
" 'patrząc': 2,\n",
" 'zaś': 2,\n",
" 'mógłby': 2,\n",
" 'oboje': 2,\n",
" 'wtrąciła': 2,\n",
" 'jeść': 2,\n",
" 'odzienie': 2,\n",
" 'Kto': 2,\n",
" 'se': 2,\n",
" 'szkolnemu': 2,\n",
" 'pieśni': 2,\n",
" 'gospodarstwa': 2,\n",
" 'drugiej': 2,\n",
" 'stamtąd': 2,\n",
" 'Strugał': 2,\n",
" 'nowy': 2,\n",
" 'koni': 2,\n",
" 'siwej': 2,\n",
" 'Mordce': 2,\n",
" 'wypoczywał': 2,\n",
" 'roli': 2,\n",
" 'resztę': 2,\n",
" 'musiała': 2,\n",
" 'chłopców': 2,\n",
" 'komina': 2,\n",
" 'twarzy': 2,\n",
" 'Nareszcie': 2,\n",
" 'przyjaciel': 2,\n",
" 'Jednej': 2,\n",
" 'niedzieli': 2,\n",
" 'przyjął': 2,\n",
" 'niezgorzej': 2,\n",
" 'wcale': 2,\n",
" 'terminu': 2,\n",
" 'zjedli': 2,\n",
" 'dali': 2,\n",
" 'żelazo': 2,\n",
" 'innym': 2,\n",
" 'żelaza': 2,\n",
" 'robota': 2,\n",
" 'walił': 2,\n",
" 'końcu': 2,\n",
" 'urzędzie': 2,\n",
" 'sosnową': 2,\n",
" 'Pochwalony': 2,\n",
" 'odpowiada': 2,\n",
" 'Niczego': 2,\n",
" 'zęby': 2,\n",
" 'Może': 2,\n",
" 'mogli': 2,\n",
" 'chłopcy': 2,\n",
" 'płukania': 2,\n",
" 'zarobku': 2,\n",
" 'półtora': 2,\n",
" 'znać': 2,\n",
" 'powoli': 2,\n",
" 'płukanie': 2,\n",
" 'potrafi': 2,\n",
" 'najstarszy': 2,\n",
" 'garści': 2,\n",
" 'miewał': 2,\n",
" 'czasie': 2,\n",
" 'podkowę': 2,\n",
" 'gęby': 2,\n",
" 'obejrzał': 2,\n",
" 'narzędzi': 2,\n",
" 'uciekł': 2,\n",
" 'oddała': 2,\n",
" 'progu': 2,\n",
" 'Antkowa': 2,\n",
" 'bystry': 2,\n",
" 'poznał': 2,\n",
" 'pilnik': 2,\n",
" 'święci': 2,\n",
" 'coraz': 2,\n",
" 'bestyja': 2,\n",
" 'Był': 2,\n",
" 'zręczny': 2,\n",
" 'włosy': 2,\n",
" 'znam': 2,\n",
" 'kobieta': 2,\n",
" 'małżeństwa': 2,\n",
" 'wydał': 2,\n",
" 'Wójtowa': 2,\n",
" 'weselu': 2,\n",
" 'wójta': 2,\n",
" 'goście': 2,\n",
" 'strzelcy': 2,\n",
" 'miesięczną': 2,\n",
" 'pensję': 2,\n",
" 'czwarty': 2,\n",
" 'zaczęło': 2,\n",
" 'pomiędzy': 2,\n",
" 'sumę': 2,\n",
" 'ciasno': 2,\n",
" 'modlił': 2,\n",
" 'świętego': 2,\n",
" 'wielkim': 2,\n",
" 'ołtarzu': 2,\n",
" 'wyżej': 2,\n",
" 'Wtem': 2,\n",
" 'ciżbą': 2,\n",
" 'ramion': 2,\n",
" 'paciorków': 2,\n",
" 'znikła': 2,\n",
" 'ławce': 2,\n",
" 'kościół': 2,\n",
" 'podniesienie': 2,\n",
" 'zrobiło': 2,\n",
" 'twarz': 2,\n",
" 'sumie': 2,\n",
" 'gorzelnik': 2,\n",
" 'trzeciej': 2,\n",
" 'pole': 2,\n",
" 'chleb': 2,\n",
" 'ostatni': 2,\n",
" 'mignęło': 2,\n",
" 'oczach': 2,\n",
" 'wśród': 2,\n",
" 'niebo': 2,\n",
" 'zamiast': 2,\n",
" 'znalazł': 2,\n",
" 'kościoła': 2,\n",
" 'myśląc': 2,\n",
" 'spojrzy': 2,\n",
" 'Wówczas': 2,\n",
" 'piękny': 2,\n",
" 'ofiarować': 2,\n",
" 'lewym': 2,\n",
" 'Wisła': 2,\n",
" 'opuścić': 2,\n",
" 'narzędziami': 2,\n",
" 'dar': 2,\n",
" 'drogę': 2,\n",
" 'Czy': 2,\n",
" 'swoimi': 2,\n",
" 'rękoma': 2,\n",
" 'takiego': 2,\n",
" 'powiedział': 2,\n",
" 'szmer': 2,\n",
" 'jedną': 2,\n",
" 'chwili': 2,\n",
" 'tajemną': 2,\n",
" 'miłość': 2,\n",
" 'ucałował': 2,\n",
" 'daj': 2,\n",
" 'Kumie': 2,\n",
" 'synusiu': 2,\n",
" 'Boże': 2,\n",
" 'święty': 2,\n",
" 'miłościw': 2,\n",
" 'Bądź': 2,\n",
" 'obcych': 2,\n",
" 'uszedł': 2,\n",
" 'Niech': 2,\n",
" 'błogosławi': 2,\n",
" 'Zostańcie': 2,\n",
" 'Bogiem': 2,\n",
" 'szarym': 2,\n",
" 'Bolesław': 1,\n",
" 'Prus': 1,\n",
" 'Wieś': 1,\n",
" 'leżała': 1,\n",
" 'niewielkiej': 1,\n",
" 'północy': 1,\n",
" 'otaczały': 1,\n",
" 'spadziste': 1,\n",
" 'porosłe': 1,\n",
" 'sosnowym': 1,\n",
" 'garbate': 1,\n",
" 'zasypane': 1,\n",
" 'leszczyną': 1,\n",
" 'tarniną': 1,\n",
" 'głogiem': 1,\n",
" 'najgłośniej': 1,\n",
" 'najczęściej': 1,\n",
" 'rwać': 1,\n",
" 'orzechy': 1,\n",
" 'wybierać': 1,\n",
" 'Kiedyś': 1,\n",
" 'stanął': 1,\n",
" 'zdawało': 1,\n",
" 'oba': 1,\n",
" 'pasma': 1,\n",
" 'gór': 1,\n",
" 'biegną': 1,\n",
" 'zetknąć': 1,\n",
" 'rana': 1,\n",
" 'wstaje': 1,\n",
" 'słońce': 1,\n",
" 'złudzenie': 1,\n",
" 'ciągnęła': 1,\n",
" 'wzgórzami': 1,\n",
" 'dolina': 1,\n",
" 'przecięta': 1,\n",
" 'rzeczułką': 1,\n",
" 'przykryta': 1,\n",
" 'zieloną': 1,\n",
" 'łąką': 1,\n",
" 'pasano': 1,\n",
" 'bydlątka': 1,\n",
" 'cienkonogie': 1,\n",
" 'bociany': 1,\n",
" 'polować': 1,\n",
" 'żaby': 1,\n",
" 'kukające': 1,\n",
" 'wieczorami': 1,\n",
" 'tamą': 1,\n",
" 'wapienne': 1,\n",
" 'nagie': 1,\n",
" 'Każdy': 1,\n",
" 'chłopski': 1,\n",
" 'szarą': 1,\n",
" 'słomą': 1,\n",
" 'pokryty': 1,\n",
" 'ogródek': 1,\n",
" 'ogródku': 1,\n",
" 'śliwki': 1,\n",
" 'węgierki': 1,\n",
" 'spomiędzy': 1,\n",
" 'komin': 1,\n",
" 'sadzą': 1,\n",
" 'uczerniony': 1,\n",
" 'pożarną': 1,\n",
" 'drabinkę': 1,\n",
" 'Drabiny': 1,\n",
" 'zaprowadzono': 1,\n",
" 'dawna': 1,\n",
" 'myśleli': 1,\n",
" 'chronić': 1,\n",
" 'bocianie': 1,\n",
" 'Toteż': 1,\n",
" 'płonął': 1,\n",
" 'ratowali': 1,\n",
" 'gospodarza': 1,\n",
" 'dopust': 1,\n",
" 'boski': 1,\n",
" 'Spalił': 1,\n",
" 'nową': 1,\n",
" 'drabinę': 1,\n",
" 'zapłacił': 1,\n",
" 'śtraf': 1,\n",
" 'połamane': 1,\n",
" 'szczeble': 1,\n",
" 'Położyli': 1,\n",
" 'niemalowanej': 1,\n",
" 'kołysce': 1,\n",
" 'zmarłym': 1,\n",
" 'bracie': 1,\n",
" 'sypiał': 1,\n",
" 'przyszła': 1,\n",
" 'ustąpić': 1,\n",
" 'osoba': 1,\n",
" 'dorosła': 1,\n",
" 'ławę': 1,\n",
" 'następny': 1,\n",
" 'rozglądał': 1,\n",
" 'Raz': 1,\n",
" 'rzekę': 1,\n",
" 'batem': 1,\n",
" 'przejezdnego': 1,\n",
" 'furmana': 1,\n",
" 'konie': 1,\n",
" 'stratowały': 1,\n",
" 'pogryzły': 1,\n",
" 'tygodnie': 1,\n",
" 'Doświadczył': 1,\n",
" 'niemało': 1,\n",
" 'czwartym': 1,\n",
" 'podarował': 1,\n",
" 'mosiężnym': 1,\n",
" 'guzikiem': 1,\n",
" 'kazała': 1,\n",
" 'nosić': 1,\n",
" 'pięć': 1,\n",
" 'użyto': 1,\n",
" 'pasania': 1,\n",
" 'Wolał': 1,\n",
" 'wapiennym': 1,\n",
" 'wzgórzem': 1,\n",
" 'pokazywało': 1,\n",
" 'wysokiego': 1,\n",
" 'czarnego': 1,\n",
" 'Wyłaziło': 1,\n",
" 'lewej': 1,\n",
" 'strony': 1,\n",
" 'upadało': 1,\n",
" 'trzecie': 1,\n",
" 'wysokie': 1,\n",
" 'świnie': 1,\n",
" 'obyczajem': 1,\n",
" 'wlazły': 1,\n",
" 'spostrzegłszy': 1,\n",
" 'zawinęła': 1,\n",
" 'wedle': 1,\n",
" 'sukiennej': 1,\n",
" 'kamizelki': 1,\n",
" 'Antkowej': 1,\n",
" 'tchu': 1,\n",
" 'złapać': 1,\n",
" 'zawziętości': 1,\n",
" 'wykrzyczawszy': 1,\n",
" 'wydrapawszy': 1,\n",
" 'kierunku': 1,\n",
" 'Antkowego': 1,\n",
" 'palca': 1,\n",
" 'przysłoniła': 1,\n",
" 'pilnuj': 1,\n",
" 'pokrzywami': 1,\n",
" 'wysmaruję': 1,\n",
" 'Acha': 1,\n",
" 'At': 1,\n",
" 'głupiś': 1,\n",
" 'uciekła': 1,\n",
" 'Gdzie': 1,\n",
" 'rozum': 1,\n",
" 'udzielania': 1,\n",
" 'objaśnień': 1,\n",
" 'spokojności': 1,\n",
" 'dawał': 1,\n",
" 'widywał': 1,\n",
" 'Widywał': 1,\n",
" 'sen.': 1,\n",
" 'straszna': 1,\n",
" 'urosła': 1,\n",
" 'zakradł': 1,\n",
" 'promu': 1,\n",
" 'rzeki': 1,\n",
" 'przewoził': 1,\n",
" 'popłynął': 1,\n",
" 'Popłynął': 1,\n",
" 'wdrapał': 1,\n",
" 'wapienną': 1,\n",
" 'akurat': 1,\n",
" 'miejscu': 1,\n",
" 'ogłoszenie': 1,\n",
" 'tędy': 1,\n",
" 'Wydał': 1,\n",
" 'dzwonnica': 1,\n",
" 'grubszy': 1,\n",
" 'dzwonnicy': 1,\n",
" 'okno': 1,\n",
" 'tęgie': 1,\n",
" 'ustawione': 1,\n",
" 'Z': 1,\n",
" 'objaśnili': 1,\n",
" 'pastuchowie': 1,\n",
" 'dowiedział': 1,\n",
" 'dmucha': 1,\n",
" 'wiater': 1,\n",
" 'kręci': 1,\n",
" 'liśćmi': 1,\n",
" 'Dalej': 1,\n",
" 'miele': 1,\n",
" 'mąkę': 1,\n",
" 'siedzi': 1,\n",
" 'mądry': 1,\n",
" 'wie': 1,\n",
" 'poglądowej': 1,\n",
" 'lekcji': 1,\n",
" 'tą': 1,\n",
" 'Dali': 1,\n",
" 'przewoźnicy': 1,\n",
" 'krwawą': 1,\n",
" 'kontent': 1,\n",
" 'zaspokoił': 1,\n",
" 'spać': 1,\n",
" 'marzył': 1,\n",
" 'całą': 1,\n",
" 'mieli': 1,\n",
" 'młynarzu': 1,\n",
" 'Drobny': 1,\n",
" 'stanowczo': 1,\n",
" 'wpłynął': 1,\n",
" 'wschodu': 1,\n",
" 'układał': 1,\n",
" 'wystrugał': 1,\n",
" 'kolumnę': 1,\n",
" 'próbował': 1,\n",
" 'obciosywał': 1,\n",
" 'ustawiał': 1,\n",
" 'wybudował': 1,\n",
" 'wiatraczek': 1,\n",
" 'wietrze': 1,\n",
" 'obracał': 1,\n",
" 'tamten': 1,\n",
" 'radość': 1,\n",
" 'brakowało': 1,\n",
" 'prawdziwy': 1,\n",
" 'Do': 1,\n",
" 'dziesiątego': 1,\n",
" 'koziki': 1,\n",
" 'dziwne': 1,\n",
" 'Robił': 1,\n",
" 'płoty': 1,\n",
" 'drabiny': 1,\n",
" 'studnie': 1,\n",
" 'zastanawiali': 1,\n",
" 'gałgan': 1,\n",
" 'podrosła': 1,\n",
" 'lesie': 1,\n",
" 'Rozalią': 1,\n",
" 'wygoda': 1,\n",
" 'zimą': 1,\n",
" 'zamiatała': 1,\n",
" 'izbę': 1,\n",
" 'nosiła': 1,\n",
" 'potrafiła': 1,\n",
" 'ugotować': 1,\n",
" 'Latem': 1,\n",
" 'posyłano': 1,\n",
" 'zajęty': 1,\n",
" 'struganiem': 1,\n",
" 'dopilnował': 1,\n",
" 'naprosili': 1,\n",
" 'napłakali': 1,\n",
" 'krzyczał': 1,\n",
" 'obiecywał': 1,\n",
" 'szkodę': 1,\n",
" 'właziło': 1,\n",
" 'pasła': 1,\n",
" 'pilnowała': 1,\n",
" 'krów': 1,\n",
" 'Nieraz': 1,\n",
" 'młodsza': 1,\n",
" 'chęci': 1,\n",
" 'załamywała': 1,\n",
" 'żalu': 1,\n",
" 'lamentowała': 1,\n",
" 'starym': 1,\n",
" 'kumem': 1,\n",
" 'Andrzejem': 1,\n",
" 'pocznę': 1,\n",
" 'nieszczęsna': 1,\n",
" 'odmieńcem': 1,\n",
" 'Ani': 1,\n",
" 'zrobi': 1,\n",
" 'doglądnie': 1,\n",
" 'kraje': 1,\n",
" 'wstąpiło': 1,\n",
" 'parobek': 1,\n",
" 'darmozjad': 1,\n",
" 'śmiech': 1,\n",
" 'obrazę': 1,\n",
" 'boską': 1,\n",
" 'młodu': 1,\n",
" 'praktykował': 1,\n",
" 'flisactwo': 1,\n",
" 'pocieszał': 1,\n",
" 'strapioną': 1,\n",
" 'wdowę': 1,\n",
" ...})"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"word_counts"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "3ba3ca17-b2f9-45d8-863f-bb6ff86b68d5",
"metadata": {},
"outputs": [],
"source": [
"word_counts = Counter(tokenized_text)\n",
"\n",
"words = list(word_counts.keys())\n",
"frequencies = list(word_counts.values())"
]
},
{
"cell_type": "code",
"execution_count": 69,
"id": "aa1acf8f-eaf3-430c-a67e-ec1b97d4ebd2",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA2EAAAKOCAYAAAAvaRHcAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy81sbWrAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3RUdf7/8dedmWRSSCGEJIQECEV6FURBBUGwoih21l6wgIuK2HfBRdT92nWta3cRdW2sKCusK10FBCEoCFJEek1IIW3evz/4zd0MSWgJE8Dn45x7IJ97597P+869M/c1d+4dx8xMAAAAAICw8NR2BwAAAADg94QQBgAAAABhRAgDAAAAgDAihAEAAABAGBHCAAAAACCMCGEAAAAAEEaEMAAAAAAII0IYAAAAAIQRIQwAAAAAwogQBuB34aqrrpLjOHrjjTdquyuopiZNmshxHK1ataq2uxLi66+/luM46t27d213pdreeOMNOY6jq666qra7UqlRo0bJcRyNGjWqtrtSqcN1GwVw+CCEAajg559/luM48ng82rp1a6XTvPnmm3IcR47j6IMPPqh0mnXr1rnTHIkHI8ED0b0NiYmJtd1N/M7sa5usbDgaguHRqnfv3gf1nB5qq1at0qhRo/jgCjhEfLXdAQCHn2OOOUapqanauHGjZs6cqXPOOafCNDNmzHD/P336dF144YUVppk+fbokKSMjQ02aNDlk/T3U/H6/unbtWum4uLi4MPcGzZo1U1RUlCIiImq7KyFiYmLUsmVLNWrU6JAup2fPnhXacnJylJ2dXeX49u3bH9I+hVtycrJatmyp5OTk2u5KtbVv316lpaUV2mfOnClJateunRISEsLdLa1atUqjR49Wr169DtszosCRjBAGoFInnXSS/vnPf2r69OlVhrDgWaBg2KpsmuC8jmRpaWkhoRO16z//+U9td6FSxx13nJYsWXLIl1PZtvj111/rlFNOqXL80Wbo0KEaOnRobXejRjz77LOVtgfPdj377LOcyQSOQnwdEUClgsGpsgO6LVu2aMmSJerRo4dOOOEELVy4ULm5uRWmC4azk08++dB2FgAA4AhCCANQqWAImzdvngoKCkLGBYPZiSeeqJ49eyoQCGjWrFkh0+Tm5mrRokUh8wpavHixLr/8cmVkZCgyMlKpqakaNGiQvvnmm0r7Uv6mGitXrtRVV12lhg0byufzhVyYn5+fr3vuuUdZWVmKiopSkyZNdMcddygvL69a62J/lL+RQX5+vu69914dc8wxioqKqvAp9nfffadLLrlEDRs2dOu/8MILNX/+/Crnv3r1av3hD39QSkqKYmJi1KFDB/3tb3+TmVV5E4B9XTuyt5sHmJnGjx+vfv36qV69evL7/WratKluvfVWbdiwocL05W9KEQgE9PTTT6tdu3aKiopSamqqrr32Wm3evLnKvmzbtk1//vOf1blzZ8XHx6tOnTpq3bq1brzxxgrrZV83Pfj3v/+tc845R6mpqfL7/crIyNDVV1+tX375pdLps7OzNXjwYGVmZioyMlKJiYlq0aKFLrvsMk2aNKnKPu9tHZS3atUqOY7jfiX3nXfeUdeuXRUTE6OkpCRdeOGFWrFixX4v52Bs3bpVI0eOVMuWLRUdHa26deuqd+/e+sc//iEzO6B5/fbbb2rdurUcx9GQIUMUCATccdu2bdN9992ndu3aKTY2VnFxcTr++OP1yiuvhEwXVH7fXrduna655ho1aNBAUVFRatu2rf72t79V2ofKbswRXM/7Giq7xqmm98lw2N/t/Oeff1ZsbKy8Xq+mTp1aYT6bNm1SSkqKHMfRuHHjJO2+Ti14ZnXq1Kkh6+9I/mo5cFgxAKhEWVmZJSQkmCT76quvQsbdcccdJsmmTZtmX3/9tUmye++9N2SaL774wiRZvXr1LBAIuO2ffvqp+f1+k2SJiYnWtWtXq1+/vkkyj8djL7/8coW+XHnllSbJ7r77bktMTDS/329dunSxVq1a2ahRo8zMLC8vz4477jiTZI7jWLt27axNmzbmOI516dLFLrnkEpNkr7/++n6vg9dff90kWePGjfd72osuusi6dOlijuNY69atrXPnzta/f393uieeeMIcxzFJlpSUZJ07d7Z69eqZJIuIiLAPP/ywwrx//PFHd5qoqCg79thjrVGjRibJbr75ZmvcuLFJspUrV4Y8TpLt7WW+qscVFxfbhRde6D4+PT3dOnbsaDExMSbJGjRoYEuXLg15zH//+1+TZL169bLLLrvMJFmLFi2sbdu25vP5TJK1bdvWdu3aVaEfCxYssPT0dHcbaNOmjXXq1Mni4+NNkl155ZX71W8zsz/+8Y9uv1NSUqxz587ufOLj423mzJkh03/77bcWHR1tkiwhIcE6duxo7dq1c7f9c889t8r1t6fy66C8lStXutvR3Xff7f6/Y8eO7r7QoEED27x5834vq6plV/Z8L1u2zDIzM02SRUZGWpcuXaxp06bu9FdccUXIPmr2v+15z3W/fPlya9KkiUmyESNGhIzLzs62hg0bustp06aNNWvWzN3eL7jgggrLCe7bo0aNsrS0NIuKirIuXbq424MkGzNmTIWa/vznP5sk+/Of/+y2rV+/3nr27Fnl4PF4Kn0NCOc+eSCC9f/3v/+tMO5At/O//e1vJskaNWpkO3bsCBk3YMAAk2QXX3yx2zZ06FBr166dO7/y6/GCCy446JoA/A8hDECVzjjjDJNkDz74YEj7cccdZ5GRkVZYWGgFBQUWERFhJ598csg09957b4WD2LVr17oHCn/84x+tqKjIzHYHvoceesg96Pnhhx9C5hU8UPN6vXbOOefY1q1b3XGFhYVmZnbbbbe5B7fZ2dnu+AULFljDhg0tIiIiLCHM6/XaMcccYz/++GOFPn7xxRfmOI4lJydXOLD7+9//bj6fz+Li4mzdunVueyAQsC5dupgkO+2000Jqf/fddy0iIsINOTUVwoJBoXPnzjZ//ny3vaCgwG6++WaTZF27dg15TDAEREREWHp6un377bfuuKVLl1pGRoZJshdeeCHkcTk5Oe7B6+mnn25r1qwJGT9t2jR755139qvfL774okmyrKyskAPX0tJSGzNmjEmyjIwM9/kwMzv77LPdDxGC22PQnDlz7B//+Eel664y+wphPp/P4uPj7fPPP3fHrV+/3jp06GCS7K677trvZVW17D2f70AgYF27dnX7tWHDBnfcF198YbGxsSbJnn/++ZDHVRbCFi1aZGlpaZW+JuTl5VmzZs1Mkt16662Wk5Pjjlu8eLG1bdvWJNlzzz0X8rjgvh0REWEXXHCBbd++3R33/PPPuyGnfLtZ5SFsb8aOHeuGkE2bNoWsg3DukweiqhB2MNu5mdnpp59ukuwPf/iD2/bSSy+ZJGvYsKFt27YtZPqqtmcANYMQBqBKwQOXfv36uW35+fkWERFhPXr0cNu6d+9uUVFRIQexJ598skmyxx57zG277777TJJ16tSp0uWdeeaZJskuv/zykPbggVpaWprl5eVVeFxubq57lmbixIkVxn/00UfuAc3BhLC9DcGDoPLTzps3r9L5BQ/cPv3000rHB88wlj/AnTJlikmy6OjoSs+U3Hrrre5yayKEbdq0yfx+v8XHx1cIRGa7A3O3bt1M2n0mNKh8CKjszMEzzzxjkuycc84Jaf/rX/9qkqx169aVniXb334XFRVZWlqaeb1e+/777yt93KBBg0ySvfXWW25by5YtTVJIaDhY+wphkuzxxx+v8LgJEyaYJOvQoUO1l73n8z158mSTZH6/39avX1/hccH137hx45CzVHuGsO+++86SkpLMcRx7+umnK8wn+Pyed955lfbvhx9+MMdxrGnTpiHt+9q3g/vMRx99FNJ+ICHss88+M4/HYzExMRW2jXDvkweishB2sNu5mdm6devcs3fvvfeeLV++3OrUqWOO49iXX35ZYT6EMODQ4powAFUKXss1e/ZslZWVSZK++eYblZSU6MQTT3Sn69mzp3bt2qU5c+ZIkoqLi/Xdd99JCr0px5dffilJVd7V7I9//GPIdHs
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(10, 6))\n",
"plt.bar(words, frequencies)\n",
"\n",
"# Customize the plot\n",
"plt.title(\"Word Frequencies in Tokenized Text\", fontsize=16)\n",
"plt.xlabel(\"Words\", fontsize=14)\n",
"plt.ylabel(\"Frequencies\", fontsize=14)\n",
"plt.xticks(rotation=45, fontsize=12)\n",
"plt.grid(axis='y', linestyle='--', alpha=0.7)"
]
},
{
"cell_type": "code",
"execution_count": 92,
"id": "0e81b1fe-478d-4d87-9d19-33b07ef733cd",
"metadata": {},
"outputs": [],
"source": [
"most_common = word_counts.most_common(30)"
]
},
{
"cell_type": "code",
"execution_count": 93,
"id": "2a5acec4-ee38-4e49-aeb3-c19462597371",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[('—', 244),\n",
" ('Antek', 61),\n",
" ('A', 50),\n",
" ('…', 35),\n",
" ('Ale', 26),\n",
" ('matka', 26),\n",
" ('choć', 19),\n",
" ('raz', 19),\n",
" ('wsi', 17),\n",
" ('W', 17),\n",
" ('chłopca', 16),\n",
" ('miał', 15),\n",
" ('Antka', 15),\n",
" ('domu', 14),\n",
" ('trochę', 14),\n",
" ('Andrzej', 13),\n",
" ('przecie', 12),\n",
" ('chłopiec', 12),\n",
" ('oczy', 12),\n",
" ('Na', 12),\n",
" ('wciąż', 11),\n",
" ('szkoły', 11),\n",
" ('potem', 11),\n",
" ('Potem', 10),\n",
" ('ino', 10),\n",
" ('wdowa', 10),\n",
" ('No', 10),\n",
" ('Jak', 10),\n",
" ('wójtowej', 10),\n",
" ('dzieci', 9)]"
]
},
"execution_count": 93,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"most_common"
]
},
{
"cell_type": "code",
"execution_count": 94,
"id": "52a620b4-554f-4769-83ff-b971d5fa12b2",
"metadata": {},
"outputs": [],
"source": [
"words = [x[0] for x in most_common]\n",
"frequencies = [x[1] for x in most_common]"
]
},
{
"cell_type": "code",
"execution_count": 95,
"id": "035efcdc-1811-4fa3-99f1-add11b7ce319",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"['—',\n",
" 'Antek',\n",
" 'A',\n",
" '…',\n",
" 'Ale',\n",
" 'matka',\n",
" 'choć',\n",
" 'raz',\n",
" 'wsi',\n",
" 'W',\n",
" 'chłopca',\n",
" 'miał',\n",
" 'Antka',\n",
" 'domu',\n",
" 'trochę',\n",
" 'Andrzej',\n",
" 'przecie',\n",
" 'chłopiec',\n",
" 'oczy',\n",
" 'Na',\n",
" 'wciąż',\n",
" 'szkoły',\n",
" 'potem',\n",
" 'Potem',\n",
" 'ino',\n",
" 'wdowa',\n",
" 'No',\n",
" 'Jak',\n",
" 'wójtowej',\n",
" 'dzieci']"
]
},
"execution_count": 95,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"words"
]
},
{
"cell_type": "code",
"execution_count": 96,
"id": "8b46b5b0-9666-4af3-920b-fd394ced9e4d",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[244,\n",
" 61,\n",
" 50,\n",
" 35,\n",
" 26,\n",
" 26,\n",
" 19,\n",
" 19,\n",
" 17,\n",
" 17,\n",
" 16,\n",
" 15,\n",
" 15,\n",
" 14,\n",
" 14,\n",
" 13,\n",
" 12,\n",
" 12,\n",
" 12,\n",
" 12,\n",
" 11,\n",
" 11,\n",
" 11,\n",
" 10,\n",
" 10,\n",
" 10,\n",
" 10,\n",
" 10,\n",
" 10,\n",
" 9]"
]
},
"execution_count": 96,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"frequencies"
]
},
{
"cell_type": "code",
"execution_count": 97,
"id": "fee70bf9-0011-4a9e-b5a8-b02557a58010",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAJeCAYAAABYuv+MAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy81sbWrAAAACXBIWXMAAA9hAAAPYQGoP6dpAADNRklEQVR4nOzdd3gUVfvw8XtSaSkEkpCQQoAAgdBD7x0EEaSoKE0QGygiKuoPCYjYsWB9HhAQRFCxPiiCKL03KVKlSJWeQAJpe79/8O64m2xCwg4Q8Pu5Li7NmbNnzpmyO/ecM2cMVVUBAAAAALjF40ZXAAAAAABuBQRXAAAAAGABgisAAAAAsADBFQAAAABYgOAKAAAAACxAcAUAAAAAFiC4AgAAAAALEFwBAAAAgAUIrgAAAADAAgRXAP4VBgwYIIZhyLRp0250VeCmcuXKiWEYcuDAgRtdFSeLFy8WwzCkZcuWN7oqbps2bZoYhiEDBgy40VVxKTExUQzDkMTExBtdFZcK6zEK4NojuAKQw+7du8UwDPHw8JDTp0+7zDN9+nQxDEMMw5Avv/zSZZ6jR4+aeW7Giwz7BWZe/wIDA290NfEvc6Vj0tW/WyHgu1W1bNnyqvbptXbgwAFJTEzkhhRQQF43ugIACp9KlSpJaGio/P3337JixQrp2rVrjjzLly83/3/ZsmXSq1evHHmWLVsmIiIRERFSrly5a1bfa83X11cSEhJcLvPz87vOtUGFChWkSJEi4u3tfaOr4qRYsWJSuXJliYqKuqbradKkSY60pKQk2bZtW67Lq1evfk3rdL2VLl1aKleuLKVLl77RVXFb9erVJTMzM0f6ihUrREQkPj5eAgICrne15MCBAzJ27Fhp0aJFoe3BBAojgisALjVr1ky++uorWbZsWa7Blb3Xxh5EucpjL+tmVqZMGadgEjfWokWLbnQVXKpfv77s3Lnzmq/H1bG4ePFiadWqVa7LbzVDhw6VoUOH3uhqWGLSpEku0+29U5MmTaLnEbiJMCwQgEv2gMjVhdqpU6dk586d0rhxY2nUqJFs2bJFkpOTc+SzB13Nmze/tpUFAAAoBAiuALhkD642bNggqampTsvsAVfTpk2lSZMmYrPZZOXKlU55kpOTZevWrU5l2W3fvl369u0rERER4uPjI6GhodKjRw9ZvXq1y7o4Tkaxf/9+GTBggJQtW1a8vLycHmhPSUmRZ599VmJiYqRIkSJSrlw5efLJJ+XChQtubYv8cJwAICUlRZ577jmpVKmSFClSJMdd57Vr18rdd98tZcuWNdvfq1cv2bRpU67lHzx4UO677z4JCQmRYsWKSY0aNeT9998XVc314fkrPZuR10P3qiqzZ8+Wdu3aSalSpcTX11fKly8vjz32mBw/fjxHfsfJHGw2m7zzzjsSHx8vRYoUkdDQUBk0aJCcPHky17qcOXNGxowZI7Vr1xZ/f38pUaKExMXFyUMPPZRju1xpsoCff/5ZunbtKqGhoeLr6ysREREycOBA+fPPP13m37Ztm9x7770SGRkpPj4+EhgYKLGxsdKnTx+ZP39+rnXOaxs4OnDggBiGYQ6NnTlzpiQkJEixYsUkKChIevXqJfv27cv3eq7G6dOn5emnn5bKlStL0aJFpWTJktKyZUv57LPPRFULVNbhw4clLi5ODMOQBx98UGw2m7nszJkz8vzzz0t8fLwUL15c/Pz8pGHDhvLf//7XKZ+d47l99OhRuf/++yUsLEyKFCki1apVk/fff99lHVxNaGHfzlf65+oZIqvPyeshv8f57t27pXjx4uLp6SlLlizJUc6JEyckJCREDMOQWbNmicjl58DsPaFLlixx2n438xBv4LpQAHAhKytLAwICVET0119/dVr25JNPqojo0qVLdfHixSoi+txzzznl+emnn1REtFSpUmqz2cz07777Tn19fVVENDAwUBMSEjQ4OFhFRD08PPQ///lPjrr0799fRURHjRqlgYGB6uvrq3Xq1NEqVapoYmKiqqpeuHBB69evryKihmFofHy8Vq1aVQ3D0Dp16ujdd9+tIqJTp07N9zaYOnWqiohGR0fnO2/v3r21Tp06ahiGxsXFae3atbV9+/ZmvokTJ6phGCoiGhQUpLVr19ZSpUqpiKi3t7fOnTs3R9l//PGHmadIkSJat25djYqKUhHRRx55RKOjo1VEdP/+/U6fExHN62s+t8+lp6drr169zM+Hh4drzZo1tVixYioiGhYWprt27XL6zG+//aYioi1atNA+ffqoiGhsbKxWq1ZNvby8VES0WrVqeunSpRz12Lx5s4aHh5vHQNWqVbVWrVrq7++vIqL9+/fPV71VVR9//HGz3iEhIVq7dm2zHH9/f12xYoVT/jVr1mjRokVVRDQgIEBr1qyp8fHx5rF/xx135Lr9snPcBo72799vHkejRo0y/79mzZrmuRAWFqYnT57M97pyW7er/b1nzx6NjIxUEVEfHx+tU6eOli9f3szfr18/p3NU9Z/jOfu237t3r5YrV05FREeOHOm0bNu2bVq2bFlzPVWrVtUKFSqYx3vPnj1zrMd+bicmJmqZMmW0SJEiWqdOHfN4EBEdP358jjaNGTNGRUTHjBljph07dkybNGmS6z8PDw+X3wHX85wsCHv7f/vttxzLCnqcv//++yoiGhUVpefOnXNadvvtt6uI6F133WWmDR06VOPj483yHLdjz549r7pNwL8BwRWAXHXq1ElFRMeNG+eUXr9+ffXx8dGLFy9qamqqent7a/PmzZ3yPPfcczkuTo8cOWJeADz++OOalpamqpcDuZdeesm8mPn999+dyrJfgHl6emrXrl319OnT5rKLFy+qquoTTzxhXrRu27bNXL5582YtW7asent7X5fgytPTUytVqqR//PFHjjr+9NNPahiGli5dOscF2+TJk9XLy0v9/Pz06NGjZrrNZtM6deqoiGiHDh2c2v7555+rt7e3GbxYFVzZA4DatWvrpk2bzPTU1FR95JFHVEQ0ISHB6TP2i3tvb28NDw/XNWvWmMt27dqlERERKiL64YcfOn0uKSnJvCjt2LGjHjp0yGn50qVLdebMmfmq90cffaQiojExMU4XpJmZmTp+/HgVEY2IiDD3h6pqly5dzJsD9uPRbt26dfrZZ5+53HauXCm48vLyUn9/f/3xxx/NZceOHdMaNWqoiOgzzzyT73Xltu7s+9tms2lCQoJZr+PHj5vLfvrpJy1evLiKiH7wwQdOn3MVXG3dulXLlCnj8jvhwoULWqFCBRURfeyxxzQpKclctn37dq1WrZqKiL733ntOn7Of297e3tqzZ089e/asueyDDz4wgxfHdFXXwVVeJkyYYAYXJ06ccNoG1/OcLIjcgqurOc5VVTt27Kgiovfdd5+Z9vHHH6uIaNmyZfXMmTNO+XM7ngHkjeAKQK7sFyTt2rUz01JSUtTb21sbN25spjVo0ECLFCnidHHavHlzFRF94403zLTnn39eRURr1arlcn233Xabioj27dvXKd1+AVamTBm9cOFCjs8lJyebvSrz5s3Lsfzrr782L1SuJrjK65/94sYx74YNG1yWZ78g++6771wut/cIOl64/vLLLyoiWrRoUZc9G4899pi5XiuCqxMnTqivr6/6+/vnCHRULwfC9erVU5HLPZd2jhf3ru70v/vuuyoi2rVrV6f01157TUVE4+LiXPZq5bfeaWlpWqZMGfX09NSNGze6/FyPHj1URPTTTz810ypXrqwi4hQMXK0rBVciom+++WaOz33//fcqIlqjRg231519fy9cuFBFRH19ffXYsWM5Pmff/tHR0U69StmDq7Vr12pQUJAahqHvvPNOjnLs+7d79+4u6/f777+rYRhavnx5p/Qrndv2c+brr792Si9IcPW///1PPTw8tFixYjmOjet9ThaEq+Dqao9zVdWjR4+avW1z5szRvXv3aokSJdQwDF2wYEGOcgiugKvDM1cAcmV/VmrVqlWSlZUlIiKrV6+WjIwMadq0qZmvSZMmcunSJVm3bp2IiKSnp8vatWtFxHkyiwULFoiI5DrL1+OPP+6UL7s
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(10, 6))\n",
"plt.bar(words, frequencies)\n",
"\n",
"# Customize the plot\n",
"plt.title(\"Word Frequencies in Tokenized Text\", fontsize=16)\n",
"plt.xlabel(\"Words\", fontsize=14)\n",
"plt.ylabel(\"Frequencies\", fontsize=14)\n",
"plt.xticks(rotation=45, fontsize=12)\n",
"plt.grid(axis='y', linestyle='--', alpha=0.7)"
]
},
{
"cell_type": "code",
"execution_count": 100,
"id": "13d5b67b-591a-4f2c-804d-1bdd646d03ad",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting wordcloud\n",
" Downloading wordcloud-1.9.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)\n",
"Requirement already satisfied: numpy>=1.6.1 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from wordcloud) (1.26.4)\n",
"Requirement already satisfied: pillow in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from wordcloud) (10.2.0)\n",
"Requirement already satisfied: matplotlib in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from wordcloud) (3.8.0)\n",
"Requirement already satisfied: contourpy>=1.0.1 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from matplotlib->wordcloud) (1.2.0)\n",
"Requirement already satisfied: cycler>=0.10 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from matplotlib->wordcloud) (0.11.0)\n",
"Requirement already satisfied: fonttools>=4.22.0 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from matplotlib->wordcloud) (4.25.0)\n",
"Requirement already satisfied: kiwisolver>=1.0.1 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from matplotlib->wordcloud) (1.4.4)\n",
"Requirement already satisfied: packaging>=20.0 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from matplotlib->wordcloud) (23.1)\n",
"Requirement already satisfied: pyparsing>=2.3.1 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from matplotlib->wordcloud) (3.0.9)\n",
"Requirement already satisfied: python-dateutil>=2.7 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from matplotlib->wordcloud) (2.8.2)\n",
"Requirement already satisfied: six>=1.5 in /home/anaconda3/anaconda3/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib->wordcloud) (1.16.0)\n",
"Downloading wordcloud-1.9.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (547 kB)\n",
"\u001b[2K \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m547.9/547.9 kB\u001b[0m \u001b[31m4.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m01\u001b[0m\n",
"\u001b[?25hInstalling collected packages: wordcloud\n",
"Successfully installed wordcloud-1.9.4\n"
]
}
],
"source": [
"#!pip install wordcloud"
]
},
{
"cell_type": "code",
"execution_count": 101,
"id": "4feddd82-22b6-45ac-8a35-870a7bdc1147",
"metadata": {},
"outputs": [],
"source": [
"from wordcloud import WordCloud"
]
},
{
"cell_type": "code",
"execution_count": 105,
"id": "0c3b403a-a791-4560-a7cd-bec66a2befec",
"metadata": {},
"outputs": [],
"source": [
"wordcloud = WordCloud().generate_from_frequencies(word_counts)"
]
},
{
"cell_type": "code",
"execution_count": 106,
"id": "1fcc99f3-274a-47d1-bd63-81ba58b8cc38",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.image.AxesImage at 0x74eba002c210>"
]
},
"execution_count": 106,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAigAAAEoCAYAAABy5QoYAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy81sbWrAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOz9d5RlyXndif7i+OtNem/KV1d7j24AjQYIQwcSJAhqKIp4eiM90b3FgfQkkRy9IaXFgbQ0M0t6ksgRJQ6GRvQACBAgTMM0Go32vrrL2/T++nuPjXh/nMysysqbWVmmDcTcaxXQee8xcc49J2LHF/vbn1BKKXaxi13sYhe72MUu3kHQ3u4G7GIXu9jFLnaxi11ciV2Csotd7GIXu9jFLt5x2CUou9jFLnaxi13s4h2HXYKyi13sYhe72MUu3nHYJSi72MUudrGLXeziHYddgrKLXexiF7vYxS7ecdglKLvYxS52sYtd7OIdh12Csotd7GIXu9jFLt5x2CUou9jFLnaxi13s4h2HXYKyi13sYhe72MUu3nF4WwnKb//2bzM2NobjONx9991897vffTubs4td7GIXu9jFLt4heNsIyp/92Z/xK7/yK/z6r/86L7/8Mu9+97v5yEc+wsTExNvVpF3sYhe72MUudvEOgXi7igXef//93HXXXfzO7/zO+meHDh3ix37sx/j0pz+97b5SSmZmZshkMggh3uym7mIXu9jFLnaxi5sApRS1Wo3+/n40bfsYifEWtWkDfN/nxRdf5J//83++4fMPfvCDPPXUU5u29zwPz/PW/56enubw4cNvejt3sYtd7GIXu9jFzcfk5CSDg4PbbvO2EJSlpSWiKKKnp2fD5z09PczNzW3a/tOf/jS/+Zu/ucXRBAgdTdOQUYCmmygZoVBoWnx56dwgbqtE4NUBgZIBmm4hZRgfQWgoGaLpJjIKgJ0HlTK9SQ7+8CgARz97Frfkbb/DLt4W7B/9CJXaFJX6FK5XIZPqpafzCFHoUa5PUqqc27C9bqcwkxmE0Ai9Jkop7FQB3bJpluZAKQwnCUDkewSN0ttxWZcgNFDy7W1DOwgBug6agDCK/1tKUGr9n0g4qEjG0VAhUL4PUiIcBy2RQAhBuLLypjdV0yHbYbLn9jQXjzeIAoWV0HCSOvVKSLMa0j3koBsa516rk+swyHVbuM0IrxEhI0XnoEN5wcdrSTIFg45+m2NPV8kWDTIdJkpCoxxQWQ7f9OvZGQSaZiCEThT58ScintUqFZFK9dBqLQOg6w6mmaDZXETTjNX+c2NfqWfy6I4DQhC1mvFnloOSksh3IYqw8kXCVgMZBAhNoCfSqNAn8jx020YYJlGrifRccrfcibswizs/DVG04VxGoYP0/sMIx6Z18Rze3CzKbb3J92t76JksQ5/8BbRUitk/+79pnT/7trbnrYCeyZIc3UvXh3+UykvPUHnpOcLS1d/XTCZz1W3eFoKyhiuXZ5RSbZdsfvVXf5VPfepT639Xq1WGhoYAMMwkppVEaDpR5CEQKBQCgdAMlAzJFobRDQvPcNB0A9+toBsOSoaAQGg6UoYIwHdrSBnssP2QHUjx4C/cBsC5x6fxKj5Kvi2rZm8LdN3CMtMI4sFFCI0w9PD8KqySRNNIYRg2AFHkE4YuYeQCYJsZdN1e7RQVUkW03PjhNo0EppGKOSgCXbfwvCpeUAPAMlMYRgKBhkISRf7qecEwEpi6s0pYJaaRRNON1fNIao0ZMqkeTCO13iFfDiuZJVHsJwpczEQmfqaEBggSuW6UklipPCiJ3yi/dQRF0xCahgpXBzghEJaFkckRrCzFg/+Vu6TToBSy0Xhr2ngZ9EwGzXFQAmSjiZHPIV0PFQQxCdF1hGkgXQ9h22i2jWw2UUGAME2Mrk6EbtxUgpJKCjo6NHq6dZZXIqZnIjwPDFOj0G2TyBgk0gaWo1Hssyj22sydazF5skmmaGKYGpoOgwdSdPTbNKshjWpI4EvyXRa+q0CFJLMmhZ74ue/fl6TYZyNDxfKMR2W5dtOu57qh62jCwNBsbDtLGMaEQtMMlIIwbJHJ9KNUSBh6GEaCVLoHKUNMM4nrlgmCBpeTFLvQgZHOIgMflc7GJFTXQSlUGKLCACOTQwFCc9EME6dvCL+0hGa2MDI59ESSsF6jefEsmpNEt+J+W15GUISu4/QNUHzvB9BTacrPPYlsNvHfZoIihECzbXTbQWj629qWtwxSIt0W/uI8Ub2+iUhuhZ3IM94WgtLZ2Ymu65uiJQsLC5uiKgC2bWPbdttjJbM9ZHKDyMhHaDr1yjSWncG0UkSRT6u+hOeWSaa7yBZHMQybWnmKKPQwrMRq9ERimAmqKxcI/Aa8Ayei70wI0sleBnruRtdMQGAaCSr1KS5OP0kYuSTsIj2dt5LLxKG8emOelco5lsunAOjtup1segDDcJAqwvMqnDz/ZZSS5LMj9HTeiiYMNM0gk+rj4vSTTMw+hRAaXcXDdOT2oBs2YeRSb8xxfuoJQFHIjtKR30vS6cQP6thWFk1sftyF0BDExOPyjlazHIRuUDl3jK5DDyGERmPhAkGzSmHsdgK3AUoShQFRuDNCu/2tFHFnfvnfsPEzXUdLJNBtm7BSQYVhPIgXimTuuofyt76BvLKDFoLk/oOoIKBx9NUbb+cqDCMedwDCcOs+yR4dwSgWkc0W4dISyduOECwtE1WrEEmMzk7cs2fBDzDyeay+XqJmC+V5BEtLaIkE2hbv/urlrd+6narp9u8z+PjHkvyD/2eKv/x8i3/7f1Q5dz6+AInCSenICOykTke/Tc+wg9uUnH65RhQqrITAMDUOPpBl4aJHumhS7LOJQsW3/3QOJeMnKZnT0U2BELD/ngy1lZAoVHQO2px+aRuCcuWz8GbAMNBTCUQEhrTI5UcQQqBkhK7bKBTNxhKaZpLLjVKpTBCGLkLo9PTcjm1nWVw8SqUysR6JBtCcJNJ3aU5dIH/HAygZUT/1OrqTJDm6D5Si9MKTKBnfbz2ZRpgG7twUVqETIglS4XT14c5OoSdSKED6/sbmZ3NYHZ3oqTQAiZFx6seOvrn3bBdtETXqNE4fp3H6+E0/9ttCUCzL4u677+axxx7jx3/8x9c/f+yxx/joRz96TccS6yFKjcCv06jOQRZMK41uJAj8OrphoZTE92pEoY/vVsh37kPKAKUUmmERhi716vT6i7OLncO2cswuvMT80usYus0dh/8eS6VTSBmSSfWSTBR5+dgfAjA2+B7y2WG8oEq9MUdPxy1Mzj1HqXIO16+uRsBihri4coLFlZMIobF/9MMsrBynUr2IJgwcp0BP5xGOnfk8rlchmx5gbPARMqk+IumTz44Qhi4vH/t9EIK7b/n7aJtmNIJ8doSkUwQky+Uz69+oKESGlzrF0G1gZzqws50EzSoyCrAyHUT+IpF37ZEJo6MT3XGQQYAKAszubtyzZ1BRhJ7NYhY70NNpvIkJwloNoWuYxQ6SR25DCIE/N4M3PUV0RVREz+YQpolsNpGtJs7YOFG5TFitXr1Rq1Gsqw2OvT0av/bPsvzUTyRxPcV/+E81fvf/alCrbd5PNhqECFQYYI+PoaL4tzVyObREgqjZxCgUsHp6UEoSLq8Qlss4e/cSVquI1YhRu0Fb02DPuM6RWyxefsXnwsUbfHcF6JqgXg4Y2JcgUzSRITz75SUO3Jele9hhYF+SVN5gZdZj+lQLJ62xPOPhry7pfODv9nLutTqtekTfniSDB5IMHUoyd94lkdZpVkPmzrvbNsM+NE5UrhLOLr1pREVYBlomRVLkcIIknlchleolVC7N5iJRFJBMdRMGTUBhWWlsJ0cy2YHnVmg0Fmi1VjaQEwDNNDGyXejJNLLVQIYhicFRQBDWKqjAJ3/Xg3iLcwhNw0hn0QwTq9CB0z8UTxiVRKEwszk0Q8fM5DDzBYLypSia3TeEPTACUuIvzGJ392F1dOHPzRK13vpI4S7eHLxtSzyf+tSn+Nmf/VnuueceHnzwQX73d3+XiYkJ/tE/+kfXdJwwdGnUZqmWJlBKEoUe9co0zdpCvA4auPhePQ7PC+J
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.imshow(wordcloud)"
]
},
{
"cell_type": "markdown",
"id": "9840b3ca-d8af-46fa-8cb4-69a7405c3e22",
"metadata": {},
"source": [
"# inne statystyki"
]
},
{
"cell_type": "code",
"execution_count": 109,
"id": "48f77326-bac4-4921-8001-9a57ff00ee69",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Bolesław Prus\\n\\nAntek\\n\\nAntek urodził się we wsi nad Wisłą.\\n\\nWieś leżała w niewielkiej dolinie. Od północy otaczały ją wzgórza spadziste, porosłe sosnowym lasem, a od południa wzgórza garbate, zasypane leszczyną, tarniną i głogiem. Tam najgłośniej śpiewały ptaki i najczęściej chodziły wiejskie dzieci rwać orzechy albo wybierać gniazda.\\n\\nKiedyś stanął na środku wsi, zdawało ci się, że oba pasma gór biegną ku sobie, ażeby zetknąć się tam, gdzie z rana wstaje czerwone słońce. Ale było to złudzenie.\\n\\nZa wsią bowiem ciągnęła się między wzgórzami dolina przecięta rzeczułką i przykryta zieloną łąką.\\n\\nTam pasano bydlątka i tam cienkonogie bociany chodziły polować na żaby kukające wieczorami.\\n\\nOd zachodu wieś miała tamę, za tamą Wisłę, a za Wisłą znowu wzgórza wapienne, nagie.\\n\\nKażdy chłopski dom, szarą słomą pokryty, miał ogródek, a w ogródku śliwki węgierki, spomiędzy których widać było komin sadzą uczerniony i pożarną drabinkę. Drabiny te zaprowadzono nie od dawna, a ludzie myśleli, że one lepiej chronić będą chaty od ognia niż dawniej bocianie gniazda. Toteż gdy płonął jaki budynek, dziwili się bardzo, ale go nie ratowali.\\n\\n— Widać, że na tego gospodarza był dopust boski — mówili między sobą. — Spalił się, choć miał przecie nową drabinę i choć zapłacił *śtraf* za starą, co to były u niej połamane szczeble.\\n\\nW takiej wsi urodził się Antek. Położyli go w niemalowanej kołysce, co została po zmarłym bracie, i sypiał w niej przez dwa lata. Potem przyszła mu na świat siostra, Rozalia, więc musiał jej miejsca ustąpić, a sam, jako osoba dorosła, przenieść się na ławę.\\n\\nPrzez ten rok kołysał siostrę, a przez cały następny — rozglądał się po świecie. Raz wpadł w rzekę, drugi raz dostał batem od przejezdnego furmana za to, że go o mało konie nie stratowały, a trzeci raz psy tak go pogryzły, że dwa tygodnie leżał na piecu. Doświadczył więc niemało. Za to w czwartym roku życia ojciec podarował mu swoją sukienną kamizelkę z mosiężnym guzikiem, a matka — kazała mu siostrę nosić.\\n\\nGdy miał pięć lat, użyto go już — do pasania świń. Ale Antek nie bardzo się za nimi oglądał. Wolał patrzeć na drugą stronę Wisły, gdzie za wapiennym wzgórzem raz na raz pokazywało się coś wysokiego i czarnego. Wyłaziło to z lewej strony jakby spod ziemi, szło w górę i upadało na prawo. Za tym pierwszym szło zaraz drugie i trzecie, takie samo czarne i wysokie.\\n\\nTymczasem świnie swoim obyczajem wlazły w kartofle. Matka spostrzegłszy to zawinęła się wedle sukiennej kamizelki Antkowej, tak że chłopiec prawie tchu nie mógł złapać. Ale że nie miał w sercu zawziętości, bo było z niego dziecko dobre, więc wykrzyczawszy się i wydrapawszy — kamizelkę, zapytał matki:\\n\\n— Matulu! A co to takie czarne chodzi za Wisłą?\\n\\nMatka spojrzała w kierunku Antkowego palca, przysłoniła oczy ręką i odparła:\\n\\n— Tam za Wisłą? Cóż to, nie widzisz, że wiatrak chodzi? A na drugi raz pilnuj świń, bo cię pokrzywami wysmaruję.\\n\\n— Acha, wiatrak! A co on, matulu, za jeden?\\n\\n— At, głupiś! — odparła matka i uciekła do swojej roboty. Gdzie ona miała czas i rozum do udzielania objaśnień o wiatrakach!\\n\\nAle chłopcu wiatrak spokojności nie dawał. Antek widywał go przecie co dzień. Widywał go i w nocy przez sen. Więc taka straszna urosła w chłopcu ciekawość, że jednego dnia zakradł się do promu, co ludzi na drugą stronę rzeki przewoził, i popłynął za Wisłę.\\n\\nPopłynął, wdrapał się na wapienną górę, akurat w tym miejscu, gdzie stało ogłoszenie, aby tędy nie chodzić, i zobaczył wiatrak. Wydał mu się budynek ten jakby dzwonnica, tylko w sobie był grubszy, a tam gdzie na dzwonnicy jest okno, miał cztery tęgie skrzydła ustawione na krzyż. Z początku nie rozumiał nic — co to i na co to? Ale wnet objaśnili mu rzecz pastu
]
},
"execution_count": 109,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"antek"
]
},
{
"cell_type": "code",
"execution_count": 110,
"id": "dea9229c-ecd8-4e70-aca3-d4a38b0a4e7a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Bolesław',\n",
" 'Prus',\n",
" 'Antek',\n",
" 'Antek',\n",
" 'urodził',\n",
" 'się',\n",
" 'we',\n",
" 'wsi',\n",
" 'nad',\n",
" 'Wisłą',\n",
" '.',\n",
" 'Wieś',\n",
" 'leżała',\n",
" 'w',\n",
" 'niewielkiej',\n",
" 'dolinie',\n",
" '.',\n",
" 'Od',\n",
" 'północy',\n",
" 'otaczały',\n",
" 'ją',\n",
" 'wzgórza',\n",
" 'spadziste',\n",
" ',',\n",
" 'porosłe',\n",
" 'sosnowym',\n",
" 'lasem',\n",
" ',',\n",
" 'a',\n",
" 'od',\n",
" 'południa',\n",
" 'wzgórza',\n",
" 'garbate',\n",
" ',',\n",
" 'zasypane',\n",
" 'leszczyną',\n",
" ',',\n",
" 'tarniną',\n",
" 'i',\n",
" 'głogiem',\n",
" '.',\n",
" 'Tam',\n",
" 'najgłośniej',\n",
" 'śpiewały',\n",
" 'ptaki',\n",
" 'i',\n",
" 'najczęściej',\n",
" 'chodziły',\n",
" 'wiejskie',\n",
" 'dzieci',\n",
" 'rwać',\n",
" 'orzechy',\n",
" 'albo',\n",
" 'wybierać',\n",
" 'gniazda',\n",
" '.',\n",
" 'Kiedyś',\n",
" 'stanął',\n",
" 'na',\n",
" 'środku',\n",
" 'wsi',\n",
" ',',\n",
" 'zdawało',\n",
" 'ci',\n",
" 'się',\n",
" ',',\n",
" 'że',\n",
" 'oba',\n",
" 'pasma',\n",
" 'gór',\n",
" 'biegną',\n",
" 'ku',\n",
" 'sobie',\n",
" ',',\n",
" 'ażeby',\n",
" 'zetknąć',\n",
" 'się',\n",
" 'tam',\n",
" ',',\n",
" 'gdzie',\n",
" 'z',\n",
" 'rana',\n",
" 'wstaje',\n",
" 'czerwone',\n",
" 'słońce',\n",
" '.',\n",
" 'Ale',\n",
" 'było',\n",
" 'to',\n",
" 'złudzenie',\n",
" '.',\n",
" 'Za',\n",
" 'wsią',\n",
" 'bowiem',\n",
" 'ciągnęła',\n",
" 'się',\n",
" 'między',\n",
" 'wzgórzami',\n",
" 'dolina',\n",
" 'przecięta',\n",
" 'rzeczułką',\n",
" 'i',\n",
" 'przykryta',\n",
" 'zieloną',\n",
" 'łąką',\n",
" '.',\n",
" 'Tam',\n",
" 'pasano',\n",
" 'bydlątka',\n",
" 'i',\n",
" 'tam',\n",
" 'cienkonogie',\n",
" 'bociany',\n",
" 'chodziły',\n",
" 'polować',\n",
" 'na',\n",
" 'żaby',\n",
" 'kukające',\n",
" 'wieczorami',\n",
" '.',\n",
" 'Od',\n",
" 'zachodu',\n",
" 'wieś',\n",
" 'miała',\n",
" 'tamę',\n",
" ',',\n",
" 'za',\n",
" 'tamą',\n",
" 'Wisłę',\n",
" ',',\n",
" 'a',\n",
" 'za',\n",
" 'Wisłą',\n",
" 'znowu',\n",
" 'wzgórza',\n",
" 'wapienne',\n",
" ',',\n",
" 'nagie',\n",
" '.',\n",
" 'Każdy',\n",
" 'chłopski',\n",
" 'dom',\n",
" ',',\n",
" 'szarą',\n",
" 'słomą',\n",
" 'pokryty',\n",
" ',',\n",
" 'miał',\n",
" 'ogródek',\n",
" ',',\n",
" 'a',\n",
" 'w',\n",
" 'ogródku',\n",
" 'śliwki',\n",
" 'węgierki',\n",
" ',',\n",
" 'spomiędzy',\n",
" 'których',\n",
" 'widać',\n",
" 'było',\n",
" 'komin',\n",
" 'sadzą',\n",
" 'uczerniony',\n",
" 'i',\n",
" 'pożarną',\n",
" 'drabinkę',\n",
" '.',\n",
" 'Drabiny',\n",
" 'te',\n",
" 'zaprowadzono',\n",
" 'nie',\n",
" 'od',\n",
" 'dawna',\n",
" ',',\n",
" 'a',\n",
" 'ludzie',\n",
" 'myśleli',\n",
" ',',\n",
" 'że',\n",
" 'one',\n",
" 'lepiej',\n",
" 'chronić',\n",
" 'będą',\n",
" 'chaty',\n",
" 'od',\n",
" 'ognia',\n",
" 'niż',\n",
" 'dawniej',\n",
" 'bocianie',\n",
" 'gniazda',\n",
" '.',\n",
" 'Toteż',\n",
" 'gdy',\n",
" 'płonął',\n",
" 'jaki',\n",
" 'budynek',\n",
" ',',\n",
" 'dziwili',\n",
" 'się',\n",
" 'bardzo',\n",
" ',',\n",
" 'ale',\n",
" 'go',\n",
" 'nie',\n",
" 'ratowali',\n",
" '.',\n",
" '—',\n",
" 'Widać',\n",
" ',',\n",
" 'że',\n",
" 'na',\n",
" 'tego',\n",
" 'gospodarza',\n",
" 'był',\n",
" 'dopust',\n",
" 'boski',\n",
" '—',\n",
" 'mówili',\n",
" 'między',\n",
" 'sobą',\n",
" '.',\n",
" '—',\n",
" 'Spalił',\n",
" 'się',\n",
" ',',\n",
" 'choć',\n",
" 'miał',\n",
" 'przecie',\n",
" 'nową',\n",
" 'drabinę',\n",
" 'i',\n",
" 'choć',\n",
" 'zapłacił',\n",
" '*',\n",
" 'śtraf',\n",
" '*',\n",
" 'za',\n",
" 'starą',\n",
" ',',\n",
" 'co',\n",
" 'to',\n",
" 'były',\n",
" 'u',\n",
" 'niej',\n",
" 'połamane',\n",
" 'szczeble',\n",
" '.',\n",
" 'W',\n",
" 'takiej',\n",
" 'wsi',\n",
" 'urodził',\n",
" 'się',\n",
" 'Antek',\n",
" '.',\n",
" 'Położyli',\n",
" 'go',\n",
" 'w',\n",
" 'niemalowanej',\n",
" 'kołysce',\n",
" ',',\n",
" 'co',\n",
" 'została',\n",
" 'po',\n",
" 'zmarłym',\n",
" 'bracie',\n",
" ',',\n",
" 'i',\n",
" 'sypiał',\n",
" 'w',\n",
" 'niej',\n",
" 'przez',\n",
" 'dwa',\n",
" 'lata',\n",
" '.',\n",
" 'Potem',\n",
" 'przyszła',\n",
" 'mu',\n",
" 'na',\n",
" 'świat',\n",
" 'siostra',\n",
" ',',\n",
" 'Rozalia',\n",
" ',',\n",
" 'więc',\n",
" 'musiał',\n",
" 'jej',\n",
" 'miejsca',\n",
" 'ustąpić',\n",
" ',',\n",
" 'a',\n",
" 'sam',\n",
" ',',\n",
" 'jako',\n",
" 'osoba',\n",
" 'dorosła',\n",
" ',',\n",
" 'przenieść',\n",
" 'się',\n",
" 'na',\n",
" 'ławę',\n",
" '.',\n",
" 'Przez',\n",
" 'ten',\n",
" 'rok',\n",
" 'kołysał',\n",
" 'siostrę',\n",
" ',',\n",
" 'a',\n",
" 'przez',\n",
" 'cały',\n",
" 'następny',\n",
" '—',\n",
" 'rozglądał',\n",
" 'się',\n",
" 'po',\n",
" 'świecie',\n",
" '.',\n",
" 'Raz',\n",
" 'wpadł',\n",
" 'w',\n",
" 'rzekę',\n",
" ',',\n",
" 'drugi',\n",
" 'raz',\n",
" 'dostał',\n",
" 'batem',\n",
" 'od',\n",
" 'przejezdnego',\n",
" 'furmana',\n",
" 'za',\n",
" 'to',\n",
" ',',\n",
" 'że',\n",
" 'go',\n",
" 'o',\n",
" 'mało',\n",
" 'konie',\n",
" 'nie',\n",
" 'stratowały',\n",
" ',',\n",
" 'a',\n",
" 'trzeci',\n",
" 'raz',\n",
" 'psy',\n",
" 'tak',\n",
" 'go',\n",
" 'pogryzły',\n",
" ',',\n",
" 'że',\n",
" 'dwa',\n",
" 'tygodnie',\n",
" 'leżał',\n",
" 'na',\n",
" 'piecu',\n",
" '.',\n",
" 'Doświadczył',\n",
" 'więc',\n",
" 'niemało',\n",
" '.',\n",
" 'Za',\n",
" 'to',\n",
" 'w',\n",
" 'czwartym',\n",
" 'roku',\n",
" 'życia',\n",
" 'ojciec',\n",
" 'podarował',\n",
" 'mu',\n",
" 'swoją',\n",
" 'sukienną',\n",
" 'kamizelkę',\n",
" 'z',\n",
" 'mosiężnym',\n",
" 'guzikiem',\n",
" ',',\n",
" 'a',\n",
" 'matka',\n",
" '—',\n",
" 'kazała',\n",
" 'mu',\n",
" 'siostrę',\n",
" 'nosić',\n",
" '.',\n",
" 'Gdy',\n",
" 'miał',\n",
" 'pięć',\n",
" 'lat',\n",
" ',',\n",
" 'użyto',\n",
" 'go',\n",
" 'już',\n",
" '—',\n",
" 'do',\n",
" 'pasania',\n",
" 'świń',\n",
" '.',\n",
" 'Ale',\n",
" 'Antek',\n",
" 'nie',\n",
" 'bardzo',\n",
" 'się',\n",
" 'za',\n",
" 'nimi',\n",
" 'oglądał',\n",
" '.',\n",
" 'Wolał',\n",
" 'patrzeć',\n",
" 'na',\n",
" 'drugą',\n",
" 'stronę',\n",
" 'Wisły',\n",
" ',',\n",
" 'gdzie',\n",
" 'za',\n",
" 'wapiennym',\n",
" 'wzgórzem',\n",
" 'raz',\n",
" 'na',\n",
" 'raz',\n",
" 'pokazywało',\n",
" 'się',\n",
" 'coś',\n",
" 'wysokiego',\n",
" 'i',\n",
" 'czarnego',\n",
" '.',\n",
" 'Wyłaziło',\n",
" 'to',\n",
" 'z',\n",
" 'lewej',\n",
" 'strony',\n",
" 'jakby',\n",
" 'spod',\n",
" 'ziemi',\n",
" ',',\n",
" 'szło',\n",
" 'w',\n",
" 'górę',\n",
" 'i',\n",
" 'upadało',\n",
" 'na',\n",
" 'prawo',\n",
" '.',\n",
" 'Za',\n",
" 'tym',\n",
" 'pierwszym',\n",
" 'szło',\n",
" 'zaraz',\n",
" 'drugie',\n",
" 'i',\n",
" 'trzecie',\n",
" ',',\n",
" 'takie',\n",
" 'samo',\n",
" 'czarne',\n",
" 'i',\n",
" 'wysokie',\n",
" '.',\n",
" 'Tymczasem',\n",
" 'świnie',\n",
" 'swoim',\n",
" 'obyczajem',\n",
" 'wlazły',\n",
" 'w',\n",
" 'kartofle',\n",
" '.',\n",
" 'Matka',\n",
" 'spostrzegłszy',\n",
" 'to',\n",
" 'zawinęła',\n",
" 'się',\n",
" 'wedle',\n",
" 'sukiennej',\n",
" 'kamizelki',\n",
" 'Antkowej',\n",
" ',',\n",
" 'tak',\n",
" 'że',\n",
" 'chłopiec',\n",
" 'prawie',\n",
" 'tchu',\n",
" 'nie',\n",
" 'mógł',\n",
" 'złapać',\n",
" '.',\n",
" 'Ale',\n",
" 'że',\n",
" 'nie',\n",
" 'miał',\n",
" 'w',\n",
" 'sercu',\n",
" 'zawziętości',\n",
" ',',\n",
" 'bo',\n",
" 'było',\n",
" 'z',\n",
" 'niego',\n",
" 'dziecko',\n",
" 'dobre',\n",
" ',',\n",
" 'więc',\n",
" 'wykrzyczawszy',\n",
" 'się',\n",
" 'i',\n",
" 'wydrapawszy',\n",
" '—',\n",
" 'kamizelkę',\n",
" ',',\n",
" 'zapytał',\n",
" 'matki',\n",
" ':',\n",
" '—',\n",
" 'Matulu',\n",
" '!',\n",
" 'A',\n",
" 'co',\n",
" 'to',\n",
" 'takie',\n",
" 'czarne',\n",
" 'chodzi',\n",
" 'za',\n",
" 'Wisłą',\n",
" '?',\n",
" 'Matka',\n",
" 'spojrzała',\n",
" 'w',\n",
" 'kierunku',\n",
" 'Antkowego',\n",
" 'palca',\n",
" ',',\n",
" 'przysłoniła',\n",
" 'oczy',\n",
" 'ręką',\n",
" 'i',\n",
" 'odparła',\n",
" ':',\n",
" '—',\n",
" 'Tam',\n",
" 'za',\n",
" 'Wisłą',\n",
" '?',\n",
" 'Cóż',\n",
" 'to',\n",
" ',',\n",
" 'nie',\n",
" 'widzisz',\n",
" ',',\n",
" 'że',\n",
" 'wiatrak',\n",
" 'chodzi',\n",
" '?',\n",
" 'A',\n",
" 'na',\n",
" 'drugi',\n",
" 'raz',\n",
" 'pilnuj',\n",
" 'świń',\n",
" ',',\n",
" 'bo',\n",
" 'cię',\n",
" 'pokrzywami',\n",
" 'wysmaruję',\n",
" '.',\n",
" '—',\n",
" 'Acha',\n",
" ',',\n",
" 'wiatrak',\n",
" '!',\n",
" 'A',\n",
" 'co',\n",
" 'on',\n",
" ',',\n",
" 'matulu',\n",
" ',',\n",
" 'za',\n",
" 'jeden',\n",
" '?',\n",
" '—',\n",
" 'At',\n",
" ',',\n",
" 'głupiś',\n",
" '!',\n",
" '—',\n",
" 'odparła',\n",
" 'matka',\n",
" 'i',\n",
" 'uciekła',\n",
" 'do',\n",
" 'swojej',\n",
" 'roboty',\n",
" '.',\n",
" 'Gdzie',\n",
" 'ona',\n",
" 'miała',\n",
" 'czas',\n",
" 'i',\n",
" 'rozum',\n",
" 'do',\n",
" 'udzielania',\n",
" 'objaśnień',\n",
" 'o',\n",
" 'wiatrakach',\n",
" '!',\n",
" 'Ale',\n",
" 'chłopcu',\n",
" 'wiatrak',\n",
" 'spokojności',\n",
" 'nie',\n",
" 'dawał',\n",
" '.',\n",
" 'Antek',\n",
" 'widywał',\n",
" 'go',\n",
" 'przecie',\n",
" 'co',\n",
" 'dzień',\n",
" '.',\n",
" 'Widywał',\n",
" 'go',\n",
" 'i',\n",
" 'w',\n",
" 'nocy',\n",
" 'przez',\n",
" 'sen.',\n",
" 'Więc',\n",
" 'taka',\n",
" 'straszna',\n",
" 'urosła',\n",
" 'w',\n",
" 'chłopcu',\n",
" 'ciekawość',\n",
" ',',\n",
" 'że',\n",
" 'jednego',\n",
" 'dnia',\n",
" 'zakradł',\n",
" 'się',\n",
" 'do',\n",
" 'promu',\n",
" ',',\n",
" 'co',\n",
" 'ludzi',\n",
" 'na',\n",
" 'drugą',\n",
" 'stronę',\n",
" 'rzeki',\n",
" 'przewoził',\n",
" ',',\n",
" 'i',\n",
" 'popłynął',\n",
" 'za',\n",
" 'Wisłę',\n",
" '.',\n",
" 'Popłynął',\n",
" ',',\n",
" 'wdrapał',\n",
" 'się',\n",
" 'na',\n",
" 'wapienną',\n",
" 'górę',\n",
" ',',\n",
" 'akurat',\n",
" 'w',\n",
" 'tym',\n",
" 'miejscu',\n",
" ',',\n",
" 'gdzie',\n",
" 'stało',\n",
" 'ogłoszenie',\n",
" ',',\n",
" 'aby',\n",
" 'tędy',\n",
" 'nie',\n",
" 'chodzić',\n",
" ',',\n",
" 'i',\n",
" 'zobaczył',\n",
" 'wiatrak',\n",
" '.',\n",
" 'Wydał',\n",
" 'mu',\n",
" 'się',\n",
" 'budynek',\n",
" 'ten',\n",
" 'jakby',\n",
" 'dzwonnica',\n",
" ',',\n",
" 'tylko',\n",
" 'w',\n",
" 'sobie',\n",
" 'był',\n",
" 'grubszy',\n",
" ',',\n",
" 'a',\n",
" 'tam',\n",
" 'gdzie',\n",
" 'na',\n",
" 'dzwonnicy',\n",
" 'jest',\n",
" 'okno',\n",
" ',',\n",
" 'miał',\n",
" 'cztery',\n",
" 'tęgie',\n",
" 'skrzydła',\n",
" 'ustawione',\n",
" 'na',\n",
" 'krzyż',\n",
" '.',\n",
" 'Z',\n",
" 'początku',\n",
" 'nie',\n",
" 'rozumiał',\n",
" 'nic',\n",
" '—',\n",
" 'co',\n",
" 'to',\n",
" 'i',\n",
" 'na',\n",
" 'co',\n",
" 'to',\n",
" '?',\n",
" 'Ale',\n",
" 'wnet',\n",
" 'objaśnili',\n",
" 'mu',\n",
" 'rzecz',\n",
" 'pastuchowie',\n",
" ',',\n",
" 'więc',\n",
" 'dowiedział',\n",
" 'się',\n",
" 'o',\n",
" 'wszystkim',\n",
" '.',\n",
" 'Naprzód',\n",
" 'o',\n",
" 'tym',\n",
" ',',\n",
" 'że',\n",
" 'na',\n",
" 'skrzydła',\n",
" 'dmucha',\n",
" 'wiater',\n",
" 'i',\n",
" 'kręci',\n",
" 'nimi',\n",
" 'jak',\n",
" 'liśćmi',\n",
" '.',\n",
" 'Dalej',\n",
" 'o',\n",
" 'tym',\n",
" ',',\n",
" 'że',\n",
" 'w',\n",
" 'wiatraku',\n",
" 'miele',\n",
" 'się',\n",
" 'zboże',\n",
" 'na',\n",
" 'mąkę',\n",
" ',',\n",
" 'i',\n",
" 'nareszcie',\n",
" 'o',\n",
" 'tym',\n",
" ',',\n",
" 'że',\n",
" 'przy',\n",
" 'wiatraku',\n",
" 'siedzi',\n",
" 'młynarz',\n",
" ',',\n",
" 'co',\n",
" 'żonę',\n",
" 'bije',\n",
" ',',\n",
" 'a',\n",
" 'taki',\n",
" 'jest',\n",
" 'mądry',\n",
" ',',\n",
" 'że',\n",
" 'wie',\n",
" ',',\n",
" 'jakim',\n",
" 'sposobem',\n",
" 'ze',\n",
" 'śpichrzów',\n",
" 'wyprowadza',\n",
" 'się',\n",
" 'szczury',\n",
" '.',\n",
" 'Po',\n",
" 'takiej',\n",
" 'poglądowej',\n",
" 'lekcji',\n",
" 'Antek',\n",
" 'wrócił',\n",
" 'do',\n",
" 'domu',\n",
" 'tą',\n",
" 'samą',\n",
" 'drogą',\n",
" 'co',\n",
" 'pierwej',\n",
" '.',\n",
" 'Dali',\n",
" 'mu',\n",
" 'tam',\n",
" 'przewoźnicy',\n",
" 'parę',\n",
" 'razy',\n",
" 'w',\n",
" 'łeb',\n",
" 'za',\n",
" 'swoją',\n",
" 'krwawą',\n",
" 'pracę',\n",
" ',',\n",
" 'dała',\n",
" 'mu',\n",
" 'i',\n",
" 'matka',\n",
" 'coś',\n",
" 'na',\n",
" 'sukienną',\n",
" 'kamizelkę',\n",
" ',',\n",
" 'ale',\n",
" 'to',\n",
" 'nic',\n",
" ':',\n",
" 'Antek',\n",
" 'był',\n",
" 'kontent',\n",
" ',',\n",
" 'bo',\n",
" 'zaspokoił',\n",
" 'ciekawość',\n",
" '.',\n",
" 'Więc',\n",
" 'choć',\n",
" 'położył',\n",
" 'się',\n",
" 'spać',\n",
" 'o',\n",
" 'głodzie',\n",
" ',',\n",
" 'marzył',\n",
" 'całą',\n",
" 'noc',\n",
" 'to',\n",
" 'o',\n",
" 'wiatraku',\n",
" ',',\n",
" 'co',\n",
" 'mieli',\n",
" 'zboże',\n",
" ',',\n",
" 'to',\n",
" 'o',\n",
" 'młynarzu',\n",
" ',',\n",
" 'co',\n",
" 'bije',\n",
" 'żonę',\n",
" 'i',\n",
" 'szczury',\n",
" 'wyprowadza',\n",
" 'ze',\n",
" 'śpichrzów',\n",
" '.',\n",
" 'Drobny',\n",
" 'ten',\n",
" 'wypadek',\n",
" 'stanowczo',\n",
" 'wpłynął',\n",
" 'na',\n",
" 'całe',\n",
" 'życie',\n",
" 'chłopca',\n",
" '.',\n",
" 'Od',\n",
" 'tej',\n",
" 'pory',\n",
" '—',\n",
" 'od',\n",
" 'wschodu',\n",
" 'do',\n",
" 'zachodu',\n",
" 'słońca',\n",
" '—',\n",
" 'strugał',\n",
" 'on',\n",
" 'patyki',\n",
" 'i',\n",
" 'układał',\n",
" 'je',\n",
" 'na',\n",
" 'krzyż',\n",
" '.',\n",
" 'Potem',\n",
" 'wystrugał',\n",
" 'sobie',\n",
" 'kolumnę',\n",
" ';',\n",
" 'próbował',\n",
" ',',\n",
" 'obciosywał',\n",
" ',',\n",
" 'ustawiał',\n",
" ',',\n",
" 'aż',\n",
" 'nareszcie',\n",
" 'wybudował',\n",
" 'mały',\n",
" 'wiatraczek',\n",
" ',',\n",
" 'który',\n",
" 'na',\n",
" 'wietrze',\n",
" 'obracał',\n",
" 'mu',\n",
" 'się',\n",
" 'jak',\n",
" 'tamten',\n",
" 'za',\n",
" 'Wisłą',\n",
" '.',\n",
" 'Cóż',\n",
" 'to',\n",
" 'była',\n",
" 'za',\n",
" 'radość',\n",
" '!',\n",
" 'Teraz',\n",
" 'brakowało',\n",
" 'Antkowi',\n",
" 'tylko',\n",
" 'żony',\n",
" ',',\n",
" 'żeby',\n",
" 'mógł',\n",
" 'ją',\n",
" 'bić',\n",
" ',',\n",
" 'i',\n",
" 'już',\n",
" 'byłby',\n",
" 'z',\n",
" 'niego',\n",
" 'prawdziwy',\n",
" 'młynarz',\n",
" '!',\n",
" 'Do',\n",
" 'dziesiątego',\n",
" 'roku',\n",
" 'życia',\n",
" 'zepsuł',\n",
" 'ze',\n",
" 'cztery',\n",
" 'koziki',\n",
" ',',\n",
" 'ale',\n",
" 'też',\n",
" 'strugał',\n",
" 'nimi',\n",
" 'dziwne',\n",
" 'rzeczy',\n",
" '.',\n",
" 'Robił',\n",
" 'wiatraki',\n",
" ',',\n",
" 'płoty',\n",
" ',',\n",
" 'drabiny',\n",
" ',',\n",
" 'studnie',\n",
" ',',\n",
" 'a',\n",
" 'nawet',\n",
" 'całe',\n",
" 'chałupy',\n",
" '.',\n",
" 'Aż',\n",
" 'się',\n",
" 'ludzie',\n",
" 'zastanawiali',\n",
" 'i',\n",
" 'mówili',\n",
" 'do',\n",
" 'matki',\n",
" ',',\n",
" 'że',\n",
" 'z',\n",
" 'Antka',\n",
" 'albo',\n",
" 'będzie',\n",
" ...]"
]
},
"execution_count": 110,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"antek_tokenized"
]
},
{
"cell_type": "code",
"execution_count": 111,
"id": "ea9e99bb-9a3f-46cf-89d7-5ec72ddac58a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"47025"
]
},
"execution_count": 111,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(antek)"
]
},
{
"cell_type": "code",
"execution_count": 112,
"id": "52ad3197-1907-4253-b02d-7e821cfea33d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"9350"
]
},
"execution_count": 112,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(antek_tokenized)"
]
},
{
"cell_type": "code",
"execution_count": 113,
"id": "cf17f78a-f434-4bf2-9ffc-e3eec96bc851",
"metadata": {},
"outputs": [],
"source": [
"words_len = [len(w) for w in antek_tokenized]"
]
},
{
"cell_type": "code",
"execution_count": 115,
"id": "432bd283-be93-473b-80b8-a3620c984662",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[8,\n",
" 4,\n",
" 5,\n",
" 5,\n",
" 7,\n",
" 3,\n",
" 2,\n",
" 3,\n",
" 3,\n",
" 5,\n",
" 1,\n",
" 4,\n",
" 6,\n",
" 1,\n",
" 11,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 7,\n",
" 8,\n",
" 2,\n",
" 7,\n",
" 9,\n",
" 1,\n",
" 7,\n",
" 8,\n",
" 5,\n",
" 1,\n",
" 1,\n",
" 2,\n",
" 8,\n",
" 7,\n",
" 7,\n",
" 1,\n",
" 8,\n",
" 9,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 3,\n",
" 11,\n",
" 8,\n",
" 5,\n",
" 1,\n",
" 11,\n",
" 8,\n",
" 8,\n",
" 6,\n",
" 4,\n",
" 7,\n",
" 4,\n",
" 8,\n",
" 7,\n",
" 1,\n",
" 6,\n",
" 6,\n",
" 2,\n",
" 6,\n",
" 3,\n",
" 1,\n",
" 7,\n",
" 2,\n",
" 3,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 5,\n",
" 3,\n",
" 6,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 7,\n",
" 3,\n",
" 3,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 4,\n",
" 6,\n",
" 8,\n",
" 6,\n",
" 1,\n",
" 3,\n",
" 4,\n",
" 2,\n",
" 9,\n",
" 1,\n",
" 2,\n",
" 4,\n",
" 6,\n",
" 8,\n",
" 3,\n",
" 6,\n",
" 9,\n",
" 6,\n",
" 9,\n",
" 9,\n",
" 1,\n",
" 9,\n",
" 7,\n",
" 4,\n",
" 1,\n",
" 3,\n",
" 6,\n",
" 8,\n",
" 1,\n",
" 3,\n",
" 11,\n",
" 7,\n",
" 8,\n",
" 7,\n",
" 2,\n",
" 4,\n",
" 8,\n",
" 10,\n",
" 1,\n",
" 2,\n",
" 7,\n",
" 4,\n",
" 5,\n",
" 4,\n",
" 1,\n",
" 2,\n",
" 4,\n",
" 5,\n",
" 1,\n",
" 1,\n",
" 2,\n",
" 5,\n",
" 5,\n",
" 7,\n",
" 8,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 8,\n",
" 3,\n",
" 1,\n",
" 5,\n",
" 5,\n",
" 7,\n",
" 1,\n",
" 4,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 1,\n",
" 7,\n",
" 6,\n",
" 8,\n",
" 1,\n",
" 9,\n",
" 7,\n",
" 5,\n",
" 4,\n",
" 5,\n",
" 5,\n",
" 10,\n",
" 1,\n",
" 7,\n",
" 8,\n",
" 1,\n",
" 7,\n",
" 2,\n",
" 12,\n",
" 3,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 1,\n",
" 6,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 6,\n",
" 7,\n",
" 4,\n",
" 5,\n",
" 2,\n",
" 5,\n",
" 3,\n",
" 7,\n",
" 8,\n",
" 7,\n",
" 1,\n",
" 5,\n",
" 3,\n",
" 6,\n",
" 4,\n",
" 7,\n",
" 1,\n",
" 7,\n",
" 3,\n",
" 6,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 3,\n",
" 8,\n",
" 1,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 4,\n",
" 10,\n",
" 3,\n",
" 6,\n",
" 5,\n",
" 1,\n",
" 6,\n",
" 6,\n",
" 4,\n",
" 1,\n",
" 1,\n",
" 6,\n",
" 3,\n",
" 1,\n",
" 4,\n",
" 4,\n",
" 7,\n",
" 4,\n",
" 7,\n",
" 1,\n",
" 4,\n",
" 8,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 4,\n",
" 1,\n",
" 4,\n",
" 8,\n",
" 8,\n",
" 1,\n",
" 1,\n",
" 6,\n",
" 3,\n",
" 7,\n",
" 3,\n",
" 5,\n",
" 1,\n",
" 8,\n",
" 2,\n",
" 1,\n",
" 12,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 7,\n",
" 2,\n",
" 7,\n",
" 6,\n",
" 1,\n",
" 1,\n",
" 6,\n",
" 1,\n",
" 4,\n",
" 5,\n",
" 3,\n",
" 4,\n",
" 1,\n",
" 5,\n",
" 8,\n",
" 2,\n",
" 2,\n",
" 5,\n",
" 7,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 4,\n",
" 6,\n",
" 3,\n",
" 7,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 3,\n",
" 1,\n",
" 4,\n",
" 5,\n",
" 7,\n",
" 1,\n",
" 9,\n",
" 3,\n",
" 2,\n",
" 4,\n",
" 1,\n",
" 5,\n",
" 3,\n",
" 3,\n",
" 7,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 5,\n",
" 4,\n",
" 8,\n",
" 1,\n",
" 9,\n",
" 3,\n",
" 2,\n",
" 7,\n",
" 1,\n",
" 3,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 3,\n",
" 6,\n",
" 5,\n",
" 2,\n",
" 12,\n",
" 7,\n",
" 2,\n",
" 2,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 1,\n",
" 4,\n",
" 5,\n",
" 3,\n",
" 10,\n",
" 1,\n",
" 1,\n",
" 6,\n",
" 3,\n",
" 3,\n",
" 3,\n",
" 2,\n",
" 8,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 8,\n",
" 5,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 11,\n",
" 4,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 1,\n",
" 8,\n",
" 4,\n",
" 5,\n",
" 6,\n",
" 9,\n",
" 2,\n",
" 5,\n",
" 8,\n",
" 9,\n",
" 1,\n",
" 9,\n",
" 8,\n",
" 1,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 6,\n",
" 2,\n",
" 7,\n",
" 5,\n",
" 1,\n",
" 3,\n",
" 4,\n",
" 4,\n",
" 3,\n",
" 1,\n",
" 5,\n",
" 2,\n",
" 3,\n",
" 1,\n",
" 2,\n",
" 7,\n",
" 4,\n",
" 1,\n",
" 3,\n",
" 5,\n",
" 3,\n",
" 6,\n",
" 3,\n",
" 2,\n",
" 4,\n",
" 7,\n",
" 1,\n",
" 5,\n",
" 7,\n",
" 2,\n",
" 5,\n",
" 6,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 2,\n",
" 9,\n",
" 8,\n",
" 3,\n",
" 2,\n",
" 3,\n",
" 10,\n",
" 3,\n",
" 3,\n",
" 9,\n",
" 1,\n",
" 8,\n",
" 1,\n",
" 8,\n",
" 2,\n",
" 1,\n",
" 5,\n",
" 6,\n",
" 5,\n",
" 4,\n",
" 5,\n",
" 1,\n",
" 4,\n",
" 1,\n",
" 4,\n",
" 1,\n",
" 7,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 9,\n",
" 4,\n",
" 5,\n",
" 6,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 5,\n",
" 4,\n",
" 6,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 9,\n",
" 6,\n",
" 5,\n",
" 9,\n",
" 6,\n",
" 1,\n",
" 8,\n",
" 1,\n",
" 5,\n",
" 13,\n",
" 2,\n",
" 8,\n",
" 3,\n",
" 5,\n",
" 9,\n",
" 9,\n",
" 8,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 8,\n",
" 6,\n",
" 4,\n",
" 3,\n",
" 4,\n",
" 6,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 3,\n",
" 4,\n",
" 1,\n",
" 5,\n",
" 11,\n",
" 1,\n",
" 2,\n",
" 4,\n",
" 1,\n",
" 5,\n",
" 7,\n",
" 5,\n",
" 1,\n",
" 4,\n",
" 13,\n",
" 3,\n",
" 1,\n",
" 11,\n",
" 1,\n",
" 9,\n",
" 1,\n",
" 7,\n",
" 5,\n",
" 1,\n",
" 1,\n",
" 6,\n",
" 1,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 5,\n",
" 6,\n",
" 6,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 9,\n",
" 1,\n",
" 8,\n",
" 9,\n",
" 5,\n",
" 1,\n",
" 11,\n",
" 4,\n",
" 4,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 1,\n",
" 3,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 7,\n",
" 6,\n",
" 1,\n",
" 1,\n",
" 2,\n",
" 5,\n",
" 3,\n",
" 6,\n",
" 4,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 10,\n",
" 9,\n",
" 1,\n",
" 1,\n",
" 4,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 1,\n",
" 6,\n",
" 1,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 1,\n",
" 2,\n",
" 1,\n",
" 6,\n",
" 1,\n",
" 1,\n",
" 7,\n",
" 5,\n",
" 1,\n",
" 7,\n",
" 2,\n",
" 6,\n",
" 6,\n",
" 1,\n",
" 5,\n",
" 3,\n",
" 5,\n",
" 4,\n",
" 1,\n",
" 5,\n",
" 2,\n",
" 10,\n",
" 9,\n",
" 1,\n",
" 10,\n",
" 1,\n",
" 3,\n",
" 7,\n",
" 7,\n",
" 11,\n",
" 3,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 7,\n",
" 2,\n",
" 7,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 7,\n",
" 2,\n",
" 1,\n",
" 1,\n",
" 4,\n",
" 5,\n",
" 4,\n",
" 4,\n",
" 4,\n",
" 8,\n",
" 6,\n",
" 1,\n",
" 7,\n",
" 9,\n",
" 1,\n",
" 2,\n",
" 7,\n",
" 4,\n",
" 7,\n",
" 3,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 5,\n",
" 2,\n",
" 5,\n",
" 6,\n",
" 5,\n",
" 9,\n",
" 1,\n",
" 1,\n",
" 8,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 8,\n",
" 1,\n",
" 7,\n",
" 3,\n",
" 2,\n",
" 8,\n",
" 4,\n",
" 1,\n",
" 6,\n",
" 1,\n",
" 3,\n",
" 7,\n",
" 1,\n",
" 5,\n",
" 5,\n",
" 10,\n",
" 1,\n",
" 3,\n",
" 4,\n",
" 3,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 8,\n",
" 7,\n",
" 1,\n",
" 5,\n",
" 2,\n",
" 3,\n",
" 7,\n",
" 3,\n",
" 5,\n",
" 9,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 3,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 3,\n",
" 5,\n",
" 2,\n",
" 9,\n",
" 4,\n",
" 4,\n",
" 1,\n",
" 4,\n",
" 6,\n",
" 5,\n",
" 8,\n",
" 9,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 1,\n",
" 8,\n",
" 3,\n",
" 8,\n",
" 3,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 2,\n",
" 1,\n",
" 3,\n",
" 4,\n",
" 9,\n",
" 2,\n",
" 5,\n",
" 11,\n",
" 1,\n",
" 4,\n",
" 10,\n",
" 3,\n",
" 1,\n",
" 9,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 3,\n",
" 1,\n",
" 2,\n",
" 2,\n",
" 8,\n",
" 6,\n",
" 6,\n",
" 1,\n",
" 5,\n",
" 4,\n",
" 3,\n",
" 6,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 3,\n",
" 1,\n",
" 2,\n",
" 1,\n",
" 8,\n",
" 5,\n",
" 3,\n",
" 5,\n",
" 2,\n",
" 4,\n",
" 1,\n",
" 1,\n",
" 9,\n",
" 1,\n",
" 3,\n",
" 1,\n",
" 2,\n",
" 4,\n",
" 8,\n",
" 6,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 4,\n",
" 4,\n",
" 1,\n",
" 1,\n",
" 4,\n",
" 4,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 1,\n",
" 5,\n",
" 8,\n",
" 2,\n",
" 9,\n",
" 10,\n",
" 3,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 6,\n",
" 10,\n",
" 6,\n",
" 5,\n",
" 6,\n",
" 2,\n",
" 4,\n",
" 2,\n",
" 4,\n",
" 5,\n",
" 2,\n",
" 7,\n",
" 1,\n",
" 4,\n",
" 2,\n",
" 3,\n",
" 11,\n",
" 4,\n",
" 4,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 5,\n",
" 6,\n",
" 5,\n",
" 1,\n",
" 4,\n",
" 2,\n",
" 1,\n",
" 5,\n",
" 3,\n",
" 2,\n",
" 8,\n",
" 9,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 3,\n",
" 1,\n",
" 5,\n",
" 3,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 9,\n",
" 9,\n",
" 1,\n",
" 4,\n",
" 4,\n",
" 7,\n",
" 3,\n",
" 4,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 6,\n",
" 4,\n",
" 3,\n",
" 2,\n",
" 1,\n",
" 8,\n",
" 1,\n",
" 2,\n",
" 5,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 1,\n",
" 8,\n",
" 1,\n",
" 2,\n",
" 4,\n",
" 4,\n",
" 1,\n",
" 7,\n",
" 10,\n",
" 2,\n",
" 9,\n",
" 1,\n",
" 6,\n",
" 3,\n",
" 7,\n",
" 9,\n",
" 7,\n",
" 2,\n",
" 4,\n",
" 5,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 4,\n",
" 1,\n",
" 2,\n",
" 7,\n",
" 2,\n",
" 7,\n",
" 6,\n",
" 1,\n",
" 7,\n",
" 2,\n",
" 6,\n",
" 1,\n",
" 7,\n",
" 2,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 9,\n",
" 5,\n",
" 7,\n",
" 1,\n",
" 8,\n",
" 1,\n",
" 10,\n",
" 1,\n",
" 8,\n",
" 1,\n",
" 2,\n",
" 9,\n",
" 9,\n",
" 4,\n",
" 10,\n",
" 1,\n",
" 5,\n",
" 2,\n",
" 7,\n",
" 7,\n",
" 2,\n",
" 3,\n",
" 3,\n",
" 6,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 3,\n",
" 2,\n",
" 4,\n",
" 2,\n",
" 6,\n",
" 1,\n",
" 5,\n",
" 9,\n",
" 7,\n",
" 5,\n",
" 4,\n",
" 1,\n",
" 4,\n",
" 4,\n",
" 2,\n",
" 3,\n",
" 1,\n",
" 1,\n",
" 3,\n",
" 5,\n",
" 1,\n",
" 5,\n",
" 9,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 11,\n",
" 4,\n",
" 5,\n",
" 6,\n",
" 2,\n",
" 6,\n",
" 6,\n",
" 1,\n",
" 3,\n",
" 3,\n",
" 7,\n",
" 4,\n",
" 6,\n",
" 6,\n",
" 1,\n",
" 5,\n",
" 8,\n",
" 1,\n",
" 5,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 7,\n",
" 1,\n",
" 1,\n",
" 5,\n",
" 4,\n",
" 7,\n",
" 1,\n",
" 2,\n",
" 3,\n",
" 6,\n",
" 12,\n",
" 1,\n",
" 6,\n",
" 2,\n",
" 5,\n",
" 1,\n",
" 2,\n",
" 1,\n",
" 5,\n",
" 4,\n",
" 6,\n",
" ...]"
]
},
"execution_count": 115,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"words_len # można wygenerować barplot długosci słów"
]
},
{
"cell_type": "code",
"execution_count": 116,
"id": "9cef8cf2-620d-44b7-b4ef-1907f82076ce",
"metadata": {},
"outputs": [],
"source": [
"# można wygenerować barplot długości zdań"
]
},
{
"cell_type": "code",
"execution_count": 117,
"id": "8110a80a-c50a-41b7-8ade-9c76b4a20eec",
"metadata": {},
"outputs": [],
"source": [
"# można wygenerować barplot frekwencji znaków"
]
},
{
"cell_type": "markdown",
"id": "65e247f3-1946-488f-ab48-9574ab59d592",
"metadata": {},
"source": [
"# Zadanie"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "55ff3555-d666-432b-92cb-2e35ad1a4ba3",
"metadata": {},
"outputs": [],
"source": [
"Wykonaj analizę tekstu https://ocw.mit.edu/ans7870/6/6.006/s08/lecturenotes/files/t8.shakespeare.txt\n",
"Analizę wykonaj na tekście z użyciem tokenizera, stemmera, bez znaków interpunkcyjnych i bez angielskich stopwordów.\n",
"Wygeneruj podstawowe statystyki tekstu (ile jest znaków, słów, ...) oraz wykresy:\n",
"- częstości występowania słów\n",
"- częstości długości słów\n",
"- częstości długości zdań\n",
"- wordcloud\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}