Go to file
siulkilulki 9b76f4e8aa Add robust recrawling of not completed data.
Add annotator.py (highlighing hout within context done)
Enhance parish2text.py (enable more flags, convert button)
2018-04-16 23:54:03 +02:00
parishwebsites Add robust recrawling of not completed data. 2018-04-16 23:54:03 +02:00
scraper Modifiy error logging in get_parishes_url. Enhance crawl_deon.py 2018-04-06 23:33:18 +02:00
.gitignore Initial commit 2017-03-10 16:05:59 +01:00
annotator.py Add robust recrawling of not completed data. 2018-04-16 23:54:03 +02:00
environment.yml Add domain-blacklist.txt, domain filter, modify crawler. 2018-04-09 23:53:36 +02:00
LICENSE Initial commit 2017-03-10 16:05:59 +01:00
Makefile Add robust recrawling of not completed data. 2018-04-16 23:54:03 +02:00
prepare-environment.sh Switch to pure html download. Enhanced urls filtering. 2018-03-11 18:02:31 +01:00
README.md Update README.md 2018-04-06 23:43:14 +02:00
temat.md Update temat.md 2017-03-14 17:11:33 +01:00

mass-scraper

Polish masses project. beeminder update