Go to file
siulkilulki e9c4dcd743 Tune download settings. Enable dummy cache with 7 days of expiration.
Fix generating spiider commands.
Add redirected domain appenid to allowed domains.
Configure loggers.
Add more meta info to *processed.txt
Enhance view raw data python jsnoline viewer
2018-04-15 12:17:35 +02:00
extractor Prototype rule based masses extractor. 2018-03-01 14:40:13 +01:00
parishwebsites Tune download settings. Enable dummy cache with 7 days of expiration. 2018-04-15 12:17:35 +02:00
scraper Modifiy error logging in get_parishes_url. Enhance crawl_deon.py 2018-04-06 23:33:18 +02:00
.gitignore Initial commit 2017-03-10 16:05:59 +01:00
environment.yml Add domain-blacklist.txt, domain filter, modify crawler. 2018-04-09 23:53:36 +02:00
LICENSE Initial commit 2017-03-10 16:05:59 +01:00
Makefile Tune download settings. Enable dummy cache with 7 days of expiration. 2018-04-15 12:17:35 +02:00
prepare-environment.sh Switch to pure html download. Enhanced urls filtering. 2018-03-11 18:02:31 +01:00
README.md Update README.md 2018-04-06 23:43:14 +02:00
temat.md Update temat.md 2017-03-14 17:11:33 +01:00

mass-scraper

Polish masses project. beeminder update