mass-scraper/parishwebsites/find-not-completed.sh
siulkilulki 9b76f4e8aa Add robust recrawling of not completed data.
Add annotator.py (highlighing hout within context done)
Enhance parish2text.py (enable more flags, convert button)
2018-04-16 23:54:03 +02:00

6 lines
318 B
Bash
Executable File

#!/usr/bin/env bash
(grep -r "No space left on device" logs | sort -u | grep "^logs/.*:OSError" -o | sed -Ee 's@^logs/|:OSError$@@g' | sort -u &&\
grep -r 'Received SIGTERM' logs/ | grep '^logs/.*:20' -o | sed -Ee 's@^logs/|:20$@@g' | sort -u &&\
find data -empty -type f | sed -e 's@data/@@' | sort
) | sort -u