paper-cutter/{{cookiecutter.paper_repo_name}}/helpers/pdf-to-plain-text.sh
Filip Gralinski 67c788fe69 Init from an internal repo.
Commit d5b6f8e831fc5c933af5ceb1267f51ef6af6c438
2020-11-24 08:33:07 +01:00

4 lines
307 B
Bash
Executable File

#!/bin/bash
pdftotext "$1" - | fgrep -v 'Confidential Review Copy' | grep -P -v '^(ACL 2020 Submission \*\*\*\. Confidential Review Copy\. DO NOT DISTRIBUTE\.|Anonymous ACL submission|Abstract|Results|Conclusions|https?://\S+)\s*$' | grep '[^[:space:]]' | egrep '[a-zA-Z]{2}' | perl -pne 's/\f//g;' | uniq