naive_bayes/README.md

34 lines
758 B
Markdown

Skeptic vs paranormal subreddits
================================
Classify a reddit as either from Skeptic subreddit or one of the
"paranormal" subreddits (Paranormal, UFOs, TheTruthIsHere, Ghosts,
,Glitch-in-the-Matrix, conspiracytheories).
Output label is the probability of a paranormal subreddit.
# Pytorch logistic regression
The code can be found in Logistic.py
Trained models end with .pth extension.
Geval results:
```
$ ./geval -t dev-0
Likelihood 0.0000
Accuracy 0.7043
F1.0 0.4950
Precision 0.6257
Recall 0.4094
```
Logs from training have been copy-pasted into `l1_epochs.txt` (for single-layer model) and `l2_epochs.txt (for two-layer model).
Sources
-------
Data taken from <https://archive.org/details/2015_reddit_comments_corpus>.