geval/test
Filip Gralinski 40bf850423 Add p<X> operations for choosing the most confident items 2021-10-26 06:48:12 +02:00
..
_submit-tests Implement --submit command 2018-08-27 17:57:07 +02:00
_validation/broken-metric Improve detecting wrong metrics during validation 2021-07-29 14:55:59 +02:00
accuracy-filtering Finish filtering 2020-05-13 15:34:16 +02:00
accuracy-flags-line-by-line Add test for line-by-line 2020-08-08 14:38:21 +02:00
accuracy-multiple-filtering Filtering works on multiple values 2020-05-20 10:50:01 +02:00
accuracy-on-sorted Introduce :S flag (sorting words within a line) 2019-11-25 21:31:17 +01:00
accuracy-probs accuracy can work on probs now 2018-04-07 21:13:37 +02:00
accuracy-simple handle Accuracy 2018-02-20 21:28:12 +01:00
accuracy-with-flags Add substitution operation 2020-01-11 17:02:49 +01:00
bio-f1-complex implement BIO-F1 2018-05-16 10:51:50 +02:00
bio-f1-complex-labels add BIO-F1-Labels metric 2018-05-29 22:04:19 +02:00
bio-f1-error better diagnostic messages for BIO 2018-05-25 14:44:19 +02:00
bio-f1-flags Fix line-by-line mode not working for BIO-F1 2021-10-09 18:28:14 +02:00
bio-f1-perfect implement BIO-F1 2018-05-16 10:51:50 +02:00
bio-f1-simple implement BIO-F1 2018-05-16 10:51:50 +02:00
bio-f1-simple-underscores underscores can be used in the BIO format 2018-05-29 20:59:00 +02:00
bio-weighted-f1-simple Add BIOWeightedF1 metric 2021-06-10 09:21:43 +02:00
bleu-complex BLEU done 2018-02-20 21:28:12 +01:00
bleu-complex-bootstrap Change Bootstrap option name 2020-01-27 22:52:15 +01:00
bleu-empty fix BLEU for empty output 2018-02-20 21:28:13 +01:00
bleu-perfect BLEU done 2018-02-20 21:28:12 +01:00
bleu-trivial add test case data 2018-02-20 21:28:11 +01:00
bleu-with-tokenization add --just-tokenize option 2018-08-17 16:57:47 +02:00
cer-mean-simple Add CER metric 2020-10-17 16:55:40 +02:00
cer-simple Add CER metric 2020-10-17 16:55:40 +02:00
cer-space-escaping Handle escaping spaces in configuration files 2020-10-17 18:56:30 +02:00
charmatch-complex make it possible to cover metrics operating on the input, add CharMatch metric 2018-02-20 21:28:13 +01:00
charmatch-complex-compressed automatic decompression 2018-05-17 08:26:57 +02:00
charmatch-no-input make it possible to cover metrics operating on the input, add CharMatch metric 2018-02-20 21:28:13 +01:00
charmatch-perfect make it possible to cover metrics operating on the input, add CharMatch metric 2018-02-20 21:28:13 +01:00
charmatch-simple make it possible to cover metrics operating on the input, add CharMatch metric 2018-02-20 21:28:13 +01:00
clippeu-simple ClippEU passes tests 2018-02-20 21:28:12 +01:00
dos-end-of-line Handle DOS/Windows end-of-lines 2021-06-30 09:33:07 +02:00
empty-output check emptiness 2018-02-20 21:28:12 +01:00
error-too-few-lines check the number of lines 2018-02-20 21:28:12 +01:00
error-too-many-lines check the number of lines 2018-02-20 21:28:12 +01:00
f-measure-all-false add F-measure 2018-02-20 21:28:13 +01:00
f-measure-perfect add F-measure 2018-02-20 21:28:13 +01:00
f-measure-simple add F-measure 2018-02-20 21:28:13 +01:00
f-measure-stupid add F-measure 2018-02-20 21:28:13 +01:00
f1-with-preprocessing Handle preprocessing operations for metrics 2019-08-12 17:50:48 +02:00
f2-simple add F-measure 2018-02-20 21:28:13 +01:00
files WIP 2018-05-14 10:37:58 +02:00
flags-case-fold Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-filter-and-match Fix filter and match combination 2021-06-10 15:01:10 +02:00
flags-filtering Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-lowercase Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-none Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-regexp-matching Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-regexp-matching-anchor Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-regexp-substitution Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-regexp-substitution-ref Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-regexp-token-matching Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-regexp-token-matching-anchor Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flags-sort Fixes in README (description of flags) 2020-08-01 21:37:48 +02:00
flags-uppercase Describe flags, add "c" and "t" flags. 2020-08-01 21:27:04 +02:00
flc-f1-multi-overlap Fix tests 2020-02-07 11:34:42 +01:00
flc-f1-simple Fix tests 2020-02-07 11:34:42 +01:00
fuzzy-match-accuracy Matching specification can be used for Accuracy 2021-07-19 16:37:43 +02:00
gleu-empty more tests for GLEU 2018-09-12 20:37:44 +02:00
gleu-perfect more tests for GLEU 2018-09-12 20:37:44 +02:00
gleu-simple Add GLEU 2018-09-11 08:03:07 +02:00
haversine Add Haversine metric 2021-06-06 20:20:25 +02:00
jsonl-simple Handle jsonl files 2019-02-14 10:54:25 +01:00
likelihood-hashed-not-normalized add likelihood as evaluation metrics 2018-05-17 15:21:03 +02:00
likelihood-simple add test for the line-by-line mode 2018-05-26 21:10:22 +02:00
log-loss-hashed-normalization log probs 2018-05-16 20:59:40 +02:00
log-loss-hashed-not-normalized implement softmax in LogLossHashed 2018-02-20 21:28:13 +01:00
log-loss-hashed-probs probs can be given for LogLossHashed 2018-05-15 08:07:47 +02:00
log-loss-hashed-probs-normalized probs can be given for LogLossHashed 2018-05-15 08:07:47 +02:00
log-loss-hashed-simple salt LogLossHashed with line numbers 2018-02-20 21:28:13 +01:00
logloss-perfect add LogLoss 2018-04-07 08:29:58 +02:00
logloss-simple add LogLoss 2018-04-07 08:29:58 +02:00
macro-f-measure-perfect Add Macro-F1 metric 2018-09-27 18:21:56 +02:00
macro-f1-simple Add Macro-F1 metric 2018-09-27 18:21:56 +02:00
mae-simple implement mean absolute error 2018-06-13 12:30:11 +02:00
map-simple add MAP metric 2018-02-20 21:28:13 +01:00
mean-multilabel-f1-simple Introduce :S flag (sorting words within a line) 2019-11-25 21:31:17 +01:00
mse-simple add a function for running with args, reading config file 2018-02-20 21:28:11 +01:00
mse-simple-headers Handle headers 2020-02-22 11:18:34 +01:00
multilabel-f1-ie Add new tests for MultiLabel-F1 2020-06-27 18:15:34 +02:00
multilabel-f1-ie-flags Add new tests for MultiLabel-F1 2020-06-27 18:15:34 +02:00
multilabel-f1-ie-fuzzy A dead-end when working on fuzzy matching 2020-07-01 18:24:45 +02:00
multilabel-f1-ie-fuzzy-harden Add hardening 2020-07-02 18:22:29 +02:00
multilabel-f1-ie-fuzzy-smart Add smart mode 2020-07-02 18:14:56 +02:00
multilabel-f1-ie-probs Fix bug with inconsistent handling of probs in MultiLabel-F1 2021-07-23 17:26:41 +02:00
multilabel-f1-simple Fix line-by-line mode for MultiLabel 2020-08-08 16:59:08 +02:00
multilabel-f1-with-probs MultiLabel-F1 works on labels given with probs now 2018-08-09 14:08:54 +02:00
multilabel-f1-with-probs-and-numbers MultiLabel-F1 works on labels given with probs now 2018-08-09 14:08:54 +02:00
multilabel-f2-simple implement MultiLabel-F metric 2018-07-26 13:01:10 +02:00
multilabel-likelihood-simple implement MultiLabel-LogLoss and MultiLabel-Likelihood 2018-08-09 16:00:19 +02:00
nmi-complex NMI implemented as geval metric 2018-02-20 21:28:13 +01:00
oracle-item-based Add --oracle-item-based option 2019-12-16 11:18:49 +01:00
perplexity-hashed-simple Add PerplexityHashed metric 2021-08-20 19:31:32 +02:00
probabilistic-f1-probs Implement Probabilistic-MultiLabel-F1 2019-09-07 14:16:06 +02:00
probabilistic-f1-simple Implement Probabilistic-MultiLabel-F1 2019-09-07 14:16:06 +02:00
probabilistic-soft-f1-calibrated Implement Probabilistic-Soft-F1 2019-03-12 22:35:19 +01:00
probabilistic-soft-f1-simple Implement Probabilistic-Soft-F1 2019-03-12 22:35:19 +01:00
rmse-simple add a function for running with args, reading config file 2018-02-20 21:28:11 +01:00
segment-accuracy-simple Add SegmentAccuracy metric 2019-11-18 18:35:01 +01:00
smape-simple Fix SMAPE on zero values 2019-02-12 08:36:52 +01:00
soft-f1-perfect Implement soft f-score 2018-10-17 22:41:46 +02:00
soft-f1-simple Implement soft f-score 2018-10-17 22:41:46 +02:00
soft2d-f1-one-pixel Soft2D-F... Metric is inclusive now. 2019-09-03 17:19:05 +02:00
soft2d-f1-simple Add Soft2D-F metric 2019-08-22 13:20:29 +02:00
spearman-simple Add Pearson and Spearman correlation measures 2018-09-27 21:52:02 +02:00
token-accuracy-simple Handle more than one possibility in TokenAccuracy 2018-10-24 08:02:34 +02:00
top-confidence Add p<X> operations for choosing the most confident items 2021-10-26 06:48:12 +02:00
unexpected-data check whether data is OK 2018-02-20 21:28:12 +01:00
unwanted-data handle numbers combined with text 2018-02-20 21:28:12 +01:00
wer-simple Change the meaning of WER 2019-12-21 16:03:52 +01:00
Spec.hs Add p<X> operations for choosing the most confident items 2021-10-26 06:48:12 +02:00
create-test.sh Add helper script for creating tests 2021-06-10 09:22:54 +02:00