Commit Graph

151 Commits

Author SHA1 Message Date
6fa502ccc2 Test challenge creation & validation 2019-08-10 13:00:29 +02:00
b4ad774623 Trying to get references 2019-05-23 16:16:05 +02:00
eb10a4c3b4 Add plotting graphs for selected metrics 2019-03-19 07:31:17 +01:00
ae27029f61 Implement Probabilistic-Soft-F1 2019-03-12 22:35:19 +01:00
8393bec3ae Implement auxiliary calibration function 2019-03-12 08:58:21 +01:00
1a9fe36a9e Handle JSONl (for MultiLabel-F) 2019-02-14 19:01:53 +01:00
709eeec4ef Merge branch 'master' into jsonl 2019-02-14 16:35:41 +01:00
Filip Gralinski
26e9735d31 Handle jsonl files 2019-02-14 10:54:25 +01:00
Filip Gralinski
872724722a Fix SMAPE on zero values 2019-02-12 08:36:52 +01:00
138b77688b Add test for SMAPE 2019-02-01 12:05:22 +01:00
f7bd1b2ccd Add missing file, when generating a challenge 2019-02-01 12:04:52 +01:00
abcce9bf68 Add numerical features 2019-02-01 10:58:29 +01:00
ea5de5c719 Introduce existential features 2019-01-26 17:18:41 +01:00
Filip Gralinski
dbf5c961af Start numerical factors 2019-01-23 13:00:37 +01:00
39bc3964b3 Speed up cartesian features 2019-01-10 22:53:43 +01:00
212457077f Consider word shapes in black-box debugging 2019-01-10 09:58:04 +01:00
5d19fc7585 Add character-by-character tokenization. 2018-12-17 07:54:12 +01:00
60a8c96aa8 Fix tests 2018-12-07 09:22:55 +01:00
Filip Gralinski
9322307813 Handle more than one possibility in TokenAccuracy 2018-10-24 08:02:34 +02:00
Filip Gralinski
2e816c4e38 Add TokenAccuracy metric 2018-10-23 17:01:33 +02:00
Filip Gralinski
30c37c2b40 Merge branch 'master' of git.applica.pl:piotr.halama/geval 2018-10-23 08:50:04 +02:00
Filip Gralinski
f814fc2c79 Merge branch 'warnings' of https://gitlab.com/fintara/geval 2018-10-23 08:46:49 +02:00
Piotr Halama
dc1618a0ec Use correct temporary directory 2018-10-22 13:32:36 +02:00
8735610745 Implement soft f-score 2018-10-17 22:41:46 +02:00
Filip Gralinski
5dc6e13191 Add Pearson and Spearman correlation measures 2018-09-27 21:52:02 +02:00
782c556f8c Add Macro-F1 metric 2018-09-27 18:21:56 +02:00
Filip Gralinski
eb395d9be0 Add WER metric 2018-09-25 08:13:57 +02:00
Filip Gralinski
9b77f08876 more tests for GLEU 2018-09-12 20:37:44 +02:00
Filip Gralinski
5cff29cf06 Add GLEU 2018-09-11 08:03:07 +02:00
eaa791cf2f improvement for "submit" special command 2018-08-28 18:58:51 +02:00
Piotr Halama
bd7c789bae Implement --submit command 2018-08-27 17:57:07 +02:00
Filip Gralinski
0871b57bbc add --just-tokenize option 2018-08-17 16:57:47 +02:00
83550688ce first tokenizer 2018-08-13 10:09:55 +02:00
d3da3a0ca5 WIP 2018-08-13 07:39:06 +02:00
8388ab4d27 towards tokenization 2018-08-11 22:59:43 +02:00
efcceae26a implement MultiLabel-LogLoss and MultiLabel-Likelihood 2018-08-09 16:00:19 +02:00
bd2bfde287 MultiLabel-F1 works on labels given with probs now 2018-08-09 14:08:54 +02:00
6376063a0c more ranking tests 2018-08-03 08:23:55 +02:00
2b1cf80601 implement ranking conduit 2018-08-01 22:39:34 +02:00
4b3a4fa665 implement MultiLabel-F metric 2018-07-26 13:01:10 +02:00
9f5882719b param can take an empty value 2018-07-10 12:10:02 +02:00
ab635f2594 add helper function for parsing params in file paths 2018-07-10 11:18:52 +02:00
656a194f42 start refactoring to enable evaluating multiple outputs 2018-06-28 14:49:44 +02:00
Tsvetan Ovedenski
9c462bdf44
Remove warnings in Spec 2018-06-20 11:57:11 +02:00
Filip Gralinski
012578f32a implement mean absolute error 2018-06-13 12:30:11 +02:00
86d50b92b7 multiple metrics can be specified 2018-06-08 12:38:45 +02:00
ffb24509d7 handle http(s):// 2018-06-02 23:27:49 +02:00
57ee8a1296 switch to smart sources 2018-06-02 20:24:34 +02:00
18ed47322e Merge branch 'master' into smart-conduit 2018-06-02 16:31:36 +02:00
f9dfbc1466 accuracy can work on probablity distributions now 2018-06-02 12:24:14 +02:00
d370e375a0 add --alt-metric option 2018-06-02 11:29:54 +02:00
4768931221 add BIO-F1-Labels metric 2018-05-29 22:04:19 +02:00
65e8d2562e underscores can be used in the BIO format 2018-05-29 20:59:00 +02:00
ab1056301e add sorting for --line-by-line internally 2018-05-28 09:45:08 +02:00
f68223409e add test for the line-by-line mode 2018-05-26 21:10:22 +02:00
881a77e239 better diagnostic messages for BIO 2018-05-25 14:44:19 +02:00
192d531969 add likelihood as evaluation metrics 2018-05-17 15:21:03 +02:00
438f013914 automatic decompression 2018-05-17 08:26:57 +02:00
b01f9439b7 log probs 2018-05-16 20:59:40 +02:00
82e794ae3c implement BIO-F1 2018-05-16 10:51:50 +02:00
06fd093349 probs can be given for LogLossHashed 2018-05-15 08:07:47 +02:00
bdcd26cddc WIP 2018-05-14 10:37:58 +02:00
cea084c789 accuracy can work on probs now 2018-04-07 21:13:37 +02:00
ff8ec8880e add LogLoss 2018-04-07 08:29:58 +02:00
5c00ab6d26 show line number when something wrong 2018-02-20 21:28:14 +01:00
a2814f2d12 add function for evaluating single lines 2018-02-20 21:28:13 +01:00
9643719193 add MAP metric 2018-02-20 21:28:13 +01:00
c10f3579c6 fix BLEU for empty output 2018-02-20 21:28:13 +01:00
54c899ddfc generating sample CharMatch challenge, CharMatch is F0.5 now 2018-02-20 21:28:13 +01:00
72dbf33b8d make it possible to cover metrics operating on the input, add CharMatch metric 2018-02-20 21:28:13 +01:00
b058cd0095 implement softmax in LogLossHashed 2018-02-20 21:28:13 +01:00
e84a14d069 salt LogLossHashed with line numbers 2018-02-20 21:28:13 +01:00
0e9c44a5b5 start working on LogLossHashed 2018-02-20 21:28:13 +01:00
37c31e6075 NMI implemented as geval metric 2018-02-20 21:28:13 +01:00
6f428d6496 add NMI 2018-02-20 21:28:13 +01:00
595b2c9650 fix purity 2018-02-20 21:28:13 +01:00
065a3ce9cd add auxiliary function for calculating purity 2018-02-20 21:28:13 +01:00
8e87e97f2d add F-measure 2018-02-20 21:28:13 +01:00
67f73f420e ClippEU passes tests 2018-02-20 21:28:12 +01:00
c3a6d94d1c start work on ClippEU 2018-02-20 21:28:12 +01:00
0835bc3a4e more tests 2018-02-20 21:28:12 +01:00
c3106a1ad6 finish general procedure for precision, recall and F-measure 2018-02-20 21:28:12 +01:00
Filip Gralinski
ea058c9763 prepare a simple test set for ClippEU 2018-02-20 21:28:12 +01:00
b4e5dcbd9d add getOptions for extracting options without running the evaluation 2018-02-20 21:28:12 +01:00
e66a8d8341 handle numbers combined with text 2018-02-20 21:28:12 +01:00
b52819f67e check whether data is OK 2018-02-20 21:28:12 +01:00
7f3973890d check emptiness 2018-02-20 21:28:12 +01:00
570411b702 refactor tests 2018-02-20 21:28:12 +01:00
17d39c4293 check the number of lines 2018-02-20 21:28:12 +01:00
cf6f287763 handle Accuracy 2018-02-20 21:28:12 +01:00
Filip Gralinski
8a944c17d0 BLEU done 2018-02-20 21:28:12 +01:00
5e6d89a94c add test case data 2018-02-20 21:28:11 +01:00
7b009b048a BLEU cntd. 2018-02-20 21:28:11 +01:00
bf4b91f8f8 start work on BLEU 2018-02-20 21:28:11 +01:00
c8fce1110e fix module names 2018-02-20 21:28:11 +01:00
bbf6b1ec43 add a function for running with args, reading config file 2018-02-20 21:28:11 +01:00
6290250125 introduce GEvalSpecification 2018-02-20 21:28:11 +01:00
33f4af1c38 rename 2018-02-20 21:28:11 +01:00
85ec1fdccb simple test passed 2018-02-20 21:28:11 +01:00
17844b5921 init cntd. 2018-02-20 21:28:11 +01:00