Commit Graph

265 Commits

Author SHA1 Message Date
Filip Gralinski
9322307813 Handle more than one possibility in TokenAccuracy 2018-10-24 08:02:34 +02:00
Filip Gralinski
2e816c4e38 Add TokenAccuracy metric 2018-10-23 17:01:33 +02:00
Filip Gralinski
f814fc2c79 Merge branch 'warnings' of https://gitlab.com/fintara/geval 2018-10-23 08:46:49 +02:00
8735610745 Implement soft f-score 2018-10-17 22:41:46 +02:00
Filip Gralinski
5dc6e13191 Add Pearson and Spearman correlation measures 2018-09-27 21:52:02 +02:00
Filip Gralinski
b3800bc1d9 Add Macro-F-measure to help 2018-09-27 18:23:26 +02:00
782c556f8c Add Macro-F1 metric 2018-09-27 18:21:56 +02:00
Filip Gralinski
3a852ed081 Speed up GLEU (cntd.) 2018-09-26 22:27:59 +02:00
Filip Gralinski
eb395d9be0 Add WER metric 2018-09-25 08:13:57 +02:00
Filip Gralinski
4f09a1802f Speed up GLEU 2018-09-25 07:10:17 +02:00
Filip Gralinski
b419aa7b08 Handle dot decimal separator in parameters 2018-09-12 20:52:56 +02:00
83b6b39fca Fix error message 2018-09-12 13:48:31 +02:00
b46caaa702 Fix README.md generated for LikelihoodHashed 2018-09-12 12:44:36 +02:00
Filip Gralinski
5cff29cf06 Add GLEU 2018-09-11 08:03:07 +02:00
c6d48c57f6 improve documentation on geval --submit 2018-09-01 16:39:00 +02:00
e2c3102cc4 check whether the remote tracking branch exists 2018-09-01 14:39:34 +02:00
eaa791cf2f improvement for "submit" special command 2018-08-28 18:58:51 +02:00
Piotr Halama
bd7c789bae Implement --submit command 2018-08-27 17:57:07 +02:00
Filip Gralinski
421d2e9797 add minimalistic tokenizer 2018-08-17 18:13:27 +02:00
Filip Gralinski
c79c4b356e fix some warnings 2018-08-17 17:52:41 +02:00
Filip Gralinski
8b7a18b4c7 v14 tokenizer added 2018-08-17 17:45:01 +02:00
Filip Gralinski
5e5a58210e use tokenization when looking for worst features 2018-08-17 17:27:25 +02:00
Filip Gralinski
0871b57bbc add --just-tokenize option 2018-08-17 16:57:47 +02:00
83550688ce first tokenizer 2018-08-13 10:09:55 +02:00
d3da3a0ca5 WIP 2018-08-13 07:39:06 +02:00
8388ab4d27 towards tokenization 2018-08-11 22:59:43 +02:00
de52a12b03 export some functions from OptionsParser 2018-08-10 16:09:41 +02:00
5098225bc1 improvements in challenge creation 2018-08-10 13:05:42 +02:00
e10f92cf9c create challenge with MultiLabelLikelihood/LogLoss 2018-08-09 16:35:31 +02:00
efcceae26a implement MultiLabel-LogLoss and MultiLabel-Likelihood 2018-08-09 16:00:19 +02:00
bd2bfde287 MultiLabel-F1 works on labels given with probs now 2018-08-09 14:08:54 +02:00
82bdf70031 add missing metric to help 2018-08-09 12:47:52 +02:00
da2114e6d2 reverse sides when diffing 2018-08-07 16:21:37 +02:00
e55b8539f1 option -r can be used with -m 2018-08-07 15:55:04 +02:00
c385710719 showing most worsening features 2018-08-06 22:22:33 +02:00
3f3d1fd287 refactor worst features 2018-08-06 21:34:38 +02:00
7503644bbe sort in --worst-features 2018-08-06 12:09:31 +02:00
bc1de4c3e6 worst features show average score now 2018-08-06 11:59:04 +02:00
51abed6fa4 count the number of lines correctly 2018-08-03 11:16:28 +02:00
8dac79fab2 clean up listing worst features 2018-08-02 22:09:25 +02:00
020b93ccf8 p-value for features counted 2018-08-02 12:50:13 +02:00
f8418894fb Merge branch 'worst-features' of ssh://gonito.net/geval into worst-features 2018-08-02 08:31:08 +02:00
cd30d88998 fix some warnings 2018-08-02 08:29:52 +02:00
2b1cf80601 implement ranking conduit 2018-08-01 22:39:34 +02:00
4b3a4fa665 implement MultiLabel-F metric 2018-07-26 13:01:10 +02:00
c0fd359590 refactor for Gonito 2018-07-14 09:48:45 +02:00
0c6032d166 print params 2018-07-10 16:22:28 +02:00
9f5882719b param can take an empty value 2018-07-10 12:10:02 +02:00
ab635f2594 add helper function for parsing params in file paths 2018-07-10 11:18:52 +02:00
0708b746a9 fix handling compressed files 2018-06-29 16:59:00 +02:00
010f0f46ab export function needed by Gonito 2018-06-28 17:00:18 +02:00
1278081a48 results are sorted in the natural manner when multiple outputs are evaluated 2018-06-28 16:32:46 +02:00
338ddb7fbf fully handle multiple outputs 2018-06-28 16:22:22 +02:00
ba26cdb9e0 multiple outs are recognised but not handled 2018-06-28 15:36:47 +02:00
656a194f42 start refactoring to enable evaluating multiple outputs 2018-06-28 14:49:44 +02:00
Tsvetan Ovedenski
f6ad2f0a85
Remove warnings in Core 2018-06-20 11:48:03 +02:00
Filip Gralinski
0a2e1fcc32 docs on PrecisionAndRecall 2018-06-13 15:36:23 +02:00
e0e06196f0 Merge branch 'handle-version-option' into 'master'
Added version flag handling, added changelog

Closes #7

See merge request filipg/geval!1
2018-06-13 10:46:34 +00:00
Filip Gralinski
012578f32a implement mean absolute error 2018-06-13 12:30:11 +02:00
Tomasz Weissbek
964957b1db Added version flag handling, added changelog 2018-06-13 12:19:06 +02:00
1073407760 improve documentation 2018-06-12 21:52:18 +02:00
86d50b92b7 multiple metrics can be specified 2018-06-08 12:38:45 +02:00
ffb24509d7 handle http(s):// 2018-06-02 23:27:49 +02:00
57ee8a1296 switch to smart sources 2018-06-02 20:24:34 +02:00
18ed47322e Merge branch 'master' into smart-conduit 2018-06-02 16:31:36 +02:00
f9dfbc1466 accuracy can work on probablity distributions now 2018-06-02 12:24:14 +02:00
d370e375a0 add --alt-metric option 2018-06-02 11:29:54 +02:00
4768931221 add BIO-F1-Labels metric 2018-05-29 22:04:19 +02:00
65e8d2562e underscores can be used in the BIO format 2018-05-29 20:59:00 +02:00
3f7384f467 add --sort and --reverse-sort options 2018-05-28 10:04:27 +02:00
ab1056301e add sorting for --line-by-line internally 2018-05-28 09:45:08 +02:00
f68223409e add test for the line-by-line mode 2018-05-26 21:10:22 +02:00
cb655cd2ae refactor LineByLine 2018-05-26 14:40:26 +02:00
c71c7a019d remove warning in LineByLine.hs 2018-05-26 13:09:06 +02:00
881a77e239 better diagnostic messages for BIO 2018-05-25 14:44:19 +02:00
3e201d11ef update for Stack LTS 11.9 2018-05-19 13:49:53 +02:00
192d531969 add likelihood as evaluation metrics 2018-05-17 15:21:03 +02:00
438f013914 automatic decompression 2018-05-17 08:26:57 +02:00
01b93dd243 improve help for geval --init 2018-05-16 21:00:45 +02:00
b01f9439b7 log probs 2018-05-16 20:59:40 +02:00
82e794ae3c implement BIO-F1 2018-05-16 10:51:50 +02:00
9fc4beaba1 improve sample challenge for LogLossHashed 2018-05-15 08:14:52 +02:00
06fd093349 probs can be given for LogLossHashed 2018-05-15 08:07:47 +02:00
bdcd26cddc WIP 2018-05-14 10:37:58 +02:00
cea084c789 accuracy can work on probs now 2018-04-07 21:13:37 +02:00
ff8ec8880e add LogLoss 2018-04-07 08:29:58 +02:00
9d4aab5f2c diff 2018-02-20 21:28:14 +01:00
88f69156e7 refactor code 2018-02-20 21:28:14 +01:00
5ae8036efc add short options, improve help 2018-02-20 21:28:14 +01:00
b51944b930 --init considers --precision now 2018-02-20 21:28:14 +01:00
6cfefed0c1 precision is part of specification now 2018-02-20 21:28:14 +01:00
f32564a42a write wrong line number correctly in line-by-line mode 2018-02-20 21:28:14 +01:00
5c00ab6d26 show line number when something wrong 2018-02-20 21:28:14 +01:00
b323e6148c refactor parse errors (use Either instead of throwing an error) 2018-02-20 21:28:14 +01:00
c70d49c418 add line-by-line mode 2018-02-20 21:28:13 +01:00
a7d2ed8c21 refactor 2018-02-20 21:28:13 +01:00
11b43b3a2a introduce special command 2018-02-20 21:28:13 +01:00
a2814f2d12 add function for evaluating single lines 2018-02-20 21:28:13 +01:00
8d87ee4c4b refactor Core so that any conduit source could be accepted, not just file names 2018-02-20 21:28:13 +01:00
9643719193 add MAP metric 2018-02-20 21:28:13 +01:00
c10f3579c6 fix BLEU for empty output 2018-02-20 21:28:13 +01:00
f289cafc03 upgrade to Stack LTS 9.5 2018-02-20 21:28:13 +01:00
54c899ddfc generating sample CharMatch challenge, CharMatch is F0.5 now 2018-02-20 21:28:13 +01:00
72dbf33b8d make it possible to cover metrics operating on the input, add CharMatch metric 2018-02-20 21:28:13 +01:00
6144ae6bdf add sample toy challenge for LogLossHashed 2018-02-20 21:28:13 +01:00
59f19cbe18 change default size of hash 2018-02-20 21:28:13 +01:00
b058cd0095 implement softmax in LogLossHashed 2018-02-20 21:28:13 +01:00
e84a14d069 salt LogLossHashed with line numbers 2018-02-20 21:28:13 +01:00
0e9c44a5b5 start working on LogLossHashed 2018-02-20 21:28:13 +01:00
073d92a4e7 add sample challenge for NMI 2018-02-20 21:28:13 +01:00
37c31e6075 NMI implemented as geval metric 2018-02-20 21:28:13 +01:00
6f428d6496 add NMI 2018-02-20 21:28:13 +01:00
595b2c9650 fix purity 2018-02-20 21:28:13 +01:00
065a3ce9cd add auxiliary function for calculating purity 2018-02-20 21:28:13 +01:00
422d7c63d9 make UTF8 decoding lenient 2018-02-20 21:28:13 +01:00
6dc293a4b9 add more explanation for the F-measure example 2018-02-20 21:28:13 +01:00
8e87e97f2d add F-measure 2018-02-20 21:28:13 +01:00
c3857e8183 fix help 2018-02-20 21:28:12 +01:00
67f73f420e ClippEU passes tests 2018-02-20 21:28:12 +01:00
d118e15353 refactor 2018-02-20 21:28:12 +01:00
c3a6d94d1c start work on ClippEU 2018-02-20 21:28:12 +01:00
c3106a1ad6 finish general procedure for precision, recall and F-measure 2018-02-20 21:28:12 +01:00
Filip Gralinski
b14e6a38a2 start implementing general precision/recall 2018-02-20 21:28:12 +01:00
b4e5dcbd9d add getOptions for extracting options without running the evaluation 2018-02-20 21:28:12 +01:00
a86828e3a8 forgotten export 2018-02-20 21:28:12 +01:00
995beb6dbc handle metric ordering 2018-02-20 21:28:12 +01:00
e979d6b2df minor clean-up 2018-02-20 21:28:12 +01:00
fbe71aef92 minor fix 2018-02-20 21:28:12 +01:00
e66a8d8341 handle numbers combined with text 2018-02-20 21:28:12 +01:00
b52819f67e check whether data is OK 2018-02-20 21:28:12 +01:00
7f3973890d check emptiness 2018-02-20 21:28:12 +01:00
17d39c4293 check the number of lines 2018-02-20 21:28:12 +01:00
9e96f5dfe4 generate sample classification challenge 2018-02-20 21:28:12 +01:00
cf6f287763 handle Accuracy 2018-02-20 21:28:12 +01:00
Filip Gralinski
e55708a8f0 small fixes 2018-02-20 21:28:12 +01:00
085ef0ec9e add --precision option 2018-02-20 21:28:12 +01:00
e96f16b829 info about title and description 2018-02-20 21:28:12 +01:00
290d4db503 README.md for sample BLEU challenge 2018-02-20 21:28:12 +01:00
5892952e5b fix typos 2018-02-20 21:28:12 +01:00
1c59d26898 enhance README in sample challenge 2018-02-20 21:28:12 +01:00
d455ffc1bf create .gitignore file with --init 2018-02-20 21:28:12 +01:00
e6b3d92913 --init handles BLEU metric now 2018-02-20 21:28:12 +01:00
Filip Gralinski
8a944c17d0 BLEU done 2018-02-20 21:28:12 +01:00
7b009b048a BLEU cntd. 2018-02-20 21:28:11 +01:00
cc514d085d generalize gevalcore 2018-02-20 21:28:11 +01:00
1e444ca3ec refactor gevalCore 2018-02-20 21:28:11 +01:00
bf4b91f8f8 start work on BLEU 2018-02-20 21:28:11 +01:00
1ebe413d82 --init working 2018-02-20 21:28:11 +01:00
cf5e473eee create a challenge, cntd. 2018-02-20 21:28:11 +01:00
f6abc0f485 creating a challenge continued, fixed a typo 2018-02-20 21:28:11 +01:00