Commit Graph

320 Commits

Author SHA1 Message Date
659b122625 Make Bootstrap work on MultiLabel-F1 2020-01-27 22:14:13 +01:00
1cea36ac93 Helper functions for confidence bounds 2020-01-27 21:54:34 +01:00
ae2769b7b9 Implement bootstrap in GEval 2020-01-25 23:46:33 +01:00
deb14c6702 Add Bootstrap facilities 2020-01-25 22:05:11 +01:00
bfcd5aa631 Most evaluation metrics are handled with dependency types 2020-01-25 19:26:57 +01:00
608b1f9d73 Merge branch 'master' into bootstrap 2020-01-18 18:09:19 +01:00
e170c37864 Add substitution operation 2020-01-11 17:02:49 +01:00
5171cf0ac6 Results are presented as cross tables (if possible) 2020-01-04 20:48:36 +01:00
4ba61b6e6e Prepare helper functions for cross-tabs 2020-01-04 18:17:14 +01:00
01486d23aa Change the meaning of WER 2019-12-21 16:03:52 +01:00
ad30bb9384 Fix bug with preprocessing ops not handled in --line-by-line mode 2019-12-16 12:47:35 +01:00
9a3a28a813 Add --oracle-item-based option 2019-12-16 11:18:49 +01:00
d95e2878a6 Refactor line-by-line mode 2019-12-14 21:10:40 +01:00
2234efa107 Multiple metrics can be packed via "Cartesian" strings 2019-12-14 20:59:00 +01:00
5f532c71c7 Add setting priorities, names can be set multiple times
If more than one is given for a metric, they are concatenated
(with spaces).
2019-12-14 19:58:02 +01:00
0826d457b2 Complete move to the new style of train files 2019-12-13 20:31:40 +01:00
26f20ba466 Fixes 2019-12-04 21:15:44 +01:00
6d95dee275 More fixes 2019-12-04 20:41:07 +01:00
76dae8e5c8 Merge branch 'master' into train-new-style 2019-12-04 20:23:03 +01:00
66e2350b1a Remove commented out code 2019-12-04 20:00:45 +01:00
74d999d4bf Towards new-style of train 2019-11-27 16:55:17 +01:00
cb4efe1d6b Introduce :S flag (sorting words within a line) 2019-11-25 21:31:17 +01:00
d83c1eeac1 Continue 2019-11-24 14:30:27 +01:00
839ad5ce47 Merge branch 'master' into bootstrap 2019-11-23 13:17:19 +01:00
3b7fe8a67c Continue work on refactor 2019-11-23 13:07:24 +01:00
03aacdef98 Add SegmentAccuracy metric 2019-11-18 18:35:01 +01:00
234bac19ce WIP 2019-11-07 12:33:46 +01:00
Filip Graliński
41fe0d2283 Make room for storing the results of bootstrap resampling 2019-11-02 16:44:13 +01:00
Filip Graliński
402ed73111 Add option for bootstrap resampling (not implemented yet) 2019-11-02 11:53:52 +01:00
Filip Graliński
d67061c230 Hande itemStep via dependent types 2019-11-02 11:11:53 +01:00
Filip Graliński
32969bb56a OutputParser handled by dependent types 2019-11-02 09:31:16 +01:00
d5d177bc8d Start using dependent types 2019-11-02 09:06:58 +01:00
6e20e79f5b Start working on usind dependent types 2019-11-02 09:06:58 +01:00
huntekah
3001803c56 Add GLEU metric description #29 2019-10-29 19:12:26 +00:00
62954bf21e Merge branch 'master' of gitlab.com:filipg/geval 2019-10-15 20:51:48 +02:00
mateusz.hinc
f1e7cc792d fix typos, fix grammar, add --single-branch to git commands 2019-10-15 09:45:42 +02:00
fd12a55bcd Add helper function to options 2019-09-23 18:31:52 +02:00
00addc5620 Refactor code 2019-09-20 20:30:12 +02:00
f68bd8df74 Refactor towards dependent types 2019-09-11 22:39:47 +02:00
601dbcf677 Bump up version 2019-09-07 15:55:36 +02:00
5998f8a316 Documentation on Probabilistic-MultiLabel-F1 metric 2019-09-07 15:48:13 +02:00
b540cba7da Implement Probabilistic-MultiLabel-F1 2019-09-07 14:16:06 +02:00
c011ba3962 Towards generalised soft F-Measure 2019-09-07 13:07:14 +02:00
e5efd90bc3 Refactor - introducte Probability file 2019-09-07 12:34:45 +02:00
3266919da9 Merge branch 'master' of git.applica.pl:gonito/geval 2019-09-03 20:19:22 +02:00
029e3880f7 Soft2D-F... Metric is inclusive now. 2019-09-03 17:19:05 +02:00
636557a772 Fix numbers in description of Soft2D-F1 2019-08-23 07:27:03 +02:00
5f9d2b85c7 Fix typo 2019-08-22 17:26:59 +02:00
e4a6ed347d Change the meaning of Soft2D-F1 metric.
Now it is averaged per line.
2019-08-22 17:07:32 +02:00
6b63740c4a Add Soft2D-F metric 2019-08-22 13:20:29 +02:00
dab2646798 Start working on --list-metrics options 2019-08-21 23:44:18 +02:00
4d069e8102 Handle preprocessing operations for metrics 2019-08-12 17:50:48 +02:00
4452095538 More checks in validation 2019-08-10 16:31:54 +02:00
9b79b8761d Check whether the maximum values is obtained during the validation 2019-08-10 15:55:51 +02:00
fcb5c454c4 Add forgotten file 2019-08-10 13:00:10 +02:00
2236899c3d Refactor (introduce GEval.Metric) 2019-08-10 12:30:17 +02:00
Karol Kaczmarek
19d231e140 Add simple validation of a challenge (--validate option) 2019-08-05 09:28:12 +02:00
b4ad774623 Trying to get references 2019-05-23 16:16:05 +02:00
780b7016c5 Refactor feature extraction 2019-05-23 10:03:26 +02:00
e5220d71d8 Clip loess and switch to gaussian in Calibration 2019-03-19 20:41:06 +01:00
de40851b5a Fix corner cases in calibration 2019-03-19 19:44:19 +01:00
Filip Gralinski
feacd5844c Minor fix in plotting the graph 2019-03-19 15:11:04 +01:00
Filip Gralinski
9beb68867a Fix grave error in calculating calibration 2019-03-19 15:10:23 +01:00
Filip Gralinski
c578efe28d Change lambda parameter in Loess 2019-03-19 08:45:58 +01:00
Filip Gralinski
d98003fc0e Introduce lambda constant in Loess 2019-03-19 08:35:41 +01:00
Filip Gralinski
adadf14888 Fix simple mistake when plotting a Loess graph 2019-03-19 07:46:10 +01:00
eb10a4c3b4 Add plotting graphs for selected metrics 2019-03-19 07:31:17 +01:00
816c83f183 Handle Probabilistic-Soft-F1 when creating a challenge 2019-03-12 22:39:32 +01:00
ae27029f61 Implement Probabilistic-Soft-F1 2019-03-12 22:35:19 +01:00
8393bec3ae Implement auxiliary calibration function 2019-03-12 08:58:21 +01:00
19642db43f Add auxilliary functions 2019-02-22 11:22:12 +01:00
fcb16d43f1 Export extensionsHandled 2019-02-14 22:29:44 +01:00
1a9fe36a9e Handle JSONl (for MultiLabel-F) 2019-02-14 19:01:53 +01:00
2ea53f92c7 Refactor gevalCore 2019-02-14 16:48:55 +01:00
709eeec4ef Merge branch 'master' into jsonl 2019-02-14 16:35:41 +01:00
4b85c4c1bb Merge branch 'master' of ssh://gonito.net/geval 2019-02-14 16:26:19 +01:00
b2e3293a12 Refactor line-by-line mode 2019-02-14 16:25:28 +01:00
Filip Gralinski
26e9735d31 Handle jsonl files 2019-02-14 10:54:25 +01:00
Filip Gralinski
872724722a Fix SMAPE on zero values 2019-02-12 08:36:52 +01:00
af21031172 Do not preprocess outputs for some metrics 2019-02-01 13:10:45 +01:00
ad5d614f48 Filter out NaN values so that sorting is not poisoned 2019-02-01 12:32:56 +01:00
f7bd1b2ccd Add missing file, when generating a challenge 2019-02-01 12:04:52 +01:00
abcce9bf68 Add numerical features 2019-02-01 10:58:29 +01:00
d5a8908599 Refactor Features into Factors 2019-01-26 19:26:45 +01:00
1c3908b273 Refactor CartesianFeature type 2019-01-26 18:00:36 +01:00
ea5de5c719 Introduce existential features 2019-01-26 17:18:41 +01:00
Filip Gralinski
dbf5c961af Start numerical factors 2019-01-23 13:00:37 +01:00
1aee476434 Add --filtre option 2019-01-14 23:23:50 +01:00
4003715726 Fix issue with sorting 2019-01-13 12:09:15 +01:00
b0c75cac3a Change features into "factors"
(Just the terminology was changed)
2019-01-11 16:08:56 +01:00
de901d4c64 Add min-cartesian-feature (as optional value) 2019-01-11 10:16:39 +01:00
dbe1613052 Filtre out unwanted Cartesian features 2019-01-11 08:47:11 +01:00
39bc3964b3 Speed up cartesian features 2019-01-10 22:53:43 +01:00
23aad86e72 Add cartesian features to black-box debugging
But it's very slow now, needs to be sped up
2019-01-10 14:01:29 +01:00
99e3a10791 Add bigram features in black-box debugging 2019-01-10 10:41:55 +01:00
13f9629cbc Minor refactor 2019-01-10 10:00:51 +01:00
212457077f Consider word shapes in black-box debugging 2019-01-10 09:58:04 +01:00
e0cfb9c4b0 Add --min-frequency for black box debugging 2019-01-10 08:15:34 +01:00
1832a23b75 Refactor features 2019-01-09 17:45:06 +01:00
5d19fc7585 Add character-by-character tokenization. 2018-12-17 07:54:12 +01:00