From f1e7cc792d330051f03434f3dc5755db8665532a Mon Sep 17 00:00:00 2001 From: "mateusz.hinc" Date: Tue, 15 Oct 2019 09:45:42 +0200 Subject: [PATCH] fix typos, fix grammar, add --single-branch to git commands --- README.md | 24 ++++++++++++------------ src/GEval/OptionsParser.hs | 2 +- 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 61b86dc..dec034e 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # GEval GEval is a Haskell library and a stand-alone tool for evaluating the -results of solutions to machine learning challenges as defined on the +results of solutions to machine learning challenges as defined in the [Gonito](https://gonito.net) platform. Also could be used outside the context of Gonito.net challenges, assuming the test data is given in simple TSV (tab-separated values) files. @@ -50,12 +50,12 @@ If you see a message like this: already installed but in a non-standard location then you can use the flags --extra-include-dirs= and --extra-lib-dirs= to specify where it is. If the header file does exist, it may contain errors that are caught by the C - compiler at the preprocessing stage. In this case you can re-run configure + compiler at the preprocessing stage. In this case, you can re-run configure with the verbosity flag -v3 to see the error messages. it means that you need to install lzma library on your operating system. The same might go for pkg-config. On macOS (it's more likely -to happen on macOS, as these packages are usually installed out of box on Linux), you need to run: +to happen on macOS, as these packages are usually installed out of the box on Linux), you need to run: brew install xz brew install pkg-config @@ -74,7 +74,7 @@ This is a fully static binary, it should work on any 64-bit Linux. Let's use GEval to evaluate machine translation (MT) systems (but keep in mind than GEval could be used for many other machine learning task -types). We start with simple evaluation, but then we switch to what +types). We start with a simple evaluation, but then we switch to what might be called black-box debugging of ML models. First, we will run GEval on WMT-2017, a German-to-English machine @@ -84,7 +84,7 @@ run on other test sets, not just the ones conforming to specific Gonito.net standards). Let's download one of the solutions, it's just available via git, so you don't have to click anywhere, just type: - git clone git://gonito.net/wmt-2017 -b submission-01229 + git clone git://gonito.net/wmt-2017 -b submission-01229 --single-branch Let's step into the repo and run GEval (I assume you added `geval` path to `$PATH`, so that you could just use `geval` instead of @@ -178,11 +178,11 @@ For instance, the average GLEU score for sentences for which a double quote is e is 0.27823151. At first glance, it does not seem much worse than the general score (0.30514), but actually… 4. … it's highly significant. The probability to get it by chance -(according to [Mann-Whitney _U_ test](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test)) +(according to the [Mann-Whitney _U_ test](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test)) is extremely low (_p_ = 0.000009). But why were double quotes so problematic in German-English -translation?! Well, look at the second worst feature — `''` +translation?! Well, look at the second-worst feature — `''` in the _output_! Oops, it seems like a very stupid mistake with post-processing was done and no double quote was correctly generated, which decreased the score a little bit for each sentence in which the @@ -313,7 +313,7 @@ have a look at the first 5 items: Now let's try to evaluate some solution to this challenge. Let's fetch it: - git fetch git://gonito.net/sentiment-by-emoticons submission-01865 + git fetch git://gonito.net/sentiment-by-emoticons submission-01865 --single-branch git reset --hard FETCH_HEAD and now run geval: @@ -327,7 +327,7 @@ be hard to interpret, so you could try other metrics. geval -t dev-0 --metric Accuracy --metric Likelihood So now you can see that the accuracy is over 78% and the likelihood -(i.e. geometric mean of probabilities of the correct classes) is 0.62. +(i.e. the geometric mean of probabilities of the correct classes) is 0.62. ## Yet another example @@ -575,7 +575,7 @@ special `--submit` option: where: * _HOST_ is the name of the host with a Gonito platform -* _TOKEN_ is a special per-user authorisation token (can be copied +* _TOKEN_ is a special per-user authorization token (can be copied from "your account" page) _HOST_ must be given when `--submit` is used (unless the creator of the challenge @@ -622,7 +622,7 @@ Available options: set -w,--worst-features Print a ranking of worst features, i.e. features that worsen the score significantly. Features are sorted - using p-value for Mann-Whitney U test comparing the + using p-value for the Mann-Whitney U test comparing the items with a given feature and without it. For each feature the number of occurrences, average score and p-value is given. @@ -682,7 +682,7 @@ Available options: If you need another metric, let me know, or do it yourself! -## Licence +## License Apache License 2.0 diff --git a/src/GEval/OptionsParser.hs b/src/GEval/OptionsParser.hs index f2e7566..c69bb24 100644 --- a/src/GEval/OptionsParser.hs +++ b/src/GEval/OptionsParser.hs @@ -64,7 +64,7 @@ optionsParser = GEvalOptions (flag' WorstFeatures ( long "worst-features" <> short 'w' - <> help "Print a ranking of worst features, i.e. features that worsen the score significantly. Features are sorted using p-value for Mann-Whitney U test comparing the items with a given feature and without it. For each feature the number of occurrences, average score and p-value is given." )) + <> help "Print a ranking of worst features, i.e. features that worsen the score significantly. Features are sorted using p-value for the Mann-Whitney U test comparing the items with a given feature and without it. For each feature the number of occurrences, average score and p-value is given." )) <|> (Diff <$> strOption ( long "diff"