fix typos, fix grammar, add --single-branch to git commands

This commit is contained in:
mateusz.hinc 2019-10-15 09:45:42 +02:00
parent f68bd8df74
commit f1e7cc792d
2 changed files with 13 additions and 13 deletions

View File

@ -1,7 +1,7 @@
# GEval # GEval
GEval is a Haskell library and a stand-alone tool for evaluating the GEval is a Haskell library and a stand-alone tool for evaluating the
results of solutions to machine learning challenges as defined on the results of solutions to machine learning challenges as defined in the
[Gonito](https://gonito.net) platform. Also could be used outside the [Gonito](https://gonito.net) platform. Also could be used outside the
context of Gonito.net challenges, assuming the test data is given in context of Gonito.net challenges, assuming the test data is given in
simple TSV (tab-separated values) files. simple TSV (tab-separated values) files.
@ -50,12 +50,12 @@ If you see a message like this:
already installed but in a non-standard location then you can use the flags already installed but in a non-standard location then you can use the flags
--extra-include-dirs= and --extra-lib-dirs= to specify where it is. --extra-include-dirs= and --extra-lib-dirs= to specify where it is.
If the header file does exist, it may contain errors that are caught by the C If the header file does exist, it may contain errors that are caught by the C
compiler at the preprocessing stage. In this case you can re-run configure compiler at the preprocessing stage. In this case, you can re-run configure
with the verbosity flag -v3 to see the error messages. with the verbosity flag -v3 to see the error messages.
it means that you need to install lzma library on your operating it means that you need to install lzma library on your operating
system. The same might go for pkg-config. On macOS (it's more likely system. The same might go for pkg-config. On macOS (it's more likely
to happen on macOS, as these packages are usually installed out of box on Linux), you need to run: to happen on macOS, as these packages are usually installed out of the box on Linux), you need to run:
brew install xz brew install xz
brew install pkg-config brew install pkg-config
@ -74,7 +74,7 @@ This is a fully static binary, it should work on any 64-bit Linux.
Let's use GEval to evaluate machine translation (MT) systems (but keep Let's use GEval to evaluate machine translation (MT) systems (but keep
in mind than GEval could be used for many other machine learning task in mind than GEval could be used for many other machine learning task
types). We start with simple evaluation, but then we switch to what types). We start with a simple evaluation, but then we switch to what
might be called black-box debugging of ML models. might be called black-box debugging of ML models.
First, we will run GEval on WMT-2017, a German-to-English machine First, we will run GEval on WMT-2017, a German-to-English machine
@ -84,7 +84,7 @@ run on other test sets, not just the ones conforming to specific
Gonito.net standards). Let's download one of the solutions, it's just Gonito.net standards). Let's download one of the solutions, it's just
available via git, so you don't have to click anywhere, just type: available via git, so you don't have to click anywhere, just type:
git clone git://gonito.net/wmt-2017 -b submission-01229 git clone git://gonito.net/wmt-2017 -b submission-01229 --single-branch
Let's step into the repo and run GEval (I assume you added `geval` Let's step into the repo and run GEval (I assume you added `geval`
path to `$PATH`, so that you could just use `geval` instead of path to `$PATH`, so that you could just use `geval` instead of
@ -178,11 +178,11 @@ For instance, the average GLEU score for sentences for which a double quote is e
is 0.27823151. At first glance, it does not seem much worse than the general score is 0.27823151. At first glance, it does not seem much worse than the general score
(0.30514), but actually… (0.30514), but actually…
4. … it's highly significant. The probability to get it by chance 4. … it's highly significant. The probability to get it by chance
(according to [Mann-Whitney _U_ test](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test)) (according to the [Mann-Whitney _U_ test](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test))
is extremely low (_p_ = 0.000009). is extremely low (_p_ = 0.000009).
But why were double quotes so problematic in German-English But why were double quotes so problematic in German-English
translation?! Well, look at the second worst feature — `''` translation?! Well, look at the second-worst feature — `''`
in the _output_! Oops, it seems like a very stupid mistake with in the _output_! Oops, it seems like a very stupid mistake with
post-processing was done and no double quote was correctly generated, post-processing was done and no double quote was correctly generated,
which decreased the score a little bit for each sentence in which the which decreased the score a little bit for each sentence in which the
@ -313,7 +313,7 @@ have a look at the first 5 items:
Now let's try to evaluate some solution to this challenge. Let's fetch it: Now let's try to evaluate some solution to this challenge. Let's fetch it:
git fetch git://gonito.net/sentiment-by-emoticons submission-01865 git fetch git://gonito.net/sentiment-by-emoticons submission-01865 --single-branch
git reset --hard FETCH_HEAD git reset --hard FETCH_HEAD
and now run geval: and now run geval:
@ -327,7 +327,7 @@ be hard to interpret, so you could try other metrics.
geval -t dev-0 --metric Accuracy --metric Likelihood geval -t dev-0 --metric Accuracy --metric Likelihood
So now you can see that the accuracy is over 78% and the likelihood So now you can see that the accuracy is over 78% and the likelihood
(i.e. geometric mean of probabilities of the correct classes) is 0.62. (i.e. the geometric mean of probabilities of the correct classes) is 0.62.
## Yet another example ## Yet another example
@ -575,7 +575,7 @@ special `--submit` option:
where: where:
* _HOST_ is the name of the host with a Gonito platform * _HOST_ is the name of the host with a Gonito platform
* _TOKEN_ is a special per-user authorisation token (can be copied * _TOKEN_ is a special per-user authorization token (can be copied
from "your account" page) from "your account" page)
_HOST_ must be given when `--submit` is used (unless the creator of the challenge _HOST_ must be given when `--submit` is used (unless the creator of the challenge
@ -622,7 +622,7 @@ Available options:
set set
-w,--worst-features Print a ranking of worst features, i.e. features that -w,--worst-features Print a ranking of worst features, i.e. features that
worsen the score significantly. Features are sorted worsen the score significantly. Features are sorted
using p-value for Mann-Whitney U test comparing the using p-value for the Mann-Whitney U test comparing the
items with a given feature and without it. For each items with a given feature and without it. For each
feature the number of occurrences, average score and feature the number of occurrences, average score and
p-value is given. p-value is given.
@ -682,7 +682,7 @@ Available options:
If you need another metric, let me know, or do it yourself! If you need another metric, let me know, or do it yourself!
## Licence ## License
Apache License 2.0 Apache License 2.0

View File

@ -64,7 +64,7 @@ optionsParser = GEvalOptions
(flag' WorstFeatures (flag' WorstFeatures
( long "worst-features" ( long "worst-features"
<> short 'w' <> short 'w'
<> help "Print a ranking of worst features, i.e. features that worsen the score significantly. Features are sorted using p-value for Mann-Whitney U test comparing the items with a given feature and without it. For each feature the number of occurrences, average score and p-value is given." )) <> help "Print a ranking of worst features, i.e. features that worsen the score significantly. Features are sorted using p-value for the Mann-Whitney U test comparing the items with a given feature and without it. For each feature the number of occurrences, average score and p-value is given." ))
<|> <|>
(Diff <$> strOption (Diff <$> strOption
( long "diff" ( long "diff"