geval/README.md

74 lines
2.8 KiB
Markdown
Raw Normal View History

2015-08-25 16:10:20 +02:00
# GEval
GEval is a library (and a stand-alone tool) for evaluating the results
of solutions to machine learning challenges as defined in the Gonito
platform.
2015-08-25 16:44:53 +02:00
Note that GEval is only about machine learning evaluation. No actual
machine learning algorithms are available here.
2015-08-25 16:10:20 +02:00
## Installing
You need [Haskell Stack](https://github.com/commercialhaskell/stack),
2015-08-25 21:26:51 +02:00
then install GEval with:
2015-08-25 16:10:20 +02:00
git clone https://github.com/filipg/geval
cd geval
stack setup
stack install
2015-08-25 21:26:51 +02:00
By default `geval` library is installed in `$HOME/.local/bin`, so in
order to run `geval` you need to either add `$HOME/.local/bin` to
`$PATH` or to type:
2015-08-25 16:44:53 +02:00
PATH="$HOME/.local/bin" geval
2015-08-25 21:26:51 +02:00
## Preparing a Gonito challenge
### Directory structure of a Gonito challenge
2015-08-25 16:44:53 +02:00
A definition of a Gonito challenge should be put in a separate
directory (preferably as a separate Git repo). Such a directory should
have the following structure:
* `config.txt` — simple configuration file with options the same as
the ones accepted by `geval` binary (see below), usually just a
metric is specified here (e.g. `--metric BLEU`), also non-default
file names could be given here (e.g. `--test-name test-B` for a
non-standard test subdirectory)
* `README.md` — description of a challenge in Markdown
* `train/` — subdirectory with training data (if training data are
supplied for a given Gonito challenge at all)
* `train/train.tsv` — the usual name of training data (this name is
2015-08-25 21:26:51 +02:00
not required and could be more than one file), the first column is the
target (predicted) value, the other columns represent features, no
2015-08-25 16:44:53 +02:00
header is assumed
* `dev-0/` — subdirectory with a development set (a sample test set,
which won't be used for the final evaluation)
* `dev-0/in.tsv` — input data (the same format as `train/train.tsv`,
but without the first column)
* `dev-0/expected.tsv` — values to be guessed (note that `paste
dev-0/expected.tsv dev-0/in.tsv` should give the same format as
`train/train.tsv`)
* `dev-1/`, `dev-2`, ... — other dev sets (if supplied)
* `test-A/` — subdirectory with the test set
* `test-A/in.tsv` — test input (the same format as `dev-0/in.tsv`)
* `test-A/expected.tsv` — values to be guessed (the same format as
`dev-0/expected.tsv`), note that this file should be "hidden" by the
organizers of a Gonito challenge, see notes on the structure of
commits below
* `test-B`, `test-C`, ... — other alternative test sets (if supplied)
2015-08-25 21:26:51 +02:00
### Initiating a Gonito challenge with geval
You can use `geval` to initiate a Gonito challenge:
geval --init --expected-directory my-challenge
(This will generate a sample toy challenge with guessing the mass of a planet).
A metric (other than the default root-mean-square error) can be given
to generate another type of a toy challenge:
geval --init --expected-directory my-mt-challenge --metric BLEU