Add documentation on git
This commit is contained in:
parent
904489ecc8
commit
603f6c0097
81
README.md
81
README.md
@ -39,6 +39,87 @@ After installing Stack:
|
|||||||
The last command will start the Web server with Gonito (go to
|
The last command will start the Web server with Gonito (go to
|
||||||
http://127.0.0.1:3000 in your browser).
|
http://127.0.0.1:3000 in your browser).
|
||||||
|
|
||||||
|
Gonito & git
|
||||||
|
------------
|
||||||
|
|
||||||
|
Gonito uses git in an inherent manner:
|
||||||
|
|
||||||
|
* challenges (data sets) are provided as git repositories,
|
||||||
|
* submissions are uploaded via git repositories, they are referred to with
|
||||||
|
git commit hashes.
|
||||||
|
|
||||||
|
Advantages:
|
||||||
|
|
||||||
|
* great flexibility as far as where you want to keep your challenges
|
||||||
|
and submissions (could be external, well-known services such as
|
||||||
|
GitHub or GitLab, your local git server, let's say gitolite or Gogs, or
|
||||||
|
just a disk accessible in a Gonito instance),
|
||||||
|
* even if Gonito ceases to exist, the challenges and submissions are still available
|
||||||
|
in a standard manner, provided that git repositories (be it external or local) are
|
||||||
|
accessible,
|
||||||
|
* data sets can be easily downloaded using the command line
|
||||||
|
(e.g. `git clone git://gonito.net/paranormal-or-skeptic`), without
|
||||||
|
even clicking anything in the Web browser,
|
||||||
|
* facilitates experiment repeatability and reproducibility (at worst
|
||||||
|
the system output is easily available via git)
|
||||||
|
* tools that were used to generate the output could be linked as git subrepositories
|
||||||
|
* some challenge/submission metadata are tracked in a Gonito-independent way
|
||||||
|
(within git commits),
|
||||||
|
* copying data can be avoided with git mechanisms (e.g. when the challenge is already
|
||||||
|
cloned, downloading specific submissions should be much quicker),
|
||||||
|
* large data sets and models could be stored if needed using mechanisms such as git-annex (see below).
|
||||||
|
|
||||||
|
### Commit structure
|
||||||
|
|
||||||
|
The following flow of git commits is recommended (though not required):
|
||||||
|
|
||||||
|
* the challenge without hidden data for main test sets (i.e. files such as `test-A/expected.tsv`)
|
||||||
|
should be pushed to the `master` branch
|
||||||
|
* the hidden files (`test-A/expected.tsv`) should be added in a
|
||||||
|
subsequent commit and pushed either to the `dont-peek` branch or a
|
||||||
|
`master` branch of a separate repository (if access to the hidden
|
||||||
|
data must be more strict),
|
||||||
|
* the submissions should be committed with the `master` branch as the
|
||||||
|
parent (or at least ancestor) commit and pushed to the same
|
||||||
|
repository as the challenge data (in some user-specific branch) or any other
|
||||||
|
repository (could be user-owned repositories)
|
||||||
|
* any subsequent submissions could be derived in a natural way from other git commits
|
||||||
|
(e.g. when a submission is improved, or even two approaches are merged)
|
||||||
|
* new versions of the challenge can be committed (a challenge can be updated at Gonito)
|
||||||
|
to the `master` (and `dont-peek`) branches
|
||||||
|
|
||||||
|
See also the following picture:
|
||||||
|
|
||||||
|
![Recommended commit structure](misc/commits.png)
|
||||||
|
|
||||||
|
### git-annex
|
||||||
|
|
||||||
|
In some cases, you don't want to store challenge/submissions files simply in git:
|
||||||
|
|
||||||
|
* very large data files, textual files (e.g. `train/in.tsv` even if
|
||||||
|
compressed as `train/in.tsv.xz`)
|
||||||
|
* binary training/testing data (PDF files, images, movies, recordings)
|
||||||
|
* data sensitive due to privacy/security concerns (a scenario where it's OK to store
|
||||||
|
metadata and some files in a widely accessible repository, but some files require
|
||||||
|
limited access)
|
||||||
|
* large ML models (note that Gonito does not require models for evaluation, but still
|
||||||
|
it might be a good practice to commit them along with output files and scripts)
|
||||||
|
|
||||||
|
Such cases can be handled in a natural manner using git-annex, a git
|
||||||
|
extension for handling files and their metadata without commiting
|
||||||
|
their content to the repository. The contents can be stored at a wide
|
||||||
|
range of [special
|
||||||
|
remotes](https://git-annex.branchable.com/special_remotes/), e.g. S3
|
||||||
|
buckets, WebDAV, rsync servers.
|
||||||
|
|
||||||
|
It's up to you which files are stored in git in a regular manner and
|
||||||
|
which are added with `git annex add`, but note that if a
|
||||||
|
challenge/submission file must be stored via git-annex and are required
|
||||||
|
for evaluation (e.g. `expected.tsv` files for the challenge or
|
||||||
|
`out.tsv` files for submissions), the git-annex special remote must be
|
||||||
|
given when a challenge is created or a submission is done and the
|
||||||
|
Gonito server must have access to such a special remote.
|
||||||
|
|
||||||
Authors
|
Authors
|
||||||
-------
|
-------
|
||||||
|
|
||||||
|
1
misc/commits.drawio
Normal file
1
misc/commits.drawio
Normal file
@ -0,0 +1 @@
|
|||||||
|
<mxfile host="app.diagrams.net" modified="2020-12-31T10:37:51.079Z" agent="5.0 (X11)" etag="f1XQEwDVaud7Ry-L3Hhg" version="14.1.1" type="device"><diagram id="_L_Cg4mIP0IpHJWha70Y" name="Page-1">7Vpbc5s6EP41zEkfzIC42H5MnOSch3ZOZzKdtk8dGTagBBAj5Ft/fVdGGDCNQ9q6kE6frF2thLT7fasV2HAW6fZfQfP4HQ8hMYgVbg3n2iDEdomFP0qz05q5Mys1kWCh1tWKO/YVtFIPjFYshKJlKDlPJMvbyoBnGQSypaNC8E3b7J4n7afmNIKO4i6gSVf7kYUyLrUzMq31/wGL4urJtj8ve1JaGeudFDEN+aahcm4MZyE4l2Ur3S4gUd6r/FKOu32i97AwAZnsMyD/ED58+T9YvXM/Wvzh6+P2jr2d6FnWNFnpDRvET3C+q5CtsRmpZhDTJIEMHaX7lqLqqjT42MaA78yR0kKC+L75GkTBeIbrsE3LrNwld1UMBF9lIahtWGi9iZmEu5wGqneDsENdLNMEJRubhRT88RArsn/YfosgJGyf9J19iAhiGXgKUuzQRA9wKjhqGHszr5Q3NSbsKtBxAw++1lENw+gwdR0pbOhgvSBwTsdLECJwtciFjHnEM5rc1Nqrth9rm7ec59p7DyDlTrOQriRv+xa9JXaf1Hhz6lXyZz3fXrjetqSdlp4Kilrz6ZDgFvlKBHDCFUSnBSoikM9hvRtiAQmVbN1exy+PF+kQLeSZnOQAj0+z6hRHevLuQkIhJ5doB9scEySEpizWaochQuFNf0aPjpOu3+ak61n9ODk9Fye9gTlJxsNJ9zVw0u1w8jJhAfxTqB2ulikrSsodU2RPHcNBUlnG9CqE9QQRdauJNsV5bvlKljRbqBKGZ0xyc0dTtaKL5swWboSGVNI3pWkRCJbLohSuefAI4p4hTvZyqqor3QcyMA0yxqPS9o5oafekpW2di5fTgXk5Hlr6PWnpDklL/wQtWZoLvsa4nGboGHlB2rwg86GPK7ubPn4DL86K7/lrOHbmHXyfuGA16z57nHcjpw1sx+oJ7LPdjWzyxwHbrt59PIdsMiSy7fkgJ+2WyU/VWYrtz/Whi1J9zCqhOmXHeZPtHeX5oFHuvjT68bus/ZK7bApC5UiL36tUwrAMV5Oo+lktp7wbqx1nqj5Y5dihrBEvVa/9qi+8s6MKwumZaGdnS7T+38q6coXXk7r+oNQd+irkWLNmyCaWafnTZ+K2l96DYOgDEKMK5rB52Ovk4X7vL8rMqHLkPRdlVjJHXGAeXgxU35AGf9F3ANLfvFf5fdxUqVbZoMoVX3aIMjroO57bgv50cOQP8t3pvAgmPRFc5tufgPB+6KUQdNcwyDnL1Bvfw8zvlaKGAHHb2W929JX3yNz2T5ljo3x+jYDDRn4CFO7A6XBEHz56o6nxH4AhEiLpJMRFTEXCutWD+twQ8HTJMpZF2L4AM1LlAmQFpMsElXipsvaVx+Hu1U2uShphgiVHtcVk1jPDzl+eYVGs/+1Rcq/+04xz8w0=</diagram></mxfile>
|
BIN
misc/commits.png
Normal file
BIN
misc/commits.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 44 KiB |
Loading…
Reference in New Issue
Block a user