87 KiB
Inżynieria uczenia maszynowego
22 maja 2024
10. DVC
DVC - Data Version Control
- dvc.org
- "Version Control System for Machine Learning Projects" (System kontroli wersji dla projektów uczenia maszynowego)
- Open Source
- Umożliwia:
- wersjonowanie danych i modeli. "Git dla danych i modeli"
- budowanie potoków ("pipeline") definiujących jak budować/trenować/ewaluować modele. "Makefile dla uczenia maszynowego"
- śledzenie, porównywanie metryk i parametrów
- ściśle zintegowany z gitem
- działa niezależnie od używanego języka/bibliotek i systemu operacyjnego
- 5-minutowe wprowadzenie: https://www.youtube.com/watch?v=UbL7VUpv1Bs
Śledzenie plików za pomocą DVC
- dużymi plikami, takimi jak plikami z danymi wejściowymi czy plikami modeli, trudno zarządza się za pomocą gita, ze względu na problemy z:
- wydajnością
- przestrzenią w repozytorium
- ograniczenia ze strony serwisu (np. limit 100 MB na plik w Github)
- Git posiada rozszerzenie lfs(Large File Storage), które stanowi pewne rozwiązanie tego problemu.
- Same pliki przechowywane są na specjalnym zdalnym serwerze, w repozytorium przechowywane są jedynie odnośniki do tych plików i pewne metadane
- Github ma zintegrowany LFS z limitem 1GB dla kont bezpłatnych
- DVC proponuje podobne podejście co LFS, ale:
Instalacja i inicjalizacja
- https://dvc.org/doc/install
pip install dvc
pipx install dvc
conda install dvc
!pip3 install dvc
Requirement already satisfied: dvc in ./venv/lib/python3.10/site-packages (3.50.2) Requirement already satisfied: attrs>=22.2.0 in ./venv/lib/python3.10/site-packages (from dvc) (23.2.0) Requirement already satisfied: psutil>=5.8 in ./venv/lib/python3.10/site-packages (from dvc) (5.9.8) Requirement already satisfied: zc.lockfile>=1.2.1 in ./venv/lib/python3.10/site-packages (from dvc) (3.0.post1) Requirement already satisfied: ruamel.yaml>=0.17.11 in ./venv/lib/python3.10/site-packages (from dvc) (0.18.6) Requirement already satisfied: dvc-http>=2.29.0 in ./venv/lib/python3.10/site-packages (from dvc) (2.32.0) Requirement already satisfied: shortuuid>=0.5 in ./venv/lib/python3.10/site-packages (from dvc) (1.0.13) Requirement already satisfied: platformdirs<4,>=3.1.1 in ./venv/lib/python3.10/site-packages (from dvc) (3.11.0) Requirement already satisfied: scmrepo<4,>=3.3.2 in ./venv/lib/python3.10/site-packages (from dvc) (3.3.5) Requirement already satisfied: pygtrie>=2.3.2 in ./venv/lib/python3.10/site-packages (from dvc) (2.5.0) Requirement already satisfied: dvc-data<3.16,>=3.15 in ./venv/lib/python3.10/site-packages (from dvc) (3.15.1) Requirement already satisfied: fsspec in ./venv/lib/python3.10/site-packages (from dvc) (2024.5.0) Requirement already satisfied: dvc-objects in ./venv/lib/python3.10/site-packages (from dvc) (5.1.0) Requirement already satisfied: grandalf<1,>=0.7 in ./venv/lib/python3.10/site-packages (from dvc) (0.8) Requirement already satisfied: hydra-core>=1.1 in ./venv/lib/python3.10/site-packages (from dvc) (1.3.2) Requirement already satisfied: tabulate>=0.8.7 in ./venv/lib/python3.10/site-packages (from dvc) (0.9.0) Requirement already satisfied: colorama>=0.3.9 in ./venv/lib/python3.10/site-packages (from dvc) (0.4.6) Requirement already satisfied: tomlkit>=0.11.1 in ./venv/lib/python3.10/site-packages (from dvc) (0.12.5) Requirement already satisfied: rich>=12 in ./venv/lib/python3.10/site-packages (from dvc) (13.7.1) Requirement already satisfied: dvc-render<2,>=1.0.1 in ./venv/lib/python3.10/site-packages (from dvc) (1.0.2) Requirement already satisfied: flufl.lock<8,>=5 in ./venv/lib/python3.10/site-packages (from dvc) (7.1.1) Requirement already satisfied: packaging>=19 in ./venv/lib/python3.10/site-packages (from dvc) (24.0) Requirement already satisfied: pyparsing>=2.4.7 in ./venv/lib/python3.10/site-packages (from dvc) (3.1.2) Requirement already satisfied: flatten-dict<1,>=0.4.1 in ./venv/lib/python3.10/site-packages (from dvc) (0.4.2) Requirement already satisfied: dulwich in ./venv/lib/python3.10/site-packages (from dvc) (0.22.1) Requirement already satisfied: configobj>=5.0.6 in ./venv/lib/python3.10/site-packages (from dvc) (5.0.8) Requirement already satisfied: dvc-studio-client<1,>=0.20 in ./venv/lib/python3.10/site-packages (from dvc) (0.20.0) Requirement already satisfied: omegaconf in ./venv/lib/python3.10/site-packages (from dvc) (2.3.0) Requirement already satisfied: distro>=1.3 in ./venv/lib/python3.10/site-packages (from dvc) (1.9.0) Requirement already satisfied: dpath<3,>=2.1.0 in ./venv/lib/python3.10/site-packages (from dvc) (2.1.6) Requirement already satisfied: funcy>=1.14 in ./venv/lib/python3.10/site-packages (from dvc) (2.0) Requirement already satisfied: dvc-task<1,>=0.3.0 in ./venv/lib/python3.10/site-packages (from dvc) (0.4.0) Requirement already satisfied: voluptuous>=0.11.7 in ./venv/lib/python3.10/site-packages (from dvc) (0.14.2) Requirement already satisfied: kombu in ./venv/lib/python3.10/site-packages (from dvc) (5.3.7) Requirement already satisfied: shtab<2,>=1.3.4 in ./venv/lib/python3.10/site-packages (from dvc) (1.7.1) Requirement already satisfied: iterative-telemetry>=0.0.7 in ./venv/lib/python3.10/site-packages (from dvc) (0.0.8) Requirement already satisfied: celery in ./venv/lib/python3.10/site-packages (from dvc) (5.4.0) Requirement already satisfied: pathspec>=0.10.3 in ./venv/lib/python3.10/site-packages (from dvc) (0.12.1) Requirement already satisfied: requests>=2.22 in ./venv/lib/python3.10/site-packages (from dvc) (2.31.0) Requirement already satisfied: tqdm<5,>=4.63.1 in ./venv/lib/python3.10/site-packages (from dvc) (4.66.2) Requirement already satisfied: gto<2,>=1.6.0 in ./venv/lib/python3.10/site-packages (from dvc) (1.7.1) Requirement already satisfied: networkx>=2.5 in ./venv/lib/python3.10/site-packages (from dvc) (3.3) Requirement already satisfied: pydot>=1.2.4 in ./venv/lib/python3.10/site-packages (from dvc) (2.0.0) Requirement already satisfied: six in ./venv/lib/python3.10/site-packages (from configobj>=5.0.6->dvc) (1.16.0) Requirement already satisfied: sqltrie<1,>=0.11.0 in ./venv/lib/python3.10/site-packages (from dvc-data<3.16,>=3.15->dvc) (0.11.0) Requirement already satisfied: dictdiffer>=0.8.1 in ./venv/lib/python3.10/site-packages (from dvc-data<3.16,>=3.15->dvc) (0.9.0) Requirement already satisfied: diskcache>=5.2.1 in ./venv/lib/python3.10/site-packages (from dvc-data<3.16,>=3.15->dvc) (5.6.3) Requirement already satisfied: aiohttp-retry>=2.5.0 in ./venv/lib/python3.10/site-packages (from dvc-http>=2.29.0->dvc) (2.8.3) Requirement already satisfied: billiard<5.0,>=4.2.0 in ./venv/lib/python3.10/site-packages (from celery->dvc) (4.2.0) Requirement already satisfied: python-dateutil>=2.8.2 in ./venv/lib/python3.10/site-packages (from celery->dvc) (2.9.0.post0) Requirement already satisfied: click-plugins>=1.1.1 in ./venv/lib/python3.10/site-packages (from celery->dvc) (1.1.1) Requirement already satisfied: click-repl>=0.2.0 in ./venv/lib/python3.10/site-packages (from celery->dvc) (0.3.0) Requirement already satisfied: vine<6.0,>=5.1.0 in ./venv/lib/python3.10/site-packages (from celery->dvc) (5.1.0) Requirement already satisfied: tzdata>=2022.7 in ./venv/lib/python3.10/site-packages (from celery->dvc) (2024.1) Requirement already satisfied: click<9.0,>=8.1.2 in ./venv/lib/python3.10/site-packages (from celery->dvc) (8.1.7) Requirement already satisfied: click-didyoumean>=0.3.0 in ./venv/lib/python3.10/site-packages (from celery->dvc) (0.3.1) Requirement already satisfied: atpublic>=2.3 in ./venv/lib/python3.10/site-packages (from flufl.lock<8,>=5->dvc) (4.1.0) Requirement already satisfied: pydantic!=2.0.0,<3,>=1.9.0 in ./venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc) (2.7.1) Requirement already satisfied: semver>=2.13.0 in ./venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc) (3.0.2) Requirement already satisfied: typer>=0.4.1 in ./venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc) (0.12.3) Requirement already satisfied: entrypoints in ./venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc) (0.4) Requirement already satisfied: antlr4-python3-runtime==4.9.* in ./venv/lib/python3.10/site-packages (from hydra-core>=1.1->dvc) (4.9.3) Requirement already satisfied: appdirs in ./venv/lib/python3.10/site-packages (from iterative-telemetry>=0.0.7->dvc) (1.4.4) Requirement already satisfied: filelock in ./venv/lib/python3.10/site-packages (from iterative-telemetry>=0.0.7->dvc) (3.14.0) Requirement already satisfied: amqp<6.0.0,>=5.1.1 in ./venv/lib/python3.10/site-packages (from kombu->dvc) (5.2.0) Requirement already satisfied: PyYAML>=5.1.0 in ./venv/lib/python3.10/site-packages (from omegaconf->dvc) (6.0.1) Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.10/site-packages (from requests>=2.22->dvc) (2024.2.2) Requirement already satisfied: urllib3<3,>=1.21.1 in ./venv/lib/python3.10/site-packages (from requests>=2.22->dvc) (2.2.1) Requirement already satisfied: charset-normalizer<4,>=2 in ./venv/lib/python3.10/site-packages (from requests>=2.22->dvc) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.10/site-packages (from requests>=2.22->dvc) (3.6) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./venv/lib/python3.10/site-packages (from rich>=12->dvc) (2.17.2) Requirement already satisfied: markdown-it-py>=2.2.0 in ./venv/lib/python3.10/site-packages (from rich>=12->dvc) (3.0.0) Requirement already satisfied: ruamel.yaml.clib>=0.2.7 in ./venv/lib/python3.10/site-packages (from ruamel.yaml>=0.17.11->dvc) (0.2.8) Requirement already satisfied: pygit2>=1.14.0 in ./venv/lib/python3.10/site-packages (from scmrepo<4,>=3.3.2->dvc) (1.15.0) Requirement already satisfied: asyncssh<3,>=2.13.1 in ./venv/lib/python3.10/site-packages (from scmrepo<4,>=3.3.2->dvc) (2.14.2) Requirement already satisfied: gitpython>3 in ./venv/lib/python3.10/site-packages (from scmrepo<4,>=3.3.2->dvc) (3.1.43) Requirement already satisfied: setuptools in ./venv/lib/python3.10/site-packages (from zc.lockfile>=1.2.1->dvc) (59.6.0) Requirement already satisfied: aiohttp in ./venv/lib/python3.10/site-packages (from aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc) (3.9.5) Requirement already satisfied: typing-extensions>=3.6 in ./venv/lib/python3.10/site-packages (from asyncssh<3,>=2.13.1->scmrepo<4,>=3.3.2->dvc) (4.11.0) Requirement already satisfied: cryptography>=39.0 in ./venv/lib/python3.10/site-packages (from asyncssh<3,>=2.13.1->scmrepo<4,>=3.3.2->dvc) (42.0.7) Requirement already satisfied: prompt-toolkit>=3.0.36 in ./venv/lib/python3.10/site-packages (from click-repl>=0.2.0->celery->dvc) (3.0.43) Requirement already satisfied: gitdb<5,>=4.0.1 in ./venv/lib/python3.10/site-packages (from gitpython>3->scmrepo<4,>=3.3.2->dvc) (4.0.11) Requirement already satisfied: mdurl~=0.1 in ./venv/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich>=12->dvc) (0.1.2) Requirement already satisfied: pydantic-core==2.18.2 in ./venv/lib/python3.10/site-packages (from pydantic!=2.0.0,<3,>=1.9.0->gto<2,>=1.6.0->dvc) (2.18.2) Requirement already satisfied: annotated-types>=0.4.0 in ./venv/lib/python3.10/site-packages (from pydantic!=2.0.0,<3,>=1.9.0->gto<2,>=1.6.0->dvc) (0.7.0) Requirement already satisfied: cffi>=1.16.0 in ./venv/lib/python3.10/site-packages (from pygit2>=1.14.0->scmrepo<4,>=3.3.2->dvc) (1.16.0) Requirement already satisfied: orjson in ./venv/lib/python3.10/site-packages (from sqltrie<1,>=0.11.0->dvc-data<3.16,>=3.15->dvc) (3.10.3) Requirement already satisfied: shellingham>=1.3.0 in ./venv/lib/python3.10/site-packages (from typer>=0.4.1->gto<2,>=1.6.0->dvc) (1.5.4) Requirement already satisfied: async-timeout<5.0,>=4.0 in ./venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc) (4.0.3) Requirement already satisfied: aiosignal>=1.1.2 in ./venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc) (1.3.1) Requirement already satisfied: yarl<2.0,>=1.0 in ./venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc) (1.9.4) Requirement already satisfied: multidict<7.0,>=4.5 in ./venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc) (6.0.5) Requirement already satisfied: frozenlist>=1.1.1 in ./venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc) (1.4.1) Requirement already satisfied: pycparser in ./venv/lib/python3.10/site-packages (from cffi>=1.16.0->pygit2>=1.14.0->scmrepo<4,>=3.3.2->dvc) (2.22) Requirement already satisfied: smmap<6,>=3.0.1 in ./venv/lib/python3.10/site-packages (from gitdb<5,>=4.0.1->gitpython>3->scmrepo<4,>=3.3.2->dvc) (5.0.1) Requirement already satisfied: wcwidth in ./venv/lib/python3.10/site-packages (from prompt-toolkit>=3.0.36->click-repl>=0.2.0->celery->dvc) (0.2.13)
Stwórzmy katalog, w którym będziemy przechowywać nasz projekt:
!rm -r -f IUM_10/sample-ml-project-2024
!mkdir -p IUM_10/sample-ml-project-2024
#Jupyter notebook magic https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-cd
%cd "IUM_10/sample-ml-project-2024"
/home/pawel/ium/IUM_10/sample-ml-project-2024
/home/pawel/ium/venv/lib/python3.10/site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library. self.shell.db['dhist'] = compress_dhist(dhist)[-100:]
Inicjalizujemy puste repozytorium Git (możemy też pominąć ten krok i działać w istniejącym już repozytorium)
!git init
Reinitialized existing Git repository in /home/pawel/ium/IUM_10/sample-ml-project-2024/.git/
Teraz inicjalizujemy repozytorium DVC:
!dvc init
Initialized DVC repository. You can now commit the changes to git. [31m+---------------------------------------------------------------------+ [0m[31m|[0m [31m|[0m [31m|[0m DVC has enabled anonymous aggregate usage analytics. [31m|[0m [31m|[0m Read the analytics documentation (and how to opt-out) here: [31m|[0m [31m|[0m <[36mhttps://dvc.org/doc/user-guide/analytics[39m> [31m|[0m [31m|[0m [31m|[0m [31m+---------------------------------------------------------------------+ [0m [33mWhat's next?[39m [33m------------[39m - Check out the documentation: <[36mhttps://dvc.org/doc[39m> - Get help and share ideas: <[36mhttps://dvc.org/chat[39m> - Star us on GitHub: <[36mhttps://github.com/iterative/dvc[39m> [0m
Zobaczmy jakie pliki dodał (również do repozytorium git) DVC. Ich opis znajdziemy tutaj: https://dvc.org/doc/user-guide/project-structure/internal-files
!git status
On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) [32mnew file: .dvc/.gitignore[m [32mnew file: .dvc/config[m [32mnew file: .dvcignore[m
.dvc/config
- główny plik konfiguracyjny DVC.dvc/config.local
- nadpisuje wartości zconfig
, do lokalnych zmian niecommitowanych do repozytorium.dvc/.gitignore
- pliki DVC, które nie mają znaleźć się w repo.dvcignore
- DVC pomija pliki zdefiniowane w tym pliku (np. aby poprawić wydajność)
Możemy teraz zacommitować zmiany w git:
!git commit -m "Initial commit"
[master (root-commit) a9746ad] Initial commit 3 files changed, 6 insertions(+) create mode 100644 .dvc/.gitignore create mode 100644 .dvc/config create mode 100644 .dvcignore
Przygotujmy przykładowe dane, pobierając je z Kaggle:
!kaggle datasets download -d uciml/iris
!unzip -o iris.zip
!rm database.sqlite iris.zip
!mkdir -p data
!mv Iris.csv data/
Downloading iris.zip to /home/pawel/ium/IUM_10/sample-ml-project-2024 0%| | 0.00/3.60k [00:00<?, ?B/s] 100%|██████████████████████████████████████| 3.60k/3.60k [00:00<00:00, 8.23MB/s] Archive: iris.zip inflating: Iris.csv inflating: database.sqlite
Teraz dodamy plik(i) z danymi do DVC:
!dvc add data/Iris.csv
[?25l[32m⠋[0m Checking graph core[39m> Adding... ![A Collecting files and computing hashes in data/Iris.csv |0.00 [00:00, ?file/s[A [A ![A 0% Checking cache in '/home/pawel/ium/IUM_10/sample-ml-project-2024/.dvc/cache[A [A ![A 0%| |Adding data/Iris.csv to cache 0/1 [00:00<?, ?file/s][A [A ![A 0%| |Checking out /home/pawel/ium/IUM_10/sa0/1 [00:00<?, ?files/s][A 100% Adding...|████████████████████████████████████████|1/1 [00:00, 31.90file/s][A To track the changes with git, run: git add data/.gitignore data/Iris.csv.dvc To enable auto staging, run: dvc config core.autostage true [0m
- DVC utworzył plik
data/Iris.csv.dvc
i dodał oryginalny plik do.gitignore
- W repozytorium będzie obecny tylko plik
*.dvc
, zawierający odnośnik do prawdziwego pliku
!git status -u
On branch master Untracked files: (use "git add <file>..." to include in what will be committed) [31mdata/.gitignore[m [31mdata/Iris.csv.dvc[m nothing added to commit but untracked files present (use "git add" to track)
Dodajmy pliki data/Iris.csv.dvc data/.gitignore
do repozytorium git, zgodnie z sugestią DVC:
!git add data/Iris.csv.dvc data/.gitignore
!git commit -m "Dodano dane IRIS (DVC)"
[master 92b2c9d] Dodano dane IRIS (DVC) 2 files changed, 6 insertions(+) create mode 100644 data/.gitignore create mode 100644 data/Iris.csv.dvc
Plik *.dvc
zawiera m.in. hash pliku. Więcej o plikach *.dvc
: link
# %load data/Iris.csv.dvc
Oryginalny plik Iris.csv
został przeniesiony do katalogu ./dvc/cache/{wartość hash pliku) i podlinkowany z powrotem do oryginalnej lokalizacji. Sposób tworzenia linków może być różny w zależności od systemu plików.
!ls -l .dvc/cache/files/md5/71
total 8 -r--r--r-- 1 pawel pawel 5107 Sep 19 2019 7820ef0af287ff346c5cabfb4c612c
!head -n 3 .dvc/cache/files/md5/71/7820ef0af287ff346c5cabfb4c612c
Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species 1,5.1,3.5,1.4,0.2,Iris-setosa 2,4.9,3.0,1.4,0.2,Iris-setosa
dvc remote
- Żeby wysłać właściwe pliki śledzone przez DVC do zdalnej lokalizacji (z której będą mogłby być pobrane np. przez system CI albo innych użytkowników), musimy mieć skonfigurowaną taką lokazliację.
- Służy do tego polecenie
dvc remote add
. - Użyjemy lokalnego "remote". Tutaj będzie to po prostu utworzony wcześniej katalog
~/dvcstore
. Taki katalog istnieje też na naszym Jenkinsie, oczywiście należy go podmontować w Dockerze. - W rzeczywistych zastosowaniach podalibyśmy tutaj ścieżkę do jakiegoś zasobu dostępnego przez internet jak np. serwer SFTP, ścieżka do AWS S3 itp.
Obsługiwane typy zdalnych lokalizacji (remotes): https://dvc.org/doc/command-reference/remote/add#supported-storage-types
- Amazon S3
- S3-compatible storage
- Microsoft Azure Blob Storage
- Google Drive
- Google Cloud Storage
- Aliyun OSS
- SSH
- HDFS
- WebHDFS
- HTTP
- WebDAV
- local remote
Dodawanie remote typu local
!dvc remote add -d my_local_remote ~/dvcstore
Setting 'my_local_remote' as a default remote.
[0m
!git status
On branch master Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) [31mmodified: .dvc/config[m no changes added to commit (use "git add" and/or "git commit -a")
!git add .dvc/config
!git commit -m "Added DVC remote"
[master 7123494] Added DVC remote 1 file changed, 1 insertion(+), 1 deletion(-)
dvc push
Kiedy mamy już skonfigurowany "remote" możemy wypchnąć do niego pliki korzystając z polecenia dvc push
:
!dvc push
Collecting |1.00 [00:00, 137entry/s] Pushing ![A 0% Checking cache in '/home/pawel/dvcstore/files/md5'| |0/? [00:00<?, ?file[A [A ![A 0% Checking cache in '/home/pawel/ium/IUM_10/sample-ml-project-2024/.dvc/cache[A [A ![A 0%| |Pushing to local 0/1 [00:00<?, ?file/s][A Pushing [A 1 file pushed [0m
!tree ~/dvcstore
[01;34m/home/pawel/dvcstore[0m └── [01;34mfiles[0m └── [01;34mmd5[0m └── [01;34m71[0m └── 7820ef0af287ff346c5cabfb4c612c 3 directories, 1 file
dvc pull
Żeby pobrać dane z DVC (np. w innej lokalizacji, przez innego użytkownika), musimy:
- sklonować repozytorium git (żeby m.in. pobrać pliki
*.dvc
- wykonać
dvc pull
Dodawanie nowych plików i modyfikacja istniejących wygląda podobnie jak przy zwykłych plikach śledzonych przez git, tylko zamiast git
używamy polecenia dvc
a dodatkowo pamiętamy o zarządzaniu plikami *.dvc
za pomocą gita:
!head -n -1 data/Iris.csv | tee data/Iris.csv
Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species 1,5.1,3.5,1.4,0.2,Iris-setosa 2,4.9,3.0,1.4,0.2,Iris-setosa 3,4.7,3.2,1.3,0.2,Iris-setosa 4,4.6,3.1,1.5,0.2,Iris-setosa 5,5.0,3.6,1.4,0.2,Iris-setosa 6,5.4,3.9,1.7,0.4,Iris-setosa 7,4.6,3.4,1.4,0.3,Iris-setosa 8,5.0,3.4,1.5,0.2,Iris-setosa 9,4.4,2.9,1.4,0.2,Iris-setosa 10,4.9,3.1,1.5,0.1,Iris-setosa 11,5.4,3.7,1.5,0.2,Iris-setosa 12,4.8,3.4,1.6,0.2,Iris-setosa 13,4.8,3.0,1.4,0.1,Iris-setosa 14,4.3,3.0,1.1,0.1,Iris-setosa 15,5.8,4.0,1.2,0.2,Iris-setosa 16,5.7,4.4,1.5,0.4,Iris-setosa 17,5.4,3.9,1.3,0.4,Iris-setosa 18,5.1,3.5,1.4,0.3,Iris-setosa 19,5.7,3.8,1.7,0.3,Iris-setosa 20,5.1,3.8,1.5,0.3,Iris-setosa 21,5.4,3.4,1.7,0.2,Iris-setosa 22,5.1,3.7,1.5,0.4,Iris-setosa 23,4.6,3.6,1.0,0.2,Iris-setosa 24,5.1,3.3,1.7,0.5,Iris-setosa 25,4.8,3.4,1.9,0.2,Iris-setosa 26,5.0,3.0,1.6,0.2,Iris-setosa 27,5.0,3.4,1.6,0.4,Iris-setosa 28,5.2,3.5,1.5,0.2,Iris-setosa 29,5.2,3.4,1.4,0.2,Iris-setosa 30,4.7,3.2,1.6,0.2,Iris-setosa 31,4.8,3.1,1.6,0.2,Iris-setosa 32,5.4,3.4,1.5,0.4,Iris-setosa 33,5.2,4.1,1.5,0.1,Iris-setosa 34,5.5,4.2,1.4,0.2,Iris-setosa 35,4.9,3.1,1.5,0.1,Iris-setosa 36,5.0,3.2,1.2,0.2,Iris-setosa 37,5.5,3.5,1.3,0.2,Iris-setosa 38,4.9,3.1,1.5,0.1,Iris-setosa 39,4.4,3.0,1.3,0.2,Iris-setosa 40,5.1,3.4,1.5,0.2,Iris-setosa 41,5.0,3.5,1.3,0.3,Iris-setosa 42,4.5,2.3,1.3,0.3,Iris-setosa 43,4.4,3.2,1.3,0.2,Iris-setosa 44,5.0,3.5,1.6,0.6,Iris-setosa 45,5.1,3.8,1.9,0.4,Iris-setosa 46,4.8,3.0,1.4,0.3,Iris-setosa 47,5.1,3.8,1.6,0.2,Iris-setosa 48,4.6,3.2,1.4,0.2,Iris-setosa 49,5.3,3.7,1.5,0.2,Iris-setosa 50,5.0,3.3,1.4,0.2,Iris-setosa 51,7.0,3.2,4.7,1.4,Iris-versicolor 52,6.4,3.2,4.5,1.5,Iris-versicolor 53,6.9,3.1,4.9,1.5,Iris-versicolor 54,5.5,2.3,4.0,1.3,Iris-versicolor 55,6.5,2.8,4.6,1.5,Iris-versicolor 56,5.7,2.8,4.5,1.3,Iris-versicolor 57,6.3,3.3,4.7,1.6,Iris-versicolor 58,4.9,2.4,3.3,1.0,Iris-versicolor 59,6.6,2.9,4.6,1.3,Iris-versicolor 60,5.2,2.7,3.9,1.4,Iris-versicolor 61,5.0,2.0,3.5,1.0,Iris-versicolor 62,5.9,3.0,4.2,1.5,Iris-versicolor 63,6.0,2.2,4.0,1.0,Iris-versicolor 64,6.1,2.9,4.7,1.4,Iris-versicolor 65,5.6,2.9,3.6,1.3,Iris-versicolor 66,6.7,3.1,4.4,1.4,Iris-versicolor 67,5.6,3.0,4.5,1.5,Iris-versicolor 68,5.8,2.7,4.1,1.0,Iris-versicolor 69,6.2,2.2,4.5,1.5,Iris-versicolor 70,5.6,2.5,3.9,1.1,Iris-versicolor 71,5.9,3.2,4.8,1.8,Iris-versicolor 72,6.1,2.8,4.0,1.3,Iris-versicolor 73,6.3,2.5,4.9,1.5,Iris-versicolor 74,6.1,2.8,4.7,1.2,Iris-versicolor 75,6.4,2.9,4.3,1.3,Iris-versicolor 76,6.6,3.0,4.4,1.4,Iris-versicolor 77,6.8,2.8,4.8,1.4,Iris-versicolor 78,6.7,3.0,5.0,1.7,Iris-versicolor 79,6.0,2.9,4.5,1.5,Iris-versicolor 80,5.7,2.6,3.5,1.0,Iris-versicolor 81,5.5,2.4,3.8,1.1,Iris-versicolor 82,5.5,2.4,3.7,1.0,Iris-versicolor 83,5.8,2.7,3.9,1.2,Iris-versicolor 84,6.0,2.7,5.1,1.6,Iris-versicolor 85,5.4,3.0,4.5,1.5,Iris-versicolor 86,6.0,3.4,4.5,1.6,Iris-versicolor 87,6.7,3.1,4.7,1.5,Iris-versicolor 88,6.3,2.3,4.4,1.3,Iris-versicolor 89,5.6,3.0,4.1,1.3,Iris-versicolor 90,5.5,2.5,4.0,1.3,Iris-versicolor 91,5.5,2.6,4.4,1.2,Iris-versicolor 92,6.1,3.0,4.6,1.4,Iris-versicolor 93,5.8,2.6,4.0,1.2,Iris-versicolor 94,5.0,2.3,3.3,1.0,Iris-versicolor 95,5.6,2.7,4.2,1.3,Iris-versicolor 96,5.7,3.0,4.2,1.2,Iris-versicolor 97,5.7,2.9,4.2,1.3,Iris-versicolor 98,6.2,2.9,4.3,1.3,Iris-versicolor 99,5.1,2.5,3.0,1.1,Iris-versicolor 100,5.7,2.8,4.1,1.3,Iris-versicolor 101,6.3,3.3,6.0,2.5,Iris-virginica 102,5.8,2.7,5.1,1.9,Iris-virginica 103,7.1,3.0,5.9,2.1,Iris-virginica 104,6.3,2.9,5.6,1.8,Iris-virginica 105,6.5,3.0,5.8,2.2,Iris-virginica 106,7.6,3.0,6.6,2.1,Iris-virginica 107,4.9,2.5,4.5,1.7,Iris-virginica 108,7.3,2.9,6.3,1.8,Iris-virginica 109,6.7,2.5,5.8,1.8,Iris-virginica 110,7.2,3.6,6.1,2.5,Iris-virginica 111,6.5,3.2,5.1,2.0,Iris-virginica 112,6.4,2.7,5.3,1.9,Iris-virginica 113,6.8,3.0,5.5,2.1,Iris-virginica 114,5.7,2.5,5.0,2.0,Iris-virginica 115,5.8,2.8,5.1,2.4,Iris-virginica 116,6.4,3.2,5.3,2.3,Iris-virginica 117,6.5,3.0,5.5,1.8,Iris-virginica 118,7.7,3.8,6.7,2.2,Iris-virginica 119,7.7,2.6,6.9,2.3,Iris-virginica 120,6.0,2.2,5.0,1.5,Iris-virginica 121,6.9,3.2,5.7,2.3,Iris-virginica 122,5.6,2.8,4.9,2.0,Iris-virginica 123,7.7,2.8,6.7,2.0,Iris-virginica 124,6.3,2.7,4.9,1.8,Iris-virginica 125,6.7,3.3,5.7,2.1,Iris-virginica 126,7.2,3.2,6.0,1.8,Iris-virginica 127,6.2,2.8,4.8,1.8,Iris-virginica 128,6.1,3.0,4.9,1.8,Iris-virginica 129,6.4,2.8,5.6,2.1,Iris-virginica 130,7.2,3.0,5.8,1.6,Iris-virginica 131,7.4,2.8,6.1,1.9,Iris-virginica 132,7.9,3.8,6.4,2.0,Iris-virginica 133,6.4,2.8,5.6,2.2,Iris-virginica 134,6.3,2.8,5.1,1.5,Iris-virginica 135,6.1,2.6,5.6,1.4,Iris-virginica 136,7.7,3.0,6.1,2.3,Iris-virginica 137,6.3,3.4,5.6,2.4,Iris-virginica 138,6.4,3.1,5.5,1.8,Iris-virginica 139,6.0,3.0,4.8,1.8,Iris-virginica 140,6.9,3.1,5.4,2.1,Iris-virginica 141,6.7,3.1,5.6,2.4,Iris-virginica 142,6.9,3.1,5.1,2.3,Iris-virginica 143,5.8,2.7,5.1,1.9,Iris-virginica 144,6.8,3.2,5.9,2.3,Iris-virginica 145,6.7,3.3,5.7,2.5,Iris-virginica 146,6.7,3.0,5.2,2.3,Iris-virginica 147,6.3,2.5,5.0,1.9,Iris-virginica 148,6.5,3.0,5.2,2.0,Iris-virginica 149,6.2,3.4,5.4,2.3,Iris-virginica
!git status
On branch master nothing to commit, working tree clean
!dvc status
data/Iris.csv.dvc:
changed outs:
modified: data/Iris.csv
[0m
!dvc add data/Iris.csv
[?25l[32m⠋[0m Checking graph core[39m> Adding... ![A Collecting files and computing hashes in data/Iris.csv |0.00 [00:00, ?file/s[A [A ![A 0% Checking cache in '/home/pawel/ium/IUM_10/sample-ml-project-2024/.dvc/cache[A [A ![A 0%| |Adding data/Iris.csv to cache 0/1 [00:00<?, ?file/s][A [A ![A 0%| |Checking out /home/pawel/ium/IUM_10/sa0/1 [00:00<?, ?files/s][A 100% Adding...|████████████████████████████████████████|1/1 [00:00, 50.81file/s][A To track the changes with git, run: git add data/Iris.csv.dvc To enable auto staging, run: dvc config core.autostage true [0m
!git add data/Iris.csv.dvc
!git commit -m "Removed last line from Iris dataset"
[master 9de24e1] Removed last line from Iris dataset 1 file changed, 2 insertions(+), 2 deletions(-)
!wc -l .dvc/cache/files/md5/*/*
151 .dvc/cache/files/md5/71/7820ef0af287ff346c5cabfb4c612c 150 .dvc/cache/files/md5/bc/cff2e578d76852294184c1dce9fdbf 301 total
dvc checkout
- Polecenia
dvc checkout
używamy razem zgit checkout
, żeby zmienić gałąź, na której pracujemy. - DVC podmieni wersje plików śledzonych przez siebie na pochodzące z innej gałęzi (o ile pliki te się różnią i różnią się pliki
*.dvc
na odpowiednich gałęziach) - Zmiana gałęzi przez git powoduje (ewentualną) zmianę plików
*.dvc
advc checkout
kopiuje/linkuje pliki z katalogu.dvc/cache
o wartościach hash odpowiadających tym z plików*.dvc
.
Wymiana danych między projektami
- za pomocą poleceń
dvc import
idvc update
możemy dodać i później aktualizować pliki śledzone przez DVC w innym repozytorium
!dvc import https://github.com/iterative/dataset-registry \
get-started/data.xml -o data/data.xml
Importing 'get-started/data.xml (https://github.com/iterative/dataset-registry)' -> 'data/data.xml' 0% Downloading data.xml| |0/1 [00:00<?, ?files/s] ![A 0%| |get-started/data.xml 0.00/13.8M [00:00<?, ?B/s][A 0%| |get-started/data.xml 16.5k/13.8M [00:00<01:47, 135kB/s][A 0%| |get-started/data.xml 66.5k/13.8M [00:00<00:48, 294kB/s][A 1%| |get-started/data.xml 102k/13.8M [00:00<00:48, 295kB/s][A 2%|▏ |get-started/data.xml 221k/13.8M [00:00<00:25, 566kB/s][A 3%|▎ |get-started/data.xml 374k/13.8M [00:00<00:17, 813kB/s][A 3%|▎ |get-started/data.xml 493k/13.8M [00:00<00:15, 873kB/s][A 5%|▍ |get-started/data.xml 697k/13.8M [00:00<00:12, 1.10MB/s][A 6%|▌ |get-started/data.xml 799k/13.8M [00:01<00:13, 1.02MB/s][A 7%|▋ |get-started/data.xml 935k/13.8M [00:01<00:12, 1.05MB/s][A 8%|▊ |get-started/data.xml 1.05M/13.8M [00:01<00:12, 1.07MB/s][A 8%|▊ |get-started/data.xml 1.10M/13.8M [00:01<00:15, 872kB/s][A 9%|▉ |get-started/data.xml 1.24M/13.8M [00:01<00:13, 991kB/s][A 10%|▉ |get-started/data.xml 1.38M/13.8M [00:01<00:12, 1.03MB/s][A 11%|█ |get-started/data.xml 1.51M/13.8M [00:01<00:12, 1.06MB/s][A 12%|█▏ |get-started/data.xml 1.66M/13.8M [00:01<00:11, 1.12MB/s][A 13%|█▎ |get-started/data.xml 1.77M/13.8M [00:02<00:11, 1.07MB/s][A 14%|█▍ |get-started/data.xml 1.91M/13.8M [00:02<00:11, 1.09MB/s][A 15%|█▍ |get-started/data.xml 2.04M/13.8M [00:02<00:11, 1.11MB/s][A 16%|█▌ |get-started/data.xml 2.19M/13.8M [00:02<00:10, 1.15MB/s][A 17%|█▋ |get-started/data.xml 2.32M/13.8M [00:02<00:10, 1.15MB/s][A 18%|█▊ |get-started/data.xml 2.42M/13.8M [00:02<00:11, 1.04MB/s][A 18%|█▊ |get-started/data.xml 2.52M/13.8M [00:02<00:12, 958kB/s][A 19%|█▉ |get-started/data.xml 2.62M/13.8M [00:02<00:12, 923kB/s][A 20%|█▉ |get-started/data.xml 2.72M/13.8M [00:03<00:12, 895kB/s][A 20%|██ |get-started/data.xml 2.82M/13.8M [00:03<00:12, 917kB/s][A 21%|██ |get-started/data.xml 2.92M/13.8M [00:03<00:12, 888kB/s][A 22%|██▏ |get-started/data.xml 3.00M/13.8M [00:03<00:14, 802kB/s][A 23%|██▎ |get-started/data.xml 3.10M/13.8M [00:03<00:13, 847kB/s][A 23%|██▎ |get-started/data.xml 3.20M/13.8M [00:03<00:13, 847kB/s][A 24%|██▍ |get-started/data.xml 3.30M/13.8M [00:03<00:12, 857kB/s][A 25%|██▍ |get-started/data.xml 3.40M/13.8M [00:03<00:12, 850kB/s][A 25%|██▌ |get-started/data.xml 3.49M/13.8M [00:04<00:14, 768kB/s][A 26%|██▌ |get-started/data.xml 3.59M/13.8M [00:04<00:12, 830kB/s][A 27%|██▋ |get-started/data.xml 3.69M/13.8M [00:04<00:12, 834kB/s][A 27%|██▋ |get-started/data.xml 3.78M/13.8M [00:04<00:12, 841kB/s][A 28%|██▊ |get-started/data.xml 3.90M/13.8M [00:04<00:12, 841kB/s][A 29%|██▉ |get-started/data.xml 4.00M/13.8M [00:04<00:12, 845kB/s][A 30%|██▉ |get-started/data.xml 4.10M/13.8M [00:04<00:11, 890kB/s][A 30%|███ |get-started/data.xml 4.20M/13.8M [00:04<00:11, 895kB/s][A 31%|███▏ |get-started/data.xml 4.32M/13.8M [00:04<00:11, 860kB/s][A 32%|███▏ |get-started/data.xml 4.42M/13.8M [00:05<00:10, 900kB/s][A 33%|███▎ |get-started/data.xml 4.53M/13.8M [00:05<00:11, 880kB/s][A 33%|███▎ |get-started/data.xml 4.58M/13.8M [00:05<00:14, 682kB/s][A 34%|███▎ |get-started/data.xml 4.65M/13.8M [00:05<00:14, 641kB/s][A 34%|███▍ |get-started/data.xml 4.68M/13.8M [00:05<00:17, 539kB/s][A 34%|███▍ |get-started/data.xml 4.71M/13.8M [00:05<00:20, 465kB/s][A 34%|███▍ |get-started/data.xml 4.75M/13.8M [00:05<00:22, 412kB/s][A 35%|███▍ |get-started/data.xml 4.80M/13.8M [00:06<00:22, 414kB/s][A 35%|███▌ |get-started/data.xml 4.85M/13.8M [00:06<00:22, 417kB/s][A 35%|███▌ |get-started/data.xml 4.88M/13.8M [00:06<00:24, 376kB/s][A 36%|███▌ |get-started/data.xml 4.93M/13.8M [00:06<00:23, 389kB/s][A 36%|███▌ |get-started/data.xml 4.96M/13.8M [00:06<00:25, 357kB/s][A 36%|███▋ |get-started/data.xml 5.01M/13.8M [00:06<00:24, 376kB/s][A 37%|███▋ |get-started/data.xml 5.06M/13.8M [00:06<00:23, 388kB/s][A 37%|███▋ |get-started/data.xml 5.10M/13.8M [00:06<00:25, 356kB/s][A 37%|███▋ |get-started/data.xml 5.15M/13.8M [00:07<00:24, 375kB/s][A 38%|███▊ |get-started/data.xml 5.18M/13.8M [00:07<00:25, 347kB/s][A 38%|███▊ |get-started/data.xml 5.25M/13.8M [00:07<00:21, 409kB/s][A 38%|███▊ |get-started/data.xml 5.28M/13.8M [00:07<00:24, 371kB/s][A 39%|███▊ |get-started/data.xml 5.33M/13.8M [00:07<00:22, 387kB/s][A 39%|███▉ |get-started/data.xml 5.38M/13.8M [00:07<00:22, 394kB/s][A 39%|███▉ |get-started/data.xml 5.43M/13.8M [00:07<00:21, 405kB/s][A 40%|███▉ |get-started/data.xml 5.48M/13.8M [00:07<00:21, 410kB/s][A 40%|████ |get-started/data.xml 5.53M/13.8M [00:08<00:20, 413kB/s][A 41%|████ |get-started/data.xml 5.59M/13.8M [00:08<00:18, 455kB/s][A 41%|████ |get-started/data.xml 5.64M/13.8M [00:08<00:19, 442kB/s][A 41%|████▏ |get-started/data.xml 5.69M/13.8M [00:08<00:19, 434kB/s][A 42%|████▏ |get-started/data.xml 5.74M/13.8M [00:08<00:19, 426kB/s][A 42%|████▏ |get-started/data.xml 5.79M/13.8M [00:08<00:19, 422kB/s][A 43%|████▎ |get-started/data.xml 5.86M/13.8M [00:08<00:18, 458kB/s][A 43%|████▎ |get-started/data.xml 5.91M/13.8M [00:08<00:18, 447kB/s][A 43%|████▎ |get-started/data.xml 5.96M/13.8M [00:09<00:19, 431kB/s][A 44%|████▎ |get-started/data.xml 6.02M/13.8M [00:09<00:17, 464kB/s][A 44%|████▍ |get-started/data.xml 6.06M/13.8M [00:09<00:19, 412kB/s][A 44%|████▍ |get-started/data.xml 6.13M/13.8M [00:09<00:17, 455kB/s][A 45%|████▍ |get-started/data.xml 6.16M/13.8M [00:09<00:19, 404kB/s][A 45%|████▌ |get-started/data.xml 6.24M/13.8M [00:09<00:16, 492kB/s][A 46%|████▌ |get-started/data.xml 6.29M/13.8M [00:09<00:16, 470kB/s][A 46%|████▌ |get-started/data.xml 6.34M/13.8M [00:09<00:17, 455kB/s][A 47%|████▋ |get-started/data.xml 6.41M/13.8M [00:10<00:15, 486kB/s][A 47%|████▋ |get-started/data.xml 6.47M/13.8M [00:10<00:15, 507kB/s][A 47%|████▋ |get-started/data.xml 6.52M/13.8M [00:10<00:15, 483kB/s][A 48%|████▊ |get-started/data.xml 6.57M/13.8M [00:10<00:15, 486kB/s][A 48%|████▊ |get-started/data.xml 6.64M/13.8M [00:10<00:15, 488kB/s][A 49%|████▊ |get-started/data.xml 6.71M/13.8M [00:10<00:14, 509kB/s][A 49%|████▉ |get-started/data.xml 6.77M/13.8M [00:10<00:14, 523kB/s][A 50%|████▉ |get-started/data.xml 6.86M/13.8M [00:10<00:12, 576kB/s][A 50%|█████ |get-started/data.xml 6.92M/13.8M [00:11<00:12, 569kB/s][A 51%|█████ |get-started/data.xml 7.01M/13.8M [00:11<00:11, 607kB/s][A 51%|█████▏ |get-started/data.xml 7.07M/13.8M [00:11<00:11, 592kB/s][A 52%|█████▏ |get-started/data.xml 7.14M/13.8M [00:11<00:11, 582kB/s][A 52%|█████▏ |get-started/data.xml 7.20M/13.8M [00:11<00:12, 574kB/s][A 53%|█████▎ |get-started/data.xml 7.25M/13.8M [00:11<00:12, 528kB/s][A 53%|█████▎ |get-started/data.xml 7.32M/13.8M [00:11<00:12, 537kB/s][A 54%|█████▎ |get-started/data.xml 7.40M/13.8M [00:11<00:11, 585kB/s][A 54%|█████▍ |get-started/data.xml 7.50M/13.8M [00:12<00:09, 658kB/s][A 55%|█████▍ |get-started/data.xml 7.57M/13.8M [00:12<00:10, 629kB/s][A 56%|█████▌ |get-started/data.xml 7.65M/13.8M [00:12<00:09, 651kB/s][A 56%|█████▌ |get-started/data.xml 7.74M/13.8M [00:12<00:09, 667kB/s][A 57%|█████▋ |get-started/data.xml 7.80M/13.8M [00:12<00:09, 637kB/s][A 57%|█████▋ |get-started/data.xml 7.90M/13.8M [00:12<00:08, 698kB/s][A 58%|█████▊ |get-started/data.xml 8.00M/13.8M [00:12<00:08, 739kB/s][A 59%|█████▉ |get-started/data.xml 8.10M/13.8M [00:12<00:07, 765kB/s][A 60%|█████▉ |get-started/data.xml 8.20M/13.8M [00:13<00:07, 791kB/s][A 60%|██████ |get-started/data.xml 8.33M/13.8M [00:13<00:06, 889kB/s][A 61%|██████▏ |get-started/data.xml 8.45M/13.8M [00:13<00:06, 901kB/s][A 62%|██████▏ |get-started/data.xml 8.55M/13.8M [00:13<00:06, 893kB/s][A 63%|██████▎ |get-started/data.xml 8.70M/13.8M [00:13<00:05, 987kB/s][A 64%|██████▍ |get-started/data.xml 8.81M/13.8M [00:13<00:05, 1.00MB/s][A 65%|██████▌ |get-started/data.xml 8.96M/13.8M [00:13<00:04, 1.05MB/s][A 66%|██████▌ |get-started/data.xml 9.11M/13.8M [00:13<00:04, 1.12MB/s][A 67%|██████▋ |get-started/data.xml 9.26M/13.8M [00:14<00:04, 1.16MB/s][A 68%|██████▊ |get-started/data.xml 9.43M/13.8M [00:14<00:03, 1.24MB/s][A 70%|██████▉ |get-started/data.xml 9.60M/13.8M [00:14<00:03, 1.29MB/s][A 71%|███████ |get-started/data.xml 9.76M/13.8M [00:14<00:03, 1.36MB/s][A 72%|███████▏ |get-started/data.xml 9.94M/13.8M [00:14<00:02, 1.42MB/s][A 74%|███████▎ |get-started/data.xml 10.1M/13.8M [00:14<00:02, 1.45MB/s][A 75%|███████▌ |get-started/data.xml 10.3M/13.8M [00:14<00:02, 1.53MB/s][A 77%|███████▋ |get-started/data.xml 10.6M/13.8M [00:14<00:02, 1.61MB/s][A 78%|███████▊ |get-started/data.xml 10.8M/13.8M [00:15<00:01, 1.68MB/s][A 80%|███████▉ |get-started/data.xml 11.0M/13.8M [00:15<00:01, 1.77MB/s][A 82%|████████▏ |get-started/data.xml 11.2M/13.8M [00:15<00:01, 1.89MB/s][A 83%|████████▎ |get-started/data.xml 11.5M/13.8M [00:15<00:01, 1.96MB/s][A 85%|████████▌ |get-started/data.xml 11.8M/13.8M [00:15<00:01, 1.98MB/s][A 87%|████████▋ |get-started/data.xml 12.0M/13.8M [00:15<00:00, 2.13MB/s][A 89%|████████▉ |get-started/data.xml 12.3M/13.8M [00:15<00:00, 2.20MB/s][A 91%|█████████▏|get-started/data.xml 12.6M/13.8M [00:15<00:00, 2.30MB/s][A 94%|█████████▎|get-started/data.xml 12.9M/13.8M [00:15<00:00, 2.37MB/s][A 96%|█████████▌|get-started/data.xml 13.2M/13.8M [00:16<00:00, 2.50MB/s][A 97%|█████████▋|get-started/data.xml 13.4M/13.8M [00:16<00:00, 2.01MB/s][A 98%|█████████▊|get-started/data.xml 13.5M/13.8M [00:16<00:00, 1.73MB/s][A 100%|██████████|get-started/data.xml 13.8M/13.8M [00:16<00:00, 1.81MB/s][A [A To track the changes with git, run: git add data/data.xml.dvc data/.gitignore To enable auto staging, run: dvc config core.autostage true [0m
!dvc status
Data and pipelines are up to date.
[0m
ls -l data
total 14124 -rw-r--r-- 1 pawel pawel 5072 May 22 07:57 Iris.csv -rw-r--r-- 1 pawel pawel 88 May 22 07:57 Iris.csv.dvc -rw-r--r-- 1 pawel pawel 14445097 May 22 07:59 data.xml -rw-r--r-- 1 pawel pawel 296 May 22 07:59 data.xml.dvc
DVC pipelines
- Wprowadzenie: https://youtu.be/71IGzyH95UY
- Getting started: https://dvc.org/doc/start/data-pipelines
- DVC pipelines pozwalają zbudować (za pomocą polecenia
dvc run
) lub zdefiniować (edytując plikdvc.yaml
) graf zależności między krokami wykonywanymi w naszym projekcie (takimi jak "przygotowanie danych", "uczenie", "ewaluacja"). - Tak zdefiniowany pipeline można potem uruchomić za pomocą polecenia
dvc reproduce
.
Zadania [5 pkt + dodatkowo 10 pkt]
Termin: 29 maja 2024
- Zainicjalizuj repozytorium DVC wewnątrz Twojego repozytorium z projektem [1pkt]
- Dodaj plik(i) z danymi w Twoim projekcie do DVC [1pkt]
- Skonfiguruj remote (dane do konfiguracji podane poniżej) [3pkt]
- [Dodatkowo] Stwórz/zdefiniuj i dodaj do repozytorium plik
dvc.yaml
opisujący kroki wykonywane w Twoim projekcie. Wydziel przynajmniej 2 kroki (np. przygotowanie danych/trenowanie) powiązane ze sobą za pomocą zależności (skorzystaj z materiałów "Getting started", link powyżej) [10pkt (opcjonalne)]
SSH remote
Jednym z remote obsługiwanych przez DVC jest SFTP/SSH.
W celu jego wykorzystania na serwerze tzietkiewicz.vm.wmi.amu.edu.pl utworzony został użytkownik ium-sftp
i skonfigurowany serwer SFTP.
Został też dla niego wygenerowany klucz ssh, który został dodany jako "Jenkins credential" (patrz opis konfiguracji na Jenkins poniżej)
Lokalnie
Będziemy potrzebować zależności (szczegóły)
conda install dvc-ssh
albo
pip install dvc[ssh] paramiko
# conda install -c conda-forge dvc-ssh
!pip install dvc[ssh] paramiko
Requirement already satisfied: dvc[ssh] in /home/pawel/ium/venv/lib/python3.10/site-packages (3.50.2) Collecting paramiko Downloading paramiko-3.4.0-py3-none-any.whl (225 kB) [2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m225.9/225.9 KB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m [?25hRequirement already satisfied: ruamel.yaml>=0.17.11 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.18.6) Requirement already satisfied: fsspec in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2024.5.0) Requirement already satisfied: dvc-studio-client<1,>=0.20 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.20.0) Requirement already satisfied: tomlkit>=0.11.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.12.5) Requirement already satisfied: dvc-objects in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (5.1.0) Requirement already satisfied: distro>=1.3 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (1.9.0) Requirement already satisfied: pygtrie>=2.3.2 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2.5.0) Requirement already satisfied: voluptuous>=0.11.7 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.14.2) Requirement already satisfied: attrs>=22.2.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (23.2.0) Requirement already satisfied: dvc-http>=2.29.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2.32.0) Requirement already satisfied: rich>=12 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (13.7.1) Requirement already satisfied: dulwich in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.22.1) Requirement already satisfied: pyparsing>=2.4.7 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (3.1.2) Requirement already satisfied: shortuuid>=0.5 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (1.0.13) Requirement already satisfied: flufl.lock<8,>=5 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (7.1.1) Requirement already satisfied: kombu in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (5.3.7) Requirement already satisfied: iterative-telemetry>=0.0.7 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.0.8) Requirement already satisfied: dpath<3,>=2.1.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2.1.6) Requirement already satisfied: colorama>=0.3.9 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.4.6) Requirement already satisfied: celery in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (5.4.0) Requirement already satisfied: packaging>=19 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (24.0) Requirement already satisfied: tabulate>=0.8.7 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.9.0) Requirement already satisfied: shtab<2,>=1.3.4 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (1.7.1) Requirement already satisfied: scmrepo<4,>=3.3.2 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (3.3.5) Requirement already satisfied: dvc-render<2,>=1.0.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (1.0.2) Requirement already satisfied: gto<2,>=1.6.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (1.7.1) Requirement already satisfied: pydot>=1.2.4 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2.0.0) Requirement already satisfied: psutil>=5.8 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (5.9.8) Requirement already satisfied: configobj>=5.0.6 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (5.0.8) Requirement already satisfied: funcy>=1.14 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2.0) Requirement already satisfied: grandalf<1,>=0.7 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.8) Requirement already satisfied: dvc-task<1,>=0.3.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.4.0) Requirement already satisfied: requests>=2.22 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2.31.0) Requirement already satisfied: zc.lockfile>=1.2.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (3.0.post1) Requirement already satisfied: flatten-dict<1,>=0.4.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.4.2) Requirement already satisfied: networkx>=2.5 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (3.3) Requirement already satisfied: pathspec>=0.10.3 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (0.12.1) Requirement already satisfied: hydra-core>=1.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (1.3.2) Requirement already satisfied: omegaconf in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (2.3.0) Requirement already satisfied: tqdm<5,>=4.63.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (4.66.2) Requirement already satisfied: dvc-data<3.16,>=3.15 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (3.15.1) Requirement already satisfied: platformdirs<4,>=3.1.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc[ssh]) (3.11.0) Collecting dvc-ssh<5,>=4 Downloading dvc_ssh-4.1.1-py3-none-any.whl (15 kB) Collecting bcrypt>=3.2 Downloading bcrypt-4.1.3-cp39-abi3-manylinux_2_28_x86_64.whl (283 kB) [2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m283.7/283.7 KB[0m [31m12.6 MB/s[0m eta [36m0:00:00[0m [?25hCollecting pynacl>=1.5 Downloading PyNaCl-1.5.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (856 kB) [2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m856.7/856.7 KB[0m [31m14.2 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m [?25hRequirement already satisfied: cryptography>=3.3 in /home/pawel/ium/venv/lib/python3.10/site-packages (from paramiko) (42.0.7) Requirement already satisfied: six in /home/pawel/ium/venv/lib/python3.10/site-packages (from configobj>=5.0.6->dvc[ssh]) (1.16.0) Requirement already satisfied: cffi>=1.12 in /home/pawel/ium/venv/lib/python3.10/site-packages (from cryptography>=3.3->paramiko) (1.16.0) Requirement already satisfied: diskcache>=5.2.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc-data<3.16,>=3.15->dvc[ssh]) (5.6.3) Requirement already satisfied: dictdiffer>=0.8.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc-data<3.16,>=3.15->dvc[ssh]) (0.9.0) Requirement already satisfied: sqltrie<1,>=0.11.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc-data<3.16,>=3.15->dvc[ssh]) (0.11.0) Requirement already satisfied: aiohttp-retry>=2.5.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from dvc-http>=2.29.0->dvc[ssh]) (2.8.3) Collecting sshfs[bcrypt]>=2023.4.1 Downloading sshfs-2024.4.1-py3-none-any.whl (15 kB) Requirement already satisfied: billiard<5.0,>=4.2.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (4.2.0) Requirement already satisfied: tzdata>=2022.7 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (2024.1) Requirement already satisfied: python-dateutil>=2.8.2 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (2.9.0.post0) Requirement already satisfied: vine<6.0,>=5.1.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (5.1.0) Requirement already satisfied: click-plugins>=1.1.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (1.1.1) Requirement already satisfied: click-repl>=0.2.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (0.3.0) Requirement already satisfied: click<9.0,>=8.1.2 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (8.1.7) Requirement already satisfied: click-didyoumean>=0.3.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from celery->dvc[ssh]) (0.3.1) Requirement already satisfied: atpublic>=2.3 in /home/pawel/ium/venv/lib/python3.10/site-packages (from flufl.lock<8,>=5->dvc[ssh]) (4.1.0) Requirement already satisfied: semver>=2.13.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc[ssh]) (3.0.2) Requirement already satisfied: pydantic!=2.0.0,<3,>=1.9.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc[ssh]) (2.7.1) Requirement already satisfied: entrypoints in /home/pawel/ium/venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc[ssh]) (0.4) Requirement already satisfied: typer>=0.4.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from gto<2,>=1.6.0->dvc[ssh]) (0.12.3) Requirement already satisfied: antlr4-python3-runtime==4.9.* in /home/pawel/ium/venv/lib/python3.10/site-packages (from hydra-core>=1.1->dvc[ssh]) (4.9.3) Requirement already satisfied: filelock in /home/pawel/ium/venv/lib/python3.10/site-packages (from iterative-telemetry>=0.0.7->dvc[ssh]) (3.14.0) Requirement already satisfied: appdirs in /home/pawel/ium/venv/lib/python3.10/site-packages (from iterative-telemetry>=0.0.7->dvc[ssh]) (1.4.4) Requirement already satisfied: amqp<6.0.0,>=5.1.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from kombu->dvc[ssh]) (5.2.0) Requirement already satisfied: PyYAML>=5.1.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from omegaconf->dvc[ssh]) (6.0.1) Requirement already satisfied: certifi>=2017.4.17 in /home/pawel/ium/venv/lib/python3.10/site-packages (from requests>=2.22->dvc[ssh]) (2024.2.2) Requirement already satisfied: idna<4,>=2.5 in /home/pawel/ium/venv/lib/python3.10/site-packages (from requests>=2.22->dvc[ssh]) (3.6) Requirement already satisfied: charset-normalizer<4,>=2 in /home/pawel/ium/venv/lib/python3.10/site-packages (from requests>=2.22->dvc[ssh]) (3.3.2) Requirement already satisfied: urllib3<3,>=1.21.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from requests>=2.22->dvc[ssh]) (2.2.1) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from rich>=12->dvc[ssh]) (2.17.2) Requirement already satisfied: markdown-it-py>=2.2.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from rich>=12->dvc[ssh]) (3.0.0) Requirement already satisfied: ruamel.yaml.clib>=0.2.7 in /home/pawel/ium/venv/lib/python3.10/site-packages (from ruamel.yaml>=0.17.11->dvc[ssh]) (0.2.8) Requirement already satisfied: pygit2>=1.14.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from scmrepo<4,>=3.3.2->dvc[ssh]) (1.15.0) Requirement already satisfied: gitpython>3 in /home/pawel/ium/venv/lib/python3.10/site-packages (from scmrepo<4,>=3.3.2->dvc[ssh]) (3.1.43) Requirement already satisfied: asyncssh<3,>=2.13.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from scmrepo<4,>=3.3.2->dvc[ssh]) (2.14.2) Requirement already satisfied: setuptools in /home/pawel/ium/venv/lib/python3.10/site-packages (from zc.lockfile>=1.2.1->dvc[ssh]) (59.6.0) Requirement already satisfied: aiohttp in /home/pawel/ium/venv/lib/python3.10/site-packages (from aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc[ssh]) (3.9.5) Requirement already satisfied: typing-extensions>=3.6 in /home/pawel/ium/venv/lib/python3.10/site-packages (from asyncssh<3,>=2.13.1->scmrepo<4,>=3.3.2->dvc[ssh]) (4.11.0) Requirement already satisfied: pycparser in /home/pawel/ium/venv/lib/python3.10/site-packages (from cffi>=1.12->cryptography>=3.3->paramiko) (2.22) Requirement already satisfied: prompt-toolkit>=3.0.36 in /home/pawel/ium/venv/lib/python3.10/site-packages (from click-repl>=0.2.0->celery->dvc[ssh]) (3.0.43) Requirement already satisfied: gitdb<5,>=4.0.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from gitpython>3->scmrepo<4,>=3.3.2->dvc[ssh]) (4.0.11) Requirement already satisfied: mdurl~=0.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich>=12->dvc[ssh]) (0.1.2) Requirement already satisfied: pydantic-core==2.18.2 in /home/pawel/ium/venv/lib/python3.10/site-packages (from pydantic!=2.0.0,<3,>=1.9.0->gto<2,>=1.6.0->dvc[ssh]) (2.18.2) Requirement already satisfied: annotated-types>=0.4.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from pydantic!=2.0.0,<3,>=1.9.0->gto<2,>=1.6.0->dvc[ssh]) (0.7.0) Requirement already satisfied: orjson in /home/pawel/ium/venv/lib/python3.10/site-packages (from sqltrie<1,>=0.11.0->dvc-data<3.16,>=3.15->dvc[ssh]) (3.10.3) Requirement already satisfied: shellingham>=1.3.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from typer>=0.4.1->gto<2,>=1.6.0->dvc[ssh]) (1.5.4) Requirement already satisfied: async-timeout<5.0,>=4.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc[ssh]) (4.0.3) Requirement already satisfied: aiosignal>=1.1.2 in /home/pawel/ium/venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc[ssh]) (1.3.1) Requirement already satisfied: multidict<7.0,>=4.5 in /home/pawel/ium/venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc[ssh]) (6.0.5) Requirement already satisfied: yarl<2.0,>=1.0 in /home/pawel/ium/venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc[ssh]) (1.9.4) Requirement already satisfied: frozenlist>=1.1.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from aiohttp->aiohttp-retry>=2.5.0->dvc-http>=2.29.0->dvc[ssh]) (1.4.1) Requirement already satisfied: smmap<6,>=3.0.1 in /home/pawel/ium/venv/lib/python3.10/site-packages (from gitdb<5,>=4.0.1->gitpython>3->scmrepo<4,>=3.3.2->dvc[ssh]) (5.0.1) Requirement already satisfied: wcwidth in /home/pawel/ium/venv/lib/python3.10/site-packages (from prompt-toolkit>=3.0.36->click-repl>=0.2.0->celery->dvc[ssh]) (0.2.13) Installing collected packages: bcrypt, pynacl, paramiko, sshfs, dvc-ssh Successfully installed bcrypt-4.1.3 dvc-ssh-4.1.1 paramiko-3.4.0 pynacl-1.5.0 sshfs-2024.4.1
## Poniższe są potrzebne, żeby polecania dvc remote działały:
!sudo apt install libssl3 libffi7
[sudo] password for pawel:
Dodajemy remote:
!dvc remote add -f -d ium_ssh_remote ssh://ium-sftp@tzietkiewicz.vm.wmi.amu.edu.pl
Setting 'ium_ssh_remote' as a default remote.
[0m
!dvc remote list
my_local_remote /home/pawel/dvcstore
ium_ssh_remote ssh://ium-sftp@tzietkiewicz.vm.wmi.amu.edu.pl
[0m
Zapisujemy hasło:
!dvc remote modify --local ium_ssh_remote password IUM@2021
[0m
Pushujemy do skonfigurowanego remote:
!dvc push
Collecting |1.00 [00:00, 252entry/s] Pushing ![A 0% Checking cache in 'files/md5'| |0/? [00:00<?, ?files/s][A [A ![A 0% Checking cache in '/home/pawel/ium/IUM_10/sample-ml-project-2024/.dvc/cache[A [A ![A 0%| |Pushing to ssh 0/1 [00:00<?, ?file/s][A ![A[A 0%| |/home/pawel/ium/IUM_10/sample-m0.00/4.95k [00:00<?, ?B/s][A[A [A[A 100%|██████████|Pushing to ssh 1/1 [00:00<00:00, 8.63file/s][A Pushing [A 1 file pushed [0m
Jenkins
W Jenkins można użyć mechanizmu "Credentials", żeby w bezpieczny sposób przekazać hasło albo klucz prywatny.
Takie dane dla użytkownika ium-sftp zostały stworzone na Jenkinsie:
- typu ssh key: https://tzietkiewicz.vm.wmi.amu.edu.pl:8081/credentials/store/system/domain/_/credential/48ac7004-216e-4260-abba-1fe5db753e18/
- typu "secret text" - zawierający hasło użytkownika ium-shftp: https://tzietkiewicz.vm.wmi.amu.edu.pl:8081/credentials/store/system/domain/_/credential/ium-sftp-password/
Opis używania "Credentials" w Jenkinsfile: https://www.jenkins.io/doc/book/pipeline/jenkinsfile/#for-other-credential-types
Klucza ssh można użyć tak:
withCredentials(
[sshUserPrivateKey(credentialsId: '48ac7004-216e-4260-abba-1fe5db753e18', keyFileVariable: 'IUM_SFTP_KEY', passphraseVariable: '', usernameVariable: '')]) {
sh 'dvc remote add -d ium_ssh_remote ssh://ium-sftp@tzietkiewicz.vm.wmi.amu.edu.pl/ium-sftp'
sh 'dvc remote modify --local ium_ssh_remote keyfile $IUM_SFTP_KEY'
sh 'dvc pull'}
Secret text tak:
withCredentials([string(credentialsId: 'ium-sftp-password', variable: 'IUM_SFTP_PASS')]) {
sh 'dvc remote add -d ium_ssh_remote ssh://ium-sftp@tzietkiewicz.vm.wmi.amu.edu.pl/ium-sftp'
sh 'dvc remote modify --local ium_ssh_remote password $IUM_SFTP_PASS'
sh 'dvc pull'
}
Przykład konfiguracji: