1078 lines
30 KiB
Plaintext
1078 lines
30 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9d06fc91",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"## Inżynieria uczenia maszynowego\n",
|
|
"### 29 maja 2024\n",
|
|
"# 11. GitHub Actions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "beeb17b2",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"<img src=\"img/expcontrol/github-actions.jpeg\">"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "752995e1",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
" - https://docs.github.com/en/actions\n",
|
|
" - System ciągłej integracji „wbudowany” w GitHub\n",
|
|
" - Darmowy dla publicznych repozytoriów (z większymi niż w płatnych planach [ograniczeniami dotyczącymi zasobów](https://docs.github.com/en/actions/reference/usage-limits-billing-and-administration#usage-limits))\n",
|
|
" - https://youtu.be/cP0I9w2coGU"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b66dd41f",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Terminologia GitHub Actions\n",
|
|
" - ***Workflow*** odpowiada *pipeline*'owi z Jenkinsa.\n",
|
|
" - ***Event*** to zdarzenie, które uruchamia/wyzwala (*triggers*) *workflow*. Np. wypchnięcie zmiany do repozytorium (*push*), utworzenie pull requestu ([pełna lista tutaj](https://docs.github.com/en/actions/reference/events-that-trigger-workflows)).\n",
|
|
" - ***Job*** - zadanie. Workflow składa się z jednego lub kilku zadań (*jobs*). Każde z nich może być wykonywane równolegle na innej maszynie (patrz *runner*).\n",
|
|
" - ***Step*** (krok) - odpowiednik *stage* z Jenkinsa - służy do grupowania *actions*.\n",
|
|
" - ***Action/command*** (akcja/polecenie) - odpowiednik *step* z Jenkinsa - pojedyncze polecenie do wykonania, np. dodanie komentarza do pull requestu, wykonanie polecenia systemowego itp.\n",
|
|
" - ***Runner*** (wykonawca) - odpowiednik jenkinsowego *agent* - serwer, na którym mogą być wykonywane zadania (*jobs*):\n",
|
|
" - *GitHub-hosted runner* - serwer utrzymywany przez GitHub (2-core CPU, 7 GB RAM, 14 GB SSD). Windows, Linux albo macOS.\n",
|
|
" - *Self-hosted runner* - własny serwer, z zainstalowaną aplikacją [GitHub Actions Runner](https://github.com/actions/runner)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9f1f6d0a",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Definicja *workflow*\n",
|
|
" - *Workflow* definiuje się w plikach YAML (o rozszerzeniu `*.yml` albo `*.yaml`) umieszczonych w specjalnym folderze `.github/workflows/` wewnątrz repozytorium.\n",
|
|
" - Pełna składnia jest opisana [tutaj](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions).\n",
|
|
" - Podstawowe pola:\n",
|
|
" - `name` (opcjonalne) - nazwa, pod którą *workflow*/*step* będzie widoczny w UI. Domyślnie: ścieżka do pliku YAML.\n",
|
|
" - `on` - definiuje, kiedy workflow ma być uruchomiony.\n",
|
|
" - `jobs` - grupuje razem zadania (*jobs*) do wykonania. Każde może być wykonane na innym „wykonawcy” (*runner*). Domyślnie wykonywane są równolegle (ale możemy definiować [zależności między jobami](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idneeds), co powoduje wykonanie ich sekwencyjnie).\n",
|
|
" - `runs-on` - parametr zadania (*job*) definiujący, na jakiej maszynie wirtualnej ma być uruchomiony (np. `ubuntu-latest`).\n",
|
|
" - `uses` - umożliwia użycie gotowych akcji zdefiniowanych przez nas albo przez innych użytkowników, np. `-uses: actions/checkout@v2` spowoduje *checkout* plików z repozytorium.\n",
|
|
" - `run` - pozwala uruchomić dowolne ([dostępne/zainstalowane](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#preinstalled-software)) polecenie, np. `python3 train.py`\n",
|
|
" - `env` - pozwala zdefiniować zmienne środowiskowe dostępne dla akcji lub skorzystać ze [zmiennych ustawionych przez Github](https://docs.github.com/en/actions/reference/environment-variables#default-environment-variables)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "f4916c1f",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/home/pawel/ium/IUM_11/github-actions-hello\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/home/pawel/ium/venv/lib/python3.10/site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library.\n",
|
|
" self.shell.db['dhist'] = compress_dhist(dhist)[-100:]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!mkdir -p IUM_11/github-actions-hello\n",
|
|
"%cd IUM_11/github-actions-hello\n",
|
|
"!mkdir -p .github/workflows"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"id": "88ce689f",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Initialized empty Git repository in /home/pawel/ium/IUM_11/github-actions-hello/.git/\n",
|
|
"Switched to a new branch 'main'\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!git init\n",
|
|
"!git checkout -b main\n",
|
|
"!git remote add origin git@github.com:USERNAME/ium-ga-hello.git\n",
|
|
"!git push -u origin main"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"id": "dde8d432",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Writing .github/workflows/workflow.yml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile .github/workflows/workflow.yml\n",
|
|
"name: github-actions-hello\n",
|
|
"on: [push]\n",
|
|
"jobs:\n",
|
|
" hello-job:\n",
|
|
" runs-on: ubuntu-latest\n",
|
|
" steps:\n",
|
|
" - name: Checkout repo\n",
|
|
" uses: actions/checkout@v2\n",
|
|
" - name: Setup Python\n",
|
|
" uses: actions/setup-python@v2.2.2\n",
|
|
" with:\n",
|
|
" python-version: '3.10'\n",
|
|
" - run: python3 --version"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 22,
|
|
"id": "ff1e011e",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"On branch main\n",
|
|
"Your branch is up to date with 'origin/main'.\n",
|
|
"\n",
|
|
"nothing to commit, working tree clean\n",
|
|
"Everything up-to-date\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!git add .github/workflows/workflow.yml\n",
|
|
"!git commit -m \"Github Actions Workflow\"\n",
|
|
"!git push"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3e237076",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Zakładka *Actions* na stronie repozytorium:\n",
|
|
"https://github.com/skorzewski/ium-ga-hello/actions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"id": "32701383",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"total 24\n",
|
|
"drwxr-xr-x 2 pawel pawel 4096 May 28 10:10 .\n",
|
|
"drwxr-xr-x 3 pawel pawel 4096 May 28 10:10 ..\n",
|
|
"-rw-r--r-- 1 pawel pawel 1451 May 28 10:10 docker-artifact.yml\n",
|
|
"-rw-r--r-- 1 pawel pawel 882 May 28 10:10 docker.yml\n",
|
|
"-rw-r--r-- 1 pawel pawel 603 May 28 10:10 parametrized.yml\n",
|
|
"-rw-r--r-- 1 pawel pawel 306 May 28 10:10 workflow.yml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!ls -al .github/workflows"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1c01acb5",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Ręczne wywoływanie\n",
|
|
"Workflow można również wywołać ręcznie, podając parametry.\n",
|
|
"Więcej informacji np. tutaj: https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 42,
|
|
"id": "a7250bf7",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting .github/workflows/parametrized.yml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile .github/workflows/parametrized.yml\n",
|
|
"name: github-actions-hello-parametrized\n",
|
|
"on: \n",
|
|
" workflow_dispatch:\n",
|
|
" inputs:\n",
|
|
" input_text:\n",
|
|
" description: 'Text to display' \n",
|
|
" required: true\n",
|
|
" default: 'Hello World'\n",
|
|
"jobs:\n",
|
|
" hello-job:\n",
|
|
" runs-on: ubuntu-latest\n",
|
|
" steps:\n",
|
|
" - name: Checkout repo\n",
|
|
" uses: actions/checkout@v2\n",
|
|
" - name: Install dependencies\n",
|
|
" run:\n",
|
|
" sudo apt update;\n",
|
|
" sudo apt install -y figlet\n",
|
|
" - name: Write\n",
|
|
" run:\n",
|
|
" figlet \"${{ github.event.inputs.input_text }}\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 43,
|
|
"id": "36ddaac0",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[main a98938d] just dispatch\n",
|
|
" 1 file changed, 6 deletions(-)\n",
|
|
"Enumerating objects: 9, done.\n",
|
|
"Counting objects: 100% (9/9), done.\n",
|
|
"Delta compression using up to 4 threads\n",
|
|
"Compressing objects: 100% (3/3), done.\n",
|
|
"Writing objects: 100% (5/5), 411 bytes | 411.00 KiB/s, done.\n",
|
|
"Total 5 (delta 1), reused 0 (delta 0), pack-reused 0\n",
|
|
"remote: Resolving deltas: 100% (1/1), completed with 1 local object.\u001b[K\n",
|
|
"To github.com:TomekZet/ium-ga-hello.git\n",
|
|
" 6c4a361..a98938d main -> main\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!git add -u .github/workflows\n",
|
|
"!git commit -m \"just dispatch\"\n",
|
|
"!git push"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ed780dea",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Zależności\n",
|
|
"\n",
|
|
"Maszyny wirtualne (*runners*), na których uruchamiane są zadania, mają zainstalowany zbiór narzędzi. Przykładowa lista dla Ubuntu 24.04: https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2404-Readme.md\n",
|
|
"\n",
|
|
"Brakujące zależności można zainstalować, korzystając z:\n",
|
|
" - akcji\n",
|
|
" - poleceń systemowych takich jak `apt install` czy `pip install` uruchomionych poprzez `run`. Patrz [przykład](https://docs.github.com/en/actions/using-github-hosted-runners/customizing-github-hosted-runners#installing-software-on-ubuntu-runners)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "28b582c4",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Akcje\n",
|
|
"Za pomocą polecenia `uses` możemy używać przygotowanych wcześniej akcji. Mogą one pochodzić:\n",
|
|
" - z tego samego repozytorium co workflow ([więcej](https://docs.github.com/en/actions/learn-github-actions/finding-and-customizing-actions#referencing-an-action-in-the-same-repository-where-a-workflow-file-uses-the-action))\n",
|
|
" - z dowolnego publicznego repozytorium Github (np. [repozytorioum iterative/setup-clm](https://github.com/iterative/setup-cml), patrz przykład poniżej\n",
|
|
" - z [Github Marketplace](https://github.com/marketplace?type=actions)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a764cc0d",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Akcje wykonywane w kontenerze Docker\n",
|
|
"Akcja może być wywołana w kontenerze Docker (pobranym z Docker Hub albo zbudowanym z `Dockerfile`).\n",
|
|
"W tym celu należy stworzyć własną akcję w pliku `action.yaml` i potem użyć jej w *workflow*."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 59,
|
|
"id": "ff4dab8c",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting action.yml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile action.yml\n",
|
|
"name: 'Hello World'\n",
|
|
"description: 'Greet someone and record the time'\n",
|
|
"inputs:\n",
|
|
" who-to-greet: # id of input\n",
|
|
" description: 'Who to greet'\n",
|
|
" required: true\n",
|
|
" default: 'World'\n",
|
|
"outputs:\n",
|
|
" time: # id of output\n",
|
|
" description: 'The time we greeted you'\n",
|
|
"runs:\n",
|
|
" using: 'docker'\n",
|
|
" image: 'Dockerfile'\n",
|
|
" args:\n",
|
|
" - ${{ inputs.who-to-greet }}"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 80,
|
|
"id": "f1aaff7c",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting Dockerfile\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile Dockerfile\n",
|
|
"# Container image that runs your code\n",
|
|
"FROM ubuntu:latest\n",
|
|
" \n",
|
|
"RUN apt update && apt install -y figlet\n",
|
|
"\n",
|
|
"# Copies your code file from your action repository to the filesystem path `/` of the container\n",
|
|
"COPY entrypoint.sh /entrypoint.sh\n",
|
|
"\n",
|
|
"VOLUME /github/workspace/\n",
|
|
"\n",
|
|
"WORKDIR /github/workspace/\n",
|
|
"\n",
|
|
"# Code file to execute when the docker container starts up (`entrypoint.sh`)\n",
|
|
"ENTRYPOINT [\"/entrypoint.sh\"]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 84,
|
|
"id": "7f778025",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting entrypoint.sh\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile entrypoint.sh\n",
|
|
"#!/bin/sh -l\n",
|
|
"\n",
|
|
"figlet \"Hello $1\" | tee figlet.txt\n",
|
|
"echo \"Entrypoint invoked in: $PWD\"\n",
|
|
"readlink -f figlet.txt\n",
|
|
"time=$(date)\n",
|
|
"echo \"time=$time\" >> $GITHUB_OUTPUT"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 60,
|
|
"id": "911975de",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"!chmod +x entrypoint.sh"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 62,
|
|
"id": "483e0498",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting .github/workflows/docker.yml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile .github/workflows/docker.yml\n",
|
|
"name: github-actions-hello-docker\n",
|
|
"on: \n",
|
|
" workflow_dispatch:\n",
|
|
" inputs:\n",
|
|
" input_text:\n",
|
|
" description: 'Who to greet' \n",
|
|
" required: true\n",
|
|
" default: 'World'\n",
|
|
"jobs:\n",
|
|
" hello-job:\n",
|
|
" runs-on: ubuntu-latest\n",
|
|
" steps:\n",
|
|
" - name: Checkout repo\n",
|
|
" uses: actions/checkout@v2\n",
|
|
" - name: Use docker action\n",
|
|
" id: hello\n",
|
|
" uses: ./\n",
|
|
" with:\n",
|
|
" who-to-greet: \"${{ github.event.inputs.input_text }}\"\n",
|
|
" # Use the output from the `hello` step\n",
|
|
" - name: Get the output time\n",
|
|
" run: echo \"The time was ${{ steps.hello.outputs.time }}\"\n",
|
|
" \n",
|
|
" "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 63,
|
|
"id": "bc24dff3",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[main 22a5094] Fix path\n",
|
|
" 1 file changed, 1 insertion(+)\n",
|
|
"Enumerating objects: 9, done.\n",
|
|
"Counting objects: 100% (9/9), done.\n",
|
|
"Delta compression using up to 4 threads\n",
|
|
"Compressing objects: 100% (5/5), done.\n",
|
|
"Writing objects: 100% (5/5), 570 bytes | 570.00 KiB/s, done.\n",
|
|
"Total 5 (delta 1), reused 0 (delta 0), pack-reused 0\n",
|
|
"remote: Resolving deltas: 100% (1/1), completed with 1 local object.\u001b[K\n",
|
|
"To github.com:TomekZet/ium-ga-hello.git\n",
|
|
" 97c7272..22a5094 main -> main\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!git add .github entrypoint.sh Dockerfile\n",
|
|
"!git commit -m \"Fix path\"\n",
|
|
"!git push"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "12af9d1b",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Archiwizowanie artefaktów\n",
|
|
"https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts\n",
|
|
"\n",
|
|
"Do archiwizowania artefaktów służy akcja \"upload-artifact\":\n",
|
|
"\n",
|
|
"```yaml\n",
|
|
" - name: Archive artifacts\n",
|
|
" uses: actions/upload-artifact@v3\n",
|
|
" with:\n",
|
|
" name: figlet-output\n",
|
|
" path: figlet.txt\n",
|
|
"```"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"id": "245f7c8a",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting .github/workflows/docker-artifact.yml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile .github/workflows/docker-artifact.yml\n",
|
|
"name: github-actions-hello-docker-artifact\n",
|
|
"on: \n",
|
|
" workflow_dispatch:\n",
|
|
" inputs:\n",
|
|
" input_text:\n",
|
|
" description: 'Who to greet' \n",
|
|
" required: true\n",
|
|
" default: 'World'\n",
|
|
"jobs:\n",
|
|
" hello-job:\n",
|
|
" name: \"Do all the hard stuff\"\n",
|
|
" runs-on: ubuntu-latest\n",
|
|
" steps:\n",
|
|
" - name: Checkout repo\n",
|
|
" uses: actions/checkout@v2\n",
|
|
" - name: Use docker action\n",
|
|
" id: hello\n",
|
|
" uses: ./\n",
|
|
" with:\n",
|
|
" who-to-greet: \"${{ github.event.inputs.input_text }}\"\n",
|
|
" # Use the output from the `hello` step\n",
|
|
" - name: Get the output time\n",
|
|
" run: echo \"The time was ${{ steps.hello.outputs.time }}\" > time.txt\n",
|
|
" - name: Archive artifacts\n",
|
|
" uses: actions/upload-artifact@v3\n",
|
|
" with:\n",
|
|
" name: figlet-output\n",
|
|
" path: |\n",
|
|
" figlet.txt\n",
|
|
" time.txt\n",
|
|
" publish:\n",
|
|
" name: \"Publish as github comment\"\n",
|
|
" runs-on: ubuntu-latest\n",
|
|
" needs: hello-job\n",
|
|
" steps:\n",
|
|
" - uses: actions/checkout@v3\n",
|
|
" #We need to download the artifact first, jobs do not share workflow files\n",
|
|
" - name: get-artifact \n",
|
|
" uses: actions/download-artifact@v3\n",
|
|
" with:\n",
|
|
" name: figlet-output\n",
|
|
" - name: display_artifact_contents\n",
|
|
" run:\n",
|
|
" cat time.txt ; tr ' ' '#' < figlet.txt\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"id": "47e301f9",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[main 5a40228] Archive in one job, use in other\n",
|
|
" 1 file changed, 1 insertion(+)\n",
|
|
"Enumerating objects: 9, done.\n",
|
|
"Counting objects: 100% (9/9), done.\n",
|
|
"Delta compression using up to 4 threads\n",
|
|
"Compressing objects: 100% (5/5), done.\n",
|
|
"Writing objects: 100% (5/5), 622 bytes | 622.00 KiB/s, done.\n",
|
|
"Total 5 (delta 2), reused 0 (delta 0), pack-reused 0\n",
|
|
"remote: Resolving deltas: 100% (2/2), completed with 2 local objects.\u001b[K\n",
|
|
"To github.com:TomekZet/ium-ga-hello.git\n",
|
|
" 4df6dc0..5a40228 main -> main\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!git add -u\n",
|
|
"!git commit -m \"Archive in one job, use in other\"\n",
|
|
"!git push"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "805622e8",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"## CML - Continous Machine Learning"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e0b3acbf",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
" - Tworzone przez [iterative.ai](iterative.ai) (tak jak DVC)\n",
|
|
" - https://cml.dev/\n",
|
|
" - Dokumentacja: https://dvc.org/doc/cml\n",
|
|
" - Korzysta z Github Actions lub Gitlab CI (a także [Bitbucket Pipelines](https://github.com/iterative/cml/wiki/CML-with-Bitbucket-Cloud))\n",
|
|
" - CML dodaje do Github Actions kilka \"akcji\":\n",
|
|
" - `iterative/setup-cml` - dodaje poniższe akcje\n",
|
|
" - `cml-send-comment` - dodaje raport CML jako komentarz do Pull Requesta na Githubie\n",
|
|
" - `cml-send-github-check` - dodaje raport CML do zakładki \"Checks\" Pull Requesta na Githubie\n",
|
|
" - `cml-publish` - umożliwia dodanie obrazka do raportu\n",
|
|
" "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "cdb54b38",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Przykładowy Workflow CML:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "07b1035a",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/home/tomek/AITech/repo/aitech-ium-private/IUM_11\n",
|
|
"Cloning into 'example_cml'...\n",
|
|
"remote: Enumerating objects: 25, done.\u001b[K\n",
|
|
"remote: Total 25 (delta 0), reused 0 (delta 0), pack-reused 25\u001b[K\n",
|
|
"Receiving objects: 100% (25/25), 222.95 KiB | 920.00 KiB/s, done.\n",
|
|
"Resolving deltas: 100% (6/6), done.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!git clone git@github.com:TomekZet/example_cml.git"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "bf27a2b3",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/home/tomek/AITech/repo/aitech-ium-private/IUM_11/example_cml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%cd example_cml\n",
|
|
"!mkdir -p .github/workflows/"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"id": "64f6e21d",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting .github/workflows/cml.yaml\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile .github/workflows/cml.yaml\n",
|
|
"name: model-training\n",
|
|
"on: [push]\n",
|
|
"jobs:\n",
|
|
" run:\n",
|
|
" runs-on: [ubuntu-latest]\n",
|
|
" steps:\n",
|
|
" - uses: actions/checkout@v2\n",
|
|
" - uses: actions/setup-python@v2\n",
|
|
" - uses: iterative/setup-cml@v1\n",
|
|
" - name: Train model\n",
|
|
" env:\n",
|
|
" REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n",
|
|
" run: |\n",
|
|
" pip install -r requirements.txt\n",
|
|
" python train.py\n",
|
|
"\n",
|
|
" cat metrics.txt >> report.md\n",
|
|
" cml-publish confusion_matrix.png --md >> report.md\n",
|
|
" cml-send-comment report.md"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "83e49d3b",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# %load train.py\n",
|
|
"from sklearn.ensemble import RandomForestClassifier\n",
|
|
"from sklearn.metrics import plot_confusion_matrix\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"import json\n",
|
|
"import os\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"# Read in data\n",
|
|
"X_train = np.genfromtxt(\"data/train_features.csv\")\n",
|
|
"y_train = np.genfromtxt(\"data/train_labels.csv\")\n",
|
|
"X_test = np.genfromtxt(\"data/test_features.csv\")\n",
|
|
"y_test = np.genfromtxt(\"data/test_labels.csv\")\n",
|
|
"\n",
|
|
"\n",
|
|
"# Fit a model\n",
|
|
"depth = 2\n",
|
|
"clf = RandomForestClassifier(max_depth=depth)\n",
|
|
"clf.fit(X_train,y_train)\n",
|
|
"\n",
|
|
"acc = clf.score(X_test, y_test)\n",
|
|
"print(acc)\n",
|
|
"with open(\"metrics.txt\", 'w') as outfile:\n",
|
|
" outfile.write(\"Accuracy: \" + str(acc) + \"\\n\")\n",
|
|
"\n",
|
|
"\n",
|
|
"# Plot it\n",
|
|
"disp = plot_confusion_matrix(clf, X_test, y_test, normalize='true',cmap=plt.cm.Blues)\n",
|
|
"plt.savefig('confusion_matrix.png')\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "8dc5748f",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"Wprowadźmy zmianę do pliku (linijka 17: `depth= = 6`)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"id": "afeaf939",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Overwriting train.py\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%%writefile train.py\n",
|
|
"from sklearn.ensemble import RandomForestClassifier\n",
|
|
"from sklearn.metrics import plot_confusion_matrix\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"import json\n",
|
|
"import os\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"# Read in data\n",
|
|
"X_train = np.genfromtxt(\"data/train_features.csv\")\n",
|
|
"y_train = np.genfromtxt(\"data/train_labels.csv\")\n",
|
|
"X_test = np.genfromtxt(\"data/test_features.csv\")\n",
|
|
"y_test = np.genfromtxt(\"data/test_labels.csv\")\n",
|
|
"\n",
|
|
"\n",
|
|
"# Fit a model\n",
|
|
"depth = 6\n",
|
|
"clf = RandomForestClassifier(max_depth=depth)\n",
|
|
"clf.fit(X_train,y_train)\n",
|
|
"\n",
|
|
"acc = clf.score(X_test, y_test)\n",
|
|
"print(acc)\n",
|
|
"with open(\"metrics.txt\", 'w') as outfile:\n",
|
|
" outfile.write(\"Accuracy: \" + str(acc) + \"\\n\")\n",
|
|
"\n",
|
|
"\n",
|
|
"# Plot it\n",
|
|
"disp = plot_confusion_matrix(clf, X_test, y_test, normalize='true',cmap=plt.cm.Blues)\n",
|
|
"plt.savefig('confusion_matrix.png')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3e4a711a",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"Stwórzmy nowy branch \"deep_depth\":"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"id": "ab019b0b",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Switched to a new branch 'deep_depth'\n",
|
|
"[deep_depth 0df0f2c] Changed depth and added cml workflow\n",
|
|
" 2 files changed, 19 insertions(+), 2 deletions(-)\n",
|
|
" create mode 100644 .github/workflows/cml.yaml\n",
|
|
"Enumerating objects: 8, done.\n",
|
|
"Counting objects: 100% (8/8), done.\n",
|
|
"Delta compression using up to 4 threads\n",
|
|
"Compressing objects: 100% (4/4), done.\n",
|
|
"Writing objects: 100% (6/6), 738 bytes | 738.00 KiB/s, done.\n",
|
|
"Total 6 (delta 2), reused 0 (delta 0)\n",
|
|
"remote: Resolving deltas: 100% (2/2), completed with 2 local objects.\u001b[K\n",
|
|
"remote: \n",
|
|
"remote: Create a pull request for 'deep_depth' on GitHub by visiting:\u001b[K\n",
|
|
"remote: https://github.com/TomekZet/example_cml/pull/new/deep_depth\u001b[K\n",
|
|
"remote: \n",
|
|
"To github.com:TomekZet/example_cml.git\n",
|
|
" * [new branch] deep_depth -> deep_depth\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!git checkout -b deep_depth\n",
|
|
"!git add train.py .github/workflows/cml.yaml\n",
|
|
"!git commit -m \"Changed depth and added cml workflow\"\n",
|
|
"!git push origin deep_depth"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b50f46a8",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"<img src=\"IUM_11/img/github-pr.png\">"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c56c8785",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"<img src=\"IUM_11/img/github-checks.png\">"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "fb25c587",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"## Zadania [20 pkt] (termin: 5 czerwca 2024)\n",
|
|
"1. Utwórz konto w serwisie GitHub (jeśli jeszcze nie masz)\n",
|
|
"2. Stwórz publiczne repozytorium. Link do niego wklej do kolumny *Link GitHub (Actions)* w arkuszu `IUM-2024.xlsx` [1 pkt]\n",
|
|
"3. Stwórz prosty *GitHub workflow*, który:\n",
|
|
" - zrobi checkout Twojego repozytorium [1 pkt]\n",
|
|
" - pobierze pliki z danymi uczącymi (pliki można po prostu dodać do repozytorium albo pobrać przez `wget` jeśli są publicznie dostępne) [2 pkt]\n",
|
|
" - będzie wywoływalny przez \"Workflow dispatch\" z parametrami uczenia [2 pkt]\n",
|
|
" - będzie się składał z co najmniej 2 zadań (*job*):\n",
|
|
" - uczenie modelu jako osobna akcja wykonana w Dockerze [8 pkt]\n",
|
|
" - ewaluacja modelu [6 pkt]"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"author": "Tomasz Ziętkiewicz",
|
|
"celltoolbar": "Slideshow",
|
|
"email": "tomasz.zietkiewicz@amu.edu.pl",
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"lang": "pl",
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.12"
|
|
},
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
},
|
|
"subtitle": "11.CML[laboratoria]",
|
|
"title": "Inżynieria uczenia maszynowego",
|
|
"year": "2021"
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|