Add 1-intro materials
This commit is contained in:
commit
8a2a9643d4
48
0-install/1-system-setup.ipynb
Normal file
48
0-install/1-system-setup.ipynb
Normal file
@ -0,0 +1,48 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# System setup\n",
|
||||||
|
"\n",
|
||||||
|
"Requirements:\n",
|
||||||
|
"* Ubuntu 20.04\n",
|
||||||
|
"* Python 3.6+\n",
|
||||||
|
"* docker-ce 19.03+\n",
|
||||||
|
"* nvidia-container-toolkit 1.3+\n",
|
||||||
|
"* docker-compose 1.28+ (install docker-compose version that support GPU (https://github.com/docker/compose/pull/7929))\n",
|
||||||
|
"```\n",
|
||||||
|
"sudo pip3 install wheel\n",
|
||||||
|
"sudo pip3 install --upgrade git+https://github.com/docker/compose.git@854c003359bd07d0d3ca137d7a08509cfeab0436#egg=docker-compose\n",
|
||||||
|
"```\n",
|
||||||
|
"\n",
|
||||||
|
"# Tests\n",
|
||||||
|
"\n",
|
||||||
|
"Simple stress test example is avaiable at (it also includes install script):\n",
|
||||||
|
"https://git.wmi.amu.edu.pl/bikol/docker-gpu-tests"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.6.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 4
|
||||||
|
}
|
663
1-intro/1_tensor_tutorial.ipynb
Normal file
663
1-intro/1_tensor_tutorial.ipynb
Normal file
@ -0,0 +1,663 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%matplotlib inline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"\n",
|
||||||
|
"What is PyTorch?\n",
|
||||||
|
"================\n",
|
||||||
|
"\n",
|
||||||
|
"It’s a Python-based scientific computing package targeted at two sets of\n",
|
||||||
|
"audiences:\n",
|
||||||
|
"\n",
|
||||||
|
"- A replacement for NumPy to use the power of GPUs\n",
|
||||||
|
"- a deep learning research platform that provides maximum flexibility\n",
|
||||||
|
" and speed\n",
|
||||||
|
"\n",
|
||||||
|
"Getting Started\n",
|
||||||
|
"---------------\n",
|
||||||
|
"\n",
|
||||||
|
"Tensors\n",
|
||||||
|
"^^^^^^^\n",
|
||||||
|
"\n",
|
||||||
|
"Tensors are similar to NumPy’s ndarrays, with the addition being that\n",
|
||||||
|
"Tensors can also be used on a GPU to accelerate computing.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 2,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from __future__ import print_function\n",
|
||||||
|
"import torch"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<div class=\"alert alert-info\"><h4>Note</h4><p>An uninitialized matrix is declared,\n",
|
||||||
|
" but does not contain definite known\n",
|
||||||
|
" values before it is used. When an\n",
|
||||||
|
" uninitialized matrix is created,\n",
|
||||||
|
" whatever values were in the allocated\n",
|
||||||
|
" memory at the time will appear as the initial values.</p></div>\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Construct a 5x3 matrix, uninitialized:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 3,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[-1.7501e-10, 4.5822e-41, -1.7501e-10],\n",
|
||||||
|
" [ 4.5822e-41, -9.8701e-38, 4.5822e-41],\n",
|
||||||
|
" [-9.8892e-38, 4.5822e-41, -9.8700e-38],\n",
|
||||||
|
" [ 4.5822e-41, -9.8702e-38, 4.5822e-41],\n",
|
||||||
|
" [-9.8701e-38, 4.5822e-41, -9.8703e-38]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.empty(5, 3)\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Construct a randomly initialized matrix:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 4,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[0.8525, 0.7922, 0.2553],\n",
|
||||||
|
" [0.2792, 0.6800, 0.7858],\n",
|
||||||
|
" [0.4438, 0.6987, 0.0985],\n",
|
||||||
|
" [0.7342, 0.1807, 0.5665],\n",
|
||||||
|
" [0.0847, 0.8206, 0.6820]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.rand(5, 3)\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Construct a matrix filled zeros and of dtype long:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[0, 0, 0],\n",
|
||||||
|
" [0, 0, 0],\n",
|
||||||
|
" [0, 0, 0],\n",
|
||||||
|
" [0, 0, 0],\n",
|
||||||
|
" [0, 0, 0]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.zeros(5, 3, dtype=torch.long)\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Construct a tensor directly from data:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 6,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([5.5000, 3.0000])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.tensor([5.5, 3])\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"or create a tensor based on an existing tensor. These methods\n",
|
||||||
|
"will reuse properties of the input tensor, e.g. dtype, unless\n",
|
||||||
|
"new values are provided by user\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 7,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[1., 1., 1.],\n",
|
||||||
|
" [1., 1., 1.],\n",
|
||||||
|
" [1., 1., 1.],\n",
|
||||||
|
" [1., 1., 1.],\n",
|
||||||
|
" [1., 1., 1.]], dtype=torch.float64)\n",
|
||||||
|
"tensor([[ 1.0131, 1.4739, -0.2482],\n",
|
||||||
|
" [-1.8965, -1.6178, 0.4807],\n",
|
||||||
|
" [ 0.1839, 0.3258, -0.6664],\n",
|
||||||
|
" [-0.9516, -1.7041, 1.1624],\n",
|
||||||
|
" [-0.4448, -1.1328, -0.5092]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = x.new_ones(5, 3, dtype=torch.double) # new_* methods take in sizes\n",
|
||||||
|
"print(x)\n",
|
||||||
|
"\n",
|
||||||
|
"x = torch.randn_like(x, dtype=torch.float) # override dtype!\n",
|
||||||
|
"print(x) # result has the same size"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Get its size:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 8,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"torch.Size([5, 3])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"print(x.size())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<div class=\"alert alert-info\"><h4>Note</h4><p>``torch.Size`` is in fact a tuple, so it supports all tuple operations.</p></div>\n",
|
||||||
|
"\n",
|
||||||
|
"Operations\n",
|
||||||
|
"^^^^^^^^^^\n",
|
||||||
|
"There are multiple syntaxes for operations. In the following\n",
|
||||||
|
"example, we will take a look at the addition operation.\n",
|
||||||
|
"\n",
|
||||||
|
"Addition: syntax 1\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 9,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[ 1.6789, 1.8680, -0.0202],\n",
|
||||||
|
" [-1.2243, -1.5905, 0.8047],\n",
|
||||||
|
" [ 0.5959, 0.7308, -0.1883],\n",
|
||||||
|
" [-0.6292, -0.7051, 1.8369],\n",
|
||||||
|
" [-0.0381, -0.2377, -0.1590]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"y = torch.rand(5, 3)\n",
|
||||||
|
"print(x + y)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Addition: syntax 2\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 10,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[ 1.6789, 1.8680, -0.0202],\n",
|
||||||
|
" [-1.2243, -1.5905, 0.8047],\n",
|
||||||
|
" [ 0.5959, 0.7308, -0.1883],\n",
|
||||||
|
" [-0.6292, -0.7051, 1.8369],\n",
|
||||||
|
" [-0.0381, -0.2377, -0.1590]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"print(torch.add(x, y))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Addition: providing an output tensor as argument\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 11,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[ 1.6789, 1.8680, -0.0202],\n",
|
||||||
|
" [-1.2243, -1.5905, 0.8047],\n",
|
||||||
|
" [ 0.5959, 0.7308, -0.1883],\n",
|
||||||
|
" [-0.6292, -0.7051, 1.8369],\n",
|
||||||
|
" [-0.0381, -0.2377, -0.1590]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"result = torch.empty(5, 3)\n",
|
||||||
|
"torch.add(x, y, out=result)\n",
|
||||||
|
"print(result)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Addition: in-place\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 12,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[ 1.6789, 1.8680, -0.0202],\n",
|
||||||
|
" [-1.2243, -1.5905, 0.8047],\n",
|
||||||
|
" [ 0.5959, 0.7308, -0.1883],\n",
|
||||||
|
" [-0.6292, -0.7051, 1.8369],\n",
|
||||||
|
" [-0.0381, -0.2377, -0.1590]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"# adds x to y\n",
|
||||||
|
"y.add_(x)\n",
|
||||||
|
"print(y)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<div class=\"alert alert-info\"><h4>Note</h4><p>Any operation that mutates a tensor in-place is post-fixed with an ``_``.\n",
|
||||||
|
" For example: ``x.copy_(y)``, ``x.t_()``, will change ``x``.</p></div>\n",
|
||||||
|
"\n",
|
||||||
|
"You can use standard NumPy-like indexing with all bells and whistles!\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 13,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([ 1.4739, -1.6178, 0.3258, -1.7041, -1.1328])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"print(x[:, 1])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Resizing: If you want to resize/reshape tensor, you can use ``torch.view``:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 14,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.randn(4, 4)\n",
|
||||||
|
"y = x.view(16)\n",
|
||||||
|
"z = x.view(-1, 8) # the size -1 is inferred from other dimensions\n",
|
||||||
|
"print(x.size(), y.size(), z.size())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"If you have a one element tensor, use ``.item()`` to get the value as a\n",
|
||||||
|
"Python number\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 15,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([-0.8622])\n",
|
||||||
|
"-0.8622472882270813\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.randn(1)\n",
|
||||||
|
"print(x)\n",
|
||||||
|
"print(x.item())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"**Read later:**\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
" 100+ Tensor operations, including transposing, indexing, slicing,\n",
|
||||||
|
" mathematical operations, linear algebra, random numbers, etc.,\n",
|
||||||
|
" are described\n",
|
||||||
|
" `here <https://pytorch.org/docs/torch>`_.\n",
|
||||||
|
"\n",
|
||||||
|
"NumPy Bridge\n",
|
||||||
|
"------------\n",
|
||||||
|
"\n",
|
||||||
|
"Converting a Torch Tensor to a NumPy array and vice versa is a breeze.\n",
|
||||||
|
"\n",
|
||||||
|
"The Torch Tensor and NumPy array will share their underlying memory\n",
|
||||||
|
"locations (if the Torch Tensor is on CPU), and changing one will change\n",
|
||||||
|
"the other.\n",
|
||||||
|
"\n",
|
||||||
|
"Converting a Torch Tensor to a NumPy Array\n",
|
||||||
|
"^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 16,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([1., 1., 1., 1., 1.])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"a = torch.ones(5)\n",
|
||||||
|
"print(a)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 17,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"[1. 1. 1. 1. 1.]\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"b = a.numpy()\n",
|
||||||
|
"print(b)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"See how the numpy array changed in value.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 18,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([2., 2., 2., 2., 2.])\n",
|
||||||
|
"[2. 2. 2. 2. 2.]\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"a.add_(1)\n",
|
||||||
|
"print(a)\n",
|
||||||
|
"print(b)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Converting NumPy Array to Torch Tensor\n",
|
||||||
|
"See how changing the np array changed the Torch Tensor automatically\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 19,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"[2. 2. 2. 2. 2.]\n",
|
||||||
|
"tensor([2., 2., 2., 2., 2.], dtype=torch.float64)\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"a = np.ones(5)\n",
|
||||||
|
"b = torch.from_numpy(a)\n",
|
||||||
|
"np.add(a, 1, out=a)\n",
|
||||||
|
"print(a)\n",
|
||||||
|
"print(b)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"All the Tensors on the CPU except a CharTensor support converting to\n",
|
||||||
|
"NumPy and back.\n",
|
||||||
|
"\n",
|
||||||
|
"CUDA Tensors\n",
|
||||||
|
"------------\n",
|
||||||
|
"\n",
|
||||||
|
"Tensors can be moved onto any device using the ``.to`` method.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 20,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"1.7.0\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "stderr",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"/usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:81: UserWarning: \n",
|
||||||
|
" Found GPU0 GeForce GTX 760 which is of cuda capability 3.0.\n",
|
||||||
|
" PyTorch no longer supports this GPU because it is too old.\n",
|
||||||
|
" The minimum cuda capability that we support is 3.5.\n",
|
||||||
|
" \n",
|
||||||
|
" warnings.warn(old_gpu_warn % (d, name, major, capability[1]))\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ename": "RuntimeError",
|
||||||
|
"evalue": "CUDA error: no kernel image is available for execution on the device",
|
||||||
|
"output_type": "error",
|
||||||
|
"traceback": [
|
||||||
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||||
|
"\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)",
|
||||||
|
"\u001b[0;32m<ipython-input-20-9fca8bb14c5b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcuda\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_available\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mdevice\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdevice\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"cuda\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# a CUDA device object\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mones_like\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdevice\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdevice\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# directly create a tensor on GPU\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mto\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# or just use strings ``.to(\"cuda\")``\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||||
|
"\u001b[0;31mRuntimeError\u001b[0m: CUDA error: no kernel image is available for execution on the device"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"# let us run this cell only if CUDA is available\n",
|
||||||
|
"# We will use ``torch.device`` objects to move tensors in and out of GPU\n",
|
||||||
|
"print(torch.__version__)\n",
|
||||||
|
"if torch.cuda.is_available():\n",
|
||||||
|
" device = torch.device(\"cuda\") # a CUDA device object\n",
|
||||||
|
" y = torch.ones_like(x, device=device) # directly create a tensor on GPU\n",
|
||||||
|
" x = x.to(device) # or just use strings ``.to(\"cuda\")``\n",
|
||||||
|
" z = x + y\n",
|
||||||
|
" print(z)\n",
|
||||||
|
" print(z.to(\"cpu\", torch.double)) # ``.to`` can also change dtype together!\n",
|
||||||
|
" "
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.6.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 1
|
||||||
|
}
|
486
1-intro/2_autograd_tutorial.ipynb
Normal file
486
1-intro/2_autograd_tutorial.ipynb
Normal file
@ -0,0 +1,486 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%matplotlib inline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"\n",
|
||||||
|
"Autograd: Automatic Differentiation\n",
|
||||||
|
"===================================\n",
|
||||||
|
"\n",
|
||||||
|
"Central to all neural networks in PyTorch is the ``autograd`` package.\n",
|
||||||
|
"Let’s first briefly visit this, and we will then go to training our\n",
|
||||||
|
"first neural network.\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"The ``autograd`` package provides automatic differentiation for all operations\n",
|
||||||
|
"on Tensors. It is a define-by-run framework, which means that your backprop is\n",
|
||||||
|
"defined by how your code is run, and that every single iteration can be\n",
|
||||||
|
"different.\n",
|
||||||
|
"\n",
|
||||||
|
"Let us see this in more simple terms with some examples.\n",
|
||||||
|
"\n",
|
||||||
|
"Tensor\n",
|
||||||
|
"--------\n",
|
||||||
|
"\n",
|
||||||
|
"``torch.Tensor`` is the central class of the package. If you set its attribute\n",
|
||||||
|
"``.requires_grad`` as ``True``, it starts to track all operations on it. When\n",
|
||||||
|
"you finish your computation you can call ``.backward()`` and have all the\n",
|
||||||
|
"gradients computed automatically. The gradient for this tensor will be\n",
|
||||||
|
"accumulated into ``.grad`` attribute.\n",
|
||||||
|
"\n",
|
||||||
|
"To stop a tensor from tracking history, you can call ``.detach()`` to detach\n",
|
||||||
|
"it from the computation history, and to prevent future computation from being\n",
|
||||||
|
"tracked.\n",
|
||||||
|
"\n",
|
||||||
|
"To prevent tracking history (and using memory), you can also wrap the code block\n",
|
||||||
|
"in ``with torch.no_grad():``. This can be particularly helpful when evaluating a\n",
|
||||||
|
"model because the model may have trainable parameters with\n",
|
||||||
|
"``requires_grad=True``, but for which we don't need the gradients.\n",
|
||||||
|
"\n",
|
||||||
|
"There’s one more class which is very important for autograd\n",
|
||||||
|
"implementation - a ``Function``.\n",
|
||||||
|
"\n",
|
||||||
|
"``Tensor`` and ``Function`` are interconnected and build up an acyclic\n",
|
||||||
|
"graph, that encodes a complete history of computation. Each tensor has\n",
|
||||||
|
"a ``.grad_fn`` attribute that references a ``Function`` that has created\n",
|
||||||
|
"the ``Tensor`` (except for Tensors created by the user - their\n",
|
||||||
|
"``grad_fn is None``).\n",
|
||||||
|
"\n",
|
||||||
|
"If you want to compute the derivatives, you can call ``.backward()`` on\n",
|
||||||
|
"a ``Tensor``. If ``Tensor`` is a scalar (i.e. it holds a one element\n",
|
||||||
|
"data), you don’t need to specify any arguments to ``backward()``,\n",
|
||||||
|
"however if it has more elements, you need to specify a ``gradient``\n",
|
||||||
|
"argument that is a tensor of matching shape.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 2,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import torch"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Create a tensor and set ``requires_grad=True`` to track computation with it\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 3,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[1., 1.],\n",
|
||||||
|
" [1., 1.]], requires_grad=True)\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.ones(2, 2, requires_grad=True)\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Do a tensor operation:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 4,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[3., 3.],\n",
|
||||||
|
" [3., 3.]], grad_fn=<AddBackward0>)\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"y = x + 2\n",
|
||||||
|
"print(y)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"``y`` was created as a result of an operation, so it has a ``grad_fn``.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"<AddBackward0 object at 0x7f0d183e5160>\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"print(y.grad_fn)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Do more operations on ``y``\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 6,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[27., 27.],\n",
|
||||||
|
" [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"z = y * y * 3\n",
|
||||||
|
"out = z.mean()\n",
|
||||||
|
"\n",
|
||||||
|
"print(z, out)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"``.requires_grad_( ... )`` changes an existing Tensor's ``requires_grad``\n",
|
||||||
|
"flag in-place. The input flag defaults to ``False`` if not given.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 7,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"False\n",
|
||||||
|
"True\n",
|
||||||
|
"<SumBackward0 object at 0x7f0cc743b438>\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"a = torch.randn(2, 2)\n",
|
||||||
|
"a = ((a * 3) / (a - 1))\n",
|
||||||
|
"print(a.requires_grad)\n",
|
||||||
|
"a.requires_grad_(True)\n",
|
||||||
|
"print(a.requires_grad)\n",
|
||||||
|
"b = (a * a).sum()\n",
|
||||||
|
"print(b.grad_fn)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Gradients\n",
|
||||||
|
"---------\n",
|
||||||
|
"Let's backprop now.\n",
|
||||||
|
"Because ``out`` contains a single scalar, ``out.backward()`` is\n",
|
||||||
|
"equivalent to ``out.backward(torch.tensor(1.))``.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 8,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"out.backward()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Print gradients d(out)/dx\n",
|
||||||
|
"\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 9,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([[4.5000, 4.5000],\n",
|
||||||
|
" [4.5000, 4.5000]])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"print(x.grad)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"You should have got a matrix of ``4.5``. Let’s call the ``out``\n",
|
||||||
|
"*Tensor* “$o$”.\n",
|
||||||
|
"We have that $o = \\frac{1}{4}\\sum_i z_i$,\n",
|
||||||
|
"$z_i = 3(x_i+2)^2$ and $z_i\\bigr\\rvert_{x_i=1} = 27$.\n",
|
||||||
|
"Therefore,\n",
|
||||||
|
"$\\frac{\\partial o}{\\partial x_i} = \\frac{3}{2}(x_i+2)$, hence\n",
|
||||||
|
"$\\frac{\\partial o}{\\partial x_i}\\bigr\\rvert_{x_i=1} = \\frac{9}{2} = 4.5$.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Mathematically, if you have a vector valued function $\\vec{y}=f(\\vec{x})$,\n",
|
||||||
|
"then the gradient of $\\vec{y}$ with respect to $\\vec{x}$\n",
|
||||||
|
"is a Jacobian matrix:\n",
|
||||||
|
"\n",
|
||||||
|
"\\begin{align}J=\\left(\\begin{array}{ccc}\n",
|
||||||
|
" \\frac{\\partial y_{1}}{\\partial x_{1}} & \\cdots & \\frac{\\partial y_{1}}{\\partial x_{n}}\\\\\n",
|
||||||
|
" \\vdots & \\ddots & \\vdots\\\\\n",
|
||||||
|
" \\frac{\\partial y_{m}}{\\partial x_{1}} & \\cdots & \\frac{\\partial y_{m}}{\\partial x_{n}}\n",
|
||||||
|
" \\end{array}\\right)\\end{align}\n",
|
||||||
|
"\n",
|
||||||
|
"Generally speaking, ``torch.autograd`` is an engine for computing\n",
|
||||||
|
"vector-Jacobian product. That is, given any vector\n",
|
||||||
|
"$v=\\left(\\begin{array}{cccc} v_{1} & v_{2} & \\cdots & v_{m}\\end{array}\\right)^{T}$,\n",
|
||||||
|
"compute the product $v^{T}\\cdot J$. If $v$ happens to be\n",
|
||||||
|
"the gradient of a scalar function $l=g\\left(\\vec{y}\\right)$,\n",
|
||||||
|
"that is,\n",
|
||||||
|
"$v=\\left(\\begin{array}{ccc}\\frac{\\partial l}{\\partial y_{1}} & \\cdots & \\frac{\\partial l}{\\partial y_{m}}\\end{array}\\right)^{T}$,\n",
|
||||||
|
"then by the chain rule, the vector-Jacobian product would be the\n",
|
||||||
|
"gradient of $l$ with respect to $\\vec{x}$:\n",
|
||||||
|
"\n",
|
||||||
|
"\\begin{align}J^{T}\\cdot v=\\left(\\begin{array}{ccc}\n",
|
||||||
|
" \\frac{\\partial y_{1}}{\\partial x_{1}} & \\cdots & \\frac{\\partial y_{m}}{\\partial x_{1}}\\\\\n",
|
||||||
|
" \\vdots & \\ddots & \\vdots\\\\\n",
|
||||||
|
" \\frac{\\partial y_{1}}{\\partial x_{n}} & \\cdots & \\frac{\\partial y_{m}}{\\partial x_{n}}\n",
|
||||||
|
" \\end{array}\\right)\\left(\\begin{array}{c}\n",
|
||||||
|
" \\frac{\\partial l}{\\partial y_{1}}\\\\\n",
|
||||||
|
" \\vdots\\\\\n",
|
||||||
|
" \\frac{\\partial l}{\\partial y_{m}}\n",
|
||||||
|
" \\end{array}\\right)=\\left(\\begin{array}{c}\n",
|
||||||
|
" \\frac{\\partial l}{\\partial x_{1}}\\\\\n",
|
||||||
|
" \\vdots\\\\\n",
|
||||||
|
" \\frac{\\partial l}{\\partial x_{n}}\n",
|
||||||
|
" \\end{array}\\right)\\end{align}\n",
|
||||||
|
"\n",
|
||||||
|
"(Note that $v^{T}\\cdot J$ gives a row vector which can be\n",
|
||||||
|
"treated as a column vector by taking $J^{T}\\cdot v$.)\n",
|
||||||
|
"\n",
|
||||||
|
"This characteristic of vector-Jacobian product makes it very\n",
|
||||||
|
"convenient to feed external gradients into a model that has\n",
|
||||||
|
"non-scalar output.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Now let's take a look at an example of vector-Jacobian product:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 10,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([1688.6201, -110.9400, 181.5985], grad_fn=<MulBackward0>)\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"x = torch.randn(3, requires_grad=True)\n",
|
||||||
|
"\n",
|
||||||
|
"y = x * 2\n",
|
||||||
|
"while y.data.norm() < 1000:\n",
|
||||||
|
" y = y * 2\n",
|
||||||
|
"\n",
|
||||||
|
"print(y)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Now in this case ``y`` is no longer a scalar. ``torch.autograd``\n",
|
||||||
|
"could not compute the full Jacobian directly, but if we just\n",
|
||||||
|
"want the vector-Jacobian product, simply pass the vector to\n",
|
||||||
|
"``backward`` as argument:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 11,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"tensor([1.0240e+02, 1.0240e+03, 1.0240e-01])\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)\n",
|
||||||
|
"y.backward(v)\n",
|
||||||
|
"\n",
|
||||||
|
"print(x.grad)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"You can also stop autograd from tracking history on Tensors\n",
|
||||||
|
"with ``.requires_grad=True`` either by wrapping the code block in\n",
|
||||||
|
"``with torch.no_grad():``\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 12,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"True\n",
|
||||||
|
"True\n",
|
||||||
|
"False\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"print(x.requires_grad)\n",
|
||||||
|
"print((x ** 2).requires_grad)\n",
|
||||||
|
"\n",
|
||||||
|
"with torch.no_grad():\n",
|
||||||
|
"\tprint((x ** 2).requires_grad)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Or by using ``.detach()`` to get a new Tensor with the same\n",
|
||||||
|
"content but that does not require gradients:\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 13,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"True\n",
|
||||||
|
"False\n",
|
||||||
|
"tensor(True)\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"print(x.requires_grad)\n",
|
||||||
|
"y = x.detach()\n",
|
||||||
|
"print(y.requires_grad)\n",
|
||||||
|
"print(x.eq(y).all())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"**Read Later:**\n",
|
||||||
|
"\n",
|
||||||
|
"Document about ``autograd.Function`` is at\n",
|
||||||
|
"https://pytorch.org/docs/stable/autograd.html#function\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.6.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 1
|
||||||
|
}
|
785
1-intro/3_cnn.ipynb
Normal file
785
1-intro/3_cnn.ipynb
Normal file
File diff suppressed because one or more lines are too long
712
1-intro/4_cifar10_tutorial.ipynb
Normal file
712
1-intro/4_cifar10_tutorial.ipynb
Normal file
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue
Block a user