{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# For tips on running notebooks in Google Colab, see\n", "# https://pytorch.org/tutorials/beginner/colab\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "# Neural Transfer Using PyTorch\n", "\n", "\n", "**Author**: [Alexis Jacq](https://alexis-jacq.github.io)\n", " \n", "**Edited by**: [Winston Herring](https://github.com/winston6)\n", "\n", "## Introduction\n", "\n", "This tutorial explains how to implement the [Neural-Style algorithm](https://arxiv.org/abs/1508.06576)_\n", "developed by Leon A. Gatys, Alexander S. Ecker and Matthias Bethge.\n", "Neural-Style, or Neural-Transfer, allows you to take an image and\n", "reproduce it with a new artistic style. The algorithm takes three images,\n", "an input image, a content-image, and a style-image, and changes the input\n", "to resemble the content of the content-image and the artistic style of the style-image.\n", "\n", " \n", ".. figure:: /_static/img/neural-style/neuralstyle.png\n", " :alt: content1\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Underlying Principle\n", "\n", "The principle is simple: we define two distances, one for the content\n", "($D_C$) and one for the style ($D_S$). $D_C$ measures how different the content\n", "is between two images while $D_S$ measures how different the style is\n", "between two images. Then, we take a third image, the input, and\n", "transform it to minimize both its content-distance with the\n", "content-image and its style-distance with the style-image. Now we can\n", "import the necessary packages and begin the neural transfer.\n", "\n", "## Importing Packages and Selecting a Device\n", "Below is a list of the packages needed to implement the neural transfer.\n", "\n", "- ``torch``, ``torch.nn``, ``numpy`` (indispensables packages for\n", " neural networks with PyTorch)\n", "- ``torch.optim`` (efficient gradient descents)\n", "- ``PIL``, ``PIL.Image``, ``matplotlib.pyplot`` (load and display\n", " images)\n", "- ``torchvision.transforms`` (transform PIL images into tensors)\n", "- ``torchvision.models`` (train or load pretrained models)\n", "- ``copy`` (to deep copy the models; system package)\n", "\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "import torch.optim as optim\n", "\n", "from PIL import Image\n", "import matplotlib.pyplot as plt\n", "\n", "import torchvision.transforms as transforms\n", "from torchvision.models import vgg19, VGG19_Weights\n", "\n", "import copy" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "torch.cuda.is_available()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we need to choose which device to run the network on and import the\n", "content and style images. Running the neural transfer algorithm on large\n", "images takes longer and will go much faster when running on a GPU. We can\n", "use ``torch.cuda.is_available()`` to detect if there is a GPU available.\n", "Next, we set the ``torch.device`` for use throughout the tutorial. Also the ``.to(device)``\n", "method is used to move tensors or modules to a desired device. \n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", "torch.set_default_device(device)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "device(type='cuda')" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "device" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading the Images\n", "\n", "Now we will import the style and content images. The original PIL images have values between 0 and 255, but when\n", "transformed into torch tensors, their values are converted to be between\n", "0 and 1. The images also need to be resized to have the same dimensions.\n", "An important detail to note is that neural networks from the\n", "torch library are trained with tensor values ranging from 0 to 1. If you\n", "try to feed the networks with 0 to 255 tensor images, then the activated\n", "feature maps will be unable to sense the intended content and style.\n", "However, pretrained networks from the Caffe library are trained with 0\n", "to 255 tensor images. \n", "\n", "\n", "
Here are links to download the images required to run the tutorial:\n", " [picasso.jpg](https://pytorch.org/tutorials/_static/img/neural-style/picasso.jpg)_ and\n", " [dancing.jpg](https://pytorch.org/tutorials/_static/img/neural-style/dancing.jpg)_.\n", " Download these two images and add them to a directory\n", " with name ``images`` in your current working directory.