4.0 KiB
4.0 KiB
import torch
x = torch.rand(1, 6400)
y = torch.rand(6400, 5000)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
assert device == 'cuda', "This exercise assumes the notebook is on a GPU machine"
x, y = x.to(device), y.to(device)
%timeit z=(x@y)
The slowest run took 22.35 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 974 µs per loop
x, y = x.cpu(), y.cpu()
%timeit z=(x@y)
100 loops, best of 3: 9.4 ms per loop
import numpy as np
x = np.random.random((1, 6400))
y = np.random.random((6400, 5000))
%timeit z = np.matmul(x,y)
10 loops, best of 3: 19.9 ms per loop