Update README.md
This commit is contained in:
parent
7dbfea19d2
commit
717796e8df
@ -108,7 +108,7 @@ python -m nanoT5.main \
|
|||||||
optim.lr_scheduler={legacy,cosine}
|
optim.lr_scheduler={legacy,cosine}
|
||||||
```
|
```
|
||||||
|
|
||||||
We recommend adding `model.compile=true` flag for pre-training, if you are able to install PyTorch 2.0. In our case it effects in 1.33x speedup.
|
We recommend adding `model.compile=true` flag for pre-training, if you are able to install PyTorch 2.0. In our case it results in ~1.33x speedup.
|
||||||
|
|
||||||
Suppose you don't have access to a 80GB GPU. In that case, you can increase the number of gradient accumulation steps by `optim.grad_acc=steps`, In where `batch_size` has to be divisible by `steps`.
|
Suppose you don't have access to a 80GB GPU. In that case, you can increase the number of gradient accumulation steps by `optim.grad_acc=steps`, In where `batch_size` has to be divisible by `steps`.
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user