Update README.md
This commit is contained in:
parent
7dbfea19d2
commit
717796e8df
@ -108,7 +108,7 @@ python -m nanoT5.main \
|
||||
optim.lr_scheduler={legacy,cosine}
|
||||
```
|
||||
|
||||
We recommend adding `model.compile=true` flag for pre-training, if you are able to install PyTorch 2.0. In our case it effects in 1.33x speedup.
|
||||
We recommend adding `model.compile=true` flag for pre-training, if you are able to install PyTorch 2.0. In our case it results in ~1.33x speedup.
|
||||
|
||||
Suppose you don't have access to a 80GB GPU. In that case, you can increase the number of gradient accumulation steps by `optim.grad_acc=steps`, In where `batch_size` has to be divisible by `steps`.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user