From 717796e8df4e21927788c6fcb042ec4f8a9141ba Mon Sep 17 00:00:00 2001
From: Piotr Nawrot
Date: Fri, 17 Mar 2023 11:19:53 +0100
Subject: [PATCH] Update README.md
---
README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index e25578c..6756537 100644
--- a/README.md
+++ b/README.md
@@ -108,7 +108,7 @@ python -m nanoT5.main \
optim.lr_scheduler={legacy,cosine}
```
-We recommend adding `model.compile=true` flag for pre-training, if you are able to install PyTorch 2.0. In our case it effects in 1.33x speedup.
+We recommend adding `model.compile=true` flag for pre-training, if you are able to install PyTorch 2.0. In our case it results in ~1.33x speedup.
Suppose you don't have access to a 80GB GPU. In that case, you can increase the number of gradient accumulation steps by `optim.grad_acc=steps`, In where `batch_size` has to be divisible by `steps`.