Update README.md

This commit is contained in:
Piotr Nawrot 2023-03-16 15:24:25 +01:00 committed by GitHub
parent 73b2e63aac
commit 7dbfea19d2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -174,6 +174,10 @@ For pre-training we compile our model with PyTorch 2.0 using `model.compile=true
We show that it is possible to successfully pre-train a "Large Language Model" (T5) under a limited budget (1xA100 GPU, ~20 hours) in PyTorch. We make our codebase, configs and training logs publicly available to enhance the accessibility of NLP research. We are keen to hear your suggestions to improve the codebase further. We show that it is possible to successfully pre-train a "Large Language Model" (T5) under a limited budget (1xA100 GPU, ~20 hours) in PyTorch. We make our codebase, configs and training logs publicly available to enhance the accessibility of NLP research. We are keen to hear your suggestions to improve the codebase further.
### Acknowledgements:
Thanks to [Edoardo Maria Ponti](https://ducdauge.github.io) for his feedback!
## References: ## References:
- [T5 paper](https://arxiv.org/pdf/1910.10683.pdf) - [T5 paper](https://arxiv.org/pdf/1910.10683.pdf)
- [T5 v1.1 paper](https://arxiv.org/pdf/2002.05202.pdf) - [T5 v1.1 paper](https://arxiv.org/pdf/2002.05202.pdf)