Update README.md
This commit is contained in:
parent
73b2e63aac
commit
7dbfea19d2
@ -174,6 +174,10 @@ For pre-training we compile our model with PyTorch 2.0 using `model.compile=true
|
|||||||
|
|
||||||
We show that it is possible to successfully pre-train a "Large Language Model" (T5) under a limited budget (1xA100 GPU, ~20 hours) in PyTorch. We make our codebase, configs and training logs publicly available to enhance the accessibility of NLP research. We are keen to hear your suggestions to improve the codebase further.
|
We show that it is possible to successfully pre-train a "Large Language Model" (T5) under a limited budget (1xA100 GPU, ~20 hours) in PyTorch. We make our codebase, configs and training logs publicly available to enhance the accessibility of NLP research. We are keen to hear your suggestions to improve the codebase further.
|
||||||
|
|
||||||
|
### Acknowledgements:
|
||||||
|
|
||||||
|
Thanks to [Edoardo Maria Ponti](https://ducdauge.github.io) for his feedback!
|
||||||
|
|
||||||
## References:
|
## References:
|
||||||
- [T5 paper](https://arxiv.org/pdf/1910.10683.pdf)
|
- [T5 paper](https://arxiv.org/pdf/1910.10683.pdf)
|
||||||
- [T5 v1.1 paper](https://arxiv.org/pdf/2002.05202.pdf)
|
- [T5 v1.1 paper](https://arxiv.org/pdf/2002.05202.pdf)
|
||||||
|
Loading…
Reference in New Issue
Block a user