A platform for research: civil engineering, architecture and urbanism
Datasheet for the Pile
This datasheet describes the Pile, a 825 GiB dataset of human-authored text compiled by EleutherAI for use in large-scale language modeling. The Pile is comprised of 22 different text sources, ranging from original scrapes done for this project, to text data made available by the data owners, to third-party scrapes available online.
Datasheet for the Pile
This datasheet describes the Pile, a 825 GiB dataset of human-authored text compiled by EleutherAI for use in large-scale language modeling. The Pile is comprised of 22 different text sources, ranging from original scrapes done for this project, to text data made available by the data owners, to third-party scrapes available online.
Datasheet for the Pile
Biderman, Stella (author) / Bicheno, Kieran (author) / Gao, Leo (author)
2022
Accompanies "The Pile: An 800GB Dataset of Diverse Text for Language Modeling" arXiv:2101.00027
Preprint
Electronic Resource
English
VS1000 - Preliminary Datasheet
British Library Conference Proceedings | 2016
British Library Online Contents | 1998
Datasheet: HPM, Ethylene Tube Alloy
British Library Online Contents | 1998
Datasheet: Avesta Sheffield 248 SV
British Library Online Contents | 1999
Datasheet: Sandvik 6R35 Austenitic Steel
British Library Online Contents | 1998