To perform this, of course, we need a lot of training data, and here, the AI reads 40 gigabytes of internet text, which is 40 gigs of non-binary plaintext data, which is a stupendously large amount of text.
当然,为了达到目的,我们需要大量的训练数据,文章里的人工智能已经读取了400亿字节的互联网文本,包括40GB的非二进制纯文本数据,这极大的文本量。