Also -unrelated to the above- i’d like to add a way of getting a corpus from Wikipedia in case you have trouble finding one.
Download this: GitHub - attardi/wikiextractor: A tool for extracting plain text from Wikipedia dumps
Download a _locale_wiki-latest-pages-articles.xml file from:
https://dumps.wikimedia.org/_locale_wiki/latest/
and run: python3 WikiExtractor.py --infn _locale_wiki-latest-pages-articles.xml
you will get a large .txt file to use as corpus.
On the above substitute locale with your preferred text one. Ie in the case of Czech use cs and so on. (Index of /cswiki/latest/)