5

Where can I find (more) pre-trained language models? I am especially interested in neural network-based models for English and German.

I am aware only of Language Model on One Billion Word Benchmark and TF-LM: TensorFlow-based Language Modeling Toolkit.

I am surprised not to find a greater wealth of models for different frameworks and languages.

nbro
  • 39,006
  • 12
  • 98
  • 176
Lutz Büch
  • 161
  • 7

2 Answers2

1

Of course now there has been a huge development: Huggingface published pytorch-transformers, a library for the so successful Transformer models (BERT and its variants, GPT-2, XLNet, etc.), including many pretrained (mostly English or multilingual) models (docs here). It also includes one German BERT model. SpaCy offers a convenient wrapper (blog post).

Update: Now, Salesforce published the English model CTRL, which allows for use of "control codes" that influence the style, genre and content of the generated text.

For completeness, here is the old, now less relevant version of my answer:


Since I posed the question, I found this pretrained German language model: https://lernapparat.de/german-lm/

It is an instance of a 3-layer "averaged stochastic descent weight-dropped" LSTM which was implemented based on an implementation by Salesforce.

Lutz Büch
  • 161
  • 7
0

This will depend to some extent on what you want to do with the language models.

Some possible resources are:

TensorFlow offers 3 pre-trained language models in the research package.

Cafe's ModelZoo has a single pre-trained model that does video -> captions.

Other packages like Cafe2 offer pre-trained models, but the documentation does not suggest any of them are suitable for language.

Failing this, a good approach might be to email the authors of a paper that adopts an approach you like. Some (but far from all) researchers will be happy to share their models, which you can then use as a starting point for your own.

John Doucette
  • 9,147
  • 1
  • 17
  • 52
  • I mean __language model__ in the [standard sense](https://en.wikipedia.org/wiki/Language_model). I updated my question accordingly. – Lutz Büch Aug 27 '18 at 06:31
  • Hmm. It looks like you found a few. I suspect DuttA is correct: if these work well, they are extremely valuable right now. No one is likely to publish then free of change. Your best bet is to contact the authors of relevant papers and see if they'll share something with you. – John Doucette Aug 27 '18 at 09:27
  • Yeah, in the meantime, HuggingFace has established https://huggingface.co/models, where many research papers and other projects publish their models. – Lutz Büch Jun 16 '21 at 15:06