I would like to create a GloVe word embedding on a very large corpus (trillions of words). However, creating the co-occurence matrix with the GloVe cooccur script is projected to take weeks. Is there any way to parallelize the process of creating a co-occurence matrix, either using GloVe or another resource that is out there?
Asked
Active
Viewed 28 times