Questions tagged [spacy]

For questions related to spaCy. spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

6 questions
4
votes
1 answer

How to understand 'losses' in Spacy's custom NER training engine?

From the tid-bits, I understand of neural networks (NN), the Loss function is the difference between predicted output and expected output of the NN. I am following this tutorial, the losses are included at line #81 in the nlp.update() function. I am…
2
votes
0 answers

Extracting "hidden" costs from financial statements using NLP

I'm designing a NLP model to extract various kinds of "hidden" expenses from 10-K and 10-Q financial statements. I've come up with about 7 different expense categories (restructuring costs, merger and acquisitions, etc.) and for each one I have a…
0
votes
1 answer

Compare Strings composed from 2-3 words using NLP/ML(Python)

I have a database of books. Each book have a list of categories that describe the genre/topics of the book (I use Python models). Most of the time, the categories in the list are composed from 1-3 words. Examples of a book category…
0
votes
3 answers

How much labelling is required for NER with SpaCy?

I have transaction data and I would like to extract the merchant from the transaction description. I am new to this but I just came across Named Entity Recognition and SpaCy. I have hundreds of thousands of different merchants. Some questions that I…
0
votes
1 answer

Should we use a pre-trained model or a blank model for custom entity training of NER in spacy?

Further to my last question, I am training a custom entity of FOODITEM to be recognized by Spacy's Name Entity Recognition engine. I am following tutorials online, following is the advise given in most of the tutorials; Load the model or create an…
0
votes
1 answer

How to make spacy lemmatization process fast?

I am applying spacy lemmatization on my dataset, but already 20-30 mins passed and the code is still running. Is there anyway to make it faster? Is there any option to do this process using GPU? My dataset size is 20k number of rows & 3 columns
Cathrine
  • 11
  • 1
  • 5