Recognizing emotions in text is fundamental to get a better sense of how people are talking about something. People can talk about a new event, but positive/negative labels might not be enough. There is a big difference between being angered by something and scared by something. This difference is why it is vital to consider sentiment and emotion in text.
There is a lot of research on sentiment analysis and emotion recognition…for English. A quick search on Google will bring you to different possible algorithms that can take care of sentiment/emotion prediction for you. …
In this blog post, I discuss our latest published paper on topic modeling:
Bianchi, F., Terragni, S., Hovy, D., Nozza, D., & Fersini, E. (2021). Cross-lingual Contextualized Topic Models with Zero-shot Learning. European Chapter of the Association for Computational Linguistics (EACL). https://arxiv.org/pdf/2004.07737/
Suppose we have a small set of documents in Portuguese that is not large enough to reliably run standard topic modeling algorithms. However, we have enough English documents in the same domain. …
I often do not remember which are the exact methods to run a quick pre-processing pipeline. And most of the times I just just the bare minimum: remove punctuation and remove stopwords.
First thing, install NLTK, the toolkit we are going to use to handle the preprocessing.
pip install nltk
I will just write here this quick function, so you can copy and paste it everywhere you want.
A few examples:
PyTorch is the cool guy/girl in town. In this blog post I want to to give you a brief overview of what I think is really interesting about it. PyTorch is easy to use and it can be used to implement neural networks very quickly. See the original blog post here.
The main objective: quickly show PyTorch and PyTorch Dataset and how to use some of their cool features. Note that I am also going to ignore overfitting and related problems here. This is something in between a tutorial and a simple blog post.
I hope you have neural network…
Word embeddings are now used everywhere to capture word meaning and to study language. The general theory in which word embeddings are grounded is distributional semantics that roughly states that “similar words appear in similar contexts”. Given as input a collection of textual documents, word embeddings generate vector representations of the words.
Essentially, what word embeddings algorithms do is to put words that appear in similar contexts in close positions in a vector space. Each word, for example the word “amazon”, has its own vector and the similarity between words can be expressed by the distance of their vectors. …