Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Sentiment Analysis on Tweets
Source Code
- Cleaned and preprocessed a dataset of 1.6M tweets, mapping target variables into sentiment labels (Positive, Negative).
- Applied text preprocessing techniques: lowercasing, URL pattern removal, emoji conversion, tag removal, and stopword elimination using NLTK and regex.
- Designed custom functions for punctuation removal, repetitive character handling, and word lemmatization based on part-of-speech tagging.
- Visualized frequent negative words with WordCloud and analyzed sentiment distribution using Seaborn.
- Conducted topic modeling on negative sentiment tweets using BERTopic and Sentence Transformer model ('paraphrase-MiniLM-L6-v2'), extracting key topics.
- Identified 5823 distinct topics and visualized the top 7 most frequent in negative sentiment.
- Repeated sentiment analysis and topic modeling on positive sentiment for comparative insights.