Showing posts with label design. Show all posts
Showing posts with label design. Show all posts

Sentiment Analysis on Twitter Data Using Neo4j and Google Cloud

Thursday, September 19, 2019

In this blog post, we’re going to walk through designing a graph processing algorithm on top of Neo4j that discovers the influence and sentiment of tweets in your Twitter network.

The source code for this reference application is open source. You can find the GitHub project here.

Graph Data Modeling

The first thing we’ll need to do is to design a data model for analyzing the sentiments and influences of users on Twitter. This example iterates from an earlier graph processing example described in another blog post. I recommend taking a look at that post to better understand the concepts I talk about in this one.

The diagram below is the graph data model that we will use to import, analyze, and query data from Twitter.

Twitter graph data model

In the diagram above, the following relationships are described.

  • Users follow other users

  • Users create tweets

  • Tweets contain phrases

  • Phrases are categorized into topics

Twitter User Ranking

For this first blog post we’re going to focus on generating a rank of influential Twitter users in my social network that tells me which topics a user tweets about.

Twitter influencer ranking with topic

The screenshot above is from the results of a Neo4j cypher query. Here we find a list of Twitter users that were discovered using a crawling algorithm based on PageRank. This output is similar to the dashboard that was created in an earlier blog post, but adds in a top category, top phrase, and a sentiment score.