So let's take a look at this raw data. Again, let me zoom in so that we can see a little bit more clearly. All right, so the first column we get is the date and time of these posts. We do get a URL for the post itself. And if we scroll over, not that quickly, in column C, see we have the contents. So this is the text of the post itself, complete with emojis. So, we can look at the raw text. We get information about who the author is. The name or the handle attached to the account, if a name has been provided. That's not always going to be the case. Klout Score being the measure of influence. So Klout scores range from 1 to 100. Closer you are to 100, the more influence your account is thought to have. That's based on how much social activity you're driving. And so, in social media, influencers typically have higher Klout scores. And so, one of the ways we might want to look at the data is by focusing on influencers compared to non-influencers. In our data we'd said let's just look at the United States. Crimson Hexagon can provide you additional information through their help section, as far as how the location data is collected. Only a small fraction of tweets include the latitude and longitude. So the rest of the location data is based on inferences. We also have the sentiment, the category that was assigned by Crimson Hexagon. And then the emotion identified in that particular social media post and the inference for, we know where the source was. And then the gender, whether that's provided directly by the platform or if it's inferred. And that information you will find within the help sections on Crimson Hexagon. So we do have access to the raw data, it's not going to give you, pulling it from the post list, it's not going to give you a huge chunk of that data. But one of the options that we do have, if you click on Manage, you'll see the Bulk Export option. So that allow us to export a larger number of social media posts, all right? And if we click on any one of these posts, that's something that we are able to drill down to. It pulls up the post for this particular user, the Klout score, basic sentiment, and the emotion that’s inferred. All right, so lots of opportunities to drill down into the content. The next section I want to walk through relates to information that we can get about the individuals who are contributing social media content. Right, so if we click on the Author tab, you'll notice, first, we get the most influential Twitter users and, no surprise, that the ones considered most influential are typically news outlets. We see some news outlets, we see universities, we see the LA Dodger baseball team. We also get an indication of total impressions. Now hows that calculated? Well total impressions based on the post coming from a particular author. How many followers does that person have? And then let me aggregate up all of the posts that are made and the number of followers that each of those posters had. So that's what's going to give us our impression numbers And who are the most prolific? How's that defined? By volume of tweets on a particular topic. All right, so who are the most active users? And if we wanted to gain additional insight here, we might click on one of these users to see what information we can get about them. Here's a list of the posts that they're putting out. This individual looks like primarily job related. So we can see veterans hiring for Starbucks career arc, join the Starbucks team hospitality. So who are the users driving that conversation? The other piece that's worth looking at under the author's tab, and you see that this is all linked to Twitter activity, because that's where the majority of the data's coming from. And it's public we get both the number of daily Twitter authors. So how many users are actually participating in the conversation, and the average number of posts from each of the users. So this data is a nice way to get a quick read on do you have a small number of very active users? We'd get that through high average number of posts per author, low number of authors, or are we primarily getting lots of individuals involved in the conversation, reflected by a high number of authors, but a low average post per author. All right? And as with the other pieces on Crimson Hexagon, we have the ability to export that to see how it's tracking over time. Just taking a quick look at this. At least over the last couple of months, you might say, well it looks like the number of authors contributing may be stable, perhaps we're seeing a little, the beginning signs of an up tick. Well take a look in our next module about statistical ways that we might approach identifying, is there an increase happening or not? But one thing we do see, especially if we go back to the beginning of 2016, looks like the number of posts per author has tailed off. So, that might be something we want to dig into especially if that's an indication of decreasing engagement. I think one of perhaps the most powerful tools built into the buzz monitor on Crimson hexagon is this affinity tab. And I'll pull up the description of what the affinity tab is. So this is a comparison of posts by Twitter authors, who have participated in this particular monitor. So anyone who's mentioned Starbucks, what fraction of them having interest in other topics compared to the general Twitter population? All right, and so, there's the affinity score, which is the comparison of the fraction of people captured by this monitor with a particular interest to the fraction of people in the general Twitter population with that fraction of interest. We'll also see there's a relevance measure that's reported in the table below. But if we look at this line going down the middle, this is saying the contributors to the Starbucks conversation have the same interest in topics clustering around this line, compared to the general Twitter population. If we go to one extreme, this is saying that individuals on Twitter are more interested in soccer than the subset of individuals who participate in the Starbucks conversation. So the Starbucks conversation participants have one-thirteenth the level of interest in soccer compared to the general Twitter population. We go to the other extreme, Chelsea Handler, resume writing, high school, Celebrities, blogging, Snapchat. These are the topics where theres higher interest among those individuals participating in the Starbucks conversation. And so if you were using this from a brand management perspective, wanting to think about, what other interest might our users have. This is one way of getting at that. And you can see the tabular form. The relevancy score is a combination of the affinity and how prevalent the particular monitor is. But, we may also just look at this in terms of the affinity. So, ranked from highest to least. What is the highest affinity that we have? Chelsea Handler, home brewing, Snapchat, blogging, celebrity, a lot focused on jobs. So it might help you get a sense for who is your social media user base. Who's contributing to that conversation. Demographics are going to allow us to drill down a little bit. The information screens will tell you where exactly this data is coming from. So, Some of these forms, the data is going to be inferred. Other, it's provided by the user profile and same for age. You'll notice that only a fraction of the posts is this demographic information identifiable. And same thing when it comes to geography. Some of it it's going to be based on inference, some of it it's going to be provided. And so recall, I had restricted the monitor to English language and the United States, because that was really the level that we want to drill down into ease from an exploration standpoint. Now zooming on the United States, so that we can take a look at it. Now notice we got population centers where we can identify where the posts are. But only 5 million of this, 6 million of this posts have identifiable information. That's allowing for a Crimson hexagon to be doing the inference. If we click on the geotex, this is where the user is providing their location information Let's see what that number changes to. As I mentioned, very small fraction of social media activity includes geotagging. So we went from 5 million to just shy of 10,000 posts. And you'll see the distribution across the US. This is looking at it in an unclustered form. You can also look at it in a clustered form. But, this is one of the challenges that we have, trying to get location specific social media data if you're requiring the level of detail of geotagging. So if you want user provided latitude and longitude, it's a very small fraction of the total number of posts. All right, and so 10000 posts geotag being able to get it for the cities based on inference. We're just shy of 6 million. Let's just go back to the total volume that fell into this window to see what we're looking at,13 million. So to get city level inference, it's less than half. And when we require latitude and longitude being provided by the user, that number plummets, all right? So, we can drill down to posts at the geography level and say, what's going on in New York City? What's going on in Los Angeles on social media? But, the more resolution you're looking for, the more granular detail that you're looking for. We're going to lose a large number of posts as we require that.