As the explosive popularity of social media creates massive new streams of accessible social, political and economic data, intelligence specialists are turning to analytical software designed to automatically convert the electronic sentiments of millions into the beginnings of actionable intelligence.
While there will always be value in using people to gather and try to make sense out of intelligence, the fast pace and sheer volume of chatter ongoing via social media outlets, such as Facebook, Twitter, LinkedIn and blogs, makes it critical for technology to be involved.
“Throwing people at the problem is not enough,” commented Tom Sabo, senior solutions architect at SAS Federal.
“The process needs to be automated,” remarked Daniel Boyle, manager of SAS Federal’s National Security Group. “The volume, variety and veracity of data are just mind boggling. There’s a need to get more adept at analyzing it.”
The appeal of social media analysis is that it offers a window into people’s openly expressed thoughts and feelings about key topics before they become a full-blown social movement or trend. When those ideas emerge in the public arena, as was true during the Arab Spring of 2011, social media can become a key mode of communication during fast-developing events.
While the volume of social media information can be both its opportunity and challenge, analytic technology can make it easier and quicker to gather and process information than ever before.
“There is a huge amount of information in social media, of all kinds: specific events and occurrences, relationships between people and organizations, main topics of interest and concern to populations, and more,” reported Kristen Summers, director, technical, for CACI.
In fact, social media offers a wealth of information for intel analysts. “In some cases, it provides an early warning of events,” Sabo remarked. “By verifying and corroborating it with other sources, it becomes another source for actionable information.”
To make effective use of this information requires analytic technology. But the volume and velocity of the information is just too great for analysts to read and make sense of it all.
“Analytic technology can sort out the elements that are likely to have value and organize and summarize them, so that analysts can focus their attention on the most meaningful combinations of data and on the most meaningful aspects of that data,” Summers said. “Then analysts can put their main effort where it belongs: applying human judgment and making the interpretations that only an analyst can make.”
The role of human judgment begins even earlier in the process, according to Eileen Ratzer, program manager and lead analyst for BAE Systems’ Advanced Analytics Lab. “From an analysis perspective, manipulating data is secondary to being able to determine what data is relevant to one’s mission area,” she said.
“Social media analysis tradecraft and training have not advanced at the same pace as analytic software,” Ratzer continued. “Many practitioners remain unfamiliar with online subcultures, as evidenced by the FBI’s recently released guidebook on social media vernacular. Just as a police officer knows his beat, the intelligence analyst must understand an ever-expanding social media domain, and the behavioral indicators of the phenomena he or she is studying, such as social unrest, foreign fighter migration and troop movements. A good social media analyst must also be able to deftly maneuver the analytic tools available to them in order to fully exploit their power.
“It may be hard to believe, but many intelligence analysts in classified spaces simply do not have access to the open Web, and those who do have to switch between networks in order to understand what’s happening in the open source domain. Thus, they tend to miss out on a lot of relevant information. Vendors have done a great job of leveraging technology to address intelligence problems, but full acceptance of the social media value proposition within the IC will not be fully realized without a cultural shift proselytized at all levels of management,” she added.
Analytic technology provides unique insights when combined with social media. “Analysts can find trends, patterns and/or anomalies, depending on what they are looking for,” commented Boyle.
“It falls back on tech analysts to identify trends and filter what is actual and what is important,” added Sabo. SAS and a number of other companies offer analytic technology for analyzing social media.
The SAS Social Media Analytics program embodies an exploratory tech analytic approach, Sabo explained. “Analysts may not know what they are looking for, but they will know when they see it,” he said.
By using rules-based analytics (“if-then” statements based on common sense and conventional wisdom) SAS applies subject matter expertise to filter social media data to actionable social media data.
The technology allows users to take on different challenges and use different phrases that filter out scenarios. It helps identify important topics and content categories and determine their relevance. The software also pulls together a variety of pertinent online data—from traditional news sites, social media forums or blogs—and allows for deeper, more holistic insights.
“We are a big proponent of letting the data show you where to look and determine the network,” explained Boyle. “If you let the data draw the network, then apply your biases and knowledge, you might get a very different picture. You will find the unknown unknowns. It’s not enough to know geography and spots on a map. You are interested in how entities interact with one another.
“For example, let’s say an analyst is interested in a specific area that is hard to access,” he continued. “Social media can give you a leading indicator. There may be people in that region who are talking about, say, a particular construction site. While those people might live in different locations, one might mention that a large structure is being built. That would give an early indication that something is on the drawing board.”
This information can then be augmented with overhead imagery or other information to verify that the structure is, in fact, being built, thereby giving intelligence officials an early indicator that something is happening.
Data is kept long enough to spot trends and change analyses over time. By keeping a repository of conversations in an analysis-ready state, data can be analyzed as frequently and as deeply as needed.
“The idea is to remove the mundane activity and let analysts apply their expertise,” Boyle said. “They are a critical component, but automation has value.” Another company in this field is CES, which offers PRism, a social media research and analysis tool. “This tool makes assumptions derived from social media faster and easier to process,” commented Blake Hasse, general legal counsel for CES and lead on the design and development of PRism. “Instead of going through files one by one for a single piece of information, a tool like ours will actually pull back all information from multiple locations and allow analysts to go through it all in one place. This makes the process a lot faster.”
Hasse emphasized, however, that analysts still need to recommend what is important and what is a false positive. “It is still complicated,” he added. “But I believe it will get easier and faster as we get better at it.”
CES, which operates primarily as a government health care contractor, developed PRism as a result of its experience with social media. “Two years ago my boss came to me and asked me to build a tool to make social media faster,” Hasse explained. “That’s where PRism came from. It is an all-in-one Web-based media investigation and intelligence platform that brings everything down to one workbench that allows one to go through the data.”
PRism offers two tools, one of which Hasse described as being like “a onetime monitor for emerging events,” while the other “offers long-term investigation functions that allow users to search across media to individual profiles to capture usernames,” he said.
The goal is to qualify the social media footprint of an individual within an organization. “We use the tool to bring all of the data down to one workbench, which archives the data so that all analysts will have it handy,” he said. “Nothing is deleted. We use a tool to preserve the content even if Twitter or such accounts are deleted.”
The tool looks for words that are commonly used as well as connections and monitors those profiles. The data is exported in a structured format so analysts can use it in other tools for more analysis. “It’s very robust and complicated,” Hasse added. “The tool tries to get everything onto one network, which makes it more efficient. Because it is a robust tool, there is a bit of a learning curve, but we offer training services.”
CACI offers a variety of solutions that apply to social media analysis. “We have a streaming architecture, meaning that we can pull in data from a variety of social media sources as it is posted, process it, and produce results as we go, rather than waiting to collect a set of data over time and then process and interpret it,” explained Summers.
CACI partners with other companies for analytics that apply to the data, or produce them directly. “This allows us to find topics of interest, indications of public mood, and the like,” Summers continued. “We can produce reports of the results or show them in an interface, such as placing them on a map according to the geospatial associations that are available.”
CACI executives said their solution is particularly useful for intelligence-related needs. By monitoring the stream for topics or events of interest to analysts, the analytic technology can alert the analyst to increases in the attention paid to these topics, either in general or within a particular geographic area of interest or within a particular organization.
“Monitoring the full social media environment for a given area or population can provide general situational awareness and a picture of the social environment,” Summers said.
SnapTrends, meanwhile, offers “social intelligence” that enables analysts to identify location-based posts, and then leverage a number of built-in analytical tools to gain insight and intelligence. “Analysts can determine the location where posts originate, identify a social media user’s connections, determine patterns of life for a user, translate from more than 80 languages, and understand the context and timing of word use,” reported Eric Klasson, co-founder and chief executive officer of SnapTrends.
The analytics provided in SnapTrends enable analysts to reduce the time required to identify and investigate threats and incidents. “SnapTrends helps reduce the amount of incidents that occur and improve the response when an event does occur,” he said.
According to Klasson, there are three main categories of social software vendors. The first category provides social media monitoring or listening tools that simply mine social networks for posts that contain certain hashtags or keywords, while leaving the user to sift through the posts to gain an understanding of context and meaning. The second category of vendors provide social media marketing tools that are primarily designed to push content out to social networks for the purposes of reaching new or existing customers.
The third category is called social intelligence, which Klasson said is SnapTrends’ focus.
“SnapTrends provides analytics and intelligence that is specially designed for public safety, law enforcement and corporate security and risk management,” Klasson explained. “SnapTrends is used to identify, respond to and investigate (and in some cases prevent) public safety and law enforcement related activities.” Some of the key use cases include gangs, terrorist activities, intelligence gathering, criminal investigations, narcotics, active shooter situations, natural disasters (such as floods, hurricanes and fires) and military intelligence.
Another major player in social media analysis is the BAE Systems Advanced Analytics Lab, which brings together analytic resources and convergence of expertise from the company’s intelligence, IT and GEOINT business areas to shape and respond to advanced analytics requirements. During the 2014 Winter Olympics, for example, the lab studied social media data to convey trends in the public dialogue around security, infrastructure, transportation, cyber-events and environmental concerns.
In studying COTS tools available in this field, the BAE lab has learned two critical lessons, Ratzer explained. There is no single tool that can do everything, she said, and no one understands the mission like trained analysts do. “It’s easy to be impressed by tools that over-engineer a solution by claiming to pull in everything social media has to offer, but an analyst’s ability to identify the most relevant platforms or accounts to his or her area of responsibility, and appropriately filter that content, is often underappreciated.
“If we can’t find the right tool to suit our needs, we have the luxury of talent to build it ourselves. It doesn’t have to be pretty, but it has to accelerate our ability to get the job done or simplify the analysts’ workflow in some way,” Ratzer said.
An example of one of these in-house innovations is the Automated Data Review (ADR) tool, which automates the collection of analyst-vetted sources on the open Web. “It uses natural language processing to apply properties to that unstructured data that are meaningful to clients’ issue set—a process that had previously been performed through manual data entry by the same analysts hired to interpret that content,” she explained. “We are training technology to make decisions as an analyst would so our humans can review the dataset rather than populate it. Instead of trying to find the needle in a haystack, we are trying to create ‘needle-rich hay.’ While we’ve just begun to socialize the tool, the initial response has been that the social media analysis challenges we are trying to tackle are fairly universal to the intelligence community.”
BAE Systems Geospatial eXploitation Products (GXP) have long been a staple in the intelligence analyst’s toolkit, Ratzer added. “Our GXP Xplorer software allows users to perform federated searches across their data environment and visualize the results according to geo-location. Soon, our customers will have the option to gain access to unique, quality-controlled social media data layers made available by our Advanced Analytics Lab. As more software providers move their products to the cloud, there is a growing appetite from the user community to order from a menu of data feeds, and we are excited to be a provider of both those data feeds and the platform to bring it all together.”
As analytic technology in general continues to improve, Sabo predicted, so will the kind of analytics used in social media. “You get a more holistic picture where social media is just one of the many signals,” he said.
Yet there is volatility in data. “For example, how the Arab Spring occurred in one country may not be the same in another,” Sabo explained. Social media continues to evolve as well. “There will be challenges in keeping up with the different forms of social media and what people choose to share and how they choose to organize as events unfold,” he said.
Meanwhile, the models are continuously emerging and getting better. One key issue is how one looks at the narrative, such as at a macro level that may indicate what’s going on by the sheer volume of chatter or a significant drop in volume.
“It gets more delicate and intricate at the micro level,” said Boyle. “The process is an ongoing evolution.”
The good news is analytic hardware and memory software has become better and less expensive. “The ability to make sense of relevant data is where the real argument is,” he added.
This is where the filtering selection comes in and value is added. For Summers, the pitfalls and advantages of open-source data are really two sides of the same coin. “There is a huge amount of data, and it is very easy to produce and to obtain,” she said. “This means that the reliability of the data varies widely and is often unpredictable, and it also means that just identifying the parts that are relevant to an intelligence need can be a daunting task.”
It’s important to remember, she stressed, that social media is also not necessarily representative of the population at large, but is biased towards the demographics that use social media heavily. So using it as an indicator of the general population’s concerns and moods can be problematic.
“However, the sheer quantity of open-source data and the ease of access means that it offers a broad view, where small biases and agendas are likely to cancel each other out, with an accurate view emerging of the participating population,” she said. “It’s worth noting that what people are discussing on social media might not be accurate, but the fact that they are discussing it is accurate and that in itself can provide insight.”
Hasse agreed, pointing out that anyone can write things and proclaim that it is fact. “It’s about spotting the marker that confirms what they are saying is true,” he said.
“There is a wealth of open-source information out there,” he added. “When looking at an individual, you can look at their friends, relatives and the people who are looking at them. By working with other forms of intelligence, you can add context and learn what’s happening behind the scenes.”
Klasson sees open-source data providing a wealth of information and insight for state, local and federal law enforcement and public safety agencies as well as corporate security. The greatest challenge, however, is the sheer volume of data.
“SnapTrends helps address this by finding the ‘needle in the haystack’ by zeroing in on the most relevant posts by specific location, keyword and user handle,” he said.
Interpretation of Meaning
Looking ahead to the future of social media analytics, Summers suggested that a key area of improvement will be deeper analysis of content.
“We’ve really only been scratching the surface,” she said. “I think we’ll see analytics that focus on more than general phrases and moods in the text of social media. We’ll see interpretation of meaning, sarcasm, indications of relationship and closeness, and we’ll also see analytics that apply to other modalities, such as images and video.”
Summers also believes there will be greater interpretation of the connections and metadata, regardless of the content, with a focus on interpreting networks, influence and patterns of use.
“We may also see a greater connection of the virtual space to the physical space, especially as the Internet of things becomes a reality,” she added. “We may see assessments of the strength of informal communities and how they are distributed geographically, or interpretations of the effect of someone’s current location on the issues they consider a priority.”
Overall, she contended, social media analytics will become more fully integrated as a part of the intelligence analyst’s toolkit; open-source intelligence will simply be another data source, and these analytics will be the way to use that data.
Hasse said he foresees social media analytics becoming more technical, with more data being pushed out. “Behind all the data is the metadata on the backend,” he stressed. “With more and more pieces of metadata, there will be a need for a tool or technology to handle the new tags as they emerge and evolve. Learning how to handle all of that data is the most difficult thing that we do.”
CES is currently looking at a proposal for ways to detect sarcasm in social media. “It is also hard to access information without special tools that go beyond security settings,” Hasse added.
The intelligence community is also apparently interested in detecting sarcasm. A recent Secret Service solicitation sought social media analytics capabilities that included the “ability to detect sarcasm and false positives.”
Klasson pointed out that social media analytics is already evolving toward a specialization by industry that will continue into the future. “Many companies are focused on marketing use cases,” he said.
For example, SnapTrends focuses on public safety, law enforcement, homeland security and corporate security. A software vendor’s focus dramatically affects the types of analytics that are built within the system.
“Tools that are designed to sell products and services to people and/or monitor customer satisfaction are drastically different from tools like SnapTrends that are designed to prevent, identify, respond and investigate incidents and the people involved with them,” he said.
Klasson warned, however, that the more terrorists and other adversaries know about what is possible through social media analytics, the less they will use social media. “It’s always a game of cat and mouse,” Boyle remarked.
One indication of the intelligence community’s interest in social media analytics comes from a recent Secret Service solicitation seeking “computer based annual social media analytics.”
The agency’s request for proposals described a social media software analytics tool with the ability to:
- • Automate the social media monitoring process
- • Synthesize large sets of social media data
- • Identify statistical pattern analysis
- • Visually present complex data in a clear, concise manner
- • Provide user friendly functionality to multiple staff members.
The capabilities/functionalities sought by the Secret Service included:
- • Real-time stream analysis
- • Customizable keyword search features
- • Sentiment analysis
- • Trend analysis
- • Audience segmentation
- • Geographic segmentation
- • Qualitative data visualization representations (heat maps, charts, graphs)
- • Access to historical Twitter data
- • Influencer identification
- • Ability to detect sarcasm and false positives
- • Ability to search online content in multiple languages. ♦
- Issue: 6
- Volume: 12