In the commercial world, retailers apply predictive analytics to customer behavior in order to maximize sales and profits, crunching data to find the right time to make the right offer to the right customer. Retailers have such a rich trove of historical customer data that, with the use of the latest technology, consumer behavior can be predicted with considerable accuracy.
Applying predictive analytics to military and national intelligence problems, however, is a different matter. Intelligence analysts don’t have the luxury of neatly organized historical data that can be easily extrapolated out into the future, but instead must deal with a series of observations that may or may not be arranged according to time and place.
More importantly, they often are trying to anticipate unknown actions of an unknown adversary. In contrast to the merchant, who often has relatively complete data on a customer and has the specific goal of making a sale, intelligence analysts are often working with incomplete data and without knowing what the next move is supposed to be.
In fact, the application of predictive analytics to intelligence often focuses on what is missing in the data or how certain observations diverge from the expected norm. Some experts insist on calling this process not predictive analytics but anticipatory analytics, in order to distinguish it from similar processes in other realms.
Predictive analytics is an outgrowth of the branch of information technology known as business intelligence (BI). Traditional BI looks backward to provide visibility into historical data that can explain how and why an organization is doing well or poorly. But in the fast-moving and ever-shifting landscape of today, relying on historical data alone is like driving a car while only looking into the rearview mirror. Predictive analytics promises to uncover challenges and opportunities that are coming down the pike, providing the opportunity to proactively deal with them.
In the geospatial field, predictive analytics involves analyzing events to uncover relevant patterns and relationships related to a place. By illuminating the spatial and temporal factors that relate a certain type of event, it is possible to statistically anticipate where similar events are most likely to occur in the future. This allows analysts, warfighters and law enforcement officials to focus tightly on those areas. Geospatial predictive analytics relies on a host of data sources, including imaging and full-motion video sensors, location-enabled applications, and open source information such as social media data.
Technology advancements that enable the processing of ever-larger data sets also serve predictive analytics. While predictive analytics can narrow down areas of interest dramatically, it also can consider greater geographical areas than in the past, thus enabling consideration by analysts of a broader set of hypotheses before narrowing them down.
New visualization technologies have also been applied to geospatial predictive analytics. Once a set of locations with similar spatial characteristics is correlated with past events, hundreds of geospatial data layers can be reviewed to identify the physical, cultural and social factors that may correlate with the activity being examined. These hot spots, once identified, can be visualized on a map highlighting where similar events are most likely to occur, allowing users of the intelligence to deploy their resources more effectively.
As intelligence analysts and consumers alike are deluged with increasing volumes of data at an accelerating pace, they need to derive actionable intelligence from the data faster, before the information gets stale.
“The government is looking to be more proactive with respect to the data it is collecting,” said Matt Fahle, a senior executive for intelligence services at Accenture. “The longer it takes to act on intelligence, the more impact is lost over time. The pace at which data is being gathered, and the ability to analyze that data so folks can act quickly, is really driving a world of insights.”
“Traditional forecasting models are based on using historical data to identify future occurrences related to specific data sets and predetermined models,” said Justin Christian, director of technology and innovation at Mercury Intelligence Systems. “Today’s predictive analytics include forecasting, but are much broader in nature. They enable the discovery of previously unknown information, often without well-defined models driving those predictions, and they are usually associated with large volumes of data, much of it unstructured. In addition to forecasting based on observed trends, predictive analytics can also provide knowledge of previously unknown trends or patterns.”
What is happening with predictive analytics represents a huge paradigm shift,” said Scott Broudy, strategic account manager for defense and the intelligence community at MicroStrategy. “Besides the refocus from backward-looking to forward-looking analysis, we are seeing developments in technology that allow more people who are not data scientists or mathematicians to use predictive analytical tools.”
The goal of any military or intelligence function is to be predictive in nature rather than reactive, noted Matt Hughes, president of Mercury Intelligence Systems. “Most current legacy tools are useful in evaluating events that occurred in the past, but that information is rarely useful at the tactical level for current operations,” he said. “Predictive analytics are the key to understanding events in real time so that commanders and decision makers can be proactive in dealing with events that affect them directly.”
Predictive analytics can provide military and intelligence organizations with a number of important capabilities, according to Barry Barlow, chief technology officer at The SI Organization. “One is that it can highlight correlations that people can’t do manually,” he said. “Second, it shortens the time to decision making. Even with huge data sets and complex models, a reasonable set of recommendations can be generated within minutes. Third, it allows the power of social knowledge to be applied by people of diverse sets of experiences by leveraging the wisdom of the crowd. Fourth, performance improves over time thanks to algorithms that can learn from the feedback loop.”
Law enforcement personnel may be looking for the next spot where an unknown malefactor may commit his next crime, while military intelligence analysts may be looking to see where a group may place its next IED.
“Insurgents, like criminals, do things multiple times, and not just once,” said Sean Bair, president of Bair Analytics. “They keep on doing it until they get caught. With predictive analytics, it is possible to develop insurgents’ modus operandi and where they might emplace the next IED. We can also figure out who their suppliers are and when they move the supplies, so that troops can be allocated to take out the emplacers as well as the suppliers.”
“We worked with the Department of Defense on IED defeat,” said Jim Stokes, vice president of Insight Commercial Solutions at DigitalGlobe. “Once we developed a signature, we would predict other areas they were likely to hit.”
But intelligence analysts are often not dealing with clean data. The data may be sparse and incomplete or it may be bad data interspersed with good data.
“In the national intelligence domain, we often talk about anticipatory analytics,” said Jordan Becker, vice president and general manger for geospatial intelligence and ISR at BAE Systems. “In the case of predictive analytics, we want to know the outcome of a known event such as an election or the reaction of the stock market to a company announcement. In the intelligence world, we don’t necessarily know the future activity or the date of observations that allow us to create hypotheses of what the future event might be. You need to open the aperture to include all observations to form a hypothesis of what the events are and what the observations are pointing to.”
Linking individuals and activities geospatially is an important part of the predictive process, according to Barlow. “Linking diverse intelligence reports describing the movement of objects geospatially is important in making good predictions about future activities and their locations,” he said. “In the case of a chemical or biological threat to a city, for example, we can model traffic patterns and the movements of individuals. The difficult part is that this is often based on imagery rather than on structured data.”
In the case of military missions, geospatial predictive analytics can answers questions on threats and counterthreats or the correct approach to raiding a building. “If a facility that looks like a residence appears to be burning its own garbage, that is abnormal,” said Becker. “This can be detected geospatially. The question then becomes how to use that data to form a hypothesis about what is going on in that residence.”
DigitalGlobe has undertaken a human geography project, which it intends eventually to cover the entire globe. The point is to amass and correlate sufficient geospatial and other types of data to be able to predict human trends on a regional basis.
“For example, our algorithms will look at certain activities and ascertain their distances to roads, rivers, vegetated areas and other geospatially related terrain features that might characterize the environment,” said Ken Campbell, Vice President of National Security Solutions for DigitalGlobe.
“This is also loaded up with demographic data such as languages and tribal affiliations. At the end of the day we have an understanding of the top geospatial factors that influence behavior. That becomes a starting point for a better understanding about what may be driving activity in given region,” Campbell said, adding that this model has been applied in the Horn of Africa to discover factors that drove local populations to refugee camps.
When the U.S. military and the intelligence community interact with local populations abroad, one key toward understanding the landscape and predicting future developments is to ascertain the sentiments of those local communities. Predictive analytical tools are increasingly using data from social media in order to make that estimation.
“There is a lot of excitement about what can be done with this type of information,” said Campbell. “Combining that with geospatial data can generate a picture of what people are thinking in various parts of the world.”
An analysis of Twitter feeds overlaid on a geospatial background could have predicted the Arab Spring of 2012, Barlow suggested. “That is why social media is being incorporated into the tradecraft,” he said. “It is a very good predictive indicator of certain trends, where they originate, and how they spread. One predictive analysis showed the relationship between social unrest and water shortages in Africa and informed policies on well-development activities and water desalination investments.”
The SAS National Security Group analyzes social media information to help DoD and intelligence agencies plan humanitarian projects.
“We use open source social media and SAS analytics to sift through millions of documents to discover where new refugee camps or hospitals may be located,” said Mark Kriz, a senior account executive. “We have used behavioral analytics on the same type of data to determine the attitudes of populations towards their local politicians. We are able to develop an understanding of the sentiment of native peoples toward a variety of topics. Based on that, we are able to predict how events may unfold in a certain region and how those events may trend.”
The Intelligence Advanced Research Projects Activity (IARPA), a research organization under the Office of the Director of National Intelligence, runs several programs that are investigating new methodologies for predictive analytics. One of those programs, Aggregative Contingent Estimation (ACE), seeks to develop methods for modeling human judgments on geopolitical events.
“As part of that, we run one of the largest experiments in forecasting,” said Jason Matheny, the ACE program manager. “More than 10,000 people have participated in generating over 1 million forecasts on hundreds of political questions over the last two and a half years.”
ACE seeks to improve forecasting by scoring the accuracy of the judgments they collect. “We look at the types of questions people get right and the types they get wrong,” said Matheny. “We study how we can improve the accuracy of forecasts by better understanding patterns of judgment and by using statistical methods to correct predictable errors of judgment.”
Another IARPA program that Matheny runs, Open Source Indicators (OSI), combines human judgments with open source data such as social media, news reports and Wikipedia to try and gain insights on emerging trends such as political instability and disease outbreaks. The program recently has focused on events in Latin America.
“The geospatial component to OSI is that we link open source data to specific cities to forecast political instabilities and disease outbreaks at the city level,” said Matheny. “An important aspect of this process is the geolocation of sources of social media data. Within the last year we have achieved a several-fold improvement in the state of the art by being able to locate the source of social media data within 8 kilometers of its actual location.”
One of the conclusions of IARPA’s research is that predictions are most accurate when multiple methodologies, models, experts and modes of data are employed. “There has been an emphasis in OSI in using multimodal data combinations such as text, metadata and video to generate forecasts for geopolitical and public health events,” said Matheny. “This has achieved unusually good results.”
An important challenge facing predictive analytics practitioners is how to develop a hypothesis even in the absence of complete data. In other words, users must ask an intelligent question in order to get a decent answer out of the predictive process.
“DoD and the intelligence community like dirty data,” said Daniel Boyle, a sales director at SAS. “They are looking for anomalies or outliers in that data. Commercial customers would want us to scrub that data before applying predictive and text analytics to that.”
“Sometimes the missing data points give you the best information,” said Becker. “A change in an observed pattern of behavior may be a clue that something is about to happen. For example, the observation of trucks moving in the desert may represent weapons shipments. When those movements stop is when a group may be preparing to launch an attack.”
When analysts are deluged with data, inferencing becomes an important analytical method because it is difficult to deduce—that is, to identify by reasoning—the most likely hypotheses from a large list of possibilities. “Too much data becomes noise,” said Becker. “Separating the noise from the signal becomes a big part of the problem. Inferencing allows us to narrow down the range of possible hypotheses based on all the observations. There are also statistical methods that can be applied to highlight or eliminate hypotheses based on the frequency of the observations upon which the hypotheses are based.”
Although advances in big data processing, cloud computing, algorithms and machine learning have facilitated advances in geospatial predictive analytics, human judgment still looms large in solving these types of intelligence problems. “To take action you still need eyes on,” said Barlow. “Predictive processes are good at pointing out options. Humans still must make the inferences and decisions that machines really aren’t equipped to make. The consequences of taking action based on robots, drones and computers are fairly high.”
“Human judgment is still huge,” said Bair. “At the end of day, analysts have to evaluate whether the results of the analytical process really hold water. They are also evaluating how the model is doing at every step. Math doesn’t know what humans know. An algorithm may look at the data and forecast that an attack may occur in the middle of a lake. If something like that happens, an analyst can adjust the data so that the algorithm can look at it in a different way.”
“Human judgment is important in forming hypotheses,” said Becker. “Predictive analytical tools use statistical methods developed a long time ago. Human cognition is necessary to make the associations required in forming hypotheses.”
IARPA’s ACE program treats human judgment as another form of unstructured data to which statistical models can be applied. “We can think of human judgment as a kind of sensor and the correction of human judgment as a sensor fusion problem,” said Matheny. “Humans exhibit inherent biases, but these can be adjusted much like the positions of sensors are adjusted. By statistically combining judgments, we can correct for bias. That is something the ACE program is specifically focused on.”
The biggest difference predictive analytics makes from the human perspective is to transform the job of the analyst. “Typically, analysts spend 80 percent of their time trying to find relevant data, and 20 percent of their time analyzing the data,” said Kriz. “Even then, they are able to find only a small subset of relevant data. Our methodologies allow analysts to acquire much more data than they could manually.”
As a result, the activities in which analysts engage are flipped to 20 percent acquiring data and 80 percent analyzing it, according to Kriz.
“Their jobs are enhanced and not diminished by this process,” he said. “Analysts are constantly tweaking the data and models and are weeding out false positives to refine results over time. The intelligence gained by analysts over the years of doing their jobs is applied to this fine-tuning effort. The value of human judgment grows exponentially when used as part of this analytic process flow.” ♦
- Issue: 1
- Volume: 12