/ / Organizing the "Knowns"
A+ A A-

In This Issue

Organizing the "Knowns"

  • Written by  Peter Buxbaum, GIF Correspondent
  • Read 15792 times
  • Print
  • Email

The effort to make masses of data more useful to the intelligence and military communities has led to the development of a number of methodologies and technologies designed to transform raw data into useful information. Because of the enormous volume of data available to intelligence analysts, a degree of automation to connect entities and events and to build relationships among data is essential to better understanding threats, reducing the fog of war and more quickly countering adversaries.

A key concept in that effort has been activity-based intelligence, which uses a multi-INT approach to analyze activity and transactional data to develop intelligence, drive data collection and resolve what have been called the “unknown unknowns.”

Activity-based intelligence can be used to develop patterns of life, understand the intent of bad actors and formulate responses. More recently, however, interest has grown in a series of capabilities that go hand in hand with activity-based intelligence—object-based production.

As described by a Defense Intelligence Agency document, object-based production seeks to organize intelligence around objects of interest and to improve the organization of information. Where activity-based intelligence reaches into the realm of the unknowns, the goal of object-based production is to organize the ‘knowns’ in order to improve the usefulness of intelligence. By improving the organization of information and its access about the knowns, it is more possible to extrapolate to the unknown.

What both methodologies also have in common is an effort to replace data-centric intelligence with an analysis that is more knowledge-centric. Instead of swimming in a sea of raw and undifferentiated data and requiring analysts to infer the connections, object-based production seeks to aggregate harvested data around objects—people, places, things or even events—in order to gain greater insights about the nature and attributes of those objects and understand relationships among them. Once those relationships are identified, they also can become objects around which old or new data can be gathered, all of which can be manipulated to produce new intelligence.

The concept of object-based production has drawn increasing interest within the intelligence community, including the National Geospatial-Intelligence Agency. For example, NGA officials described their Map of the World initiative, which offers access to a number of different types of information about any particular spot on the globe, as providing a foundation for the IC’s object-based production environment.

Graph Databases

One of the major technology innovations that have facilitated object-based production has been different kinds of databases. Graph databases, as opposed to traditional relational databases, allow for the storage and retrieval of large volumes of unstructured data and observations as well as linkages associated with those observations.

Open-source intelligence and unstructured observations are key data elements that aid in object-based production. Graph databases make it easier to quickly perform analytics on the data and their links without the computationally difficult manipulations necessary in relational databases. The way the data is stored and the technology deployed allows information to be queried and used across a multiplicity of systems interoperably.

“Intelligence systems until this time have focused on the problem in the wrong way,” said Harry Niedzwiadek, CEO of Image Matters. “They were way too data-centric, meaning that they were focused on building systems that attack the big data problem. The big-data problem involves separating the wheat from the chaff and putting the remainder in front of analysts to make inferences. We needed to get beyond that.”

“In intelligence, object-based production started with a program designed to identify vehicles, equipment and ships,” said Tim Estes, founder and CEO of Digital Reasoning. “We were trying to create a ‘baseball card’ for certain kinds of things—a description with all of the attributes of the object in one place. When you gather the properties of an object, you can identify what sort of thing it is. Once you have created those baseball cards, you can index and search them.”

Object-based production also includes the capabilities of disseminating information across domains for information sharing and collaboration purposes. “Object-based production allows analysts to assemble and disseminate information across a community of interest,” said Idriss Mekrez, public sector chief technology officer at MarkLogic. “The idea is to define every physical object once and then attach properties that come from different sources of intelligence, such as geospatial, human and signal intelligence. Then you allow agencies to share that information to make sure that everyone is talking about the same physical object.”

“Before object-based production, intelligence products were derived from different multi-INT sources and were stored and processed in a linear fashion,” said Bob Palmer, senior director for strategy and innovation at SAP National Security Services. “Each analyst would produce a product along the lines of his own mission needs. That makes the ability to share that information and reuse it in an automated fashion very difficult. Object-based production represents a fundamental shift in how components of information products are stored, increasing productivity and efficiency and allowing analysts to execute their missions better.”

Aggregating data around objects allows systems to transform data into knowledge. “Object-based intelligence focuses on creating knowledge about what we know so we can to do a better job of discovering the needle in the haystack,” said Niedzwiadek. “But it doesn’t stop there. We are also able to infer from the needles what we know or need to know about the haystacks. That is a key shift with object-based intelligence.”

Activity-based intelligence and object-based production play off one another. “The goal of activity-based intelligence is to discover and track dynamic activities between objects,” said Mekrez. “Activity-based intelligence makes use of object-based data because object-based production provides the data model that allows systems to store objects not typically found in relational databases.”

Object-based production supports activity-based intelligence and is also informed by it. “There is a feedback loop,” said Palmer. “The activity itself then becomes an object that can be stored in a graph database.”

“The event shows up as an object,” added Niedzwiadek. “Activity-based intelligence deals more with the transient nature of things. There is some activity going on which we are monitoring and trying to understand what it means. Objects play into activities, and out of activities we can extract some facts about objects, so they are interconnected.”

“Object-based production and activity-based intelligence are the left and right hands of intelligence,” said Estes. “Objects are always located in time and space. If you are watching an object as it changes in time and space, that’s activity-based intelligence. They work hand in hand.”

Intelligent Searching

Object-based production posits a data model that allows the linkage of properties of objects derived from different intelligence sources and is built around a conceptual, semantic indexing system different from the kind of intelligence search systems that prevail today.

“What we have now in the intelligence community and Department of Defense are systems that are search-driven,” said Estes. “The analyst enters some type of token or search string and gets back documents or content he can consume. This is a cumbersome process, and it is difficult to scale when tracking a person or a vehicle because it depends on searching for some kind of clue and fusing data together manually.”

Beginning in 2012, databases were created in which analysts could identify data with object tags. “This allowed systems to link data in forms that humans, and not just machines, could understand,” said Estes. “Analysts across systems and agencies could agree about the existence of a given object, and all data relevant to that object could be tagged to it. Analysts could search for objects, not just data, and retrieve all of the data relevant to that object.”

Powering object-based production are technologies that allow for the graph representation of data and search engines that enable the discovery of that data. “Systems have to be set up to accommodate the way data needs to be stored and used,” said Palmer. “With a graph representation of data, the objects and their attributes are stored in such a way that the attributes and relationships between objects and attributes are themselves persistent data objects in the technical system.

“This data schema requires a graph engine to execute queries against that type of data. In a conventional relational database, entities and their attributes can be stored and queried, and connections between objects can be inferred. But the relationships themselves are not stored as discrete objects that can be manipulated in the data structure,” Palmer added.

In a graph representation of data, the relationships between objects are first-class citizens in the data structure. They are persistent and can be queried in multiple ways, or altered and have new attributes added to them.

“That type of flexibility better represents the real world, where relationships between people, places and things are constantly changing and need to be understood for different mission needs,” said Palmer. “Object-based production is a way to store intelligence products that empowers semantic interoperability. Multiple systems can understand the context and meaning of that information and act on the data in an automated fashion.”

That way, it is no longer up to the analyst alone to connect the dots. “When you wrap data up like this, it becomes structured knowledge,” said Niedzwiadek. “This represents a paradigm shift from a data-centric model to one which is knowledge-centric. The object-based production data model infuses the data with meaning and relevance.”

The ultimate goal of intelligence is to make sense out of multiple sources of information holistically. “There is intelligence coming in many different formats,” said Mekrez. “When the intelligence community started using NoSQL [a type of graph database] databases three years ago, they were able to link together multiple sources of information in multiple formats from binary to textual to highly structured data.”

Complex Data Profiles

MarkLogic is currently storing data for the two biggest object-based production applications in the intelligence community. “The main capabilities that we bring are the ability to store information in any format, to integrate very complex data profiles and to secure that information,” said Mekrez.

For example, objects of interest may be attached to different names, which are secured at different classification levels. “Our security model enables different people occupying different roles and at different security levels to get different views of an object depending on their roles and security levels,” said Mekrez. “This used to require two different databases and two different applications.”

MarkLogic is also able to store complex object profiles such as features or locations and link them together using semantic capabilities. “We also have the ability to attach a geospatial location to an object,” said Mekrez. “This is a unique feature we are providing to the intelligence community.”

Digital Reasoning first attacked object-based production from the standpoint of structured data. “We started by using signal intelligence to help identify objects,” said Estes. “We were able to abstract from the SIGINT that a certain kind of signal belongs to a specific kind of object. That was the easy part.”

The more difficult part came when the company sought to integrate human intelligence, often in the form of unstructured text, into the object-based production universe. “Intelligence agencies may receive reports about a certain person of interest, and they may have other sources that link that person to others,” said Estes. “Some of these sources may be open-source intelligence, such as newspaper articles, that mention a link between one person and another.”

In order to put that information into context, an index of data on the object of the inquiry would have to available across agencies so that the attributes of the object can be aggregated. This leads the intelligence analyst to a greater degree of confidence that the name being sought is the one being studied.

If the subject of the inquiry, who may have a common name, can be associated with other objects, such as geospatial locations or people, the analyst can conclude with a greater sense of surety that the information he is studying is relevant to his mission. Another example of how object-based production works is in the production of specialized, mission-oriented digital maps for warfighters. “The purpose of these specialized maps is to answer questions for warfighters on their missions,” said Niedzwiadek. “For example, if they enter an Afghan village they have never been in before, they may want to know the location of specific water sources.

“The knowledge each warfighter needs is very specialized and specific. It could take thousands of conventional maps to supply this information. But object-based production changes the paradigm by allowing maps to be generated on the spot in connection with warfighter needs,” he added.

HANA, SAP’s computational platform, is “uniquely suitable for supporting and executing activity-based intelligence and the semantic organization of that information, which is object-based production,” according to Palmer. SAP has taken advances in in-memory computing to build the HANA platform, which combines database, data management, and multi-core processing capabilities. The platform also provides libraries for predictive, planning, text processing, and spatial and business analytics.

“The very rapid in-memory speeds make the platform very suitable for activity-based intelligence and object-based production,” said Palmer. “The in-memory database has a graphic engine that also has the ability to do text and geospatial analysis. All of this processing is done within a graph representation of data on a single platform.”

In-Memory Processing

The in-memory processing also facilitates more accurate predictive analyses by allowing algorithms to run against entire data sets, instead of just a small sampling of data.

Intel, which co-developed the platform, wrote HANA-specific instruction sets into its newest generation of chips. HANA also provides a suite of predictive, spatial and text analytics libraries that can run across multiple data sources. “These new tools enable self-service applications that address the needs of subject-matter experts and mission analysts,” said Palmer.

“There is a large effort to use object-based production across different tradecrafts and agencies,” said Mekrez. “Leveraging the object-based production data model is redefining data capture and dissemination. In terms of capabilities, the community is focusing on capturing objects, profiling and reporting in a way that can be used by different user communities and for different topics.”

“We are working on getting good knowledge from unstructured data,” said Estes. “The incorporation of this knowledge in objects means that it is also data that can be counted, measured and predicted from. Now, it is cumbersome to figure out what is happening in the present. The next step will be to test these models to see if we can predict what will happen next.”

Intelligence tradecraft is already shifting to accommodate an object-based production approach, according to Palmer. “The change is coming primarily in how information derived from multiple sources of intelligence is conceptualized and stored. Capabilities such as in-memory processing, graph engines, predictive analyses and natural language processing will be increasingly useful for the community. There has to be a focal point for multiple INTs to be circling around an object of interest. That object at first may be non-specified and not well-defined.

“Object-based production can help us understand not only what we know now, but what we don’t yet know and need to find out,” he said. ♦

Last modified onMonday, 02 February 2015 13:53

Additional Info

  • Issue: 1
  • Volume: 13
back to top