Many defense and intelligence situations present big data scenarios these days. It has come to the point where it is no longer necessary to speak of big data problems, at least to the extent that the building blocks of solutions—distributed computing, cloud infrastructures and storage strategies—are in hand and well known.
But there are always wrinkles in big-data issues that require special attention. One such wrinkle involves performing analytics on data gathered from overhead drones.
"Drones in the sky are like giant vacuum cleaners of data," said Bob Baughman, CEO of HUVR, a Texas-based startup. "Image and video files can be massive. The data must be approached with a strategy to store it properly and categorize and tag it usefully, to develop an analytics engine to create value and to feed information back to users in an easy and simple way." HUVR specializes in performing overhead inspections and analyzing data on behalf of wind energy, oil and gas, pipeline and agriculture companies.
"Drones aren't a garden-variety big data situation. It is a little more complex," said Brian Houston, vice president of engineering at HDS Federal. "You have multiple sensors on drones collecting video, imagery, infrared and sound, flight path data, wind and temperature data. Storage of the data requires a tiered structure that resides online on an optical platform. At the end of the day, you have to make sure the data is in the right place and at the right time when it is needed and that the storage infrastructure is cost-justifiable."
Although HUVR describes itself as an inspection outfit, behind the scenes it is in many ways really a data analytics company. "Customers want to know how many problems they have, what kind they are and how much they are going to cost to fix," said Baughman. "That is why we focus on developing a data structure and a cloud infrastructure."
The data structure is one that can take in the kinds of files that are being gathered in a particular situation and that can accommodate data elements that are particularly useful to users in that scenario. "We typically gather video, still imagery and thermal imagery on behalf of customers," said Baughman. "The data structure needs to place around those images information that is useful to customers now and in the future."
For example, HUVR plans on being able to insert manufacturing information about specific assets it is inspecting from the sky. Baughman also sees opportunities to be able to append weather information to data collected by drones to serve an analytics process that would draw connections between weather conditions and possible problems with inspected assets.
"The data structure allows us to add information to images and to sort and categorize that information on that basis," said Baughman.
All of this takes place in the cloud. "It needs to be up there for the sake of safety, redundancy and backup capabilities," said Baughman. "We want to provide clients with secure, unencumbered delivery of products almost in real time."
Although HUVR does not work in the defense or intelligence markets, its principles are "100 percent applicable," said Baughman. "Government agencies inspecting things covertly with drones need a complete data strategy. Otherwise they are wasting their time collecting petabytes of information that no one knows about."
Another aspect of the large volumes of data generated by drones involves storage. Data stored for long periods of time historically have been migrated to tape storage media. That required the physical location of tape to identify the correct data, a potentially time-consuming process.
Hitachi Data Systems advocates a tiered data storage system, which includes optical media for long-term storage. "That way," said Houston, "the data can be kept online for many years without having to worry about data migration. You can find data from 15 years ago without having to locate a tape and making sure the tape is good."
A tiered system of data storage means that the most recent and usable data is stored on the medium, which provides the highest speed, typically flash storage, which is also the most expensive. Data that gets older but that still may be accessed in the future may be driven down to slower and less expensive media, such as optical storage. All this is done automatically.
"Data that may not be looked at again for three months shouldn't be in the highest-cost portion of the infrastructure," said Houston.
Just as a tiered storage system views data holistically, HDS's analytics capabilities breaks down data silos to search across data types. Thanks to recent acquisitions, HDS also has the capability of running trending analysis to generate forecasts.
"All of the data generated by drone sensors can be searched together," said Houston, "across all data types and independent of data sources."