Big information initiatives are, properly, large in dimension and scope, typically very bold, and all too typically, full failures. In 2016, Gartner estimated that 60 % of massive information initiatives failed. A 12 months later, Gartner analyst Nick Heudecker mentioned his firm was “too conservative” with its 60 % estimate and put the failure fee at nearer to 85 %. Today, he says nothing has modified.
Gartner isn’t alone in that evaluation. Long-time Microsoft govt and (till just lately) Snowflake Computing CEO Bob Muglia informed the analytics website Datanami, “I can’t find a happy Hadoop customer. It’s sort of as simple as that. … The number of customers who have actually successfully tamed Hadoop is probably fewer than 20 and it might be fewer than ten. That’s just nuts given how long that product, that technology has been in the market, and how much general industry energy has gone into it.” Hadoop, in fact, is the engine that launched the large information mania.
Other folks acquainted with large information additionally say the issue stays actual, extreme, and never totally considered one of know-how. In reality, know-how is a minor reason behind failure relative to the actual culprits. Here are the 4 key causes that large information initiatives fail—and 4 key methods in which you’ll succeed.
Big information drawback No. 1: Poor integration
Heudecker mentioned there’s one main technological drawback behind large information failures, and that’s integrating siloed information from a number of sources to get the insights corporations need. Building connections to siloed, legacy techniques are merely not straightforward. Integration prices are 5 to 10 occasions the price of software program, he mentioned. “The biggest problem is simple integration: How do you link multiple data sources together to get some sort of outcome? A lot go the data lake route and think if I link everything to something magic will happen. That’s not the case,” he mentioned.
Siloed information is a part of the issue. Clients have informed him they pulled information from techniques of file into a standard atmosphere like an information lake and couldn’t determine what the values meant. “When you pull data into a data lake, how do you know what that number 3 means?” Heudecker requested.