Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Big Data and Analytics
    • Big Data and Analytics
    • Cloud
    • Database
    • Storage

    Why Good Machine Data Management Optimizes Analytics Tool Costs

    Written by

    Zeus Kerravala
    Published October 5, 2020
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      According to Merriam-Webster, the word “paradox” is defined as a statement that is seemingly contradictory or opposed to common sense. In the enterprise world of today, IT professionals might consider machine data a paradox. On one hand, machine data is pure gold; it holds valuable information that, when correlated and analyzed, can provide valuable insights to help IT organizations optimize applications, find security breaches or proactively prevent problems that have yet to occur. 

      On the other hand, machine data is one of the biggest sources of pain. The volume of data, types of information, formats and sources has become so unwieldy that it’s difficult, if not impossible, to parse. To ensure that everyone is on the same page, I’m defining machine data as metrics, log data and traces (MELT). If you’re unsure of the difference or what these are, Sysdig’s Apurva Dave does a great job of explaining in this post. 

      Machine data comes in many shapes and sizes 

      To understand the problem, consider how machine data is currently handled. Some log data is pulled off servers and stored in one or more index systems for fast search. Security logs get sent to SIEMs for correlation and threat hunting. Metrics go through a different process and get captured in a time series database for analysis. The massive amount of trace data likely gets dumped into big data lakes theoretically for future processing. I say theoretically, because the information in data lakes is often unusable due to its unstructured nature. The net result is lots of data silos, which leads to incomplete analysis. 

      In data sciences, there’s an axiom that states “Good data leads to good insights.” The corollary is true as well: Bad data leads to bad insights, and siloed data leads to siloed insights. 

      Also, many of the analytic tools are very expensive and do not work well for unstructured data. I’ve talked to companies that have spent tens of millions on log analytics. These tools can be helpful, but often they aren’t because the volume of data is so large and has so much noise in it that the output isn’t as useful as it could be. The volume of data is certainly on the rise, so this problem isn’t getting addressed any time soon with traditional tools. 

      Analytic and security tools have their own agents that add to the problem 

      Another issue is that each of the tools used to analyze machine data comes with its own agent that often collects the same data from different endpoints, often in a unique format, adding to the data clutter that IT departments need to sort. This also adds a lot of management overhead and increases resource utilization but doesn’t really add much value. Hence the paradox: The insights are hidden in the data, but the overhead required to find those “a-ha”s is often more complicated than whatever the original problem was.

      A new approach to managing the data pipeline is required

      What’s required is a new approach to managing machine data so the various tools can be used effectively. A good analogy for what’s needed is the network packet broker. The network industry has a similar problem with tool sprawl, because the number of network management and security tools has exploded over the past decade. There is no cost-effective way to send all network data to all tools, so, as with machine data, network managers just send everything to everything, which is expensive and limits the effectiveness of the tools. Sound familiar? In networking, along came a network packet broker that collects, normalizes, correlates data and then directs only relevant information to the specific tools. 

      Key attributes of machine data management 

      There’s no similar product with machine data, but in ideal world, the data would flow through some kind of pipeline that could address the following:

      • Gather one set of data that acts as a single source of truth.
      • Pre-processing of information so analytic tools only process the data it requires instead of everything. This would include suppression of duplicate information, removal of null events and dynamic sampling of the stream.
      • Normalize the data so it’s consistent and in format that’s usable by all the tools.
      • Optimize data flows for performance and cost.
      • Direct only the data required to the specific tools. There’s no point in having a tool process data only to drop it.

      This kind of machine data pipeline will dramatically reduce costs, particularly with consumption-based tools that charge on the volume of data analyzed. For example, companies incur a lot of expense-ingesting data into Splunk that they never actually consume in the tool. That might seem crazy, but unfortunately, that’s the norm. One solution could be to build a unique pipeline per tool, but that might cost more than just sending everything to everything.

      Current solutions address offer partial solutions  

      I don’t want to make it seem like nothing has been done to improve machine data management. There are a few open source companies, such as Apache NiFi and Fluentd, but they only address part of the problem. Also, Splunk has a product called data-stream processing that does close to what I outlined, but in typical Splunk style, it only works well with Splunk. The company would be smart to broaden the use of it to other tools. 

      There is an old saying that every business is a technology business, but I think that narrative has gotten a bit old. Instead, every business is a data-driven business, and competitive advantage is driven by finding those key insights in the data. 

      The problem is that the volume of machine data has grown so much that the ecosystem of tools to analyze it for IT organizations can’t keep pace. CIOs and IT leaders should look to invest in data-processing tools to optimize what the organization has already spent on analytic tools. This will help maximize the return on investment on tool spend and delay having to spend even more.

      Zeus Kerravala is an eWEEK regular contributor and the founder and principal analyst with ZK Research. He spent 10 years at Yankee Group and prior to that held a number of corporate IT positions.

      Zeus Kerravala
      Zeus Kerravala
      https://zkresearch.com/
      Zeus Kerravala is an eWEEK regular contributor and the founder and principal analyst with ZK Research. He spent 10 years at Yankee Group and prior to that held a number of corporate IT positions. Kerravala is considered one of the top 10 IT analysts in the world by Apollo Research, which evaluated 3,960 technology analysts and their individual press coverage metrics.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×
      OSZAR »