HPC

Apache Foundation embraces real time big data cruncher 'Storm'

Does for real time processing what Hadoop did for batch processing


The Apache Foundation has voted to accept the “Storm” real time data processing tool into its incubator program, the first step towards making it an official part of the Foundation's open source offerings.

Storm aims to do for real time data processing what Hadoop did for batch processing: queue jobs and send them off to a cluster of computers, then pull everything back together into usable form. Nathan Marz, poster of the Storm GitHub repository believes “The lack of a 'Hadoop of real time' has become the biggest hole in the data processing ecosystem.”

Storm tries to fill that hole with software that “... exposes a set of primitives for doing real time computation. Like how MapReduce greatly eases the writing of parallel batch processing, Storm's primitives greatly ease the writing of parallel real time computation.”

Without Storm, Marz writes, one would have to “manually build a network of queues and workers to do real time processing.” Storm automates that stuff, which should mean better scaling: Marz already claims “one of Storm's initial applications processed 1,000,000 messages per second on a 10 node cluster, including hundreds of database calls per second as part of the topology.”

All of which should get high performance computing folks excited.

The Apache Foundation's incubation process isn't technical. One goal is to ensure any software offered with its feathered logo conforms to its preferred license, which should not prove problematic as Storm is currently offered under the Eclipse Public License. The Foundation also likes to ensure proper communities nourish software it offers, and again that should not be a struggle given Storm already has enthusiastic users including Yahoo!, Twitter and business-to-business tat bazaar Alibaba.

Once the Foundation adds its imprimatur to the list of testimonials from current Storm users, that community will doubtless grow. And be joined by Big Data marketers who have run out of things to say about Hadoop, although they'll doubtless soon assert Storm means sizzzling business insights are magically available in real time with just as little justification for that assertion as for the oft-repeated proposition that Hadoop+data=highly profitable insights in your inbox every afternoon. ®

Narrower topics


Other stories you might like

  • DuckDuckGo tries to explain why its browsers won't block some Microsoft web trackers
    Meanwhile, Tails 5.0 users told to stop what they're doing over Firefox flaw

    DuckDuckGo promises privacy to users of its Android, iOS browsers, and macOS browsers – yet it allows certain data to flow from third-party websites to Microsoft-owned services.

    Security researcher Zach Edwards recently conducted an audit of DuckDuckGo's mobile browsers and found that, contrary to expectations, they do not block Meta's Workplace domain, for example, from sending information to Microsoft's Bing and LinkedIn domains.

    Specifically, DuckDuckGo's software didn't stop Microsoft's trackers on the Workplace page from blabbing information about the user to Bing and LinkedIn for tailored advertising purposes. Other trackers, such as Google's, are blocked.

    Continue reading
  • Despite 'key' partnership with AWS, Meta taps up Microsoft Azure for AI work
    Someone got Zuck'd

    Meta’s AI business unit set up shop in Microsoft Azure this week and announced a strategic partnership it says will advance PyTorch development on the public cloud.

    The deal [PDF] will see Mark Zuckerberg’s umbrella company deploy machine-learning workloads on thousands of Nvidia GPUs running in Azure. While a win for Microsoft, the partnership calls in to question just how strong Meta’s commitment to Amazon Web Services (AWS) really is.

    Back in those long-gone days of December, Meta named AWS as its “key long-term strategic cloud provider." As part of that, Meta promised that if it bought any companies that used AWS, it would continue to support their use of Amazon's cloud, rather than force them off into its own private datacenters. The pact also included a vow to expand Meta’s consumption of Amazon’s cloud-based compute, storage, database, and security services.

    Continue reading
  • Atos pushes out HPC cloud services based on Nimbix tech
    Moore's Law got you down? Throw everything at the problem! Quantum, AI, cloud...

    IT services biz Atos has introduced a suite of cloud-based high-performance computing (HPC) services, based around technology gained from its purchase of cloud provider Nimbix last year.

    The Nimbix Supercomputing Suite is described by Atos as a set of flexible and secure HPC solutions available as a service. It includes access to HPC, AI, and quantum computing resources, according to the services company.

    In addition to the existing Nimbix HPC products, the updated portfolio includes a new federated supercomputing-as-a-service platform and a dedicated bare-metal service based on Atos BullSequana supercomputer hardware.

    Continue reading

Biting the hand that feeds IT © 1998–2022