IBM lobs 3,500 staffers at Apache Spark

Big Blue researchers pile into cluster parade


IBM has thrown its full weight behind Spark, Apache’s open-source cluster computing framework.

Spark will form the basis of all of Big Blue's analytics and commerce platforms and its Watson Health Cloud. The framework will also be sold as a service on its Bluemix cloud.

IBM will commit more than 3,500 of its researchers and developers to Spark-related projects and promised a Spark Technology Center in San Francisco, California where data science and developers can work with IBM designers and architects.

The giant also committed to release, under open source terms, its SystemML family machine-learning libraries.

Spark was invented by researchers at the University of California at Berkeley in 2009, under Matei Zaharia, and donated to Apache in 2013.

Written in Java, Scala and Python, Spark is an in-memory system for processing large data sets. It consists of scheduling and dispatching, SQL-style programming language, a machine-learning framework and distributed graphics processing framework.

Spark can scale to more than 8,000 production nodes and, while it works with Hadoop and MapReduce, is claimed to also be faster on certain workloads. Up until last year, Spark had just 465 contributors.

The presence of IBM can make or break open-source projects.

IBM adopted the Eclipse framework early on, making it the basis of its Rational programming tools. Serving as the foundation of IBM’s tools helped establish Eclipse as one industry’s biggest development environments, behind Microsoft’s Visual Studio, and guaranteed an entire ecosystem of ISVs building Eclipse plug-ins.

It’s been a virtuous circle: IBM is freed from having to maintain the IDE plumbing, ISVs and devs got an open, pluggable tools platform, and IBM benefits from advances and partners.

On the other extreme, you have Harmony – also an Apache project, for an independent alternative to Java from the now non-existent Sun Microsystems.

IBM threw in its lot because it vied with Sun for stewardship over Java.

When Sun ceased to exist, bought by Oracle, IBM withdrew from Harmony in October 2010 to join the OpenJDK project with Apple and Oracle.

Drained of its biggest backer, Harmony shut down 12 months later.

Oracle sought to make amends of a kind with Apache in 2011 by punting its OpenOffice productivity suite over the open-source project shop’s auspices.

Announcing its backing for Apache's Spark Monday, IBM painted Spark as a platform for data and analytics, the analogy being Linux – which IBM also contributes to – as a platform for apps.

The parallel, though, would seem closer to Eclipse. ®

Broader topics


Other stories you might like

  • Meet Wizard Spider, the multimillion-dollar gang behind Conti, Ryuk malware
    Russia-linked crime-as-a-service crew is rich, professional – and investing in R&D

    Analysis Wizard Spider, the Russia-linked crew behind high-profile malware Conti, Ryuk and Trickbot, has grown over the past five years into a multimillion-dollar organization that has built a corporate-like operating model, a year-long study has found.

    In a technical report this week, the folks at Prodaft, which has been tracking the cybercrime gang since 2021, outlined its own findings on Wizard Spider, supplemented by info that leaked about the Conti operation in February after the crooks publicly sided with Russia during the illegal invasion of Ukraine.

    What Prodaft found was a gang sitting on assets worth hundreds of millions of dollars funneled from multiple sophisticated malware variants. Wizard Spider, we're told, runs as a business with a complex network of subgroups and teams that target specific types of software, and has associations with other well-known miscreants, including those behind REvil and Qbot (also known as Qakbot or Pinkslipbot).

    Continue reading
  • Supreme Court urged to halt 'unconstitutional' Texas content-no-moderation law
    Everyone's entitled to a viewpoint but what's your viewpoint on what exactly is and isn't a viewpoint?

    A coalition of advocacy groups on Tuesday asked the US Supreme Court to block Texas' social media law HB 20 after the US Fifth Circuit Court of Appeals last week lifted a preliminary injunction that had kept it from taking effect.

    The Lone Star State law, which forbids large social media platforms from moderating content that's "lawful-but-awful," as advocacy group the Center for Democracy and Technology puts it, was approved last September by Governor Greg Abbott (R). It was immediately challenged in court and the judge hearing the case imposed a preliminary injunction, preventing the legislation from being enforced, on the basis that the trade groups opposing it – NetChoice and CCIA – were likely to prevail.

    But that injunction was lifted on appeal. That case continues to be litigated, but thanks to the Fifth Circuit, HB 20 can be enforced even as its constitutionality remains in dispute, hence the coalition's application [PDF] this month to the Supreme Court.

    Continue reading
  • How these crooks backdoor online shops and siphon victims' credit card info
    FBI and co blow lid off latest PHP tampering scam

    The FBI and its friends have warned businesses of crooks scraping people's credit-card details from tampered payment pages on compromised websites.

    It's an age-old problem: someone breaks into your online store and alters the code so that as your customers enter their info, copies of their data is siphoned to fraudsters to exploit. The Feds this week have detailed one such effort that reared its head lately.

    As early as September 2020, we're told, miscreants compromised at least one American company's vulnerable website from three IP addresses: 80[.]249.207.19, 80[.]82.64.211 and 80[.]249.206.197. The intruders modified the web script TempOrders.php in an attempt to inject malicious code into the checkout.php page.

    Continue reading

Biting the hand that feeds IT © 1998–2022