Fear of Staxit: What next for ASF's Cassandra as biggest donor cuts back

Fill the DataStax-sized hole with big ideas


I've been a user of Cassandra for quite a number of years. I've suggested fixes for Apache Cassandra and – I believe – was the first to build a small cluster on Raspberry Pi computers. This year I was lucky enough to be voted an Apache Cassandra MVP.

It's for these reasons that I've been saddened by this year's falling out between the Apache Software Foundation (ASF), which is home to the Cassandra Project, and DataStax, the primary contributor to ASF Cassandra since Cassandra shifted there from Facebook in 2009.

To me, it feels to me like a vibrant, responsive and welcoming community has been turned on its head, and in this case, by the very people who are about community building.

Apache Cassandra is governed by ASF's bylaws and procedures, rules that promote the foundation's principles of openness, innovation and community.

The terms of the Apache licence mean anyone can fork a project, develop it and use it commercially – as long as they don't breach the AFS licence and copyright. A number of ASF projects are marketed by commercial organisations, offering extended versions of the project, technical help services or add-on products that have been developed by the company themselves.

Examples include Microsoft's HDInsight, which is described as "a managed Apache Hadoop, Spark, R, HBase and Storm cloud service made easy", with no end of companies offering managed Tomcat or Apache web server applications.

In the case of Cassandra, DataStax had its offering.

As an end user, I was pretty happy with the role played by DataStax – it seemed to me that DataStax were giving back a lot to the Cassandra community and making Cassandra a pretty kick-ass database. The project chair for Apache Cassandra, Jonathan Ellis, is also the chief technology officer and co-founder of DataStax along with committers from DataStax, who were in my view responsible for much of the innovation Cassandra has seen over the past couple of years.

All was going swimmingly with the Apache's Cassandra project. I never really used DataStax's Enterprise Edition – it was too big and had too many additional features for my use case – but for commercial end users it certainly is an easy way to get into distributed database and analysis engines.

Issues between DataStax and the ASF started in June this year. This email is a perfectly reasonable question about the ownership of Java driver for Apache Cassandra. Other questions followed – the role of DataStax in providing cheap training for Apache Cassandra, the role of JIRA (software development tool) in Apache Casandra communications – culminating in a request from the ASF for a special report on the potential control of single company (DataStax) of Apache Cassandra, Planet Cassandra, marketing material and the composition of the Apache Casandra Project Management Committee.

At this point, as a developer and user of the software, I was beginning to feel a little nervous. Why had the ASF suddenly taken an interest in the management of the project? I don't know the answer to this question but by 19 August Ellis had announced he was stepping down from his role as PMC chair.

This was followed by two announcements from DataStax about its role and the future of the community portal Planet Casandra. In the first, Ellis stated that DataStax would in future concentrate its effort on the enterprise edition of Cassandra.

Patrick McFadin, DataStax chief evangelist for Apache Cassandra, announced that one of the major community voices for Apache Cassandra – Planet Cassandra – was shutting down and DataStax's Developer Relations team shifting its attention to the DataStax Academy.

This felt like DataStax had left the building as far as Apache Cassandra is concerned and I'm sure it leaves end users uncertain about the future.

McFadin has been quick to point out that DataStax is not "abandoning" Cassandra. He also said it was the ASF that objected to DataStax's heavy involvement: "ASF was very clear. Single corporate control on a project is not OK. Look at all the new PMC and committers added lately."

Where does this leave us? The community is still very strong on the Apache Cassandra development mail list, plans are still afoot for the release of version 4 with more committers added, and there has been increasing support from companies such as Apple and Instacluster (who provided managed Cassandra clusters).

What isn't clear is the exact role of the Cassandra big boy, DataStax – the commercial entity with a vested interest whose dominance of the project alarmed ASF in the first place.

Take for instance the annual Apache Cassandra summit that was hosted by DataStax. Ellis has said DataStax will continue to provide sponsorship and meet-up support, but who is going to do the conference organising? As for support for Cassandra's software development, McFadin tweeted recently that DataStax would be still heavily involved, but there would be a shift "from pushing features quickly to OSS C*" – open-source Cassandra to the rest of us – which could signal a slowdown in new features for Apache Cassandra.

Is this in the end a good thing for Apache Cassandra? The heavy involvement of a single company meant that there were a lot of resources that could be drawn upon, meaning the development of Apache Cassandra over the past couple of years has been at a blistering pace, moving it from a niche NoSQL product to one of the go-to solutions for data at scale.

DataStax brought a lot of expertise. It moved Apache Casandra from a single-model database to a multi-model one, and brought in its SQL-like language. Would this have happened without DataStax? Possibly but certainly not at the speed they did with DataStax onboard.

The pace of development for Apache Cassandra will continue owing to the fact such a community has sprung up over the years. It's just likely to be slower.

A rough roadmap already exists for version 4 but most of the proposed features look pretty technical or are updates to the protocols used (thrift for instance). I'd like, however, to see something bolder – perhaps extending the idea of a multi-model database or increasing support for JSON to take on other document databases.

The biggest challenge, though, is to get more community involvement. Without that, development could go from slow to slower and stop in the long run.

Was ASF right to step in? Certainly ASF is correct to protect their copyright and the principles of the foundation, but I can't help but think that ASF may have let dogma get in the way of a pragmatic approach to company involvement. The approach felt heavy-handed and accusatory rather than looking for the good side of the DataStax situation. I believe there were many advantages – and, possibly, allowing for a graceful withdrawal from the project, this feels like a sharp laying down of ASF law produced a rapid DataStax exit.

Overall, this should serve as a warning to other companies involved in ASF projects: be careful about your level of commitment and separate your commercial and non-commercial efforts. No matter how much good work you've done, you could get pushed out.

As for me? I'll continue to support the development of Apache Cassandra and use the project where I can. I also look forward to see where DataStax can take its Enterprise Edition. ®

Similar topics

Broader topics

Narrower topics


Other stories you might like

  • DigitalOcean sets sail for serverless seas with Functions feature
    Might be something for those who find AWS, Azure, GCP overly complex

    DigitalOcean dipped its toes in the serverless seas Tuesday with the launch of a Functions service it's positioning as a developer-friendly alternative to Amazon Web Services Lambda, Microsoft Azure Functions, and Google Cloud Functions.

    The platform enables developers to deploy blocks or snippets of code without concern for the underlying infrastructure, hence the name serverless. However, according to DigitalOcean Chief Product Officer Gabe Monroy, most serverless platforms are challenging to use and require developers to rewrite their apps for the new architecture. The ultimate goal being to structure, or restructure, an application into bits of code that only run when events occur, without having to provision servers and stand up and leave running a full stack.

    "Competing solutions are not doing a great job at meeting developers where they are with workloads that are already running today," Monroy told The Register.

    Continue reading
  • Patch now: Zoom chat messages can infect PCs, Macs, phones with malware
    Google Project Zero blows lid off bug involving that old chestnut: XML parsing

    Zoom has fixed a security flaw in its video-conferencing software that a miscreant could exploit with chat messages to potentially execute malicious code on a victim's device.

    The bug, tracked as CVE-2022-22787, received a CVSS severity score of 5.9 out of 10, making it a medium-severity vulnerability. It affects Zoom Client for Meetings running on Android, iOS, Linux, macOS and Windows systems before version 5.10.0, and users should download the latest version of the software to protect against this arbitrary remote-code-execution vulnerability.

    The upshot is that someone who can send you chat messages could cause your vulnerable Zoom client app to install malicious code, such as malware and spyware, from an arbitrary server. Exploiting this is a bit involved, so crooks may not jump on it, but you should still update your app.

    Continue reading
  • Google says it would release its photorealistic DALL-E 2 rival – but this AI is too prejudiced for you to use
    It has this weird habit of drawing stereotyped White people, team admit

    DALL·E 2 may have to cede its throne as the most impressive image-generating AI to Google, which has revealed its own text-to-image model called Imagen.

    Like OpenAI's DALL·E 2, Google's system outputs images of stuff based on written prompts from users. Ask it for a vulture flying off with a laptop in its claws and you'll perhaps get just that, all generated on the fly.

    A quick glance at Imagen's website shows off some of the pictures it's created (and Google has carefully curated), such as a blue jay perched on a pile of macarons, a robot couple enjoying wine in front of the Eiffel Tower, or Imagen's own name sprouting from a book. According to the team, "human raters exceedingly prefer Imagen over all other models in both image-text alignment and image fidelity," but they would say that, wouldn't they.

    Continue reading
  • Facebook opens political ad data vaults to researchers
    Facebook builds FORT to protect against onslaught of regulation, investigation

    Meta's ad transparency tools will soon reveal another treasure trove of data: advertiser targeting choices for political, election-related, and social issue spots.

    Meta said it plans to add the targeting data into its Facebook Open Research and Transparency (FORT) environment for academic researchers at the end of May.

    The move comes a day after Meta's reputation as a bad data custodian resurfaced with news of a lawsuit filed in Washington DC against CEO Mark Zuckerberg. Yesterday's filing alleges Zuckerberg built a company culture of mishandling data, leading directly to the Cambridge Analytica scandal. The suit seeks to hold Zuckerberg responsible for the incident, which saw millions of users' data harvested and used to influence the 2020 US presidential election.

    Continue reading
  • Toyota cuts vehicle production over global chip shortage
    Just as Samsung pledges to invest $360b to shore up next-gen industries

    Toyota is to slash global production of motor vehicles due to the semiconductor shortage. The news comes as Samsung pledges to invest about $360 billion over the next five years to bolster chip production, along with other strategic sectors.

    In a statement, Toyota said it has had to lower the production schedule by tens of thousands of units globally from the numbers it provided to suppliers at the beginning of the year.

    "The shortage of semiconductors, spread of COVID-19 and other factors are making it difficult to look ahead, but we will continue to make every effort possible to deliver as many vehicles to our customers at the earliest date," the company said.

    Continue reading

Biting the hand that feeds IT © 1998–2022