AWS Fork Elasticsearch: Introducing OpenSearch

AWS Elastic Fork

Table of Contents:

  1. AWS Step Up For a “Truly Open Source Elasticsearch”
  2. A Software Fork – The ‘Nuclear Option’
  3. AWS’ Intentions Behind OpenSearch
  4. Open Source in the Cloud Era

 

AWS Step Up For a “Truly Open Source Elasticsearch”

On the 21st of January AWS introduced OpenSearch: their open source fork of Elastic’s Elasticsearch and Kibana products. AWS announced that they were stepping up for a ‘truly open source Elasticsearch’ by committing to support and maintain the Apache 2.0-licensed fork, with security and bug fixes, throughout its lifecycle.

This announcement comes as a result of Elastics licensing change from the Apache 2.0 license to the Server Side Public License (SSPL) for their Elasticsearch and Kibana products. This license change marked Elastic’s move from open source to a source available license (a type of proprietary license) for its Elasticsearch and Kibana products.

The project includes OpenSearch (derived from Elasticsearch 7.10.2) and OpenSearch Dashboards (derived from Kibana 7.10.2). The code for OpenSearch and OpenSearch Dashboards is now available on GitHub.

The goal of this project is to make it easy for as many people and organisations as possible to use OpenSearch in their business, their products, and their projects. This is possible because of the Apache 2.0 license (‘ALv2’).

The ALv2 is an open source license that allows users to use the software as they wish as long as they fulfil the license obligations. These include providing a copy of the license, retaining all notices (copyright, patent, trademark, attribution notices and NOTICE files).

 

License Notices
Figure 1: Amazon Photos complying with the obligations of the Apache 2.0 license, Source – Amazon Photos

It doesn’t matter what type of service provider you are the ALv2 grants you the right to use, modify, extend, monetise, resell, and offer OpenSearch as part of your products and services, as long as you comply with the license terms. AWS is also allowing users to use the OpenSearch trademark to promote offerings.

AWS’ OpenSearch project has the support of organisations such as Red Hat, SAP, Capital One and Logz.io. Nureen D’Souza, Senior Manager for Capital One’s Open Source Program Office said:

“When our teams chose to use Elasticsearch, the freedoms provided by the Apache v2.0 license was central to that choice. We’re very supportive of the OpenSearch project, as it will give us greater control and autonomy over our data platform choices while retaining the freedom afforded by an open source license.”

Deborah Bryant, Senior Director, Open Source Program Office at Red Hat said:

“At Red Hat, we believe in the power of open source, and that community collaboration is the best way to build software”.

Tomer Levy, co-founder and CEO of Logz.io said:

“At Logz.io we have a deep belief that community driven open source is an enabler for innovation and prosperity… We have the highest commitment to our customers and the community that relies on open source to ensure that OpenSearch is available, thriving, and has a strong path forward for the community and led by the community”

The initial code, which is available now on GitHub is at ‘alpha stage’ meaning that it is not complete, not thoroughly tested, and not suitable for production use. OpenSearch has only 45 contributors compared to the 1,610 contributors of the original Elasticsearch project and has around one-tenth of the level of commit activity. However, at its current rate AWS predict it to stabilize and be ready for production by mid-2021.

GitHub Commits
Figure 2: Comparison of Elasticsearch and OpenSearch’s GitHub code commits, Source – GitHub insights

 

A Software Fork – The ‘Nuclear Option’

To fork software is to take the source code from one software package and start independent development on it to create a distinct, separate, and entirely new program. To be considered a fork, the newer version of the software must have its own name and its own developer community.

In the context of open source software, software packages may be forked from the original development team without prior permission and without violating copyright law as per the open source definition.

A software fork often represents a split in the developer community. This split can occur due to a schism over different goals or personality clashes. In the case of AWS and Elastic, this fork occurred due to the former reason.

Elastic chose to move away from open source to proprietary for its products as the source available licensing model better suited their current business needs. AWS declare that their motive behind their fork was to make:

“a long-term investment in OpenSearch to ensure users continue to have a secure, high-quality, fully open source search and analytics suite”.

Software forks can be incredibly controversial when they duplicate efforts. One commentator has gone as far as to describe a fork as ‘the open source equivalent of the nuclear option’. Eric Raymond, the co-founder of the Open Source Initiative, says in Homesteading the Noosphere’:

 “There is a strong social pressure against forking projects. It does not happen except under plea of dire necessity, with much public self-justification, and requires re-naming”

AWS recognised the contention in their initial announcement back in January:

“Choosing to fork a project is not a decision to be taken lightly, but it can be the right path forward when the needs of a community diverge, as they have here”

 

AWS’ Intentions Behind OpenSearch

In the early development stages of AWS’ fork, doubt was cast on their intentions. In an update in February, before OpenSearch was made public on GitHub, a commentator made a fair observation:

“As an external observer, this is unclear to me why this effort is not done in the open. This does not look like a very good sign that the future community will be fully open.”

Kyle Davis, Senior Developer Advocate of OpenSearch and Open Distro for Elasticsearch at AWS, replied:

“As far as the questions of openness, we’ve talked and written about our plans as far as community, openness, and governance. I totally get talk is cheap and people in the broader community have been burned by similar statements recently. However, I’m confident that we’ll prove fears in this area unfounded over time.”

That time came on the 12th of April when AWS announced and went public with OpenSearch. They made their goals and intentions behind the OpenSearch project clear in the statement:

“Our goal with the OpenSearch project is to make it easy for as many people and organizations as possible to use OpenSearch in their business, their products, and their projects… Whether you are an independent developer, an enterprise IT department, a software vendor, or a managed service provider, the ALv2 license grants you well-understood usage rights for OpenSearch. You can use, modify, extend, embed, monetize, resell, and offer OpenSearch as part of your products and services. We have also published permissive usage guidelines for the OpenSearch trademark, so you can use the name to promote your offerings. Broad adoption benefits all members of the community.”

The fact that OpenSearch has the support of organisations such as Red Hat, SAP, Capital One and Logz.io suggests this forking effort is not to further the ‘AWS Show’. The support doesn’t end here, AWS received a welcoming reception from members of the wider open source community regarding their OpenSearch announcement. Adam Jacob, CEO of the start-up System Initiative and co-founder of Chef, tweeted:

This is good for everyone (except maybe Elastic, but they brought it on themselves). Good on AWS for forking, for taking out a greenfield trademark and collaborating openly with others. Long live OpenSearch…”

Although, other members of the community have criticised AWS’ decision. Bruno Borges, Java lead at Microsoft, tweeted:

“While I think it is an ‘OK’ move by AWS Open Source, I also think it was a missed opportunity to improve company’s relationship with FOSS communities,” expressing disappointment that AWS was keeping itself the steward of OpenSearch rather than handing it to a foundation such as Apache.”

The former Open Source Initiative president, Simon Phipps, says the tools for the removal of software freedoms are relicensing, as Elastic have done, and contributor licensing agreements (‘CLA’) as these agreements give the company effective rights over the whole codebase.

When contributing to open source projects it is not uncommon to be required to sign a CLA. However, if you wish to contribute to OpenSearch AWS will not ask you to sign one. Their reasoning behind the absence of CLA’s is that it makes it easier for anyone to contribute.

Furthermore, AWS say the project is:

“to be a community endeavour, where anyone can contribute to it, influence it, and make decisions together about its future. Community development, at its best, lets people with diverse interests have a direct hand in guiding and building products they will use; this results in products that meet their needs better than anything else.”

It will be interesting to see how OpenSearch develops throughout its lifecycle and specifically AWS’ actions regarding the project; whether the community will truly be their focus going forward.

 

Open Source in the Cloud Era

Organisations have been moving to the Cloud for years due to its numerous benefits. The cloud is more cost-effective, flexible, secure and has the benefits of newer and more advanced applications. Although, a recent global study done by LogicMonitor shows COVID-19 has become a ‘powerful catalyst’ for cloud migration. This is because:

“the cloud is enabling [organisations] to operate remotely now while also serving as the foundation for digital transformation and ongoing innovation”, says Tej Redkar, chief product office at LogicMonitor.

Covid Cloud Migration
Figure 3: Acceleration of Cloud Migration Post COVID-19, Source – LogicMonitor

 

Elasticsearch is a frequently used cloud-based product. In fact, in the past four months, Source Code Control has seen Elasticsearch crop up in multiple audits. The continued investment in cloud solutions means that people are more likely to use cloud-based products such as Elasticsearch, MongoDB, Redis and by summer 2021 AWS’ OpenSearch.

The license changes the community has seen from Cloud providers over the past four years emphasises the importance of knowing what is in your solution. Consistent monitoring of what is in your software is necessary to keep up to date with these changes and to ultimately mitigate business risks in your solution.

If this article raises questions or concerns Source Code Control can provide you with support and guidance. Visit our website here or contact us at [email protected]

Leave a Reply

Your email address will not be published. Required fields are marked *