This article is more than 1 year old

The great big open-source census: Most-used libraries revealed – plus 10 things developers should be doing to keep their code secure

Linux Foundation hears your gripes about naming schemes, legacy code, and more

With modern applications now composed of 80 to 90 per cent Free and Open Source Software (FOSS), the Linux Foundation and Laboratory for Innovation Science at Harvard University (LISH) on Wednesday published their second open-source census to promote better security and code management practices.

The first such report appeared in 2015, and focused on enumerating critical components in the Debian GNU/Linux distribution. The latest one, "Vulnerabilities in the Core, a Preliminary Report and Census II of Open Source Software," examines the most commonly used FOSS packages in production applications with an eye toward potential vulnerabilities so organizations can develop better management and security tools.

The reports are part of the Linux Foundation's Core Infrastructure Initiative (CII), a multi-million dollar project backed by Amazon Web Services, Adobe, Bloomberg, Cisco, Dell, Facebook, Fujitsu, Google, Hitachi, HP, Huawei, IBM, Intel, Microsoft, NetApp, NEC, Qualcomm, RackSpace,, and VMware. The CII provides companies with a way to fund the open source projects they've come to depend on, like OpenSSL.

Through these reports, the Linux Foundation and LISH aim to promote software ecosystem improvements that will help enterprises and organizations become more active in preventing software vulnerabilities and attacks.

“The report begins to give us an inventory of the most important shared software and potential vulnerabilities and is the first step to understand more about these projects so that we can create tools and standards that results in trust and transparency in software," explained Jim Zemlin, executive director at the Linux Foundation, in a statement.

A companion report, "Open Source Software Supply Chain Security [PDF]," makes the case for concern by recalling a series of software package compromises over the past few years. These include" the 2015 repackaging of Apple's Xcode IDE to enable malicious code distribution; the 2016 npm "left-pad" debacle; the 2017 Python package (PyPI) typosquatting and 2018 “Colourama” crypto-stealing incident; and the 2018 backdooring of the npm "event-stream" library, among others.

The primary focus of the Census II report is to identify the ten most used JavaScript and non-JavaScript packages, with attention paid to open issues and the frequency of code commits, and to outline lessons learned.

The JavaScript top 10 consist of:

  • Async: For writing asynchronous JavaScript.
  • Inherits: For implementing inheritance.
  • Isarray: Array testing for older browsers.
  • Kind-of: Get the native type designation of a JavaScript value.
  • Lodash: A utility library.
  • Minimist: For parsing argument options.
  • Natives: Provides access to Node.js’s native JavaScript modules.
  • Qs: A query string parsing and stringifying library.
  • Readable-stream: Node.js core streams module.
  • String_decoder: Node-core string_decoder module.

While the non-JavaScript top 10 include:

  • Com.fasterxml.jackson.core:jackson-core: Part of Jackson, a JSON processor.
  • Com.fasterxml.jackson.core:jackson-databind: A data-binding package for Jackson (2.x).
  • Google core libraries for Java.
  • Commons-codec: Apache Commons-Codec encoding software.
  • Commons-io: A library of utilities for IO operations.
  • Httpcomponents-client: Low-level Java components focused on HTTP.
  • Httpcomponents-core: Low-level Java components focused on HTTP.
  • Logback-core: A Java logging framework.
  • Org.apache.commons:commons-lang3: A package of Java utility classes.
  • Slf4j: A Java logging framework abstraction.

The report touches on various findings, specifically the need for a standardized naming schema for software components (so everyone understands the specific code being discussed), the importance of developer account security (so identities and packages can't be hijacked), and the challenge of dealing with legacy code (because moving to a revised package may not be an easy process).

A second companion report, "Improving Trust and Security in Open Source Projects," [PDF] offers some actual advice on how to deal with the issues raised in the other two publications. These best practices include:

  • Defining roles and responsibilities in security teams.
  • Having and communicating a security policy.
  • Identifying project contributors and assigning appropriately scoped permissions.
  • Proper use of git for handling security issues, and other toolchain concerns.
  • Maintaining technical security guidance documents.
  • Having incident response and vulnerability management playbooks.
  • Implementing security testing and code review procedures.
  • Defining secure release criteria.

"Hundreds of thousands of open source software packages are in production applications throughout the supply chain, so understanding what we need to be assessing for vulnerabilities is the first step for ensuring long-term security and sustainability of open source software,” said Zemlin. ®

More about


Send us news

Other stories you might like