Community
2.8-
Activity
The activity score measures how recently a project has been updated, based on the number of days since the last commit on its default branch. Regular commits indicate an actively maintained project, responsive to bugs, security issues, and evolving user needs.
A project scores 5 if its last commit was within the past 30 days, 4 if within 6 months, 3 if within a year, and 2 if within 2 years. Projects that have not received any commit in over 2 years score 1, which may signal abandonment or a transition to a maintenance-only phase.
5.0 -
Popularity
The popularity score is based on the number of GitHub stars the project has received. While stars are an imperfect proxy for adoption, they remain one of the most widely available and comparable signals of community interest and visibility.
A project with fewer than 5,000 stars scores 1. The score increases to 2 at 5,000 stars, 3 at 20,000, 4 at 40,000, and reaches the maximum of 5 at 80,000 stars or more. These thresholds are intentionally high to differentiate among well-known open-source projects.
2.0 -
Maturity
The maturity score reflects how long a project has been in existence, measured by the time elapsed since its very first commit. A long track record is a sign of a project that has withstood the test of time, accumulated institutional knowledge, and stabilized through multiple release cycles.
Projects receive a score of 1 if their first commit is less than a year old, indicating a very young project. A score of 2 is given starting at 1 year, 3 at 5 years, 4 at 10 years, and the maximum score of 5 is reserved for projects with 20 years or more of history. These thresholds reward long-term sustainability without penalizing newer projects too harshly.
4.0 -
Number of contributors
The contributors score evaluates how many people are actively working on the project. Rather than counting all historical contributors, it focuses on those who have made more than 3 commits in the last 6 months, excluding bots. This filters out one-time contributors and automated accounts.
A project with no qualifying active contributor scores 1. A score of 2 requires at least 1 active contributor, 3 requires 5, 4 requires 20, and the highest score of 5 is awarded to projects with 50 or more active contributors. This reflects the breadth and health of the development community around the project.
A diverse contributor base reduces the "bus factor" risk: the chance that the project stalls if a single maintainer steps away. Projects with many active contributors are generally more resilient, better reviewed, and more likely to sustain long-term development.
2.0 -
Technical documentation
The documentation score is a composite evaluation of how well a project is documented, combining four sub-metrics with different weights: README quality (40%), the presence and richness of a docs directory (30%), the number of key README sections (20%), and accessibility features (10%).
README quality is assessed on length and structure. A comprehensive README with over 3,000 words and at least 8 key sections (such as installation, usage, configuration, contributing, examples, and license) will score highest. The docs directory sub-metric rewards projects that maintain a dedicated documentation folder, with extra points for having more files. Accessibility checks for the presence of a CONTRIBUTING.md file and GitHub issue templates, which lower the barrier for new contributors.
These four sub-metrics are combined into a single weighted value (0-100), then mapped to a 1-5 score using thresholds at 20, 40, 60, and 80 points. A project that only has a short README and no docs directory will score low, while a project with thorough documentation across all dimensions will achieve the maximum score. Good documentation is one of the strongest indicators of a project that cares about its users and contributors.
2.0
Tech
3.3-
Technical debt
The technical debt score reflects the structural health of the production code: excessive complexity, tight coupling, coding convention violations, etc.
A score of 5 corresponds to clean, well-structured code; a score of 1 indicates significant debt that is likely to slow the project's evolution. Intermediate values reflect increasing levels of structural degradation.
Unlike size or test coverage, technical debt is a qualitative indicator: it does not say how much code exists, but in what state it is. It directly determines how easy it is to fix a bug, add a feature, or resume a project after a long period of inactivity.
2.8 -
Test coverage
The tests score evaluates the project's investment in automated testing, measured as the ratio of test lines of code to production lines of code. Test files are identified by common naming conventions (_test, .test., .spec.) and directory patterns (/tests/, /tests/, /t/, etc.).
A ratio of 0.1 (10% of test code relative to production code) earns a score of 2, 0.4 (40%) earns 3, 0.8 (80%) earns 4, and a ratio of 1.6 or more (160%, meaning more test code than production code) earns the maximum score of 5. Projects with a test ratio below 10% score 1.
A healthy test ratio is one of the strongest signals of software quality. Projects with extensive test suites are more likely to catch regressions, easier to refactor safely, and more welcoming to contributors who can validate their changes. The thresholds are deliberately progressive: achieving a 1:1 test-to-code ratio or beyond demonstrates an exceptional commitment to reliability.
3.0 -
Global size
The size score reflects the volume of production code in the project, measured in lines of code (LOC). Line counting is performed by Tokei, a fast and accurate tool that counts only actual code lines, excluding comments and blanks. Test files are excluded from this count.
Projects with fewer than 1,000 lines of code score 5, while those exceeding 1,000,000 lines score 1. The intermediate thresholds are 10,000 (score 4), 100,000 (score 3), and 1,000,000 (score 2).
A smaller codebase is generally easier to understand, audit, and maintain. Large codebases tend to accumulate technical debt and are harder for new contributors to approach. However, this score should be interpreted in context: a large project is not inherently bad if its complexity and test coverage remain healthy. Some domains simply require more code.
2.0 -
Complexity
The complexity score measures the proportion of functions in the codebase that exhibit high cyclomatic complexity, as analyzed by Lizard. Cyclomatic complexity counts the number of independent execution paths through a function. A function is considered highly complex when its cyclomatic complexity exceeds 15.
The metric used is the percentage of high-complexity functions relative to the total number of functions. The analysis focuses on the 1,000 largest production files, filtering out vendored code, generated files, build scripts, documentation, and other non-essential directories. This ensures the score reflects the project's own code rather than third-party dependencies.
The score uses a "smaller is better" logic: a project where 5% or fewer of functions are highly complex scores 5. The score decreases to 4 at 10%, 3 at 20%, 2 at 30%, and 1 above 30%. High cyclomatic complexity often correlates with code that is harder to test, more prone to bugs, and more difficult to maintain.
5.0
Security
1.7-
Security policy
We are using Scorecard
This check tries to determine if the project has published a security policy. It works by looking for a file named SECURITY.md (case-insensitive) in a few well-known directories.
A security policy (typically a SECURITY.md file) can give users information about what constitutes a vulnerability and how to report one securely so that information about a bug is not publicly visible.
This check examines the contents of the security policy file awarding points for those policies that express vulnerability process(es), disclosure timelines, and have links (e.g., URL(s) and email(s)) to support the users.
5.0 -
Pinned dependencies
We are using Scorecard
This check tries to determine if the project pins dependencies used during its build and release process. A "pinned dependency" is a dependency that is explicitly set to a specific hash instead of allowing a mutable version or range of versions. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).
The check works by looking for unpinned dependencies in Dockerfiles, shell scripts, and GitHub workflows which are used during the build and release process of a project. Special considerations for Go modules treat full semantic versions as pinned due to how the Go tool verifies downloaded content against the hashes when anyone first downloaded the module.
Pinned dependencies reduce several security risks:
- They ensure that checking and deployment are all done with the same software, reducing deployment risks, simplifying debugging, and enabling reproducibility.
- They can help mitigate compromised dependencies from undermining the security of the project (in the case where you've evaluated the pinned dependency, you are confident it's not compromised, and a later version is released that is compromised).
- They are one way to counter dependency confusion (aka substitution) attacks, in which an application uses multiple feeds to acquire software packages (a "hybrid configuration"), and attackers fool the user into using a malicious package via a feed that was not expected for that package.
However, pinning dependencies can inhibit software updates, either because of a security vulnerability or because the pinned version is compromised. Mitigate this risk by:
- using automated tools to notify applications when their dependencies are outdated;
- quickly updating applications that do pin dependencies.
-
Packaging
We are using Scorecard
This check tries to determine if the project is published as a package. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).
Packages give users of a project an easy way to download, install, update, and uninstall the software by a package manager. In particular, they make it easy for users to receive security patches as updates.
The check currently looks for GitHub packaging workflows and language-specific GitHub Actions that upload the package to a corresponding hub, e.g., Npm. We plan to add better support to query package manager hubs directly in the future, e.g., for Npm, PyPi.
You can create a package in several ways:
- Many program language ecosystems have a generally-used packaging format supported by a language-level package manager tool and public package repository.
- Many operating system platforms also have at least one package format, tool, and public repository (in some cases the source repository generates system-independent source packages, which are then used by others to generate system executable packages).
- Using container images.
-
Vulnerabilities
We are using Scorecard
This check determines whether the project has open, unfixed vulnerabilities in its own codebase or its dependencies using the OSV (Open Source Vulnerabilities) service. An open vulnerability is readily exploited by attackers and should be fixed as soon as possible.
1.0 -
Binary artifacts
We are using Scorecard
This check determines whether the project has generated executable (binary) artifacts in the source repository.
Including generated executables in the source repository increases user risk. Many programming language systems can generate executables from source code (e.g., C/C++ generated machine code, Java .class files, Python .pyc files, and minified JavaScript). Users will often directly use executables if they are included in the source repository, leading to many dangerous behaviors.
Problems with generated executable (binary) artifacts:
- Binary artifacts cannot be reviewed, allowing possible obsolete or maliciously subverted executables. Reviews generally review source code, not executables, since it's difficult to audit executables to ensure that they correspond to the source code. Over time the included executables might not correspond to the source code.
- Generated executables allow the executable generation process to atrophy, which can lead to an inability to create working executables. These problems can be countered with verified reproducible builds, but it's easier to implement verified reproducible builds when executables are not included in the source repository (since the executable generation process is less likely to have atrophied).
Allowed by Scorecard:
- Files in the source repository that are simultaneously reviewable source code and executables, since these are reviewable. (Some interpretive systems, such as many operating system shells, don't have a mechanism for storing generated executables that are different from the source file.)
- Source code in the source repository generated by other tools (e.g., by bison, yacc, flex, and lex). There are potential downsides to generated source code, but generated source code tends to be much easier to review and thus presents a lower risk. Generated source code is also often difficult for external tools to detect.
- Generated documentation in source repositories. Generated documentation is intended for use by humans (not computers) who can evaluate the context. Thus, generated documentation doesn't pose the same level of risk.
-
Branch protection
We are using Scorecard
This check determines whether a project's default and release branches are protected with GitHub's branch protection or repository rules settings. Branch protection allows maintainers to define rules that enforce certain workflows for branches, such as requiring review or passing certain status checks before acceptance into a main branch, or preventing rewriting of public history.
Different types of branch protection protect against different risks:
- Require code review:
- requires at least one reviewer, which greatly reduces the risk that a compromised contributor can inject malicious code. Review also increases the likelihood that an unintentional vulnerability in a contribution will be detected and fixed before the change is accepted.
- requiring two or more reviewers protects even more from the insider risk whereby a compromised contributor can be used by an attacker to LGTM the attacker PR and inject a malicious code as if it was legit.
- Prevent force push: prevents use of the --force command on public branches, which overwrites code irrevocably. This protection prevents the rewriting of public history without external notice.
- Require status checks: ensures that all required CI tests are met before a change is accepted.
-
Code review
We are using Scorecard
This check determines whether the project requires human code review before pull requests (merge requests) are merged.
Reviews detect various unintentional problems, including vulnerabilities that can be fixed immediately before they are merged, which improves the quality of the code. Reviews may also detect or deter an attacker trying to insert malicious code (either as a malicious contributor or as an attacker who has subverted a contributor's account), because a reviewer might detect the subversion.
The check determines whether the most recent changes (over the last ~30 commits) have an approval on GitHub or if the merger is different from the committer (implicit review). It also performs a similar check for reviews using Prow (labels "lgtm" or "approved") and Gerrit ("Reviewed-on" and "Reviewed-by"). If recent changes are solely bot activity (e.g. Dependabot, Renovate bot, or custom bots), the check returns inconclusively.
Scoring is leveled instead of proportional to make the check more predictable. If any bot-originated changes are unreviewed, 3 points are deducted. If any human changes are unreviewed, 7 points are deducted if a single change is unreviewed, and another 3 are deducted if multiple changes are unreviewed.
Review by bots, including bots powered by artificial intelligence / machine learning (AI/ML), do not count as code review. Such reviews do not provide confidence that there will be a second person who understands the code change (e.g., if the originator suddenly becomes unavailable). However, analysis by bots may be able to meet (at least in part) the SAST criterion.
1.0 -
Signed releases
We are using Scorecard
This check tries to determine if the project cryptographically signs release artifacts. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).
Signed releases attest to the provenance of the artifact.
This check looks for the following filenames in the project's last five release assets:
*.minisig,*.asc(pgp),*.sig,*.sign,*.sigstore,*.sigstore.json,*.intoto.jsonl.If a signature is found in the assets for each release, a score of 8 is given. If a SLSA provenance file is found in the assets for each release (*.intoto.jsonl), the maximum score of 10 is given.
This check looks for the 30 most recent releases associated with an artifact. It ignores the source code-only releases that are created automatically by GitHub.
0.6
Overview
GoCD is an open‑source continuous delivery server that streamlines the build‑test‑release cycle, letting teams ship software with confidence. Built primarily with Java, TypeScript, Spring, SparkJava, and MithrilJS, it runs on Eclipse Jetty and can even be used to build itself. The project is backed by Thoughtworks and JetBrains, and provides extensive documentation, community support, and a permissive Apache 2.0 license.