Releases: nexB/scancode-toolkit
v30.1.0 - 2021-09-25
This is a bug fix release for these bugs:
We now return the package in the summaries as before.
There is also a minor API change: we no longer return a count of "null" empty
values in the summaries for license, copyrights, etc.
Thank you to:
- Thomas Druez @tdruez
See also https://github.com/nexB/scancode-toolkit/tree/v30.0.0 for details on the main changes in v30.0.x
What's Changed
- Prepare bugfix release 30.0.1 #2713 by @pombredanne in #2715
- Return package details in summary #2717 by @pombredanne in #2718
Full Changelog: v30.0.1...v30.1.0
v30.0.1 - 2021-09-24
This is a minor bug fix release for these bugs:
We now correctly work with all supported Click versions.
Thank you to:
See also https://github.com/nexB/scancode-toolkit/tree/v30.0.0 for details on the main changes in v30.0.x
Full Changelog: v30.0.0...v30.0.1
v30.0.0 - 2021-09-23
This is a major release with new features, and several bug fixes and
improvements including major updates to the license detection.
We have dropped using calendar-based versions and are now switched back to semver
versioning. To ensure that there is no ambiguity, the new major version has been
updated from 21 to 30. The primary reason is that calver was not helping
integrators to track major version changes like semver does.
We also have introduced a new JSON output format version based on semver to
version the JSON output format data structure and have documented the new
versioning approach.
Here are the key changes for each area:
Package detection:
-
The Debian packages declared license detection in machine readable copyright
files and unstructured copyright has been significantly improved with the
tracking of the detection start and end line of a license match. This is not
yet exposed outside of tests but has been essential to help improve detection. -
Debian copyright license detection has been significantly improved with new
license detection rules. -
Support for Windows packages has been improved (and in particular the handling
of Windows packages detection in the Windows registry). -
Support for Cocoapod packages has been significantly revamped and is now
working as expected. -
Support for PyPI packages has been refined, in particular package descriptions.
Copyright detection:
- The copyright detection accuracy has been improved and several bugs have been
fixed.
License detection:
There have been some significant updates in license detection. We now track
34,164 license and license notices:
-
84 new licenses have been added,
-
34 existing license metadata have been updated,
-
2765 new license detection rules have been added, and
-
2041 existing license rules have been updated.
-
Several license detection bugs have fixed.
-
The SPDX license list 3.14 is now supported and has been synced with the
licensedb. We also include the version of the SPDX license list in the
ScanCode YAML, JSON and the SPDX outputs, as well as display it with the
"--version" command line option. -
Unknown licenses have a new flag "is_unknown" in their metadata to identify
them explicitly. Before that we were just relying on the naming convention of
having "unknown" as part of a license key. -
Rules that match at least one unknown license have a flag "has_unknown" set
and returned in the match results. -
Experimental: License detection can now "follow" license mentions that
reference another file such as "see license in COPYING" where we can relate
this mention to the actual license detected in the COPYING file. Use the new
"--unknown-licenses" command line option to test this new feature.
This feature will evolve significantly in the next version(s).
Outputs:
- The SPDX output now has the mandatory ids attribute per SPDX spec. And we
support SPDX 2.2 and SPDX license list 3.14.
Miscellaneous
-
There is a new "--no-check-version" CLI option to scancode to bypass live,
remote outdated version check on PyPI -
The scan results and the CLI now display an outdated version warning when
the installed ScanCode version is older than 90 days. This is to warn users
that they are relying on outdated, likely buggy, insecure and inaccurate scan
results and encourage them to update to a newer version. This is made entirely
locally based on date comparisons. -
We now display again the command line progressbar counters correctly.
-
A bug has been fixed in summarization.
-
Generated code detection has been improved with several new keywords.
Thank you!
Many thanks to the many contributors that made this release possible and in
particular:
- Akanksha Garg @akugarg
- Armijn Hemel @armijnhemel
- Ayan Sinha Mahapatra @AyanSinhaMahapatra
- Bryan Sutula @sutula
- Chin-Yeung Li @chinyeungli
- Dennis Clark @DennisClark
- dyh @yunhua-deng
- Dr. Frank Heimes @FrankHeimes
- gunaztar @gunaztar
- Helio Chissini de Castro @heliocastro
- Henrik Sandklef @hesa
- Jiyeong Seok @dd-jy
- John M. Horan @johnmhoran
- Jono Yang @JonoYang
- Joseph Heck @heckj
- Luis Villa @tieguy
- Konrad Weihmann @priv-kweihmann
- mapelpapel @mapelpapel
- Maximilian Huber @maxhbr
- Michael Herzog @mjherzog
- MMarwedel @MMarwedel
- Mikko Murto @mmurto
- Nishchith Shetty @inishchith
- Peter Gardfjäll @petergardfjall
- Philippe Ombredanne @pombredanne
- Rainer Bieniek @rbieniek
- Roshan Thomas @Thomshan
- Sadhana @s4-2
- Sarita Singh @itssingh
- Sebastian Schuberth @sschuberth
- Siddhant Khare @Siddhant-K-code
- Soim Kim @soimkim
- Thorsten Godau @tgodau
- Yunus Rahbar @yns88
What's Changed
- Collect InstalledWindowsProgram installed files #2615 by @JonoYang in #2623
- Improve release creation speed by @pombredanne in #2627
- Omnibus license updates July/Aug 21 by @pombredanne in #2626
- Add new flag in License Data Model definition by @akugarg in #2548
- Update Contributing: Development setup-instructions by @mapelpapel in #2631
- Referenced_filenames should be returned by API function by @akugarg in #2632
- Add emails and urls to HTML output by @sritasngh in #2539
- Avoid misinterpreting MIT license notice as Apache-2.0, issue #2635 by @petergardfjall in #2636
- Add final report for GSoC'21 by @akugarg in #2648
- Add "--no-check-version" CLI option to scancode by @yns88 in #2662
- Align tests for pubspec with latest code by @pombredanne in #2628
- Add new licenses by @akugarg in #2625
- Add podspec.json and podfile.lock parsers by @AyanSinhaMahapatra in #2638
- Add new license Anti-Capitalist Software License #2362 by @sritasngh in #2364
- Do not mistake path for copyright year by @pombredanne in #2666
- Follow license reference to another file by @akugarg in #2616
- Bump commoncode #2583 by @pombredanne in #2676
- Detect only mit license, not boost #2675 by @pombredanne in #2678
- Detect ocb correctly license #2670 by @pombredanne in #2677
- Improve license referenced_filenames handling #1364 by @pombredanne in #2681
- Add script to report rules by @AyanSinhaMahapatra in #2685
- Update Azure CI to not use ubuntu-16.04 images by @AyanSinhaMahapatra in #2688
- Introduce output data format versioning #2653 by @AyanSinhaMahapatra in #2682
- Release preparation for 2021.08 by @pombredanne in #2680
- Improve license detection accuracy by @pombredanne in #2667
- Improve copyright detection by @pombredanne in #2701
- Add CI for Docs and ABOUT files by @AyanSinhaMahapatra in #2695
- Adopt SPDX v2.2 and fix SPDX TV correctness by @pombredanne in #2704
- Improve Copyright detection by @pombredanne in #2707
- Prepare new release by @pombredanne in #2705
New Contributors
- @mapelpapel made their first contribution in #2631
- @yns88 made their first contribution in #2662
Full Changelog: v21.8.4...v30.0.0
v21.8.4
This is a minor bug fix release primarily for Windows installation.
There is no feature change.
Installation:
- Application installation on Windows works again. This fixes #2610
- We now build and test app bundles on all supported Python versions: 3.6 to 3.9
Thank you to @gunaztar for reporting the #2610 bug
Documentation:
- Documentation is updated to reference supported Python versions 3.6 to 3.9
v21.7.30
This is a minor release with several bug fixes, major performance improvements
and support for new and improved package formats
WARNING: the app installatiom does not work on Windows. Everything else is fine. Use the next version v21.8.1 and later. See #2610
Many thanks to every contributors that made this possible and in particular:
- Abhigya Verma @abhi27-web
- Ayan Sinha Mahapatra @AyanSinhaMahapatra
- Dennis Clark @DennisClark
- Jono Yang @JonoYang
- Mayur Agarwal @mrmayurgithub
- Philippe Ombredanne @pombredanne
- Pierre Tardy @tardyp
Key changes:
Outputs:
- Add new YAML-formatted output. This is exactly the same data structure as for
the JSON output - Add new Debian machine readable copyright output.
- The CSV output "Resource" column has been renamed to "path".
- The SPDX output now has the mandatory DocumentNamespace attribute per SPDX specs #2344
Copyright detection:
- The copyright detection speed has been significantly improved with the tests
taking roughly 1/2 of the time to run. This is achieved mostly by replacing
NLTK with a the minimal and simplified subset we need in a new library named
pygmars.
License detection:
- Add new licenses: now tracking 1763 licenses
- Add new license detection rules: now tracking 29475 license detection rules
- We have also improved license expression parsing and processing
Package detection:
- The Debian packages declared license detection has been significantly improved.
- The Alpine packages declared license detection has been significantly improved.
- There is new support for shell parsing and Alpine packages APKBUILD data collection.
- There is new support for various Windows packages detection using multiple
techniques including MSI, Windows registry and several more. - There is new support for Distroless Debian-like installed packages.
- There is new support for Dart Pub package manifests.
v21.6.7 - major version with CLI API changes
This is a major new release with important security and bug fixes, as well as
significant improvement in license detection.
WARNING: the app installatiom does not work on Windows. Everything else is fine. Use the next version v21.8.1 and later. See #2610
Many thanks to every contributors that made this possible and in particular:
- Akanksha Garg @akugarg
- Ayan Sinha Mahapatra @AyanSinhaMahapatra
- Dennis Clark @DennisClark
- François Granade @farialima
- Hanna Modica @hanna-modica
- Jelmer Vernooij @jelmer
- Jono Yang @JonoYang
- Konrad Weihmann @priv-kweihmann
- Philippe Ombredanne @pombredanne
- Pierre Tardy @tardyp
- Sarita Singh @itssingh
- Sebastian Thomas @sebathomas
- Steven Esser @MaJuRG
- Till Jaeger @LeChasseur
- Thomas Druez @tdruez
Breaking API changes:
- The configure scripts for Linux, macOS and Windows have been entirely
refactored and should be considered as new. These are now only native scripts
(.bat on Windows and .sh on POSIX) and the Python script etc/configure.py
has been removed. Use the PYTHON_EXECUTABLE environment variable to point to
alternative non-default Python executable and this on all OSes.
Security updates:
-
Update minimum versions and pinned version of thirdparty dependencies
to benefit from latest improvements and security fixes. This includes in
particular this issues:- pkg:pypi/pygments: (low severity, limited impact) CVE-2021-20270, CVE-2021-27291
- pkg:pypi/lxml: (low severity, likely no impact) CVE-2021-28957
- pkg:pypi/nltk: (low severity, likely no impact) CVE-2019-14751
- pkg:pypi/jinja2: (low severity, likely no impact) CVE-2020-28493, CVE-2019-10906
- pkg:pypi/pycryptodome: (high severity) CVE-2018-15560 (dropped since no
longer used by pdfminer)
Outputs:
- The JSON output packages section has a new "extra_data" attributes which is
a JSON object that can contain arbitrary data that are specific to a package
type.
License detection:
- The SPDX license list has been update to 3.13
- Add 42 new and update 45 existing licenses.
- Over 14,300 new and improved license detection rules have been added. A large
number of these (~13,400) are to avoid false positive detection.
Copyright detection:
- Improved speed and fixed some timeout issues. Fixed minor misc. bugs.
- Allow calling copyright detection from text lines to ease integration.
Package detection:
-
A new "extra_data" dictionary is now part of the "packages" data in the
returned JSON. This is used to store arbitrary type-specific data that do
cannot be fit in the Package data structure. -
The Debian copyright files license detection has been reworked and
significantly improved. -
The PyPI package detection and manifest parsing has been reworked and
significantly improved. -
The detection of Windows executables and DLLs metadata has been enabled.
These metadata are returned as packages.
Other:
- Most third-party libraries have been updated to their newer versions. Some
dependency constraints have been relaxed to help some usage as a library. - The on-commit CI tests now validate that we can install from PyPI without
problem. - Fix several installation issues.
- Add new function to detect copyrights from lines.
v21.3.31 - major version with no breaking API changes
This is a major version with no breaking API changes.
Attention: the next version will bring up some significant API changes summarized in the CHANGELOG.
Security:
- Update dependency versions for security.
License scanning:
- Add 22 new and update 71 existing reference licenses
- Update licenses to include the SPDX license list 3.12
- Improve license detection accuracy with over 2300 new and improved license
detection rules - Undeprecate the regexp license and deprecate the hs-regexp-orig license
- Improve license db initial load time with caching for faster scancode
start time - Ensure that license short names are no more than 50 characters long
- Thank you to:
- Dennis Clark @DennisClark
- Chin-Yeung Li @chinyeungli
- Armijn Hemmel @armijnhemel
- Sarita Singh @itssingh
- Akanksha Garg @akugarg
Copyright scanning:
- Detect SPDX-FileCopyrightText as defined by the FSFE Reuse project
- Fix bug when using the --filter-clues command line option
Thank you to Van Lindberg @VanL - Allow calling copyright detection from text lines to ease integration
Thank you to Jelmer Vernooij @jelmer
Package scanning:
- Add support for installed RPMs detection internally (not wired to scans)
Thank you to Chin-Yeung Li @chinyeungli - Improve handling of Debian copyright files with faster and more
accurate license detection
Thank you to Thomas Druez @tdruez - Add new built-in support for installed_files report. Only available when
used as a library. - Improve support for RPM, npm, Debian, build scripts (Bazel) and Go packages
Thank you to:- Divyansh Sharma @Divyansh2512
- Jonothan Yang @JonoYang
- Steven Esser @MaJuRG
- Add new support to collect information from semi-structured Readme files
and related metadata files.
Thank you to:
Ouputs:
- Add new Debian copyright-formatted output.
Thank you to Jelmer Vernooij @jelmer - Fix bug in --include where directories where not skipped correctly
Thank you to Pierre Tardy @tardyp
Misc. and documentation improvements:
- Update the way tests assertions are made
Thank you to Aditya Viki @adityaviki - Thank you to Aryan Kenchappagol @aryanxk02
The sources of third-party dependencies are available for download here and in:
v21.2.25 - minor release with new licenses and improved installation
This is a minor new release. Some of the highlights include:
Installation:
- Resolve reported installation issues on macOS, Windows and Linux
- Stop using extras for a default wheel installation
- Build new scancode-toolkit-mini package with limited dependencies for use
when packaging in distros and similar - The new Dockerfile will be create smaller images and containers
License scanning:
- Over 150 and and updated licenses
- Support the latest SPDX license list v3.11
- Improve license detection accuracy with over 740 new and improved license
detection rules - Fix license cache handling issues
Misc.:
- Update extractcode, typecode and their native dependencies for better support
of latest versions of macOS.
Big thank you to all contributors!
v21.2.9
This is a major new release. Some of the highlights include:
Security:
- Update vulnerable LXML to version 4.6.2 to fix
https://nvd.nist.gov/vuln/detail/CVE-2020-27783
This was detected thanks to https://github.com/nexb/vulnerablecode
Operating system support:
- Drop support for Python 2 #295
- Drop support for 32 bits on Windows #335
- Add support for Python 64 bits on Windows 64 bits #335
- Add support for Python 3.6, 37, 3.8 and 3.9 on Linux, Windows and macOS.
These are now tested on Azure. - Add deprecation message for native Windows support #2366
License scanning:
- Improve license detection accuracy with over 8400 new license detection fules
added or updated - Remove the previously deprecated --license-diag option
- Include pre-built license index in release archives to speed up start #988
- Use SPDX LicenseRef-scancode namespace for all licenses keys not in SPDX
- Replace DEJACODE_LICENSE_URL with SCANCODE_LICENSEDB_URL at
https://scancode-licensedb.aboutcode.org #2165
Package scanning:
- Add detection of package-installed files
- Add analysis of system package installed databases for Debian, OpenWRT and
Alpine Linux packages - Add support for Alpine Linux, Debian, OpenWRT.
Copyright scanning:
- Improve detection with minor grammar fixes
Misc.:
- Adopt a new calendar date-based versioning for scancode-toolkit version numbers
- Update thirdparty dependencies and built-in plugins
- Allow installation without extractcode and typecode native plugins. Instead
one can elect to install these or not to have a lighter footprint if needed. - Update configuration and bootstrap scripts to support a new PyPI-like
repository at https://thirdparty.aboutcode.org/pypi/ - Create new release scripts to populate released archives with just the
required wheels of a given OS and Python version. - Updated scancode.bat to handle % signs in the arguments #1876
Big thank you to all contributors and in particular:
- Abhishek Kumar
- Ayan Sinha Mahapatra
- Ayush Bhardwaj
- Chin Yeung Li
- Dennis Clark
- Duncan Howe
- John Horan
- Jono Yang
- Maximilian Huber
- Michael Herzog
- Philippe Ombredanne
- Sankha Das
- Scott Pakin
- Steven Esser
- Tushar Upadhyay