-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New attributes for license tag #347
base: master
Are you sure you want to change the base?
New attributes for license tag #347
Conversation
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea overall, but I do think the implementation needs some changes to make it more generally useful.
rep-0149.rst
Outdated
@@ -346,8 +346,8 @@ Example | |||
</description> | |||
<maintainer email="someone@example.com">Someone</maintainer> | |||
|
|||
<license>BSD</license> | |||
<license file="LICENSE">LGPL</license> | |||
<license file="LICENSE">BSD-3-Clause</license> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion I have here is to add a type
attribute to the license tag. The type
attribute would describe what kind of license tag is being used, either an SPDX
identifier, or a freeform
identifier. If no type
attribute is specified, we would assume freeform
.
Thus you could have:
<license>BSD</license>
- a common license tag today, which can be assumed to be a "freeform" license tag (and thus nothing can mechanically be determined about it)<license type='freeform'>BSD</license>
- which is the same as above but more explicit<license type='spdx'>BSD-3-Clause</license>
- which is very explicitly using the SPDX identifier, and thus can be mechanically verified
With this in place, we could put in linters (for instance) that would complain when the "type" is SPDX, but the actual text doesn't match a known SPDX identifier. We could also have linters that would complain if any packages in the workspace don't use the "spdx" type (though we'd have to do a lot of work to enable that by default).
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be simpler to require the use of SPDX identifiers?
Other efforts to standardise licensing information (like REUSE) have gone that way and it makes parsing and validation much more straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, yes, we would require SPDX identifiers.
The issue is backwards compatibility. There are thousands of ROS (1 & 2) packages out there in the wild, and almost none of the package.xml files in them follow SPDX. Explicitly putting a type field in will make it easier for tools (and humans) to determine whether the field is expected to be SPDX or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of introducing a type attribute. There are many ROS 1 packages that simply specify BSD
as license identifier, which is not enough information to infer the exact license.
which can be assumed to be a "freeform" license tag (and thus nothing can mechanically be determined about it)
I would be a bit more optimistic for the linter and in particular for generating a debian/copyright file:
<license file="path_to_license.txt">BSD</license>
: This is enough information to generate a Debian copyright file. The generator (and the linter) should check whetherpath_to_license.txt
really contains a BSD license text.<license>BSD</license>
(or<license source-files="*">BSD</license>
) with a fileLICENSE
in the root of the repo: Again, this is enough information to generate a Debian copyright file in my opinion.<license>Some SPDX full name or SPDX identifier</license>
: Again enough information to generate a Debian copyright file (and to validate it against a potential LICENSE file in the package or repo root).<license file="custom_license_text.txt">MyCustomLicenseIdentifier</license>
: Also this would allow the generation of a debian/copyright file according to https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/. ("If there are licenses present in the package without a standard short name, an arbitrary short name may be assigned for these licenses. These arbitrary names are only guaranteed to be unique within a single copyright file.")
Even the case that a debian/copyright file is generated where the license short name does not match the license text I do not consider dramatic: It simply repeats the inconsistent license names and texts from the source code and should be discovered by relevant tools applied to the debian/copyright file.
As a consequence, the type="spdx"
attribute is not really necessary but its use would introduce a safety layer that prevents that a typo in an SPDX license identifier leads to the interpretation as a custom license identifier (i.e., as case 4).
What we must prevent in any case is that the future debian/copyright file generator refines the license specification for which there is no evidence - which is the case for some ROS 1 packages that specify <license>BSD</license>
without any further license headers or LICENSE
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, yes, we would require SPDX identifiers.
The issue is backwards compatibility. There are thousands of ROS (1 & 2) packages out there in the wild, and almost none of the package.xml files in them follow SPDX. Explicitly putting a type field in will make it easier for tools (and humans) to determine whether the field is expected to be SPDX or not.
but don't we have the same problem with the attributes and elements proposed here? All the packages in existence today would also not have those.
if the package format version would be increased, tools could just assume "v4 manifest->spdx".
Anything below that does not (have to) conform to that.
If the concern is "there may be licenses in use for which there are no SPDX identifiers", then SPDX has a standard approach for that in place IIRC.
Edit: absence of the type attribute could of course also be used, but it seems much cleaner to me to be able to say "format 4 manifests must use SPDX identifiers" than "if the license
tag doesn't have the type
attribute then it's OK to not use SPDX identifiers".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the type
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for completeness: The SPDX Specification also describes a solution for naming custom licenses (cf. https://spdx.dev/spdx-specification-21-web-version/#h.1v1yuxt), but I still prefer the explicit type
attribute to distinguish between licenses from the SPDX License List and custom ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type
attribution with default looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduced the type
attribute as proposed by @clalancette. (Sorry, I should have updated the PR before yesterday's discussion in the TSC meeting.)
Is the description of the new attribute sufficient or should I give explicit explanations of the two possible values freeform
and spdx
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clalancette, are you fine if I mark the discussion on the type attribute as resolved, cf. my last question on the explanation of the attribute in the REP?
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2022-04-21/25293/1 |
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-01-19/29423/1 |
rep-0149.rst
Outdated
@@ -346,8 +346,8 @@ Example | |||
</description> | |||
<maintainer email="someone@example.com">Someone</maintainer> | |||
|
|||
<license>BSD</license> | |||
<license file="LICENSE">LGPL</license> | |||
<license file="LICENSE">BSD-3-Clause</license> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type
attribution with default looks good to me.
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-4-20-2023/31087/1 |
<https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_. | ||
Consequently, more general tags should be given first. | ||
|
||
``type="LICENSE-IDENTIFIER-TYPE"`` *(optional)* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to eliminate this type
attribute. I haven't seen it in any other package managers. We can still state that the license SHOULD be an SPDX identifier (and linter can throw a warning).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is important to have as a way to suppress linter errors. In particular, I think that we should change it to be spdx is the default type. As that should be our recommendation. But if you have a non-spdx license you can set it to be "freeform" explicitly and then the linter will be suppressed and not complain about the string not matching spdx.
This will encourage spdx by it being the shortest path to usage. And if they aren't using an spdx license people can be given a linting warning. And the fix for the linter is to either fix their license declaration to be spdx compatible, or declare that it's freeform with the non-default type attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If spdx is the default that works 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also in favor of making spdx the standard. @clalancette understandably argued that this would lead to very many error messages for older packages.
Idea: We make freeform the default (as currently stated in this PR and approved by the TSC yesterday) but make a prominent info message in the linter tooling if the type-attribute is not specified explicitly and the license identifier is not from the SPDX list. Furthermore, we raise awareness for the license documentation topic by a Discourse post and a ROSCon talk. Then, in about a year, we can consider changing the default of the type attribute to SPDX (by a new PR to this REP) to force the maintainers of the remaining packages to resolve/clarify the license documentation. What do you think, @amacneil, @clalancette ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for doing a followup/tick tock process to make it the default with strong linting/warning. But lets close this out first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the way I would do this is the following:
- Make
freeform
the default for now. - Raise awareness with the Discourse post and ROSCon talk.
- Update the core packages (everything in https://github.com/ros2/ros2/blob/rolling/ros2.repos) to use SPDX.
- Add in a linter warning in
ament_copyright
for packages that don't specify the type and the identifier is not in SPDX. - Later on, consider changing the default to SPDX.
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-05-18/31587/1 |
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
should be a very rare case - have to be replaced by a ``?`` in the filename | ||
pattern as described at the end of Section 6.9 of the `Machine-readable | ||
debian/copyright file specification V1.0 | ||
<https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required? Since you describe above that multiple space-separated paths are not allowed, I don't understand why we cannot support spaces in paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how the Machine-readable debian/copyright file specification V1.0 states it. We could do differently, but then we would make the process how the package.xml is translated to the copyright file less transparent. And I consider spaces in source file/folder names an absolute exception.
@ct2034 has built a linter for the new package.xml format, which will of course include a check for space characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we would simply replace space characters by ?
during the generation of the copyright file, this could even change the set of matched files. This must be prevented. Therefore I would force those developers, who want to use space characters in the filenames/paths, to think about it themselves.
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-06-15/32038/1 |
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/changes-in-the-package-xml-about-licensing/32118/1 |
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/changes-in-the-package-xml-about-licensing/32118/3 |
I think increasing schema version is inevitable. Current catkin with format 3 cannot read the files with added |
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
Signed-off-by: Ralph Lange <ralph.lange@de.bosch.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with these changes. They match what was accepted in the TSC Meeting 2023-06-15. What are the next steps to get this merged?
In short: I think a big mistake was made here, and I was a major part of that mistake. During the discussion for this REP, I pushed hard for making these changes without bumping the package.xml version number. My thinking at the time is that avoiding this bump would save us the (substantial) work across the ROS tooling to make the change. However, I have been convinced in the meantime that it is a bad idea to add in new features like this, even in a backwards-compatible way. It's just too unclear whether a compliant implementation is using the "new" standard, or the "old" one. Thus, I think this needs to be revisited with a thought to bumping the package.xml version format. I'm very sorry for my big mistake here, but I do think that we need to make this correct before implementing and merging this. |
Hi @clalancette, no problem. Please let us know what has to be done for bumping the package.xml version number. Changing this PR to create a new REP document instead of updating REP-149, right? Plus a decision request to the ROS PMC? |
... as basis for automatic creation of copyright files in binary packages according to https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/.
Questions to be discussed in this PR in particular:
copyright
attribute really required? If vendor packages would include the third-party source code directly, the copyright attribute would not be required.