Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPDX reporter: Stop creating dangling relationships to excluded packages #7488

Merged
merged 3 commits into from
Sep 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions model/src/main/kotlin/OrtResult.kt
Original file line number Diff line number Diff line change
Expand Up @@ -188,17 +188,19 @@ data class OrtResult(
/**
* Return the dependencies of the given [id] (which can refer to a [Project] or a [Package]), up to and including a
* depth of [maxLevel] where counting starts at 0 (for the [Project] or [Package] itself) and 1 are direct
* dependencies etc. A value below 0 means to not limit the depth.
* dependencies etc. A value below 0 means to not limit the depth. If [omitExcluded] is set to true, identifiers of
* excluded projects / packages are omitted from the result.
*/
fun getDependencies(id: Identifier, maxLevel: Int = -1): Set<Identifier> {
fun getDependencies(id: Identifier, maxLevel: Int = -1, omitExcluded: Boolean = false): Set<Identifier> {
val dependencies = mutableSetOf<Identifier>()
val matcher = DependencyNavigator.MATCH_ALL.takeUnless { omitExcluded } ?: { !isExcluded(it.id) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about introducing a DependencyNavigator.MATCH_NON_EXCLUDED for { !isExcluded(it.id) }?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not possible, because the DependencyNavigator does not know about the excluded state.


getProjects().forEach { project ->
if (project.id == id) {
dependencies += dependencyNavigator.projectDependencies(project, maxLevel)
dependencies += dependencyNavigator.projectDependencies(project, maxLevel, matcher)
}

dependencies += dependencyNavigator.packageDependencies(project, id, maxLevel)
dependencies += dependencyNavigator.packageDependencies(project, id, maxLevel, matcher)
}

return dependencies
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -240,10 +240,6 @@
"spdxElementId" : "SPDXRef-Package-Maven-seventh-package-group-seventh-package-0.0.1",
"relationshipType" : "GENERATED_FROM",
"relatedSpdxElement" : "SPDXRef-Package-Maven-seventh-package-group-seventh-package-0.0.1-source-artifact"
}, {
"spdxElementId" : "SPDXRef-Project-Maven-first-project-group-first-project-name-0.0.1",
"relationshipType" : "DEPENDS_ON",
"relatedSpdxElement" : "SPDXRef-Package-Maven-fifth-package-group-fifth-package-0.0.1"
}, {
"spdxElementId" : "SPDXRef-Project-Maven-first-project-group-first-project-name-0.0.1",
"relationshipType" : "DEPENDS_ON",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -241,9 +241,6 @@ relationships:
- spdxElementId: "SPDXRef-Package-Maven-seventh-package-group-seventh-package-0.0.1"
relationshipType: "GENERATED_FROM"
relatedSpdxElement: "SPDXRef-Package-Maven-seventh-package-group-seventh-package-0.0.1-source-artifact"
- spdxElementId: "SPDXRef-Project-Maven-first-project-group-first-project-name-0.0.1"
relationshipType: "DEPENDS_ON"
relatedSpdxElement: "SPDXRef-Package-Maven-fifth-package-group-fifth-package-0.0.1"
- spdxElementId: "SPDXRef-Project-Maven-first-project-group-first-project-name-0.0.1"
relationshipType: "DEPENDS_ON"
relatedSpdxElement: "SPDXRef-Package-Maven-first-package-group-first-package-0.0.1"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,11 @@ internal object SpdxDocumentModelMapper : Logging {
ortResult
)

ortResult.getDependencies(project.id, 1).mapTo(relationships) { dependency ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, can't the same bug also appear around line 94, if a non-excluded package depends on an excluded package?

In general, I wonder whether a more general solution to the problem would be to also teach getDependencies() an omitExcluded parameter. If we had that, the fix would be a matter of just setting omitExcluded = true in current line 73 and 94, right?

Copy link
Member Author

@fviernau fviernau Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, can't the same bug also appear around line 94, if a non-excluded package depends on an excluded package?

I believe so as well, good catch. If I'm not mistaken, in that case it is way more of an edge case compared to above, because IIUC it happens only if a package X appears at least twice in dependency tree (one time included, one time excluded), whereas the two subtrees rooted at X differ.

In general, I wonder whether a more general solution to the problem would be to also teach getDependencies() an omitExcluded parameter.

I've decided against that, because it seemed to me that the function as-is is already too use case specific (and in particular encouraging mis-use) for the location it resides. I'm saying that, because to me it's not obvious that it merges the subtrees for identical identifiers. Anyhow, maybe the omitExcluded wouldn't make it that much worse. What do you think?

If we had that, the fix would be a matter of just setting omitExcluded = true in current line 73 and 94, right?

No, because that would not solve point #2 mentioned in the commit message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because that would not solve point #2 mentioned in the commit message.

I anyway have another question regard that point. You write (typo in "depdendencies" BTW):

2. Dependencies which are direct depdendencies of a sub-project, but not
   of any root project are not considered a first level dependency. Such
   dependencies may not be linked into the dependency tree of resulting
   SPDX document at all.

I'm unsure about what "may not be linked ... at all" exactly means. So if I had (all non-excluded) a project structure like

root-project
|
+- sub-project
   |
   +- package (dependency of "sub-project" only)

are you saying there should be no relationship between "sub-project" and "package" being recorded in the SPDX document?

Copy link
Member Author

@fviernau fviernau Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

root-project-1
|
+- sub-project-1
   |
   +- package 1
root-project-2
|
+- sub-project-2
   |
   +- package 2  

If we have above, and consider the original (current) algorithm implemented by you, I guess the intention was to not mention subproject-1 and subproject-2 in the SPDX report, but link the dependencies directly to root-project-1 and root-project-2.

To my understanding the implementation did not match this mentioned intention, which I wanted to fix with this new implementation.

are you saying there should be no relationship between "sub-project" and "package" being recorded in the SPDX document?

Given the above, I'd say no, because "package" should be a direct child of the "root" project, and nested submodules should not be modeled in the output.

I'm unsure about what "may not be linked ... at all"

If I'm not mistaken in my above tree, the current algorithm would produce a SPDX document where there is no path from any root to "package 2" and / or to "package 1". (Due to the limitation of depth==1)

Copy link
Member

@sschuberth sschuberth Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the intention was to not mention subproject-1 and subproject-2 in the SPDX report, but link the dependencies directly to root-project-1 and root-project-2.

Actually no, that was not my intention. The intention was to create all of the following relationships in this case:

  • root-project-1 depends on sub-project-1
  • sub-project-1 depends on package 1

(And the same for the number 2 projects / packages.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, only listing the top-level projects is better than exposing the whole sub-module structure.

Well, that differs from what @tsteenbe and others have expressed here then. The intent of my b471544 was to partly address that linked issue.

And we have the same ask to capture the full dependency tree for CycloneDX, BTW.

Would that be ok with you?

Not really, to be honest, as that takes us a step back again from solving #4099. However, I completely agree that I've introduced a bug in b471544 by disregarding excludes. That should certainly be fixed. And if I have introduced another bug related to sub-projects, that should be fixed, too.

Copy link
Member Author

@fviernau fviernau Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that differs from what @tsteenbe and others have expressed here then. The intent of my b471544 was to partly address that linked issue.

I understand Thomas' comment differently, however. It's not mentioning sub-modules explicitly, so this is IMO speculation. I believe we should discuss sub-module case explicitly / separately.

And we have the #3906, BTW.

Ok.

Not really, to be honest, as that takes us a step back again

Why is it a step back? My PR does not reduce information reported at all. But fixes this bug:

However, I completely agree that I've introduced a bug in b471544 by disregarding excludes.

So, can we use this PR as-is as a bug fix, and then moving forward discuss how exactly we want sub-module structure be represented in ORT community and then in a next iteration implement that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So even if each entity (project or package) only relates to its direct dependencies, in the end we get a full dependency tree in total.

I'm not sure if this bug was clear to you already, but this sentence illustrates it pretty well. The idea does not work in case there are sub-projects, because then one does not iterate over the sub-projects dependencies at all in some cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

root-project-1
|
+- sub-project-1
   |
   +- package 1
root-project-2
|
+- sub-project-2
   |
   +- package 2  

[...]

Given the above, I'd say no, because "package" should be a direct child of the "root" project, and nested submodules should not be modeled in the output.

To me it would be misleading to report "package-1" as a direct dependency of "root-project-1" here, especially if both "root-project-1" and "sub-project-1" are published. I do not yet understand the issue with the current approach to build the tree, especially:

The idea does not work in case there are sub-projects, because then one does not iterate over the sub-projects dependencies at all in some cases.

Under which circumstances would that be the case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sschuberth @mnonnenmacher -

I've adjusted the implementation to what I understood We agreed upon in todays call.
@sschuberth I've incorporated your proposal from the PR comments to teach getDependencies() to omit excluded IDs.

ortResult.getDependencies(
id = project.id,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message nit: It's not "a type parameter" but "typed parameters" as also id is named now.

maxLevel = 1,
omitExcluded = true
).mapTo(relationships) { dependency ->
SpdxRelationship(
spdxElementId = spdxProjectPackage.spdxId,
relationshipType = SpdxRelationship.Type.DEPENDS_ON,
Expand All @@ -91,7 +95,11 @@ internal object SpdxDocumentModelMapper : Logging {
ortResult
)

ortResult.getDependencies(pkg.id, 1).mapTo(relationships) { dependency ->
ortResult.getDependencies(
id = pkg.id,
maxLevel = 1,
omitExcluded = true
).mapTo(relationships) { dependency ->
SpdxRelationship(
spdxElementId = binaryPackage.spdxId,
relationshipType = SpdxRelationship.Type.DEPENDS_ON,
Expand Down