Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug - Legacy Mend Parser - Locations - type character varying(4000) error #11340

Open
1 of 3 tasks
testaccount90009 opened this issue Nov 27, 2024 · 1 comment
Open
1 of 3 tasks
Labels

Comments

@testaccount90009
Copy link
Contributor

testaccount90009 commented Nov 27, 2024

Slack - https://owasp.slack.com/archives/C2P5BA8MN/p1732583494245009

It's possible when getting Mend Legacy SCA JSON Project vulnerabilities from 1.4 API that Locations will contain many values and error out the Locations 4000 char limit based on the current join. I will make a slight change to the Mend parser that will hopefully reduce the likelihood of this happening.

If you check Mend unit test and scan files, you can see the difference between legacy and platform json schemas. Legacy consists of all Vulnerabilities with a Library as an attribute. Platform consists of all Libraries with vulnerabilities as an attribute.

This means that for Platform the Locations will not be the same style as Legacy Locations.

To confirm, I do not encounter the 4000 char limit error for Locations on Platform schema style imports - only the Legacy style.

Steps to reproduce
Steps to reproduce the behavior:

  1. Download legacy* (non-platform) Mend SCA scan containing many Locations
  2. Attempt to import scan file
  3. Error type character varying(4000).

Expected behavior
Upload the report

Deployment method (select with an X)

  • Docker Compose
  • Kubernetes
  • GoDojo

Environment information

  • Latest DD version, Ubuntu

I will raise a PR to make the change that should solve this fix, but I welcome feedback from the community on the proposed change of 'path' to be replaced with 'dependencyFile' and this should impact only the Legacy parser.

However, instead of 'path' referencing the offending .jar file, it will instead reference the 'dependencyFile' which is typically build.gradle, as an example. When taking into consideration 'Vulnerabilities' in SCA and Direct vs Transitive inclusions -- this can be tricky. Technically the build.gradle is where the issue exists, but the vulnerability is in the offending .jar file.

@mtesauro
Copy link
Contributor

mtesauro commented Nov 28, 2024

@testaccount90009 Based on my reading of ☝️ especially the last paragraph, it seems like choosing to use dependencyFile instead of path is likely to produce less accurate data in DefectDojo:

Technically the build.gradle is where the issue exists, but the vulnerability is in the offending .jar file.

Having read the above and the Slack posts, I think it would be better to reduce the path to a size that fits the DB's 4000 varchar limit plus put the full details into something that can accept more data and doesn't tend to be used for deduplication.

For example, you could remove characters off the front of the path until the path is within the DB limit - or you can get fancy and take off a few more characters so you can add ... to the resulting string. Something like the pseudo code:

If the length of path is > 4000

  • Trim all but the last 3997 characters
  • append '...' to the resulting string from above
  • store this in location for the finding

Additionally add the unmodified value of path to Finding's steps_to_reproduce which is a text type in the DB so the 4k varchar isn't an issue. You can do something nice like the below in steps_to_reproduce:

The vulnerability was found at this location:
/some/really/long/path/that/is/crazy/long/and/more/than/4000/characters
The above was located in the following dependency file:
/path/to/build.gradle

This preserves data from the parsed file while:

  • not making the DB field (location in this case) bigger
  • not adding potentially a long string to something like description which is used somewhat frequently in dedup hashcodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants