Skip to content

CoinFabrik/scout-substrate-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scout Substrate Dataset: Audited Substrate Projects

https://img.shields.io/badge/license-MIT-green

Scout Entering the Cave of Audits

Welcome to the Scout Substrate Dataset, a collection of thoroughly analyzed audited Substrate pallets, runtime, and node code. This repository serves as a knowledge base for Substrate developers, auditors, and security researchers aiming to identify common Substrate vulnerabilities and improve the security of their projects.

Our goal is to provide a reference point for the community, compiling key security issues found in Substrate projects, facilitating better security practices, and serving as a resource for improving vulnerability detection tools like Scout

Dataset Breakdown

We have structured the Scout Substrate Dataset into the following folders:

  • /audited-projects/:
    • Contains directories for each audited project, labeled by [audited-project-id]-[project-name].
    • Each directory contains:
      • [project-audit].pdf: The original audit report in PDF format.
      • findings-[audited-project-id]-[project-name].json: JSON file containing the project-specific findings.

Generate Dataset

To generate the dataset, run make dataset. It will generate the /dataset/ directory with the following files:

  • findings.json: A comprehensive list of all findings across the audited projects.
  • findings-linear.json: The findings.json file further processed to be imported into Hugging Face.

Accessing Audited Project Code

For access to the complete codebase associated with this dataset, including tagged archives for each audit finding and remediation, visit the Scout Substrate Dataset Code repository. Download bundles by tag or commit hash, enabling a full historical view of each project version.

Audited Projects

This dataset currently contains the following audited Substrate projects:

Audited Project ID Project Name Auditor
1 Parallel Trail of Bits
2 Parallel SlowMist
3 Ava Protocol Slow Mist
5 Nodle Halborn
6 Reef Chain Halborn
7 Manta Network Veridise
8 Manta Network Halborn
9 Manta Network Veridise
10 Astar Security Research Labs
11 Astar Zellic

More projects will be added as new audits are analyzed.

Substrate Issue Classes

As we analyzed various audit reports and their respective findings, we observed a range of issue classes applied by auditing companies, each recorded under the field vulnerability_class_audit in the dataset. Despite some variation in classification, certain categories tend to recur.

To provide a common classification across the reviewed audits, we provide a vulnerability_class_scout field for each finding. Below, we list several issue classes that we find applicable to Substrate pallets, runtime, and node code, and that we applied for this field.

  • Dependency: Issues related to using vulnerable or outdated dependencies in the project. These vulnerabilities could introduce potential risks due to unmaintained or insecure libraries.
    Example Projects with Findings in this Class: [1-Parallel], [2-Parallel], [5-Nodle]

  • Arithmetic: Arithmetic-related vulnerabilities, such as unchecked arithmetic operations, saturating calculations, and overflows. These issues can result in unexpected behaviors or crashes due to incorrect handling of mathematical operations.
    Example Projects with Findings in this Class: [3-AvaProtocol], [6-ReefChain], [7-MantaNetwork]

  • Weight Management: Incorrect or missing weight calculations, including static versus dynamic weight handling, or failure to account for changes in workload. This can lead to DoS vulnerabilities as resource costs are underestimated.
    Example Projects with Findings in this Class: [7-MantaNetwork], [10-Astar], [4-Pendulum]

  • Error Handling and Validation: Inadequate error handling and validation, such as improper use of DispatchError, missing error checks, and insufficient input validation. These issues can cause unexpected program flows and unauthorized access.
    Example Projects with Findings in this Class: [5-Nodle], [4-Pendulum], [1-Parallel]

  • Denial of Service (DoS) and Spamming: Vulnerabilities that could lead to potential denial of service or spamming, often tied to extrinsic calls or weights.
    Example Projects with Findings in this Class: [4-Pendulum], [10-Astar]

  • Business Logic: Issues in project-specific rules or logic, leading to exploitable or unintended behaviors.
    Example Projects with Findings in this Class: [7-MantaNetwork], [5-Nodle]

  • Code Quality: Issues impacting readability, maintainability, or structure, increasing risk of errors.
    Example Projects with Findings in this Class: [7-MantaNetwork], [5-Nodle]

  • TBD: Findings or issues with pending classification.

We understand that this classification depends largely on expert criteria and that a finding could potentially be assigned to multiple classes simultaneously. We plan to further refine this classification as we add more audited projects to the dataset.

About CoinFabrik

We - CoinFabrik - are a research and development company specialized in Web3, with a strong background in cybersecurity. Founded in 2014, we have worked on over 500 blockchain-related projects, EVM-based and also for Solana, Algorand, and Polkadot. Beyond development, we offer security audits through a dedicated in-house team of senior cybersecurity professionals, currently working on code in Substrate, Solidity, Clarity, Rust, and TEAL.

Our team has an academic background in computer science and mathematics, with work experience focused on cybersecurity and software development, including academic publications, patents turned into products, and conference presentations. Furthermore, we have an ongoing collaboration on knowledge transfer and open-source projects with the University of Buenos Aires.

As proud members, and with the support of the Polkadot Assurance Legion (PAL), we are pleased to contribute this audited code dataset to the Substrate community, aiming to enhance vulnerability detection and promote security best practices within the Polkadot ecosystem.

License

The Scout Substrate Dataset is licensed and distributed under the MIT license. Contact us if you're looking for an exception to the terms.