Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of ActiveRow::related() #312

Open
MichaelPavlista opened this issue Aug 28, 2024 · 0 comments
Open

Performance of ActiveRow::related() #312

MichaelPavlista opened this issue Aug 28, 2024 · 0 comments

Comments

@MichaelPavlista
Copy link

MichaelPavlista commented Aug 28, 2024

During application performance analysis, I discovered that using the ActiveRow::related() method significantly
slows down the application's execution time. The slowdown increases exponentially with the number of calls to this method.

Problem Description

The issue can be replicated with the following simple example,
which compares the same logic using ActiveRow::related() versus manually constructed SQL.

Select batch of main and related rows using ActiveRow:

/** @var Explorer $databaseExplorer */
$databaseExplorer = $container->getByType(type: Explorer::class);

// Load batch rows from database.
$companyIdToRow = $databaseExplorer->table(table: 'company')
    ->select('*')
    ->order('id DESC')
    ->limit(BATCH_SIZE)
    ->fetchPairs('id');

/** @var ActiveRow $row */
foreach ($companyIdToRow as $row)
{
    // Load related rows.
    $relatedRows = $row->related('company_member')->fetchAll();
}

Select batch of main and related rows using manually constructed SQL:

// Select main rows.
$companyIdToRow = $databaseExplorer
    ->query('SELECT * FROM `company` ORDER BY `id` DESC LIMIT ?', BATCH_SIZE)
    ->fetchPairs('id');

// Create main rows ID list.
$companyIdList = array_keys($companyIdToRow);

// Select rows from the related table.
$companyIdToMemberRowList = $databaseExplorer
    ->query('SELECT `id`, `company_id` FROM `company_member` WHERE `company_id` IN (?)', $companyIdList)
    ->fetchAssoc('company_id[]->');

foreach ($companyIdToRow as $row)
{
    // Load related rows.
    $relatedRows = $companyIdToMemberRowList[$row['id']] ?? [];
}

Comparison of the variants

Number of rows (=BATCH_SIZE) Execution Time [related] Execution Time [manual SQL] How much faster is manual SQL?
1 000 206 ms 130 ms 1.6x
2 500 480 ms 142 ms 3.4x
5 000 1 330 ms 160 ms 8.3x
10 000 4 560 ms 200 ms 22.8x
25 000 26 800 ms 328 ms 81.7x
50 000 134 120 ms 550 ms 243.8x
  • The comparison was performed on PHP 8.3 without the Xdebug extension, using MariaDB 10.5.
  • Both services ran on a local machine in Docker.
  • In both cases, the same logic is executed, and the same two SQL queries are sent to the database. The performance issue lies within PHP.
  • In the custom SQL, the columns id, company_id are explicitly defined to match the query sent by ActiveRow::related() exactly.
  • Enabling or disabling the database's debug mode makes almost no difference.
  • A dump of the test data: database-init.sql.gz.

General Notes

  • In real applications, the delay caused by ActiveRow::related() is usually in the hundreds of milliseconds.
    This is because:
    • Queries involving a large number of rows are rarely used.
    • Common queries often target hundreds of rows, but multiple queries per page accumulate the delay.

Affected Versions

The issue is present in the latest version nette/database@3.2.2 and also exists in older versions.

Problematic Area

An analysis of the problematic area using Xdebug suggests that the majority of the script's runtime is spent repeatedly calling the SqlBuilder::getConditionHash method.

Expected Behavior

Using ActiveRow::related() should not significantly increase the script's runtime, at least in cases where no modifications are made to the selection before fetching.

# this code should always be fast
$relatedRows = $row->related('company_member')->fetchAll();

Docker Demo

I have prepared a demo of the issue (including a database), which can be easily run in Docker.

Setup

Test URLs:

You can change settings, such as the number of rows (BATCH_SIZE), in bootstrap.local.php.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant