Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error when changing force write mode #8146

Open
wants to merge 1 commit into
base: v5.0-release
Choose a base branch
from

Conversation

TreeHunter9
Copy link
Contributor

@TreeHunter9 TreeHunter9 commented Jun 3, 2024

When we try to change force write mode on database that has connections, we can get various types of errors or asserts from PIO_ functions (Super Server mode only). These errors can be reproduced using TPCC load and changing force write mode.
This patch fixes cases where someone tries to change force write mode when database has connections. This is done via RWLock for file descriptor, so we can be sure that some other thread won't close this descriptor while we're using it.
For Classic Server we can change force write mode only with no connections.

…thread won't close it while we're using it.

For Classic Server you can change forced write mode only with no connections.
@hvlad
Copy link
Member

hvlad commented Jun 6, 2024

For Classic Server we can change force write mode only with no connections.

Why ?

@@ -968,3 +968,4 @@ FB_IMPL_MSG(JRD, 965, ods_upgrade_err, -901, "HY", "000", "ODS upgrade failed wh
FB_IMPL_MSG(JRD, 966, bad_par_workers, -924, "HY", "000", "Wrong parallel workers value @1, valid range are from 1 to @2")
FB_IMPL_MSG(JRD, 967, idx_expr_not_found, -902, "42", "000", "Definition of index expression is not found for index @1")
FB_IMPL_MSG(JRD, 968, idx_cond_not_found, -902, "42", "000", "Definition of index condition is not found for index @1")
FB_IMPL_MSG(JRD, 969, forced_write_change_err, -902, "42", "000", "Cannot change forced write mode for classic server while it has connections")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error code is already used in master

@@ -751,13 +764,13 @@ bool PIO_read(thread_db* tdbb, jrd_file* file, BufferDesc* bdb, Ods::pag* page,
SINT64 bytes;
FB_UINT64 offset;

EngineCheckout cout(tdbb, FB_FUNCTION, EngineCheckout::UNNECESSARY);
ReadLockGuard readGuard(file->fil_desc_lock, FB_FUNCTION);

if (file->fil_desc == -1)
return unix_error("read", file, isc_io_read_err, status_vector);

Database* const dbb = tdbb->getDatabase();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be before EngineCheckout ? At least from logical POV

@@ -803,13 +816,13 @@ bool PIO_write(thread_db* tdbb, jrd_file* file, BufferDesc* bdb, Ods::pag* page,
SINT64 bytes;
FB_UINT64 offset;

EngineCheckout cout(tdbb, FB_FUNCTION, EngineCheckout::UNNECESSARY);
ReadLockGuard readGuard(file->fil_desc_lock, FB_FUNCTION);

if (file->fil_desc == -1)
return unix_error("write", file, isc_io_write_err, status_vector);

Database* const dbb = tdbb->getDatabase();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as in PIO_read above

@@ -858,6 +871,8 @@ static jrd_file* seek_file(jrd_file* file, BufferDesc* bdb, FB_UINT64* offset,
break;
}

ReadLockGuard readGuard(file->fil_desc_lock, FB_FUNCTION);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks as not necessary when seek_file() is called from PIO_read() and PIO_write() - the most often cases.
The exception is when caller is PIO_init_data(), so there should be some smart solution.

@TreeHunter9
Copy link
Contributor Author

For Classic Server we can change force write mode only with no connections.

Why ?

I tried to implement this via AST, looking at LCK_monitor as an example, but I couldn't figure it out where and when I should relock my lock after LCK_downgrade in AST. So I stop on restriction for classic server.
Maybe you can give me some advice on this issue?

@hvlad
Copy link
Member

hvlad commented Jun 7, 2024

For Classic Server we can change force write mode only with no connections.

Why ?

I tried to implement this via AST, looking at LCK_monitor as an example, but I couldn't figure it out where and when I should relock my lock after LCK_downgrade in AST. So I stop on restriction for classic server. Maybe you can give me some advice on this issue?

My question was - why different CS processes must run with the same FW mode ?
I.e. why force same mode here ? I highly doubt it can produce

various types of errors or asserts from PIO_ functions

@TreeHunter9
Copy link
Contributor Author

TreeHunter9 commented Jun 7, 2024

why different CS processes must run with the same FW mode ?

I think because it's more expected, if someone changes FW mode, they will expect all connections to change FW mode too. This works with Super, but not with Classic.
But this is how I see it, currently if we change FW mode on Classic, it will not change for already connected users, maybe I should keep this behavior.

I.e. why force same mode here ? I highly doubt it can produce

various types of errors or asserts from PIO_ functions

Yes, these errors can only be reproduced on Super, I should have mentioned that in the PR description, added it now.

@hvlad
Copy link
Member

hvlad commented Jun 7, 2024

The more I think about it the more I consider that solution is not fully correct and excessive, sorry.

There is no (direct) harm when different CS processes use different FW mode. It is not perfect, I agree, but we lived with it many years with no complains.

As for SS, where the real problem happens, I think it is not good to re-open database file with a (lot of) user connections - it could lead to a significant performance loss for a visible time, as OS flushes the file buffers when file is closing and it could take a time. Note, all user connections the require disk IO will wait for this process to finish.

So, instead, I offer to save new FW flag value to the header page (all architectures) and re-open file only if there is no other connections (required for SS only, but could be discussed). This is similar to how change of page buffers number works, btw.

@aafemt
Copy link
Contributor

aafemt commented Jun 7, 2024

So, instead, I offer to save new FW flag value to the header page (all architectures) and re-open file only if there is no other connections (required for SS only, but could be discussed).

I would expect that while file open mode changes are delayed in this case for SS, changes in page cache flush strategy would be immediate. Perhaps file re-opening also may be immediate if the change is done in sole connection.

@hvlad
Copy link
Member

hvlad commented Jun 7, 2024

So, instead, I offer to save new FW flag value to the header page (all architectures) and re-open file only if there is no other connections (required for SS only, but could be discussed).

I would expect that while file open mode changes are delayed in this case for SS, changes in page cache flush strategy would be immediate.

Could you show what you means by 'flush strategy', how it works now and how you would expect it to work ?

Perhaps file re-opening also may be immediate if the change is done in sole connection.

This is what I said above.

@aafemt
Copy link
Contributor

aafemt commented Jun 7, 2024

Could you show what you means by 'flush strategy', how it works now and how you would expect it to work ?

I expect that when I set FW OFF, dirty pages won't be flushed immediately on commit for all connections. Because that's why I do it.

@aafemt
Copy link
Contributor

aafemt commented Jun 7, 2024

My last comment is for SS. CS can do whatever it wants.

@hvlad
Copy link
Member

hvlad commented Jun 7, 2024

Could you show what you means by 'flush strategy', how it works now and how you would expect it to work ?

I expect that when I set FW OFF, dirty pages won't be flushed immediately on commit for all connections. Because that's why I do it.

When engine works with FW=ON (FIL_force_write flag is set) is doesn't call PIO_flush(), because it makes no sence - OS does it for us on every write().

When engine works with FW=OFF (FIL_force_write flag is not set) is does call PIO_flush() (as often as it allowed by MaxUnflushedWrites and MaxUnflushedWriteTime settings) - this is necessary, as OS doesn't do it.

Now, imagine database with FW=ON and admin changed FW to OFF. File mode is not changed and OS still writes every page to disk immediately. You offer to clear FIL_force_write flag and make engine to call PIO_flush() in addition to the OS immediate writes. In best case it will change nothing, in worst case - adds performance penalty. In any case your expectation about perf boost is wrong.

In opposite case, when FW is changed from OFF to ON, engine will never call PIO_flush() and OS also will not write data to disk immediately. This is very bad for on-disk consistency, obviously.

@aafemt
Copy link
Contributor

aafemt commented Jun 7, 2024

as often as it allowed by MaxUnflushedWrites and MaxUnflushedWriteTime settings

Which by default are -1, i.e. "never".

@hvlad
Copy link
Member

hvlad commented Jun 7, 2024

as often as it allowed by MaxUnflushedWrites and MaxUnflushedWriteTime settings

Which by default are -1, i.e. "never".

Not on Windows.
Either way, that doesn't make your expectation/proposal correct.

@ilya071294
Copy link
Contributor

As for SS, where the real problem happens, I think it is not good to re-open database file with a (lot of) user connections - it could lead to a significant performance loss for a visible time, as OS flushes the file buffers when file is closing and it could take a time. Note, all user connections the require disk IO will wait for this process to finish.

I guess the main point was when someone changes the FW mode, he usually wants to get the effect immediately. And there are 2 ways to do this: either force all user connections to close or reopen the file with relatively less time and performance penalty. The second one looks better from this point. Also, if we allow the actual FW mode change to be delayed then we cannot trust its status showed by gstat.

This is similar to how change of page buffers number works, btw.

Similarity is good but changing the FW mode is more critical, I believe.

@dyemanov
Copy link
Member

What I don't like in the "delayed option" is it being unpredictable. DBA turns FW on and expects that crashes will not lead to corruptions after that. The same for CS - if some connections will still be using FW=OFF after that and crash and corrupt the database this is not what DBA expects after changing FW to ON. But they have no idea when the change will be actually applied. So to ensure that they need also to shutdown the database to force everybody to reconnect. This is hardly pleasant under user load. This is the difference between BUFFERS and FW -- one is just about performance but the other is about reliability which is more important, IMO.

@hvlad
Copy link
Member

hvlad commented Jun 10, 2024

What I don't like in the "delayed option" is it being unpredictable. DBA turns FW on and expects that crashes will not lead to corruptions after that. The same for CS - if some connections will still be using FW=OFF after that and crash and corrupt the database this is not what DBA expects after changing FW to ON.

Same for SC.

But they have no idea when the change will be actually applied.

Document it. Issue warning.

So to ensure that they need also to shutdown the database to force everybody to reconnect. This is hardly pleasant under user load.

Explain, please - why reasonable DBA could think about changing FW under real load ever ?
What is real necessity for this and how often it happens ?

PS this is again flame about nothing. The original intent was to fix errors on SS when FW is changing under load.
But it was not about making any uneducated "want-to-be-dba" happy.

@TreeHunter9
Copy link
Contributor Author

So, we stop at the realization proposed by @hvlad?

@romansimakov
Copy link
Contributor

So to ensure that they need also to shutdown the database to force everybody to reconnect. This is hardly pleasant under user load.

Explain, please - why reasonable DBA could think about changing FW under real load ever ? What is real necessity for this and how often it happens ?

Why to change FW at all? Loading, indexing, or even just because someone wants. At least untill we don't require EX lock on DB for this.
So it should either predictbaly work or not.
Moreover, an ability to change FW on the fly looks not so bad option for me, and might be useful for some heavy regilar operations, in combination with nbackup as "global transaction".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants