Coordination of server use #377

IlkaCu · 2021-08-09T08:04:00Z

This issue is meant to coordinate the use of the egondata user/instance on our server in FL.
We already agreed on starting a clean-run of the dev branch on every Friday. This will (most likely) make some debugging necessary on Mondays. To avoid conflicts while debugging, please comment in this issue before you start debugging and shortly note on which datasets/ parts of the workflow you will be working on.

The text was updated successfully, but these errors were encountered:

ClaraBuettner · 2021-08-09T08:54:29Z

The run started on 6th of August is not finished yet. The task industry.temporal.insert-osm-ind-load is still running.
Two tasks failed:

heat_etrago.supply : This fails because some subst_id's of the mv grids are not in the etrago-bus table. I assume this is happens because the MV grids are already versioned and were skipped, but osmTGmod was running again. So even if some id changed in osmTGmod, the subst_id of the mv grids are not updated. I will check this but will wait until industry.temporal.insert-osm-ind-load is finished because it depends of the mv grids.
power_plants.wind_farms.insert : This is the same problem described in geom-error in generate_wind_farms #354 . Since I can not reproduce this issue in other instances, I will try to debug this in the clean-run instance.

Both problems were caused by subst_ids which were in the mv_grid table but due to the new run of osmTGmod not part of the etrago buses. When I enforced a re-run of the mv-grid-dataset, the tasks finished successfully.
The migration of osmTGmod to datasets solves this problem. Since this will be merged to dev soon, I will not look for another intermediate solution.

nesnoj · 2021-08-19T15:45:04Z

the new branch for the Fridays' run @gnn was talking about does not exist yet, right?

nesnoj · 2021-08-19T15:54:06Z

@nailend and me would like to have the branch features/#256-hh-load-area-profile-generator tested prior to merging to dev.
@gnn could you please merge it into the Friday-branch before you start? Thx!

ClaraBuettner · 2021-08-20T06:25:34Z

the new branch for the Fridays' run @gnn was talking about does not exist yet, right?

I think he was talking about this branch: https://github.com/openego/eGon-data/tree/continuous-integration/run-everything-over-the-weekend

nesnoj · 2021-08-20T07:32:47Z

I think he was talking about this branch: https://github.com/openego/eGon-data/tree/continuous-integration/run-everything-over-the-weekend

Thank you, didn't copy the name during the webco and the docs have not been updated yet.
I merged my branch into continuous-integration/run-everything-over-the-weekend
Ready for takeoff!

nesnoj · 2021-08-23T07:12:34Z

Apparently, there has been no run on Friday?!

nesnoj · 2021-08-23T11:28:45Z

Apparently, there has been no run on Friday?!

May I start it today? @gnn

AmeliaNadal · 2021-08-23T11:29:52Z

I would find it great yes!

IlkaCu · 2021-08-23T11:59:41Z

gnn told me that he started a clean-run on Friday. But I didn't check the results yet.

nesnoj · 2021-08-23T12:13:58Z

gnn told me that he started a clean-run on Friday. But I didn't check the results yet.

Ah, I'm just seeing he didn't use the image we used before but created a new one. But I dunno which HTTP port it's listening on.. :(
@gnn ?

nesnoj · 2021-08-23T14:29:54Z

Got it, it's port 9001 (do u know how u reconfigure the tunnel @AmeliaNadal ?).

Apparently, it crashed quite early at tasks
osmtgmod.import-osm-data and
electricity_demand.temporal.insert-cts-load 😞.

It's very likely that the first one is caused by insufficient disk space as there're only 140G free (after cleaning up temp files) and that might not sufficient for the temp tables created by osmTGmod. So I propose to delete my old setup we used before and re-run the new one. Shall I do so? Any objections @IlkaCu @AmeliaNadal ?

AmeliaNadal · 2021-08-23T14:54:18Z

I could access the results (thanks for asking @nesnoj!) and my tasks haven't run. So I have no objection that you re-run the workflow ;)

nesnoj · 2021-08-23T15:26:33Z

Done.

Update: osmtgmod.import-osm-data has been run successfully :D

nesnoj · 2021-08-25T14:46:01Z

I'm done on the server and happy, go ahead @IlkaCu

nesnoj · 2021-08-27T09:05:47Z

@IlkaCu and I decided to restart the weekend run tonight. I merged dev into continuous-integration/run-everything-over-the-weekend and I'm now done with all my stuff ... please go ahead @IlkaCu

IlkaCu · 2021-08-27T11:33:46Z

I merged one bug fix into continuous-integration/run-everything-over-the-weekend

IlkaCu · 2021-08-27T14:11:24Z

I merged another bug fix: ee038e4
@nesnoj: I hope this works now.

nesnoj · 2021-08-27T14:20:40Z

I merged another bug fix: ee038e4
@nesnoj: I hope this works now.

Yepp, looks good 👍
Run started 🏃

IlkaCu · 2021-08-27T14:23:51Z

Great, thank you.

IlkaCu · 2021-08-30T07:34:07Z

If I see it right, the server run in normal mode has been successful. 🥳
Which means we are now able to merge the different features and bug fixes into dev via PR. Or could it be an option to merge the whole continuous-integration-Branch into dev (I guess gnn would like this option)?

nesnoj · 2021-08-30T08:25:19Z

If I see it right, the server run in normal mode has been successful. 🥳

Awesome!

Which means we are now able to merge the different features and bug fixes into dev via PR. Or could it be an option to merge the whole continuous-integration-Branch into dev (I guess gnn would like this option)?

Generally I'm fine with both options, but I guess that there might be some additional checks necessary (at least in #260) before it can get merged to dev. I reckon there will be some more commits in the branches so separate merging via PRs seems more clean to me.

nesnoj · 2021-09-06T12:01:29Z

A task of mine failed due to some column name adjustments in 5b7d9f2.
I had to clear some stuff, They're re-running now..

gnn · 2021-09-06T13:18:17Z

I see that I missed an open question last week. Sorry for that.

Which means we are now able to merge the different features and bug fixes into dev via PR. Or could it be an option to merge the whole continuous-integration-Branch into dev (I guess gnn would like this option)?

Since the CR branch might contain changes which are working but not yet meant to be merged into dev, I'm in favour of merging tested feature branches into dev individually. This also makes it easier to figure out where a change came from, which is important when trying to fix bugs which are discovered later on. Hence my 👍 to @nesnoj's comment. :)
For anybody running into the issue of having to resolve the same conflicts multiple times because of this, have a look at git's rerere.enabled option, which makes git automatically reuse known conflict resolutions. You can switch on that option via
git config --global rerere.enabled true for all your repositories or via git config --local rerere.enabled true inside a repository if you only want to switch it on for that particular repository.

nesnoj · 2021-09-06T21:03:50Z

For anybody running into the issue of having to resolve the same conflicts multiple times because of this, have a look at git's rerere.enabled option, which makes git automatically reuse known conflict resolutions. You can switch on that option via
git config --global rerere.enabled true for all your repositories or via git config --local rerere.enabled true inside a repository if you only want to switch it on for that particular repository.

That's exactly what has been annoying most when keeping track of 2 branches. Thx for the hint! 🙏

BTW @IlkaCu : Some of "your" tasks failed in the current run. Also, we get a No space left on device in task power_plants.pv_rooftop.pv-rooftop-per-mv-grid for some reason, bu there're 300 GB free 🧐

nesnoj · 2022-10-25T04:46:31Z

We'e experiencing some odd stuff: parts of 2 tasks in CtsDemandBuildings @nailend merged into CI do not show up in the CI. Most likely, they have been overwritten during a merge as only some parts of a commit are missing. Do allow the pipeline to continue, we had to stop it and are currently applying the tasks manually.
As soon as this will finish, we will resume the run.

ClaraBuettner · 2022-10-25T11:35:49Z

individual_heating.determine-hp-capacity-pypsa-eur-sec-mvgd-bulk0 failed:

MVGD=30937 | Start
[2022-10-25 10:50:20,279] {saio.py:101} WARNING - Reflection was unable to determine primary key (normal for views), assuming: egon_heat_idp_pool.index
[2022-10-25 10:50:35,811] {local_task_job.py:156} WARNING - State of this instance has been externally set to failed. Taking the poison pill.
[2022-10-25 10:50:35,832] {helpers.py:325} INFO - Sending Signals.SIGTERM to GPID 2729369
[2022-10-25 10:50:36,332] {taskinstance.py:955} ERROR - Received SIGTERM. Terminating subprocesses.
[2022-10-25 10:50:36,486] {helpers.py:291} INFO - Process psutil.Process(pid=2729369, status='terminated', exitcode=0, started='08:33:13') (2729369) terminated with exit code 0
[2022-10-25 10:50:36,486] {local_task_job.py:102} INFO - Task exited with return code 0

nesnoj · 2022-10-25T12:39:33Z

individual_heating.determine-hp-capacity-pypsa-eur-sec-mvgd-bulk0 failed

Yes, @nailend had to mark the task failed as it'd take too much time without parallelization. The parallelization was lost the same way like the stuff mentioned above - we assume that someone who merged recently didn't take sufficient care.

This unfortunately cannot be fixed in the current clean run (new tasks are not detected properly). We would have to start a versioned run. Is that ok for you @ClaraBuettner @IlkaCu @AmeliaNadal?

However, this would raise the problem in #979, right?

IlkaCu · 2022-10-25T12:50:40Z

Lets give it a try. #979 will not necessarily appear again, I guess.

AmeliaNadal · 2022-10-25T12:53:59Z

I don't really see another solution, so that's ok for me too :)

nesnoj · 2022-10-25T12:56:36Z

Thanks for the quick replies! I'll take care..

nesnoj · 2022-10-26T13:36:33Z

Uh, the max_con limit I set before was without any consequence as it was overridden by @gnn's manual script 🤦‍♂️. But we agreed to set the HP tasks (individual_heating.determine-hp-capacity-pypsa-eur-sec-mvgd-bulk*) manually to success anyway.

It's running, but gas_neighbours.eGon100RE.insert-gas-neigbours-eGon100RE failed @AmeliaNadal.

AmeliaNadal · 2022-10-26T14:16:29Z

Thanks for the notification, this is solved!

IlkaCu · 2022-10-28T06:47:35Z

@khelfen and @nesnoj: Task power_plants.pv_rooftop_buildings.pv-rooftop-to-buildings failed on the server.

khelfen · 2022-10-28T07:04:22Z

@khelfen and @nesnoj: Task power_plants.pv_rooftop_buildings.pv-rooftop-to-buildings failed on the server.

I pushed a fix onto the CI!

nesnoj · 2022-10-28T11:07:49Z

@khelfen and @nesnoj: Task power_plants.pv_rooftop_buildings.pv-rooftop-to-buildings failed on the server.

I pushed a fix onto the CI!

As the CI will not be pulled (too many changes) it has to be cherry-picked or manually edited.
I cannot support today.

khelfen · 2022-10-28T11:31:25Z

@khelfen and @nesnoj: Task power_plants.pv_rooftop_buildings.pv-rooftop-to-buildings failed on the server.

I pushed a fix onto the CI!

As the CI will not be pulled (too many changes) it has to be cherry-picked or manually edited. I cannot support today.

git pull origin features/#684-distribute-pv-rooftop-buildings-3 should be enough, right? Should I do it?

nesnoj · 2022-10-31T07:16:26Z

@khelfen and @nesnoj: Task power_plants.pv_rooftop_buildings.pv-rooftop-to-buildings failed on the server.

I pushed a fix onto the CI!

As the CI will not be pulled (too many changes) it has to be cherry-picked or manually edited. I cannot support today.

git pull origin features/#684-distribute-pv-rooftop-buildings-3 should be enough, right? Should I do it?

Done and cleared..

nesnoj · 2022-12-22T21:36:10Z

On the prior run which finished yesterday, 1 task failed: sanity_checks.etrago-eGon100RE-gas @AmeliaNadal - not sure whether you are aware of that.
Update: oh, just seeing that that's seems to be work in progress #1067

AmeliaNadal · 2022-12-23T12:46:51Z

On the prior run which finished yesterday, 1 task failed: sanity_checks.etrago-eGon100RE-gas @AmeliaNadal - not sure whether you are aware of that. Update: oh, just seeing that that's seems to be work in progress #1067

Thanks for notifying, I've removed the sanity checks for eGon100RE in CI.

nesnoj · 2022-12-31T14:39:37Z

The run finished today :)

AmeliaNadal · 2023-01-17T11:06:59Z

The following tasks failed (I tried to clear them but they failed again):

tyndp.download ([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'www.entsos-tyndp2020-scenarios.eu')
osmtgmod.import-osm-dat
osm_buildings_streets.filter-buildings (@nesnoj, ERROR - (psycopg2.errors.DiskFull) could not write to file "base/pgsql_tmp/pgsql_tmp57428.0.sharedfileset/1.0": No space left on device)

nesnoj · 2023-01-17T14:08:59Z

The following tasks failed (I tried to clear them but they failed again):

Hey @AmeliaNadal!
On which instance did these errors came up?

tyndp.download ([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'www.entsos-tyndp2020-scenarios.eu')

Looks like a certificate problem on the provider's side you wouldn't be able to solve (I guess it is possible to use a param to ignore the certificate temporarily). The cert for that side was renewed 6 days ago so it is supposed to work, did you stumble across this error today?

osmtgmod.import-osm-dat

osm_buildings_streets.filter-buildings (@nesnoj, ERROR - (psycopg2.errors.DiskFull) could not write to file "base/pgsql_tmp/pgsql_tmp57428.0.sharedfileset/1.0": No space left on device)

Self-speaking. Sounds like the Hetzner server which is 99% full. We have 2 instances (ci-run-container, ci-run-container-2023-01-16) running - probably the run from yesterday blows up the disk space?

(By the way, /home/egon/egon-data/ seems to hold an orphaned run. Can it be deleted? @ClaraBuettner )

I didn't take part in the last meetings so sorry if I'm stating something obvious you're already aware of..

ClaraBuettner · 2023-01-17T14:25:02Z

(By the way, /home/egon/egon-data/ seems to hold an orphaned run. Can it be deleted? @ClaraBuettner )

Yes, that run can be deleted.

nesnoj · 2023-01-17T17:37:06Z

(By the way, /home/egon/egon-data/ seems to hold an orphaned run. Can it be deleted? @ClaraBuettner )

Yes, that run can be deleted.

Done.

@AmeliaNadal I restarted the tasks but the SSL error persists.
This is because the domain changed

The old file is still available but not covered by the cert for some reason.
I manually changed the URL in the datasets.yml. Here's a PR: #1085

So the instance is back running now..

AmeliaNadal · 2023-02-20T08:25:29Z

Hi everyone,
the task demandregio.insert-cts-ind-demands failed with the following error: "ERROR - 'ReadOnlyWorksheet' object has no attribute 'defined_names'" (seems to be a query problem), I'm not completely sure who can fix it, @nesnoj?

IlkaCu · 2023-02-20T08:43:47Z

Hi everyone, the task demandregio.insert-cts-ind-demands failed with the following error: "ERROR - 'ReadOnlyWorksheet' object has no attribute 'defined_names'" (seems to be a query problem), I'm not completely sure wo can fix it, @nesnoj?

I will have a look.

nesnoj · 2023-02-20T08:43:59Z

Hi everyone, the task demandregio.insert-cts-ind-demands failed with the following error: "ERROR - 'ReadOnlyWorksheet' object has no attribute 'defined_names'" (seems to be a query problem), I'm not completely sure wo can fix it, @nesnoj?

Hey @AmeliaNadal, I just stumbled across this error in another project, it was problem with openpyxl v3.1.1 according to this SO post from last week. A reinstall with pip install openpyxl==3.1.0 fixed it for me. But in the current pipeline we use an older version v3.0.10 so I'm unsure whether this is the origin. Maybe it is caused by a (recently released) dependency of openpyxl? @gnn could you please have a look?

nesnoj · 2023-02-20T08:44:38Z

Hi everyone, the task demandregio.insert-cts-ind-demands failed with the following error: "ERROR - 'ReadOnlyWorksheet' object has no attribute 'defined_names'" (seems to be a query problem), I'm not completely sure wo can fix it, @nesnoj?

I will have a look.

Thanks @IlkaCu

gnn · 2023-02-24T16:02:19Z

Hi everyone, the task demandregio.insert-cts-ind-demands failed with the following error: "ERROR - 'ReadOnlyWorksheet' object has no attribute 'defined_names'" (seems to be a query problem), I'm not completely sure wo can fix it, @nesnoj?

Hey @AmeliaNadal, I just stumbled across this error in another project, it was problem with openpyxl v3.1.1 according to this SO post from last week. A reinstall with pip install openpyxl==3.1.0 fixed it for me. But in the current pipeline we use an older version v3.0.10 so I'm unsure whether this is the origin. Maybe it is caused by a (recently released) dependency of openpyxl? @gnn could you please have a look?

I checked the versions and the current CI run used openpyxl==3.1.1 so I'm pretty sure that's the culprit.
Constraining to !=3.1.1 should fix this.

[Apparently][0], this version [breaks][1] the ```python demandregio.insert-cts-ind-demands ``` task. [0]: #377 (comment) [1]: #1108

IlkaCu added the 🙏 help wanted Extra attention is needed label Aug 9, 2021

IlkaCu assigned gnn, nesnoj, IlkaCu, ClaraBuettner and AmeliaNadal Aug 9, 2021

openego deleted a comment from KathiEsterl Jan 2, 2023

nesnoj mentioned this issue Jan 17, 2023

Hotfix/tyndp url #1085

Merged

6 tasks

IlkaCu mentioned this issue Feb 20, 2023

demandregio.insert-cts-ind-demands fails #1108

Open

gnn added a commit that referenced this issue Feb 24, 2023

Exclude breaking openpyxl version

1340a93

[Apparently][0], this version [breaks][1] the ```python demandregio.insert-cts-ind-demands ``` task. [0]: #377 (comment) [1]: #1108

gnn mentioned this issue Feb 24, 2023

Exclude breaking openpyxl version #1113

Draft

6 tasks

Coordination of server use #377

Coordination of server use #377

Comments

IlkaCu commented Aug 9, 2021

ClaraBuettner commented Aug 9, 2021 • edited Loading

nesnoj commented Aug 19, 2021

nesnoj commented Aug 19, 2021

ClaraBuettner commented Aug 20, 2021

nesnoj commented Aug 20, 2021

nesnoj commented Aug 23, 2021

nesnoj commented Aug 23, 2021 • edited Loading

AmeliaNadal commented Aug 23, 2021 • edited Loading

IlkaCu commented Aug 23, 2021

nesnoj commented Aug 23, 2021 • edited Loading

nesnoj commented Aug 23, 2021 • edited Loading

AmeliaNadal commented Aug 23, 2021

nesnoj commented Aug 23, 2021 • edited Loading

nesnoj commented Aug 25, 2021

nesnoj commented Aug 27, 2021 • edited Loading

IlkaCu commented Aug 27, 2021

IlkaCu commented Aug 27, 2021

nesnoj commented Aug 27, 2021

IlkaCu commented Aug 27, 2021

IlkaCu commented Aug 30, 2021

nesnoj commented Aug 30, 2021

nesnoj commented Sep 6, 2021

gnn commented Sep 6, 2021

nesnoj commented Sep 6, 2021

nesnoj commented Oct 25, 2022

ClaraBuettner commented Oct 25, 2022

nesnoj commented Oct 25, 2022 • edited Loading

IlkaCu commented Oct 25, 2022

AmeliaNadal commented Oct 25, 2022 • edited Loading

nesnoj commented Oct 25, 2022

nesnoj commented Oct 26, 2022

AmeliaNadal commented Oct 26, 2022

IlkaCu commented Oct 28, 2022

khelfen commented Oct 28, 2022

nesnoj commented Oct 28, 2022

khelfen commented Oct 28, 2022 • edited Loading

nesnoj commented Oct 31, 2022

nesnoj commented Dec 22, 2022 • edited Loading

AmeliaNadal commented Dec 23, 2022 • edited Loading

nesnoj commented Dec 31, 2022

AmeliaNadal commented Jan 17, 2023

nesnoj commented Jan 17, 2023 • edited Loading

ClaraBuettner commented Jan 17, 2023

nesnoj commented Jan 17, 2023

AmeliaNadal commented Feb 20, 2023 • edited Loading

IlkaCu commented Feb 20, 2023

nesnoj commented Feb 20, 2023

nesnoj commented Feb 20, 2023

gnn commented Feb 24, 2023 • edited Loading

ClaraBuettner commented Aug 9, 2021 •

edited

Loading

nesnoj commented Aug 23, 2021 •

edited

Loading

AmeliaNadal commented Aug 23, 2021 •

edited

Loading

nesnoj commented Aug 23, 2021 •

edited

Loading

nesnoj commented Aug 23, 2021 •

edited

Loading

nesnoj commented Aug 23, 2021 •

edited

Loading

nesnoj commented Aug 27, 2021 •

edited

Loading

nesnoj commented Oct 25, 2022 •

edited

Loading

AmeliaNadal commented Oct 25, 2022 •

edited

Loading

khelfen commented Oct 28, 2022 •

edited

Loading

nesnoj commented Dec 22, 2022 •

edited

Loading

AmeliaNadal commented Dec 23, 2022 •

edited

Loading

nesnoj commented Jan 17, 2023 •

edited

Loading

AmeliaNadal commented Feb 20, 2023 •

edited

Loading

gnn commented Feb 24, 2023 •

edited

Loading