-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-43077: Convert all tasks to use CalibrateImageTask outputs #979
base: main
Are you sure you want to change the base?
Conversation
46aacdd
to
8d09263
Compare
8d09263
to
3923de1
Compare
class WriteSourceTableConnections(pipeBase.PipelineTaskConnections, | ||
defaultTemplates={"catalogType": ""}, | ||
dimensions=("instrument", "visit", "detector")): | ||
|
||
catalog = connectionTypes.Input( | ||
doc="Input full-depth catalog of sources produced by CalibrateTask", | ||
name="{catalogType}src", | ||
name="{catalogType}initial_stars_footprints_detector", | ||
storageClass="SourceCatalog", | ||
dimensions=("instrument", "visit", "detector") | ||
) | ||
outputCatalog = connectionTypes.Output( | ||
doc="Catalog of sources, `src` in DataFrame/Parquet format. The 'id' column is " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Legacy usage of src
in this doc string.
@@ -188,30 +188,33 @@ def run(self, catalogs, tract, patch): | |||
return catalog | |||
|
|||
|
|||
# TODO: should deprecate this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning on performing these deprecations on this ticket?
@@ -281,6 +284,7 @@ class WriteRecalibratedSourceTableConfig(WriteSourceTableConfig, | |||
) | |||
|
|||
|
|||
# TODO: deprecate, since reprocessVisitImage does this more thoroughly? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're asking for an opinion, I can see the merit of adding deprecation flags here. If reprocessVisitImage
can do a better job, I don't see the need to keep this around. With that said, is this beyond the scope of this ticket?
If deprecations are set up, please also set up the future removal ticket, blocked by the appropriate version release.
@@ -1122,6 +1126,7 @@ def _combineExposureMetadata(self, visit, dataRefs): | |||
class ConsolidateSourceTableConnections(pipeBase.PipelineTaskConnections, | |||
defaultTemplates={"catalogType": ""}, | |||
dimensions=("instrument", "visit")): | |||
# TODO: Deprecate the dataframe connection? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And replace it with another storage class? Or just a deprecation and removal?
@@ -95,8 +95,9 @@ calexpFlags: | |||
- base_PixelFlags_flag_saturatedCenter | |||
- base_PixelFlags_flag_crCenter | |||
- base_PixelFlags_flag_suspectCenter | |||
- base_PixelFlags_flag_streak | |||
- base_PixelFlags_flag_streakCenter | |||
# Streak flags not yet propagated from difference imaging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are being left in but commented, please add a TODO
and a DM ticket number for their eventual reinstatement.
@@ -769,8 +769,9 @@ flags: | |||
- base_PixelFlags_flag_sensor_edgeCenter | |||
- base_PixelFlags_flag_suspect | |||
- base_PixelFlags_flag_suspectCenter | |||
- base_PixelFlags_flag_streak | |||
- base_PixelFlags_flag_streakCenter | |||
# Streak flags not yet propagated from difference imaging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
and ticket link here too please.
@@ -380,8 +381,9 @@ flags: | |||
- base_PixelFlags_flag_saturatedCenter | |||
- base_PixelFlags_flag_suspect | |||
- base_PixelFlags_flag_suspectCenter | |||
- base_PixelFlags_flag_streak | |||
- base_PixelFlags_flag_streakCenter | |||
# Streak flags not yet propagated from difference imaging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
and ticket link here please.
6fc7e25
to
cd09fdb
Compare
42afd18
to
0d0a9ba
Compare
Now that reprocessVisitImage is directly writing an ArrowAstropy, it no longer has an index column, so we can just treat the id like a normal column.
They will get added back in once we start propagating them from the diffim. Remove STREAK from MeasureMergedCoaddSources.
calib_detected is now meaningless, as the icSource table is no more.
This is needed for compatibility with outputs produced by CalibrateImageTask.
Some doc parameter lists had gotten out of sync with the code, and it's always good to clarify whether you've got a DeferredDatasetHandle or a real object.
Background modeling is done on the image without the calibration applied, and that's how it should be subtracted, too. This was a clear bug in the old code, but probably a very minor one, because the photometric calibration is close to constant, and we were previously only incorrectly multiplying the skyCorr background by the ratio of the photometric calibration to its average.
This is needed for compatibility with outputs produced by CalibrateImageTask.
0d0a9ba
to
ddcd8c9
Compare
@@ -4,7 +4,8 @@ | |||
# See the DPDD for more information about the output: https://lse-163.lsst.io | |||
funcs: | |||
sourceId: | |||
functor: Index | |||
functor: Column | |||
args: id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@parejkoj, do you recall what led you to make this change? I'm finding that I don't need it, which is very convenient because it means that (at least for now) I can use the one Source.yaml
for both calibrateImage
outputs and writeRecalibratedSourceTable
despite one writing ArrowAstropy
and one writing DataFrame
(I do need to make other changes to the file and various plugin configurations, but that's orthogonal). But I want to make sure I'm not missing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment on this commit: 1df4911
The Index
line didn't work with ArrowAstropy when I tried it, and this treats it just like a normal column.
I'm surprised this file works with calibrateImage
outputs, since that task is not configured to produce an output catalog with many of the fields that this Source.yaml
expects. That's why there's a new dedicated .yaml
file for that task's outputs on another commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
Index
line didn't work with ArrowAstropy when I tried it, and this treats it just like a normal column.
Do you remember what didn't work? I also expected it not to work, but unless I didn't test what I thought I did, restoring the original didn't cause a failure in transform[Pre]SourceTable
or consolidate[Pre]SourceTable
for me. I'm running a ci_hsc in Jenkins now with this one bit reverted (on DM-46991) so that'll hopefully check what I tried to do locally.
I'm surprised this file works with calibrateImage outputs, since that task is not configured to produce an output catalog with many of the fields that this Source.yaml expects.
In my incremental approach I've reconfigured the task to produce all of those fields in DRP (except calib_detected
). We've got analysis tools that need some of them, but the pipeline steps we have now are really not amenable to adding the multiple invocations we need of those tools (for initial stars, recalibrated stars, and final sources), so I want to fix that on DM-47320 before trying to review which analysis tools we run when, and I want to do that before trimming down the calibrateImage
output fields back to [close to] the defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you remember what didn't work?
No, I don't remember any of the details at this point. I'm pretty sure one of those listed tasks just failed when I tried to use it with ArrowAstropy inputs. It was probably ci_imsim or ci_hsc where the failure occurred.
I've reconfigured the task to produce all of those fields in DRP
Oof, that's what I was afraid of. Good luck. I hope we really can get it back to the defaults, and/or any non-default values are ones that we can look over with AP, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I figured it out - without this change, transformSourceTable
does misbehave, but only by replacing the real source ID column with range(0, N)
, and it turns out nothing actually raises an exception (even downstream anywhere in the pipeline). A lot of tasks just produce no outputs, because finalizeCharacterization
doesn't produce any PSFs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, that sounds very familiar. I think it took me a while to figure it out; the "there were no outputs for tasks" was the secret.
No description provided.