Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weld errored at image-qc #2

Open
gwaybio opened this issue Jul 24, 2021 · 5 comments
Open

Weld errored at image-qc #2

gwaybio opened this issue Jul 24, 2021 · 5 comments
Labels
bug Something isn't working

Comments

@gwaybio
Copy link
Member

gwaybio commented Jul 24, 2021

CP257 errored as described in broadinstitute/pooled-cell-painting-profiling-recipe#73

All sites complete.
Summarizing 4684 sites in batch: 20210422_6W_CP257.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/facets/facet.py:399: PlotnineWarning: If you need more space for the y-axis tick text use ... + theme(subplots_adjust={'hspace': 0.25}). Choose an appropriate value for 'hspace'
There are a total of 46371251 cells in 20210422_6W_CP257
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 360 rows containing missing values.
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3081, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'level_3'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "recipe/0.preprocess-sites/4.image-and-segmentation-qc.py", line 511, in <module>
    cp_sat_df[["cat", "type", "Ch"]] = cp_sat_df["level_3"].str.split(
  File "/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/core/frame.py", line 3024, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3083, in get_loc
    raise KeyError(key) from err
KeyError: 'level_3'
Building single file for dataset ALLBATCHES___ALLPLATES___ALLWELLS; combining single cells from site: CP257A-Well2-59...

cc @ErinWeisbart

looks like this error happens pretty deep into the script (4.image-and-segmentation-qc.py", line 511) so you will likely retain most of your QC figures, but not anything beyond line 511.

p.s. writing these issues on the weekend since the weld just failed (see #1 ) so as to not forget on Monday!

@gwaybio gwaybio added the bug Something isn't working label Jul 24, 2021
@ErinWeisbart
Copy link
Member

When you push what has been processed I can try and replicate this locally to see if I can fix it?

@gwaybio
Copy link
Member Author

gwaybio commented Jul 26, 2021

Sounds good. I'll tag you once it's pushed

@gwaybio
Copy link
Member Author

gwaybio commented Jul 27, 2021

@ErinWeisbart #3 adds the image_metadata.tsv file - I think this is all you need to address this? LMK if you need anything else

@ErinWeisbart
Copy link
Member

ErinWeisbart commented Aug 2, 2021

I don't know what's going on as a local test runs for me.

To test locally I set the variables

input_image_file = "data/0.site-qc/20210422_6W_CP257/data/image_metadata.tsv"
intensity_col_prefix = "ImageQuality_StdIntensity_"
saturated_col_prefix = "ImageQuality_PercentMaximal_"
platelist = ["CP257A", "CP257B"]
sites_per_image_grid_side = 10
image_cols = {'well': "Metadata_Well", 'site': "Metadata_Site", 'plate': "Metadata_Plate"}
barcoding_cycles=12

I then run lines 69-87 to make the loc_df and lines 257-265 to load the image file and add loc_df to it (we don't use the cols coming from loc_df for the saturation plots but it's the only time that image_df is modified after loading so I included it).

Then I run lines 555 to the end and it works and my plots save (I simplify the file save bit at the end to the following, but that shouldn't matter).

        output_file = pathlib.Path(f"bc_saturation_{well}_{plate}.png"
        )
        bc_saturation_gg.save(
            output_file,
            dpi=300,
            width=5,
            height=(barcoding_cycles + 2),
            verbose=False,
        )

@ErinWeisbart
Copy link
Member

Taking a stab at what's going on, since it's a KeyError: 'level_3', that means that the line before bc_sat_df = bc_sat_df.set_index(image_meta_col_list).stack().reset_index() isn't working as expected since this is where the column name level_3 comes from.
image_meta_col_list is made from image_cols at 259 but I don't see anywhere that image_meta_col_list is modified after creation and before my test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants