Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with CSV files #18

Open
BaluHarshavardan99 opened this issue Feb 4, 2024 · 2 comments
Open

Issues with CSV files #18

BaluHarshavardan99 opened this issue Feb 4, 2024 · 2 comments

Comments

@BaluHarshavardan99
Copy link

Hi,
I am trying to recreate the QUILT dataset. I have a doubt regarding some of the columns in the CSV files that you have shared in the repo. Can you please highlight how you obtained the "stable_times" column in quilt_recon.csv?

Also, Were the images in the "image_path" column of quilt_data.csv extracted using the Static Video Chunk Detection Algorithm? Can you please elaborate on the generation of the quilt_data.csv file?

Thank you

@wisdomikezogwo
Copy link
Owner

wisdomikezogwo commented Feb 5, 2024

Hi,

Thanks for bringing this up, it's somewhat of a known issue as highlighted in the readme here.

Can you please highlight how you obtained the "stable_times" column in quilt_recon.csv?
To preface, at the time we were creating QUILT we weren't going to provide code to re-create the dataset as such it wasn't obvious then, to save the exact time/or time-interval of representative frames as such, the reconstruction code isn't perfect.

That said we then put into quilt_recon.csv the key_frames (or scene frames in the paper) and the stable regions within the chunks ( given by get_histo_srt_im_recon) if any. So, to make it clearer, we extract images in a cascaded manner, the visual representation of this is in the supplementary of the paper Figure 7. For each valid chunk in a video, we look for stable regions (i.e continuous frames with little to no visual changes), and for every one we find we then take the median frame (pixel-wise), if there aren't any stable frames we take the frame within the chunk and deduplicate them. This means for the latter you can have precise timing but for the former, it's a median of various frames and hence you can't peg it to a specific time, but more to an interval, and that's why we released code to extract said frame (i.e save_frame_chunks_recon). All this to say, stable times are the regions in which we think there are small chunks of stable frames for which we can collect representative images because the narrator stops for a bit to explain the image's features but for swats of frames that do not have stable regions because the narrator was moving around the WSI for an extended time we collect all the images we can and deduplicate them to represent the chunks for which this happens.

Were the images in the "image_path" column of quilt_data.csv extracted using the Static Video Chunk Detection Algorithm?
Yes, for all the representative images collected from the process described above (so not just Static Video Chunk Detection as not all chunks have stable frames), we then save them to disk and to file the paths. so quilt-data.csv is just an early dump of the data (text and metadata) before releasing the full and current data here.

Let me know if you need any more clarification or have more questions, thanks.

@BaluHarshavardan99
Copy link
Author

Hi,
Thank you very much for the clarification. It was really helpful.

I need clarification on one more variable - How is the variable "pair_chunk_time" decided?

Thank you for your time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants