Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for JPEG-LS decoding #534

Merged
merged 18 commits into from
Aug 31, 2024
Merged

Add support for JPEG-LS decoding #534

merged 18 commits into from
Aug 31, 2024

Conversation

dougyau
Copy link
Contributor

@dougyau dougyau commented Jul 9, 2024

Add jpeg ls decoding support using charls bindings via charls-sys crate

@dougyau dougyau marked this pull request as draft July 9, 2024 05:06
@dougyau dougyau marked this pull request as ready for review July 9, 2024 22:52
@Enet4
Copy link
Owner

Enet4 commented Jul 10, 2024

Ah, I had entertained the idea of linking to charls for JPEG-LS support, but I wouldn't be able to know if people were interested in it. Good that we have a proof of concept now!

I may take my time to look into the other crate. From what I can already tell, it is odd to find both the low-level API and the high-level memory safe API in the same crate. Usually *-sys crates would only hold the low-level API, leaving the high-level API to another crate.

We can (and should!) also add tests that decode the test files available from WG04.

@Enet4 Enet4 added enhancement A-lib Area: library C-transfer-syntax Crate: dicom-transfer-syntax-registry labels Jul 10, 2024
@dougyau
Copy link
Contributor Author

dougyau commented Jul 10, 2024

JPEG-LS is still useful, a lot of modalities use it (and for some modalities it is the only compressed format available) so I think it is better to have it supported than relying on gdcm (specially when building for windows).

The other crate is using bindgen generated bindings, so I did not think it would be worth it to create a separate crate. But I am open to split it if it helps.

@dougyau
Copy link
Contributor Author

dougyau commented Jul 10, 2024

Updated pixeldata tests to include jpegls samples.

@Enet4
Copy link
Owner

Enet4 commented Jul 10, 2024

JPEG-LS is still useful, a lot of modalities use it (and for some modalities it is the only compressed format available) so I think it is better to have it supported than relying on gdcm (specially when building for windows).

Great to know! 👍

The other crate is using bindgen generated bindings, so I did not think it would be worth it to create a separate crate. But I am open to split it if it helps.

I would really prefer to follow the convention, if you don't mind. I can also assist in the split if it helps. See also the relevant section from the Cargo book: https://doc.rust-lang.org/cargo/reference/build-scripts.html#-sys-packages

@dougyau
Copy link
Contributor Author

dougyau commented Jul 29, 2024

As suggested, the charls codec has been spit into charls-sys and charls-rs.

@Enet4
Copy link
Owner

Enet4 commented Jul 29, 2024

As suggested, the charls codec has been spit into charls-sys and charls-rs.

Hey, great to know! Don't you want to take over the crate named charls instead? It is non-idiomatic for a crate name to end with -rs.

@dougyau
Copy link
Contributor Author

dougyau commented Jul 29, 2024

Done, I have renamed the crate from charls-rs to charls

@Enet4
Copy link
Owner

Enet4 commented Aug 6, 2024

Hello. I've been going through the charls bindings so we can polish it a bit before it's used by DICOM-rs. I will send in a few maintenance PR to the other repos, but in the meantime, there are some other concerns about the API provided by the crate that we should also work on.

  • It is not possible to retrieve the image header information from the high level bindings. This may be important to validate the encoded image with the metadata in the DICOM.
  • We can avoid some allocations on decode if it is changed to write bytes to a &mut Vec<u8> or some other kind of byte writer instead of returning a new Vec<u8>, which automatically requires an allocation.
  • Is the parameter stride useful in the high level decode method? I would assume that it only makes sense in the original low-level API because the output is done through a destination buffer.

@dougyau
Copy link
Contributor Author

dougyau commented Aug 6, 2024

* It is not possible to retrieve the image header information from the high level bindings. This may be important to validate the encoded image with the metadata in the DICOM.

I think there is a call on charls to get data, but this kind of validations are not required per DICOM standard (I could be wrong on this).

* We can avoid some allocations on decode if it is changed to write bytes to a `&mut Vec<u8>` or some other kind of byte writer instead of returning a new `Vec<u8>`, which automatically requires an allocation.

I think it is worth exploring how to lower the amount of allocations.

* Is the parameter `stride` useful in the high level `decode` method? I would assume that it only makes sense in the original low-level API because the output is done through a destination buffer.

I left it there to pass it to the lower API. So if you know the stride, you can pass it, or if you don't, per charls documentation a value of 0 will cause charls to auto-detect it.

@Enet4
Copy link
Owner

Enet4 commented Aug 6, 2024

I think there is a call on charls to get data, but this kind of validations are not required per DICOM standard (I could be wrong on this).

Validation tools may want to cross-check the information from it, but even if we don't use it over here, it may be useful for the the crate itself for testing, and it may help other people interested in working with JPEG-LS for other reasons, so I believe it's worth it. I can help with this if necessary.

I left it there to pass it to the lower API. So if you know the stride, you can pass it, or if you don't, per charls documentation a value of 0 will cause charls to auto-detect it.

I had a better look at the decoder implementation, stride allows for the output to be padded on each encoded row (so the minimum stride is width * samples_per_pixel * bytes_per_sample). So OK, we are unlikely to need it but it might have some use. Something like having decode and decode_with_stride could help simplify the method that we're most likely to call, but it's not very important.

@dougyau
Copy link
Contributor Author

dougyau commented Aug 7, 2024

I am applying the suggestion. I have added decode and decode_with_stride. And get_frame_info. And following from this line of thought, decode and decode_with_stride will return Result<FrameInfo> that way on the frame decode function it will check the frame data against the dicom metadata, and throw a warning if it doesn't match.

@dougyau
Copy link
Contributor Author

dougyau commented Aug 9, 2024

Updated the library and the code on this PR. I did plan to validate the dicom metadata and the data on the spiff header, but according to what I saw, the charls codec doesn't use that header, since it's values are not reliable.

@dougyau dougyau marked this pull request as draft August 9, 2024 21:18
@dougyau
Copy link
Contributor Author

dougyau commented Aug 9, 2024

After doing additional tests, there is a part that can be improved, as it is now, when you decode an image you need to append the results on dst. So the change on the charls codec did not get us any gain, since a new buffer needs to be sent to the codec and then appendend to the dst buffer.

Now, if we want to reduce the amount of allocations, on encoding::PixelDataReader::decode the dst buffer could implement something like

    fn decode(&self, src: &dyn PixelDataObject, dst: &mut Vec<u8>) -> DecodeResult<()> {
        let frames = src.number_of_frames().unwrap_or(1);
        let cols = src
            .cols()
            .context(decode_error::MissingAttributeSnafu { name: "Cols" })?;
        let rows = src
            .rows()
            .context(decode_error::MissingAttributeSnafu { name: "Rows" })?;
        let bits_allocated = src
            .bits_allocated()
            .context(decode_error::MissingAttributeSnafu {
                name: "BitsAllocated",
            })?;
        let samples_per_pixel =
            src.samples_per_pixel()
                .context(decode_error::MissingAttributeSnafu {
                    name: "SamplePerPixel",
                })?;
        let buffer_size = cols * rows * bits_allocated.div(8) * samples_per_pixel * frames;
        dst.reserve(buffer_size);

        for frame in 0..frames {
            self.decode_frame(src, frame, dst)?;
        }
        Ok(())
    }

The idea would be to allocate the necessary memory to store the image, that way the append operations on decode_frame wouldn't need to allocate.

@dougyau dougyau marked this pull request as ready for review August 9, 2024 21:57
@Enet4 Enet4 self-requested a review August 14, 2024 14:04
@Enet4
Copy link
Owner

Enet4 commented Aug 14, 2024

After doing additional tests, there is a part that can be improved, as it is now, when you decode an image you need to append the results on dst. So the change on the charls codec did not get us any gain, since a new buffer needs to be sent to the codec and then appendend to the dst buffer.

The idea would be to allocate the necessary memory to store the image, that way the append operations on decode_frame wouldn't need to allocate.

That makes sense! But I would strongly recommend that try_reserve is used instead of reserve, because we are dealing with external input which can easily trigger a panic.

@Enet4
Copy link
Owner

Enet4 commented Aug 14, 2024

Other than the comment above, I don't see any major issues so far. I'd say we take care of the PRs over at charls-rs

And after any eventual optimizations that you might want to make, we would be ready to merge this one.

@dougyau
Copy link
Contributor Author

dougyau commented Aug 15, 2024

I think further optimization, like the one I proposed would be better on another PR if needed. I do not think it is required, as it will not add much performance and changes on the adapters would be required to use it.

A good optimization would be to use a mutable slice, and send the adapter a mutable window of the slice, that way the codecs could write it inplace. But this can be done at a later time.

And regarding this PR, I don't think further changes will be required.

Copy link
Owner

@Enet4 Enet4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few copy&paste mistakes in the doc-comments, I will push them right away.

transfer-syntax-registry/src/adapters/jpegls.rs Outdated Show resolved Hide resolved
transfer-syntax-registry/src/adapters/jpegls.rs Outdated Show resolved Hide resolved
Copy link
Owner

@Enet4 Enet4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful! Thank you for working on JPEG-LS support. 👍

@Enet4 Enet4 merged commit 4f2c0d8 into Enet4:master Aug 31, 2024
4 checks passed
@dougyau dougyau deleted the jpegls-decode branch September 16, 2024 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lib Area: library C-transfer-syntax Crate: dicom-transfer-syntax-registry enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants