Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We should make clear what is technology type #727

Open
ypriverol opened this issue Oct 9, 2024 · 11 comments
Open

We should make clear what is technology type #727

ypriverol opened this issue Oct 9, 2024 · 11 comments

Comments

@ypriverol
Copy link
Member

technology type is mandatory in SDRF but no section in the SDRF mentions it, and even when it is added to the templates, we have confusion about where should be. The validator says after the assay name, but the templates are before. see issue bigbio/sdrf-pipelines#177.

Let us include a section of the SDRF about technology type and also where to write it. I prefer before the assay name as the templates.

Opinions @daichengxin @TineClaeys @levitsky

@ypriverol ypriverol changed the title W should make clear what is technology type We should make clear what is technology type Oct 9, 2024
@levitsky
Copy link
Collaborator

levitsky commented Oct 9, 2024

We definitely should improve documentation of technology type because right now it is listed as a required column in template part of the doc but at the same time it's hard to understand what values it should have.

P.S. I missed the discussion on making technology type required, I am guessing this is for compatibility with non-proteomics formats?

As for the column order, it makes sense to have it close to assay name, but I find it hard to come up with arguments for one particular ordering. I don't know if it's a good idea to leave it unspecified so I suggest we should just pick one and stick with it.

@ypriverol
Copy link
Member Author

We definitely should improve documentation of technology type because right now it is listed as a required column in template part of the doc but at the same time it's hard to understand what values it should have.

P.S. I missed the discussion on making technology type required, I am guessing this is for compatibility with non-proteomics formats?

Yes. Also, the SDRF-proteomics may start getting used in other proteomics technologies that are not only MS, then it would be good to control the type of technology that was used. For example, we are working now to get Affinity proteomics Olink data in SDRF, then would be good to keep this field to control that.

As for the column order, it makes sense to have it close to assay name, but I find it hard to come up with arguments for one particular ordering. I don't know if it's a good idea to leave it unspecified so I suggest we should just pick one and stick with it.

I think the other is nice to be before assay name, because I have been thinking that could be nice as the introduction of what type of assay is that, then but I agreed that it doesn't matter too much where it is as soon as it is there. After assay name is also nice because it could be understood as the type of assay that was done.

Then, before assay name could be seen as the technology that was use to process the sample and the assay types we will find. After assay name means the type of assay that was performed.

@levitsky
Copy link
Collaborator

levitsky commented Oct 9, 2024

One could say that technology type is a characteristic of an assay, together with the comment columns that follow. Then we could put it after assay name, much like the sample characteristics part starts with source name.

@deeptijk
Copy link
Collaborator

deeptijk commented Oct 9, 2024

I believe it's better to put it before the assay name, it makes more sense to know the technology used to process the sample. In most of the PRIDE submissions, submitters define which technology they have used in 'Sample processing protocol'.

@nithujohn
Copy link
Collaborator

Including the assay name after the technology type in a proteomics experiment can be a helpful practice, especially for maintaining clarity and organization in data records. If you're managing a large dataset with multiple runs under the same technology, adding the assay name after the technology type can aid in distinguishing between them and facilitate data processing or interpretation.

@ypriverol
Copy link
Member Author

@nithujohn @deeptijk @levitsky what about allowing it right before or after assay name? but not in any other place.

@levitsky
Copy link
Collaborator

levitsky commented Oct 10, 2024

Fine by me. I think column order only matters for human readability. Most humans don't seem to notice the difference as long as the columns are next to each other, which is why we ended up with contradictions between the templates, validation rules and actual annotations in the first place.

@daichengxin
Copy link
Collaborator

daichengxin commented Oct 10, 2024

Before or after assay name are little difference. But one thing is that it often placed after the assay name like this https://ftp.ebi.ac.uk/biostudies/fire/E-MTAB-/567/E-MTAB-13567/Files/E-MTAB-13567.sdrf.txt. So I'm not sure it's mandatory in ArrayExpress and affects compatibility if it's put up front. I don't see MAGE-TABE validator forcing this order of term. I also agree before or after are fine.

@nithujohn
Copy link
Collaborator

I would suggest like in my previous comment assay name after technology type

@deeptijk
Copy link
Collaborator

I would also suggest to have assay name after the technology type. As mentioned by @ypriverol and @nithujohn ..it would be easier to handle larger datasets and conceptually it makes more sense to me.

@ypriverol
Copy link
Member Author

@deeptijk @levitsky @daichengxin @nithujohn:

I think the best solution is to write it after the assay name, which means changing the specification and templates. I will vote @levitsky to have only a warning for now if is right before the assay name in the validator to avoid failing for those users already using the templates.

Please those in favour of the proposal vote here with 👍 if not, please continue commenting on the issue. I will update a PR about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants