Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for timsTOF data #440

Open
daichengxin opened this issue Nov 17, 2024 · 7 comments
Open

Support for timsTOF data #440

daichengxin opened this issue Nov 17, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request question Further information is requested

Comments

@daichengxin
Copy link
Collaborator

Description of feature

I tested identification workflow for timsTOF dataset in last week. The first step is to execute the tdf2mzml module and then run the same analyses as for the other data (Comet and MSGF+). Then I compared results with MaxQuant. Set PSM FDR as 0.01, and results of MaxQuant are from evidence.txt. Then overlap of identified peptides above 90% from MaxQuant. But the overlap of PSM is 0. Because scan number (index) is different bwteen quantms and MQ after comparing precursor mz.

identified_pept
compare_results.csv
compare_results_pep.zip

Some questions:

  1. For example, the peptide is identified in three scans those are different MQ. How to compare and check the difference due to different scan numbers (or index)? I manually checked the identifications and they all look like they match well?
sequence exp_mass_to_charge quantms scan_number MaxQuant MS/MS scan number
AAAAAAMAEQESAR 695.3256725319 356222 65719
AAAAAAMAEQESAR 695.3256725319 356647 65719
AAAAAAMAEQESAR 695.3256725319 356088 65719

image
image
image

  1. Surprised quantms identified so many peptides! I also manually checked the identifications in only quantms and they all look like they match not bad? Further assessment is needed here

image
image

@daichengxin daichengxin added enhancement New feature or request question Further information is requested labels Nov 17, 2024
@ypriverol
Copy link
Member

ping @wfondrie @jspaezp

@jpfeuffer
Copy link
Collaborator

Regarding scan number mismatch: do we not have a scan ID that we could use? Did you run MQ on the tdf or on the converted mzML?

@daichengxin
Copy link
Collaborator Author

daichengxin commented Nov 17, 2024

Run MQ on the tdf. So there are difference. But I didn't know how map scan number in MQ between quantms
This is converted mzml in quantms.
image

@jspaezp
Copy link
Contributor

jspaezp commented Nov 18, 2024

Well I cannot say anything about how MQ deals with the numbers ... BUT ... I think it is very normal for a "scan" to mean very different things in different software dealing with PASEF data. The main reason is that a real scan in the .d has very little an noisy information. Most of the real information comes from the series of scans that encompass a single elution of the tims funnel (called a frame).

So when converting frames -> scans the naive approach of just splitting each scan is kind of useless (because it would lead to blocks of ~700 ms1 scans that dont share information with each other but share retention time, followed by a bunch of blocks of ms2 scans that look horrendous).

The more standard approach is to use sections of the frame and squash them into a single new scan. So when tdf2mzml says DEPENDING ON THE OPTIONS USED FOR EXPORTING "scan=1" could actually mean "frame 1, scans 200-250" (similar to how some qtofs have micro-scans that get aggregated) OR actually "scan 234234" in the run.

image

so .... I have no idea :P I would need to explore a bit more what the numbers are.
Some things I would like to know:

  1. Are the index numbers contiguous? (in mq/qms, do all scans from 1-N exist? or are there steps like 1,53,134,...N)
  2. What settings were used in tdf2mzml?
  3. What were your acquisition parameters?
  4. How many scans do you have between every ms1 scan in the derived mzml?
  5. How does the IMS section look in the .mzml ?

@jonasscheid
Copy link
Contributor

Can you use the mzML from tdf2mzml and run it with MQ? Then compare and you should see if its a tdf-handling problem, or a pipeline-problem

@jspaezp
Copy link
Contributor

jspaezp commented Nov 21, 2024

I don't think there is a real problem, there is just lack of consensus on what the "scan number" should mean. Bc a scan in the mzml is not the same as a scan in the .d.

Having said that ... I do think its a good idea to run the same mq run with .d and .mzml to have some idea how the mappings compare.

@daichengxin
Copy link
Collaborator Author

I tried MQ from tdf2mzml converted mzml. But the error was reported. I also set instrument as Bruker TIMS. It doesn't look like the field MS:1000505 is recognised.

start	22/11/2024 19:58:44
title	Assemble_run_info (1/1)
description	E:\MSNet\Bat\Bat_MBat_20fraction_DDA_Data\P0031_TOF1_DDA_20241010_Bat10_30_200ng_149min_RA1_1_8030.mzML
error	E:\MSNet\Bat\Bat_MBat_20fraction_DDA_Data\P0031_TOF1_DDA_20241010_Bat10_30_200ng_149min_RA1_1_8030.mzML_The given key 'MS:1000505' was not present in the dictionary._   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)__   at PluginRawMzMl.MzMLRawFile.GetInfoForScanNumber(Int32 scanNumber) in C:\Users\bi\source\repos\net7\net\PluginRawMzMl\MzMLRawFile.cs:line 392__   at MqUtil.Ms.Raw.RawFile.InitFromRawFileImpl()__   at MqUtil.Ms.Raw.RawFile.InitFromRawFile()__   at MqUtil.Ms.Raw.RawFile.Init(String path1)__   at MqUtil.Ms.Raw.RawFileUtil.CreateRawFile(String path)__   at MaxQuantLibS.Domains.Peptides.Features.RunInfo.AssembleRunInfo(String mqparFile, Int32 fileIndex) in C:\Users\bi\source\repos\net7\net\MaxQuantLibS\Domains\Peptides\Features\RunInfo.cs:line 42__   at MaxQuantLibS.Domains.Peptides.Work.AssembleRunInfo.Calculation(String[] args, Responder responder) in C:\Users\bi\source\repos\net7\net\MaxQuantLibS\Domains\Peptides\Work\AssembleRunInfo.cs:line 17__   at MaxQuantLibS.Domains.Peptides.Work.MaxQuantWorkDispatcherUtil.PerformTask(Int32 taskType, String[] args, Responder responder) in C:\Users\bi\source\repos\net7\net\MaxQuantLibS\Domains\Peptides\Work\MaxQuantWorkDispatcherUtil.cs:line 7__   at MaxQuantLibS.Base.MaxQuantUtils.Run(Int32 softwareId, Int32 taskType, String[] args, Responder responder) in C:\Users\bi\source\repos\net7\net\MaxQuantLibS\Base\MaxQuantUtils.cs:line 275__   at MaxQuantTask.Program.Function(String[] args, Responder responder) in C:\Users\bi\source\repos\net7\net\MaxQuantTask\Program.cs:line 17__   at MqUtil.Util.ExternalProcess.Run(String[] args, Boolean debug)
end	22/11/2024 19:58:46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants