Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semi Successful test for USCS cat 4 to refseq ids #10

Open
childers opened this issue Mar 22, 2017 · 3 comments
Open

Semi Successful test for USCS cat 4 to refseq ids #10

childers opened this issue Mar 22, 2017 · 3 comments

Comments

@childers
Copy link
Collaborator

$ time  seqconv convert --ref felCat4 --out rs cat_felCat4_UCSC_2008.gtf >test_cat_4.rs.gtf
Converting from None to rsStarting Conversion
Cannot convert id: chrM
No corresponding id for chrX from None
FORMAT detected: uc
real	0m24.379s
user	0m1.858s
sys	0m0.188s

@childers childers changed the title Semi Successfule test for USCS cat 4 to refseq ids Semi Successful test for USCS cat 4 to refseq ids Mar 22, 2017
@childers
Copy link
Collaborator Author

The resulting gtf only contains a link to the assembly report:

$ wc -l cat_felCat4_UCSC_2008.gtf 
    1000 cat_felCat4_UCSC_2008.gtf
$ wc -l test_cat_4.rs.gtf 
       1 test_cat_4.rs.gtf

$ cat test_cat_4.rs.gtf 
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/003/115/GCA_000003115.1_catChrV17e/GCA_000003115.1_catChrV17e_assembly_report.txt


@childers
Copy link
Collaborator Author

The assembly_report file shows 'X', while the UCSC gtf file uses the format 'ChrX'.
Assembly_report

F1      assembled-molecule      F1      Chromosome      CM000711.1      <>      na      Primary Assembly        92851383        chrF1
F2      assembled-molecule      F2      Chromosome      CM000712.1      <>      na      Primary Assembly        81418843        chrF2
X       assembled-molecule      X       Chromosome      CM000713.1      <>      na      Primary Assembly        145558876       chrX
chrUn1_1        unplaced-scaffold       na      na      ACBE01511744.1  <>      na      Primary Assembly        3005    chrUn_ACBE01511744
chrUn1_3106     unplaced-scaffold       na      na      ACBE01511745.1  <>      na      Primary Assembly        3953    chrUn_ACBE01511745
chrUn1_7159     unplaced-scaffold       na      na      ACBE01511746.1  <>      na      Primary Assembly        1488    chrUn_ACBE01511746

Cat GTF

$ head cat_felCat4_UCSC_2008.gtf
chrM    felCat4_gold    exon    1       17009   0.000000        +       .       gene_id "NC_001700"; transcript_id "NC_001700"; 
chrX    felCat4_gold    exon    1       3694    0.000000        +       .       gene_id "ACBE01484836.1"; transcript_id "ACBE01484836.1"; 
chrX    felCat4_gold    exon    16589   17861   0.000000        -       .       gene_id "ACBE01484837.1"; transcript_id "ACBE01484837.1"; 

@childers
Copy link
Collaborator Author

@guilhemfaure How should we handle this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant