Skip to content

Commit

Permalink
Safer linking from SRA to BioSample
Browse files Browse the repository at this point in the history
A strange issue with NCBI was causing the wrong BioSamples to be linked. This has been resolved now.
  • Loading branch information
lmrodriguezr committed Nov 4, 2024
1 parent ec1fe75 commit cc7b358
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 9 deletions.
29 changes: 23 additions & 6 deletions app/models/genome/external_resources.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,27 @@ def external_sra_to_biosamples(acc)
'XREF_LINK[DB[text() = "ENA-SAMPLE"]]/ID'
).map(&:text)
elsif ng.xpath('//EXPERIMENT_SET').present?
ng.xpath(
'//EXPERIMENT_SET/EXPERIMENT/DESIGN/SAMPLE_DESCRIPTOR/' \
'IDENTIFIERS/PRIMARY_ID'
).map(&:text)
# Unfortunately, we should prefer external IDs over primary IDs because
# NCBI E-Utils has a strange tendency to return the wrong biosample when
# using SRS... accessions. For example, see:
#
# - https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=biosample
# &id=SRS22988103&rettype=xml&retmode=text
# - https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=biosample
# &id=SAMN13193749&rettype=xml&retmode=text
#
# The first is using the accession SRS22988103 but it (wrongly) retrieves
# data for SAMN22988103 (= SRS11001113). Apparently the backend code
# simply strips off the alphabetic prefix and uses the numeric part
# without checking
sample_id =
ng.xpath(
'//EXPERIMENT_SET/EXPERIMENT/DESIGN/SAMPLE_DESCRIPTOR/IDENTIFIERS'
)
biosample_id =
sample_id.xpath('EXTERNAL_ID[@namespace="BioSample"]').map(&:text)
biosample_id.present? ? biosample_id :
sample_id.xpath('PRIMARY_ID').map(&:text)
else
[] # Unknown XML specification
end
Expand All @@ -82,7 +99,7 @@ def external_biosample_hash_ebi(acc)

ng = Nokogiri::XML(body)
{}.tap do |hash|
h = {}
h = { api: 'EBI' }
h[:title] = ng.xpath('//SAMPLE_SET/SAMPLE/TITLE').text
h[:description] = ng.xpath('//SAMPLE_SET/SAMPLE/DESCRIPTION').text
h[:attributes] = Hash[
Expand All @@ -103,7 +120,7 @@ def external_biosample_hash_ncbi(acc)

ng = Nokogiri::XML(body)
{}.tap do |hash|
h = {}
h = { api: 'NCBI' }
h[:title] = ng.xpath('//BioSampleSet/BioSample/Description/Title').text
h[:description] =
ng.xpath('//BioSampleSet/BioSample/Description/Comment/Paragraph').text
Expand Down
11 changes: 9 additions & 2 deletions app/views/genomes/_samples.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -108,9 +108,16 @@
<ul>
<% @genome.source_hash[:samples].each do |acc, sample| %>
<% id = modal(acc, size: 'lg') do %>
<% if sample[:from_sra] %>
<% if sample[:from_sra] || sample[:api] %>
<div class="alert alert-info">
Retrieved via <%= to_sentence(sample[:from_sra]) %>
Metadata retrieved
<% if sample[:api] %>
using the <%= sample[:api] %> API
<% end %>
<% if sample[:from_sra] %>
via <%= to_sentence(sample[:from_sra]) %>
(linked through the EBI API)
<% end %>
</div>
<% end %>
<p class="lead mx-2">
Expand Down
2 changes: 1 addition & 1 deletion config/environments/development.rb
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
config.file_watcher = ActiveSupport::EventedFileUpdateChecker

# Turn off API checks for external services
config.bypass_external_apis = true
config.bypass_external_apis = !ENV['ALLOW_EXTERNAL_APIS'].present?

# DataCite access
config.datacite = {
Expand Down

0 comments on commit cc7b358

Please sign in to comment.