Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bnd and mod nodes are not merged as expected in the action graph #9

Open
slegare2 opened this issue Mar 6, 2018 · 1 comment
Open

Comments

@slegare2
Copy link
Collaborator

slegare2 commented Mar 6, 2018

SH2 regions have distinct bnd nodes for each target if the anatomizer is not used.
Kinase regions have distinct mod nodes for each target, independently of anatomizer use.
Here is an example and some KamiStudio images that illustrate the issue.

import json
import pickle
from kami.entities import *
from kami.interactions import (Binding, Modification)
from kami.hierarchy import (KamiHierarchy)
from kami.resolvers.black_box import create_nuggets

inters = []

#ABL1 binds BCR on phosphorylated Y246
sh2_gene = Gene("P00519", hgnc_symbol="ABL1")
ptyr_gene = Gene("P11274", hgnc_symbol="BRC")
ptyr_resi = Residue(aa="Y", loc=246, state=State("phosphorylation", True))
b = Binding(
[RegionActor(gene=sh2_gene, region=Region(name="SH2"))],
[SiteActor(gene=ptyr_gene, site=Site(name="Tyr246", start=246, end=246,
residues=[ptyr_resi]))],
)
inters.append(b)

#ABL1 binds BCR on phosphorylated Y279
sh2_gene = Gene("P00519", hgnc_symbol="ABL1")
ptyr_gene = Gene("P11274", hgnc_symbol="BRC")
ptyr_resi = Residue(aa="Y", loc=279, state=State("phosphorylation", True))
b = Binding(
[RegionActor(gene=sh2_gene, region=Region(name="SH2"))],
[SiteActor(gene=ptyr_gene, site=Site(name="Tyr279", start=279, end=279,
residues=[ptyr_resi]))],
)
inters.append(b)

#ALK phosphorylates PTPN11 on Y546
kin_gene = Gene("Q9UM73", hgnc_symbol="ALK")
ptyr_gene = Gene("Q06124", hgnc_symbol="PTPN11")
ptyr_resi = Residue(aa="Y", loc=546, state=State("phosphorylation", True))
m = Modification(
enzyme = RegionActor(gene=kin_gene, region=Region(name="Tyr_kin")),
substrate = SiteActor(gene=ptyr_gene, site=Site(name="Tyr546", start=546, end=546)),
mod_target = Residue(aa="Y", loc=546, state=State("phosphorylation", False)),
mod_value = True
)
inters.append(m)

#ALK phosphorylates PTPN11 on Y584
kin_gene = Gene("Q9UM73", hgnc_symbol="ALK")
ptyr_gene = Gene("Q06124", hgnc_symbol="PTPN11")
ptyr_resi = Residue(aa="Y", loc=584, state=State("phosphorylation", True))
m = Modification(
enzyme = RegionActor(gene=kin_gene, region=Region(name="Tyr_kin")),
substrate = SiteActor(gene=ptyr_gene, site=Site(name="Tyr584", start=584, end=584)),
mod_target = Residue(aa="Y", loc=584, state=State("phosphorylation", False)),
mod_value = True
)
inters.append(m)

hierarchy = KamiHierarchy()
create_nuggets(inters, hierarchy, anatomize=False)

kstudio = hierarchy.get_studio_v1()
outfile = open('bind-match-noanato.json', 'w')
json.dump(kstudio, outfile, indent=4, sort_keys=False)

actions-noanato
actions-withanato

@eugeniashurko
Copy link
Contributor

Hi! For the moment, 'semantic tagging' (SH2, protein kinase) is performed on the anatomization stage, where I actually identify regions from a nugget to the regions found by the anatomizer (by range/name/order etc). In the anatomization I assign a semantic tag to the region according to its IPR id, see below:

if "IPR000719" in domain.ipr_ids:
    semantic_relations[region_id].add("protein_kinase")
   ....
if "IPR000980" in domain.ipr_ids:
                        semantic_relations[region_id].add("sh2_domain")
    ....

Why semantics of mod example doesn't work even with anatomization on?
The reason is that you have created a region Region(name="Tyr_kin") and having only this information, there is no way for the black box to know that what you actually meant is the region with name 'Ser-Thr/Tyr_kinase_cat_dom Prot_kinase_dom Tyr_kinase_cat_dom' in the range 242-493 that was found by the anatomizer. (For example, if you specify a range or at least call it 'Tyr_kinase' it works, because the black box looks if the region obtained from the anatomization contains the name of the region from the nugget).

Could we make semantic tagging even if anatomizer is off?
Here we have the following options:

  • I case of SH2 bnd, I can try to assign semantic tag to a region if its name contains 'SH2' (which I am not sure will always be correct), I can also do the same for the protein kinase (though for which name I should look: 'protein kinase', 'Tyr_kin', 'pro kin', 'kinase', the list of them can continue...), but it seems to me more of a hack, than a real solution.

  • We can agree on some KAMI-convention of naming regions if you mean some regions that should be assigned a tag (like there should be a word 'kinase' in the name or smth like this).

  • The last option is to add some parameter to the Region KAMI-entity called semantics, where you can manually specify region's semantics (from the list of reserved key-words that the black box would know).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants