You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I took the code used in a previous thread: #82 and now wish to modify it so that I can assign a value to the previously empty 'score' value. I have some datases off of GEO that have a number in the 4th column which I believe corresponds to the score as opposed to the name of the feature. I have reflected this format in creating "b_test.bed" in the code below. I would like to take that 'feature.name' and make it the 'feature.score', and then rename the 'feature.name' as I was doing before in the previous issue I alluded to.
I have created a function 'is_number()' to determine if the given string is a number.
When I run this code I receive a segmentation fault, which is also what occurs if I simply try to set feature.score equal to a numerical string (such as '1').
Do you have any insight in to this problem? Your time and effort are appreciated.
import pybedtools
# Create two example files so that this example is self-contained.
pybedtools.BedTool('''
chr2L 20 30 TestD1 23.23
chr2L 45 60 TestD2 24.24''', from_string=True).saveas('a_test.bed')
pybedtools.BedTool('''
chr3L 500 600 11.11
chr3L 900 1000 12.12''', from_string=True).saveas('b_test.bed')
def gen_change_name(fn, GSM):
'''
This generator accepts a filename and a string, and yields
pybedtools.Interval features with names changed according to `GSM` and line
number.
'''
for i, feature in enumerate(pybedtools.BedTool(fn)):
print feature.name
print feature.score
if is_number(feature.name):
if feature.score =='':
feature.score = feature.name
print feature.score
feature.name = GSM + '_{0}'.format(i + 1)
yield feature
def change_name(fn):
'''
This function accepts a filename and creates a new file with changed names.
It returns a BedTool pointing to this new file.
'''
GSM = fn.split('_')[0]
# This is the key: BedTool objects can be created from a generator of
# pybedtools.Interval objects...which is what gen_name_change was designed
# to do.
return pybedtools.BedTool(gen_change_name(fn, GSM))\
.saveas(fn + '.changed')
def is_number(s):
try:
float(s)
return True
except ValueError:
return False
for fn in ('a_test.bed', 'b_test.bed'):
original = pybedtools.BedTool(fn)
changed = change_name(fn)
print 'original', fn
print original
print 'changed', fn
print changed
The text was updated successfully, but these errors were encountered:
Thanks for pointing this out. This happens because the example BED file (saved as b_test.bed) is BED4 format, and pybedtools allocates memory for a 4-item list. One solution is to use pybedtools.featurefuncs.extend_fields to pad it out to 5 fields, and then it should work as expected. This example demonstrates:
importpybedtoolsb=pybedtools.BedTool('''chr3L 500 600 11.11chr3L 900 1000 12.12''', from_string=True).saveas('b_test.bed')
# Just get one interval to work with...i=b[0]
# The following results in a segfault -- `i` does not have enough fields allocated# because the format is BED4:# i.score = i.name# Solution: extend to BED5frompybedtools.featurefuncsimportextend_fieldsi=extend_fields(i, 5)
i.score=i.nameprinti
I'll play around with the cbedtools.Interval.score setter method to get it to allocate enough room in the list (same with name and strand setters) so that things work as expected without the extend_features workaround.
Hi,
I took the code used in a previous thread: #82 and now wish to modify it so that I can assign a value to the previously empty 'score' value. I have some datases off of GEO that have a number in the 4th column which I believe corresponds to the score as opposed to the name of the feature. I have reflected this format in creating "b_test.bed" in the code below. I would like to take that 'feature.name' and make it the 'feature.score', and then rename the 'feature.name' as I was doing before in the previous issue I alluded to.
I have created a function 'is_number()' to determine if the given string is a number.
When I run this code I receive a segmentation fault, which is also what occurs if I simply try to set feature.score equal to a numerical string (such as '1').
Do you have any insight in to this problem? Your time and effort are appreciated.
The text was updated successfully, but these errors were encountered: