-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should unspecified optional strings be the empty string or NaN? #25
Comments
in general, all of '', 'nan' (, 'NaN', ...) should be allowed as input in the dataframe and interpreted as "Nothing". I am not sure why the observable name is read in as a NaN here, I would have thought the column is interpreted as string. But apparently not ... . The question would be where to put this conversion. At the moment, we perform little value checking when reading in the csv files in petab, so one could do it in AMICI, or here but then also for other potentially problematic columns ... preferences @dweindl @LeonardSchmiester ? |
The columns are correctly identified as >>> df1.dtypes
observableId object
observableName object
dtype: object
>>> df3 = pd.read_csv('test_str.csv', sep='\t', dtype={'observableName': str})
>>> df3.dtypes
observableId object
observableName object
dtype: object
>>> df3
observableId observableName
0 a_id a_name
1 b_id NaN
One way would be to replace PARAMETER_DF_OPTIONAL_COLS = [
PARAMETER_NAME, NOMINAL_VALUE,
INITIALIZATION_PRIOR_TYPE, INITIALIZATION_PRIOR_PARAMETERS,
OBJECTIVE_PRIOR_TYPE, OBJECTIVE_PRIOR_PARAMETERS] at https://github.com/PEtab-dev/PEtab/blob/master/petab/C.py#L78 with PARAMETER_DF_OPTIONAL_COLS_STR = [
PARAMETER_NAME,
INITIALIZATION_PRIOR_TYPE, INITIALIZATION_PRIOR_PARAMETERS,
OBJECTIVE_PRIOR_TYPE, OBJECTIVE_PRIOR_PARAMETERS]
PARAMETER_DF_OPTIONAL_COLS += [NOMINAL_VALUE] then insert parameter_df[PARAMETER_DF_OPTIONAL_COLS_STR] = \
parameter_df[PARAMETER_DF_OPTIONAL_COLS_STR].fillna('') here Similarly for observables/other tables. |
One option would be to give explicit data types for each column in Otherwise, checking everything with |
fixed in AMICI-dev/AMICI@6dfac95 although I really don't think this should be fixed in amici. Ensuring that names are strings would be a good start and shouldn't be too much trouble. |
At the moment, there can be
NaN
s (afterpd.read_csv
) in optional PEtab string columns, such asobservableNames
, that, if interpreted as a string, are converted to the string literal'nan'
.An issue can occur in the AMICI plotting functions. This issue can be fixed by replacing
with
to correctly identify unspecified observable names. However, testing for the string
'nan'
seems unintuitive, and this fix might cause another issue if an observable is named'nan'
.Here's a solution, which could be implemented in PEtab, and might resolve the issue in AMICI.
The text was updated successfully, but these errors were encountered: