Why iterrowmapmany convert each row to Record instance ? #623

jossefaz · 2022-06-21T10:38:27Z

In this method :

petl/petl/transform/maps.py

Line 309 in 0be2735

def iterrowmapmany(source, rowgenerator, header, failonerror):

Each row is converted to a Record instance.

petl/petl/transform/maps.py

Line 314 in 0be2735

it = (Record(row, flds) for row in it)

In my usecase, my "rowgenerator" helper function function do need a named tuple and not a plain row as an input. This is a great convenience to call named attribute instead of unclear row[5] - "index notation".
For that purpose I tried to use rowmapmany in this way :

etl.rowmapmany(etl.namedtuples(my_table), rowgenerator=mapper, header=headers)

I thought that using namedtuples will solve my issue (because my row has more than 100 columns, so it is a bit hard to use indexes i.e row[57] where a named tuple could simply gives me the convenience of row.my_target_attribute.

But because of this conversion to Record instance, the input will convert each namedtuple to a plain list of values which is a bit frustrating, since it forces us to use the indexes notation in the mapper function (very hard to read).

When I remove this line

petl/petl/transform/maps.py

Line 314 in 0be2735

it = (Record(row, flds) for row in it)

It works like a charm....
Why this Record conversion is important ?
If it is not, could we remove it from the iterrowmapmany method ?

Please help 🙏

The text was updated successfully, but these errors were encountered:

jossefaz · 2022-06-21T11:14:48Z

Another reason to not convert to a Record : using nameduple as input for the rowmapper, unleash us from any order binding... accessing property in the mapper will be by name and not by position.

So no matter what are the order of the field in the input source, the mapper will work as expected, even if the field order changed between two input that have the same output target.

bmaggard · 2022-06-21T11:17:08Z

https://petl.readthedocs.io/en/latest/util.html#petl.util.base.records

"a record is a hybrid object supporting all possible ways of accessing values."

The examples for rowmapmany demonstrate this:

https://petl.readthedocs.io/en/latest/transform.html#petl.transform.maps.rowmapmany
`

def rowgenerator(row):
... transmf = {'male': 'M', 'female': 'F'}
... yield [row[0], 'gender',
... transmf[row['sex']] if row['sex'] in transmf else None]
... yield [row[0], 'age_months', row.age * 12]
... yield [row[0], 'bmi', row.height / row.weight ** 2]
...
table2 = etl.rowmapmany(table1, rowgenerator,
... header=['subject_id', 'variable', 'value'])

`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why iterrowmapmany convert each row to Record instance ? #623

Why iterrowmapmany convert each row to Record instance ? #623

jossefaz commented Jun 21, 2022 •

edited

Loading

jossefaz commented Jun 21, 2022 •

edited

Loading

bmaggard commented Jun 21, 2022

Why iterrowmapmany convert each row to Record instance ? #623

Why iterrowmapmany convert each row to Record instance ? #623

Comments

jossefaz commented Jun 21, 2022 • edited Loading

jossefaz commented Jun 21, 2022 • edited Loading

bmaggard commented Jun 21, 2022

jossefaz commented Jun 21, 2022 •

edited

Loading

jossefaz commented Jun 21, 2022 •

edited

Loading