Skip to content
This repository has been archived by the owner on Oct 2, 2023. It is now read-only.

to discuss: logging #12

Open
simonwoerpel opened this issue Dec 14, 2022 · 1 comment
Open

to discuss: logging #12

simonwoerpel opened this issue Dec 14, 2022 · 1 comment
Labels
question Further information is requested

Comments

@simonwoerpel
Copy link
Contributor

Some parsers log progress (e.g. every X parsed rows) and information about parsed files, some others do nothing. We should define a pattern and apply to all parsers.

Suggestion (as seen in the corpwatch parser):

  • log file name on start of parsing a specific file
  • then log every 100.000 rows/records
  • log finished count of records for this file
  • repeat
    This allows to keep track how source file sizes develop over time and to spot data source errors.
@simonwoerpel simonwoerpel added the question Further information is requested label Dec 14, 2022
@pudo
Copy link
Contributor

pudo commented Dec 19, 2022

One aspect to consider here: GitHub Actions has a tendency to kill jobs that don't print any output for an extended period. So we should output something, every now and then.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants