Does streamed output make sense? #7

hoijui · 2022-11-08T13:41:25Z

Basically, this is a CLI tool which:

scans files
collects info/filters
and writes it to 3 output/log files (that could become a multiple of 3, later on).

The input is usually many files (let's say: 100), which are scanned in sequence,
and maybe 3% of the input lines are filtered out and written to the 3 output files.
The question is, which of these methods should I use to do it:

bulk scan & write in the end:
Scan and filter all input files, storing the selected data in a variable (in memory),
and in the end: write everything out to the output files at once.
scan input, filter it constantly while reading, and write out to the output whenever something was selected for output; continuously.

I like the second option much more, as it uses less memory and is a stream based approach,
so output could start appearing right when starting to scan input. The question is, whether this potentially decreases overall performance, because we always switch between reading input and writing to one of the 3 output files.

I do not expect the higher memory usage of method 1 to ever be a problem,
and I am not sure how often the stream-approach of method 2 is really an advantage in practice.
I do know that file-system access is the main performance issue of this software,
as this is generally the case,
but also because the computation done here is very minimal.

Maybe I need not worry, and the OS/buffering is going to handle the second method (stream based) just fine?

For now, the tool will be run max 100 times a day, globally,
with ~1MB of input text for each run.
So it is not very critical either way,
but I came across this issue a few times already,
and would like to tackle it and be over with it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does streamed output make sense? #7

Does streamed output make sense? #7

hoijui commented Nov 8, 2022

Does streamed output make sense? #7

Does streamed output make sense? #7

Comments

hoijui commented Nov 8, 2022