Replies: 3 comments
-
Hmm .. not sure if it is a good idea to mix aggregation and filtering. I am not a big fan of filtering flows while collecting - however, I see the point of dropping unwanted stuff. I'd rather would do it in post processing. First of all, I need to update the filter engine to work on the new record format without unpacking the record first. It's a question of efficiency and performance. A possible solution I would see is to extend the filter with a function to zero out elements, such as |
Beta Was this translation helpful? Give feedback.
-
Maybe my description was not exactly clear.... Unfortunately filtering will filter out (hide) flows, while I mean to aggregate. For instance, I have a DNS server which is running on port 53, and I have 10000 requests from single host - each of them will be recorded as separate flow, while with port aggregation it will be just one record - as all srcport will be set to 0 where dstport is 53. This way, I will be able to see all traffic (volume) from a single IP but without the need to keep all flows - filtering wouldn't allow me to do so in one pass. To illustrate, if I want detailed stats for DNS/HTTP/HTTPS and just totals for anything else, I have to:
With port aggregation this could be done in just one pass, producing maximum 3 records per srcip/dstip/protocol. Filtering rule that you suggest will solve this, of course. As to filtering while collecting - it is a great way to reduce noise - for instance during scans/DoS I easily could get few millions flows/sec - storing all this on disk becomes a challenge, especially if this is a RAM disk. But without the option to dynamically change the filtering (without restarting the collector) rules this has little application. |
Beta Was this translation helpful? Give feedback.
-
And while we are talking about zeroing-while-filtering... probably a bit more generic approach would allow to aggregate on src/dst ip/net as well - also quite useful if you have a big network and want to have a better reports, for instance:
If we assume that the "first match wins," this method helps to create nice reports by putting together the interesting parts and hiding everything else (like all external traffic) behind single IP. While all this can be easily done on post-processing, this usually involves unnecessary work - nfdump converts binary data into text then post-processor parses it back to binary to apply additional filtering/aggregation - unless this done in C/C++ it could be a bit slow. Certainly adding this to the filter language is a bit of work (and even may be confusing if implemented). Probably dedicated aggregation rules with simpler syntax could be a better approach. |
Beta Was this translation helpful? Give feedback.
-
Busy servers (DNS and web in particular) produce a lot of flows - literally every new request creates new flow and the amount could be really huge. When we are not interested in such details, it makes sense to aggregate flows only for specific ports.
The idea is simple: allow aggregation by port numbers, like:
-A srcip,dstip,proto,ports:80:443:900-999
With such specification, for all protocols which do have port numbers the port number which is not matching specification is zeroed - either in srcport and/or dstport.
Use case: this allows to significantly reduce number of flows on aggregation when we want to collect a bit detailed stats on specific app port but don't really care about details for other ports.
It could even be done during collection time so we don't have to store all irrelevant data in the first place (unfortunately not all collectors allow filtering or aggregation).
Does it sound as a good idea? :)
Beta Was this translation helpful? Give feedback.
All reactions