Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for configuring Gelf encoders in Monolog configuration #492

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Tib-z
Copy link

@Tib-z Tib-z commented Nov 3, 2024

This commit introduces the ability to specify custom encoders (json and compressed_json) for Gelf publishers in Monolog's configuration.

Overview
This PR introduces an enhancement to the Symfony Monolog Bundle by adding a configuration option for setting the message encoder for the GELF handler's publisher. This change addresses a backward compatibility issue introduced in the 2.0 update of the graylog2/gelf-php package, which shifted the default message encoder from CompressedJsonEncoder to JsonEncoder. Thus an upgrade silently breakes the logging.

bzikarsky/gelf-php@f2625cf#diff-9520be6ea98f5ded11cbe62c500f1a8fef819810d8836d2b15189cdbc073900cR32
bzikarsky/gelf-php@f2625cf#diff-72a9d13c23b7d615833219f0dfc92be317d9a4981de7b4747e287db62aaa1079L69

In a Symfony project, to override this default behavior, users had to manually construct and define the transport, publisher, and encoder as a service and then use it in the Monolog configuration. This approach, while functional, is cumbersome. It would be more user-friendly to enable setting the encoder via a simple configuration option. The proposed change would be:

            gelf:
                type: gelf
                level: debug
                publisher:
                    hostname: '%env(resolve:LOGSTASH_HOSTNAME)%'
                    port: '%env(resolve:LOGSTASH_PORT)%'
                    encoder: compressed_json  # Optional: 'json' or 'compressed_json'

This new encoder configuration is optional and does not have an explicit default value, thereby preserving the existing behavior (albeit the recent default may not align with prior expectations).

While it might be tempting to expose more variability (like using class names or service IDs directly), overly customizable options may not add practical value and could complicate the configuration unnecessarily, 'coz YAGNI :)

Tibor Rac added 2 commits November 3, 2024 11:39
This commit introduces the ability to specify custom encoders (`json` and `compressed_json`) for Gelf publishers in Monolog's configuration.
…ExistsCheck

Replaced instances of \stdClass in class_alias calls with DummyClassForClassExistsCheck as before php 8.3 only user created class can be aliased.
@Tib-z
Copy link
Author

Tib-z commented Nov 3, 2024

Well class_alias(\stdClass::class, 'Gelf\Transport\UdpTransport'); in the tests obviously does not work in <8.3 😓
Generally, workarounds involving class existence checks are hacky, but I wanted to ensure the tests provided meaningful coverage.

To address the compatibility issue, I've added a simple dummy class as a workaround. Another option could be creating the class dynamically with eval, though that also smells... A potential refactor might involve wrapping class_exists checks and mocking them, but I'm unsure if the benefit would justify the effort.

Let me know your thoughts

Copy link
Member

@derrabus derrabus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR. Can you elaborate a bit more why I should care about which encoder is used? If the protocol allows both and thus a backend implementing the protocol should support both, I don't understand why it makes a difference for my app.

I also wonder how common your problem is. If this only affects a handful of projects, we could recommend to configure the publisher service manually and configure a service ID instead of connection details.

DependencyInjection/Configuration.php Outdated Show resolved Hide resolved
Comment on lines +54 to +58
* - id: string, service id of a publisher implementation, optional if hostname is given
* - hostname: string, optional if id is given
* - [port]: int, defaults to 12201
* - [chunk_size]: int, defaults to 1420
* - [encoder]: string, its value can be 'json' or 'compressed_json'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before your change, it was clear that I should either provide a service ID or configure the connection by specifying the hostname, port etc. The host name is not "optional if id is given" as you wrote, it's ignored entirely if a service ID has been configured.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I like how it was before, since it is clear and sound, however I it lacks details about the options (type, enum options, defaults, etc) and I wanted to provide more verbose information especially for the new option as it is has predefined values. btw I tried to keep consistent with the other docs, this comment is reused from the mongo section. 😃

Would you add comment just like you said "ignored when service ID has been configured" to all of these options?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 * - publisher: (one of the following configurations)
 *      # Option 1: Service-based configuration
 *      - id: string, service id of a publisher implementation
 *      
 *      # Option 2: Direct connection configuration
 *      - hostname: string, server hostname
 *      - [port]: int, server port (default: 12201)
 *      - [chunk_size]: int, UDP packet size (default: 1420)
 *      - [encoder]: string, encoding format ('json' or 'compressed_json')

WDYT about this approach?

@Tib-z
Copy link
Author

Tib-z commented Nov 4, 2024

In our infra, in the logstash pipeline comporessed json is a contraint, does not work without comporession, so the change in the gelf package broke the logging. Likely it's a specific setup, not sure exactly how is configured, I'll try to figure out (it's not maintained by me/my team) and see if it could affect others as well.

As the encoder is a dependency and straightforward to change it, it was logical move to expose it as a monolog config option and make the implementation easier. In a microservice architecture with 30-40 services, this approach significantly reduces the need for repetitive boilerplate service definitions in favor of a single configuration option. That said, I understand your point about using the service ID workaround as an alternative.

@derrabus
Copy link
Member

derrabus commented Nov 4, 2024

That makes perfect sense. Thank you for the explanation.

public function testExceptionWhenUsingLegacyGelfImplementationWithUnsupportedEncoder(): void
{
if (!class_exists('Gelf\MessagePublisher')) {
class_alias(DummyClassForClassExistsCheck::class, 'Gelf\MessagePublisher');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing such class aliases is a hack, and won't work properly. If the testExceptionWhenUsingGelfWithInvalidEncoder test is run first, this test will never reach the code path guarded by class_exists('Gelf\MessagePublisher') (because the actual guard to reach that code path is !class_exists('Gelf\Transport\UdpTransport') && class_exists('Gelf\MessagePublisher') and you cannot remove the other alias).

Tests should be written against the installed dependencies, without such hacks, and skipped in case they are not relevant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback; I completely agree. While this approach is hackish, it did feel wrong that the changes couldn’t be properly tested. I initially used these as part of my testing process and I left them in the PR...

As I mentioned in my previous comment:

Generally, workarounds involving class existence checks are hacky, but I wanted to ensure the tests provided meaningful coverage.

I'll remove them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants