Some applications are dropping trace spans with errors like "multiple errors
during transform" and "span too large to send". I suspect this may be a
consequence of a workaround in
#2663 that
limits MaxPacketSize
to 1472 so that Thrift packets can be sent over the
wire. These affected Thrift packets are dropped due to this error resulting in
a missing or incomplete trace in Jaeger UI.
I suspect this reduced size is also limiting the max span payload as it doesn't appear to further subdivide the content.
An example error scenario where the span contains some attributes and logs, but nothing excessive, IMHO.
2022/03/21 15:15:17 multiple errors during transform: span too large to send: Span({TraceIdLow:-1959931804685451170 TraceIdHigh:9142211311457415581 SpanId:3501129510158026876 ParentSpanId:-3699454375769775617 ... [very long dump truncated by me]
This was unreadable, so I pulled the source and added a JSON dump. Here's one such result from within the "multiple errors during transform" error:
span too large to send:{
"traceIdLow": 8238679333188071647,
"traceIdHigh": 6973323051039728651,
"spanId": 1844865607382183757,
"parentSpanId": -1272813624660893053,
"operationName": "github.com/mailgun/turret/v2/client/golang.(*transaction).Close",
"flags": 1,
"startTime": 1647890117472575,
"duration": 7196,
"tags": [
{
"key": "file",
"vType": "STRING",
"vStr": "/Users/SPoulson/src/turret/client/golang/client.go:228"
},
{
"key": "otel.library.name",
"vType": "STRING",
"vStr": "github.com/mailgun/turret/v2"
},
{
"key": "error",
"vType": "BOOL",
"vBool": true
},
{
"key": "otel.status_code",
"vType": "STRING",
"vStr": "ERROR"
},
{
"key": "otel.status_description",
"vType": "STRING",
"vStr": "code:250 message:\"OK\" utf8_enabled:true mx_host:\"10.5.0.2\" secure:true smtp_log:\"19:15:17.475 0s \u003e- {19:15
:17.468, #0, 0}\\n19:15:17.476 0s ** age=8.9931005s, sessionCount=9\\n19:15:17.476 0s \u003c- {#0}\\n19:15:17.476 0s -\u003e M
AIL FROM:\u003csender@example.com\u003e BODY=8BITMIME SMTPUTF8\\n19:15:17.477 1ms -\u003c 250 Sender address accepted\\n19:15:17.477
1ms -\u003e RCPT TO:\u003crecipient@example.com\u003e\\n19:15:17.478 2ms -\u003c 250 Recipient address accepted\\n19:15:17.478 2ms -\
u003e DATA\\n19:15:17.479 3ms -\u003c 354 Continue\\n19:15:17.480 4ms \u003e- {19:15:17.472, #1, 18}\\n19:15:17.480 4ms \u003e- {
19:15:17.472, #2, 0, last}\\n19:15:17.481 5ms \u003c- {#1}\\n19:15:17.483 7ms -\u003c 250 Great success\\n19:15:17.484 8ms \u003c
- {#2, last}\\n\" mx_host_ip:\"10.5.0.2\" tls_version:772 tls_cipher_suite:4865"
}
],
"logs": [
{
"timestamp": 1647890117479772,
"fields": [
{
"key": "event",
"vType": "STRING",
"vStr": "exception"
},
{
"key": "exception.type",
"vType": "STRING",
"vStr": "*errors.fundamental"
},
{
"key": "exception.message",
"vType": "STRING",
"vStr": "code:250 message:\"OK\" utf8_enabled:true mx_host:\"10.5.0.2\" secure:true smtp_log:\"19:15:17.475 0s \u003e- {19:15:17.468, #0, 0}\\n19:15:17.476 0s ** age=8.9931005s, sessionCount=9\\n19:15:17.476 0s \u003c- {#0}\\n19:15:17.476 0s -\u003e MAIL FROM:\u003csender@example.com\u003e BODY=8BITMIME SMTPUTF8\\n19:15:17.477 1ms -\u003c 250 Sender address accepted\\n19:15:17.477 1ms -\u003e RCPT TO:\u003crecipient@example.com\u003e\\n19:15:17.478 2ms -\u003c 250 Recipient address accepted\\n19:15:17.478 2ms -\u003e DATA\\n19:15:17.479 3ms -\u003c 354 Continue\\n19:15:17.480 4ms \u003e- {19:15:17.472, #1, 18}\\n19:15:17.480 4ms \u003e- {19:15:17.472, #2, 0, last}\\n19:15:17.481 5ms \u003c- {#1}\\n19:15:17.483 7ms -\u003c 250 Great success\\n19:15:17.484 8ms \u003c- {#2, last}\\n\" mx_host_ip:\"10.5.0.2\" tls_version:772 tls_cipher_suite:4865"
}
]
}
]
}
- OS: Linux and MacOS 12.2
- Architecture: amd64
- Go Version: 1.17
- opentelemetry-go version: 1.4.1
- Checkout repo: https://github.com/Baliedge/otel-span-too-large
- Run
make run JAEGER_AGENT_HOST=<your_jaeger_agent>
- See error described above.
- Compare with
make run JAEGER_AGENT_HOST=localhost
, which does not generate this error.
Expect trace in Jaeger UI to display all span details generated by code.