Skip to content

Baliedge/otel-span-too-large

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reproduce Error "span too large to send" in OpenTelemetry

Some applications are dropping trace spans with errors like "multiple errors during transform" and "span too large to send". I suspect this may be a consequence of a workaround in #2663 that limits MaxPacketSize to 1472 so that Thrift packets can be sent over the wire. These affected Thrift packets are dropped due to this error resulting in a missing or incomplete trace in Jaeger UI.

I suspect this reduced size is also limiting the max span payload as it doesn't appear to further subdivide the content.

An example error scenario where the span contains some attributes and logs, but nothing excessive, IMHO.

2022/03/21 15:15:17 multiple errors during transform: span too large to send: Span({TraceIdLow:-1959931804685451170 TraceIdHigh:9142211311457415581 SpanId:3501129510158026876 ParentSpanId:-3699454375769775617 ... [very long dump truncated by me]

This was unreadable, so I pulled the source and added a JSON dump. Here's one such result from within the "multiple errors during transform" error:

span too large to send:{
  "traceIdLow": 8238679333188071647,
  "traceIdHigh": 6973323051039728651,
  "spanId": 1844865607382183757,
  "parentSpanId": -1272813624660893053,
  "operationName": "github.com/mailgun/turret/v2/client/golang.(*transaction).Close",
  "flags": 1,
  "startTime": 1647890117472575,
  "duration": 7196,
  "tags": [
    {
      "key": "file",
      "vType": "STRING",
      "vStr": "/Users/SPoulson/src/turret/client/golang/client.go:228"
    },
    {
      "key": "otel.library.name",
      "vType": "STRING",
      "vStr": "github.com/mailgun/turret/v2"
    },
    {
      "key": "error",
      "vType": "BOOL",
      "vBool": true
    },
    {
      "key": "otel.status_code",
      "vType": "STRING",
      "vStr": "ERROR"
    },
    {
      "key": "otel.status_description",
      "vType": "STRING",
      "vStr": "code:250  message:\"OK\"  utf8_enabled:true  mx_host:\"10.5.0.2\"  secure:true  smtp_log:\"19:15:17.475      0s \u003e- {19:15
:17.468, #0, 0}\\n19:15:17.476      0s ** age=8.9931005s, sessionCount=9\\n19:15:17.476      0s \u003c- {#0}\\n19:15:17.476      0s -\u003e M
AIL FROM:\u003csender@example.com\u003e BODY=8BITMIME SMTPUTF8\\n19:15:17.477     1ms -\u003c 250 Sender address accepted\\n19:15:17.477
1ms -\u003e RCPT TO:\u003crecipient@example.com\u003e\\n19:15:17.478     2ms -\u003c 250 Recipient address accepted\\n19:15:17.478     2ms -\
u003e DATA\\n19:15:17.479     3ms -\u003c 354 Continue\\n19:15:17.480     4ms \u003e- {19:15:17.472, #1, 18}\\n19:15:17.480     4ms \u003e- {
19:15:17.472, #2, 0, last}\\n19:15:17.481     5ms \u003c- {#1}\\n19:15:17.483     7ms -\u003c 250 Great success\\n19:15:17.484     8ms \u003c
- {#2, last}\\n\"  mx_host_ip:\"10.5.0.2\"  tls_version:772  tls_cipher_suite:4865"
    }
  ],
  "logs": [
    {
      "timestamp": 1647890117479772,
      "fields": [
        {
          "key": "event",
          "vType": "STRING",
          "vStr": "exception"
        },
        {
          "key": "exception.type",
          "vType": "STRING",
          "vStr": "*errors.fundamental"
        },
        {
          "key": "exception.message",
          "vType": "STRING",
          "vStr": "code:250  message:\"OK\"  utf8_enabled:true  mx_host:\"10.5.0.2\"  secure:true  smtp_log:\"19:15:17.475      0s \u003e- {19:15:17.468, #0, 0}\\n19:15:17.476      0s ** age=8.9931005s, sessionCount=9\\n19:15:17.476      0s \u003c- {#0}\\n19:15:17.476      0s -\u003e MAIL FROM:\u003csender@example.com\u003e BODY=8BITMIME SMTPUTF8\\n19:15:17.477     1ms -\u003c 250 Sender address accepted\\n19:15:17.477     1ms -\u003e RCPT TO:\u003crecipient@example.com\u003e\\n19:15:17.478     2ms -\u003c 250 Recipient address accepted\\n19:15:17.478     2ms -\u003e DATA\\n19:15:17.479     3ms -\u003c 354 Continue\\n19:15:17.480     4ms \u003e- {19:15:17.472, #1, 18}\\n19:15:17.480     4ms \u003e- {19:15:17.472, #2, 0, last}\\n19:15:17.481     5ms \u003c- {#1}\\n19:15:17.483     7ms -\u003c 250 Great success\\n19:15:17.484     8ms \u003c- {#2, last}\\n\"  mx_host_ip:\"10.5.0.2\"  tls_version:772  tls_cipher_suite:4865"
        }
      ]
    }
  ]
}

Environment

  • OS: Linux and MacOS 12.2
  • Architecture: amd64
  • Go Version: 1.17
  • opentelemetry-go version: 1.4.1

Steps To Reproduce

  1. Checkout repo: https://github.com/Baliedge/otel-span-too-large
  2. Run make run JAEGER_AGENT_HOST=<your_jaeger_agent>
  3. See error described above.
  4. Compare with make run JAEGER_AGENT_HOST=localhost, which does not generate this error.

Expected behavior

Expect trace in Jaeger UI to display all span details generated by code.