-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mongoose spans are getting attached to wrong parent transaction #3161
Comments
@sefiros1 Thanks for the excellent report. I can repro. I hope to look into this next week. I'm not a mongoose/mongodb user. Do you happen to know off hand if there is any built in debugging mode for mongoose that could log what it is doing internally? No worries if not. One possibility is that mongoose is using some internal queue that is grouping up / batching multiple Mongo queries to be run under one asynchronous context. (The "background" section of #2498 describes what I mean.) If so, then I'll have to see if that is something we could instrument reasonably with the APM agent. |
Note to self: This might be the same issue that I was hitting when attempting to validate that we instrument mongoose >=5.7 correct in this PR: #2547 |
@trentm Thanks for reply!
I have never used the mongoose/mongodb debug tools, but this might help: // all executed methods log output to console
mongoose.set('debug', true)
// disable colors in debug mode
mongoose.set('debug', { color: false })
// get mongodb-shell friendly output (ISODate)
mongoose.set('debug', { shell: true }) |
Using the setup explained in this issue I did perform tests on several versions. Those tests confirm @trentm suspicions about internal optimisations #3161 (comment). MY guess is that commands may be sent in batches The issue is on the driver level. The Code used for testing mongodb require('elastic-apm-node').start({
serviceName: 'example-trace-mongodb-concurrent',
logUncaughtExceptions: true
})
const http = require('http')
const MongoClient = require('mongodb').MongoClient
const DB_NAME = 'example-trace-mongodb-concurrent'
const url = 'mongodb://localhost:27017'
async function bootstrap () {
const client = new MongoClient(url)
try {
await client.connect()
const database = client.db(DB_NAME)
const catsCollection = database.collection('cats')
const server = http.createServer(function onRequest (req, res) {
console.log('incoming request: %s %s %s', req.method, req.url, req.headers)
req.resume()
req.on('end', async function () {
const pathname = req.url
let resBody = ''
if (pathname === '/create') {
catsCollection.insertOne({ name: 'kitty' })
resBody = 'Meow'
} else if (pathname === '/getAll') {
resBody = JSON.stringify(await catsCollection.find().toArray())
}
// Then reply to the incoming request.
res.writeHead(200, {
server: 'example-trace-http',
'content-type': 'text/plain',
'content-length': Buffer.byteLength(resBody)
})
res.end(resBody)
})
})
server.listen(3000, function () {
console.log('listening on port 3000')
})
server.on('close', async function () {
console.log('closing DB connection')
await client.close()
})
} catch (err) {
console.log('bootstrap error', err)
}
}
bootstrap() |
More tests reveal the change happens between v3 and v4 (when they removed with latest v3 we get the following
We can see the Version v4.0.0 seems to use the queue also for the
|
The async behaviour has been there for a while since the ConnectionPool Class is using Not sure if there is a way to make the connection available in a sync way through configuration but even if we achieve it I believe this will not cover all use cases for |
The previous comment was not complete. Cursors are also involved since they make use of async iterators for their operations. This methods are internal and there is not an easy way to ensure we got the right context when they execute. Elastic APM is not the only affected. I've used OpenTelemetry to check by creating this tracer // File: otel-manual-instrumentation.js
// OpenTelemetry
const { Resource } = require('@opentelemetry/resources')
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions')
const { ConsoleSpanExporter } = require('@opentelemetry/sdk-trace-base')
const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base')
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node')
const { trace } = require('@opentelemetry/api')
// instrumentations
const { MongoDBInstrumentation } = require('@opentelemetry/instrumentation-mongodb')
const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http')
const { registerInstrumentations } = require('@opentelemetry/instrumentation')
// Exporter
module.exports = (serviceName) => {
const exporter = new ConsoleSpanExporter()
const provider = new NodeTracerProvider({
resource: new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: serviceName
})
})
provider.addSpanProcessor(new SimpleSpanProcessor(exporter))
provider.register()
registerInstrumentations({
instrumentations: [
new HttpInstrumentation(),
new MongoDBInstrumentation({
enhancedDatabaseReporting: true
})
],
tracerProvider: provider
})
return trace.getTracer(serviceName)
} and replacing our instrumentation in the code above for const tracer = require('./otel-manual-instrumentation')('example-trace-mongodb-http') Sadly the results are similar to Elastic APM, we got some HTTP spans with no MongoDB child spans and others have several. This is the output from OTel
Elastic APM is relying on MongoClient events API to do instrumentation but it has the same issue, the event is fired in a different context. In conclusion we need to change the instrumentation but without going too deep in internal implementations details since they may change in the near future. |
Another test reveals the tracking works if the client is not shared across requests. Making a change to the original test code (moving client creation inside the request handler) produces the desired output. the code require('../').start({
serviceName: 'example-trace-mongodb-concurrent',
logUncaughtExceptions: true,
});
const http = require('http');
const MongoClient = require('mongodb').MongoClient;
const DB_NAME = 'example-trace-mongodb-concurrent';
const url = 'mongodb://localhost:27017';
async function bootstrap() {
try {
const server = http.createServer(async function onRequest(req, res) {
// NOTE: a client is created on each request
const client = await MongoClient.connect(url);
const database = client.db(DB_NAME);
const catsCollection = database.collection('cats');
req.resume();
req.on('end', async function () {
const pathname = req.url;
let resBody = '';
if (pathname === '/create') {
catsCollection.insertOne({ name: 'kitty' });
resBody = 'Meow';
} else if (pathname === '/getAll') {
resBody = JSON.stringify(
await catsCollection.find({ name: 'kitty' }).toArray(),
);
}
// Then reply to the incoming request.
res.writeHead(200, {
server: 'example-trace-http',
'content-type': 'text/plain',
'content-length': Buffer.byteLength(resBody),
});
await client.close();
res.end(resBody);
});
});
server.listen(3000, function () {
console.log('linstening on port 3000');
});
server.on('close', async function () {
console.log('closing DB conneciton');
});
} catch (err) {
console.log('bootstrap error', err);
}
}
bootstrap(); When sending 30 concurrent requests ab -n 30 -c 30 http://localhost:3000/getAll Gives the following traces
|
Took a while to get to the root cause but we could confirm our assumptions about batching. We prepared a repo which reproduces the issue for our agent and Opentelemtry https://github.com/david-luna/node-mongodb-native-async-resource And I've created an issue in MongoDB's issue tracker with a fix proposal and a PR (draft for now) For now I'd say let's wait for their response |
Describe the bug
I'm using Koa with MongoDB via mongoose. With multiple simultaneous requests, mongoose spans begin to attach to the wrong parents. There is no such problem with one simultaneous request.
To Reproduce
Steps to reproduce the behavior:
node index.js
):In kibana you should see normal traces for first test (step 3), but after second test (step 4) you will see
broken traces:
some with no mongo spans at all
some with too many spans
Expected behavior
Spans should be attached to their parents in test with simultaneous requests.
Environment (please complete the following information)
How are you starting the agent? (please tick one of the boxes)
agent.start()
directly (e.g.require('elastic-apm-node').start(...)
)elastic-apm-node/start
from within the source code-r elastic-apm-node/start
Additional context
Agent config options:
Click to expand
Tried with multiple configurations:package.json
dependencies:Click to expand
The text was updated successfully, but these errors were encountered: