-
Notifications
You must be signed in to change notification settings - Fork 0
Benchmarking
Benchmarking should be done with a release build:
./mach build --release
There is a test runner which downloads and runs all the Dromaeo tests in headless mode:
./mach test-dromaeo
Once the tests are downloaded locally, you can run them inside servo:
./target/release/servo tests/dromaeo/dromaeo/web/index.html
Before running individual tests, you need to create the test harness:
cp tests/dromaeo/dromaeo/web/htmlrunner.js tests/dromaeo/dromaeo/
The default test harness runs test once. If you want to run each test more than once (and you probably do) then edit tests/dromaeo/dromaeo/web/htmlrunner.js, for example:
var startTest = parent.startTest || function(){};
var test = parent.test || function(name, fn){
console.log(name);
for (var i=0; i<1000; i++) { fn(); }
};
var endTest = parent.endTest || function(){};
var prep = parent.prep || function(fn){ fn(); };
You can then run individual test pages, for example:
./target/release/servo --exit tests/dromaeo/dromaeo/tests/dom-traverse.html
On a Debian-based system, install the google perftools package:
sudo apt-get install google-perftools
Run servo on a test page with the profiling library:
LD_PRELOAD=/usr/lib/libprofiler.so.0 \
CPUPROFILE=/tmp/servo-cpu.log \
target/release/servo --exit tests/dromaeo/dromaeo/tests/dom-traverse.html
Generate a call graph from the log:
google-pprof --svg --focus=dom target/release/servo \
/tmp/servo-cpu.log > /tmp/servo-cpu.svg
The flag --focus=dom filters the call graph to only show calls involving the dom namespace.
Servo and Firefox running the DOM Core tests:
Speculating about the cases where Firefox is getting significantly better performance:
-
getAttribute: a difference this significant is probably due to JIT optimization in Spidermonkey, resulting in the attribute read being hoisted out of the loop. (See https://github.com/servo/servo/pull/8040).
-
DOM Query: the implementation of HTMLCollection could benefit from caching to avoid traversing the document tree on every access (the problem here is triggering cache invalidation when the DOM is modified). (See https://github.com/servo/servo/issues/1916 and https://github.com/servo/servo/issues/3381).
Caching and cache invalidation are implemented in https://github.com/servo/servo/pull/8227, which gets about a 1000x speed-up on the relevant DOM query tests.
- DOM Traversal: From looking at the call graph, a surprising amount of time is spent in item and len. The issue with len appears to be rooting, since everything else compiles to a field access:
Looking at the generated x86, you can see vector code mixed in with what should really just be a pointer indirection:
gdb target/release/servo
disassemble 'dom::nodelist::_$LT$impl$GT$::len::hfe60577417cf4a8eGC7'
Dump of assembler code for function _ZN3dom8nodelist13_$LT$impl$GT$3len20hfe60577417cf4a8eGC7E:
...
0x0000000000951474 <+228>: mov %r15,%rdi
0x0000000000951477 <+231>: callq 0x62b8d0 <_ZN7raw_vec13_$LT$impl$GT$6double6double21h16855681377121452665E>
0x000000000095147c <+236>: mov 0x10(%r15),%rbp
0x0000000000951480 <+240>: jmpq 0x9513e0 <_ZN3dom8nodelist13_$LT$impl$GT$3len20hfe60577417cf4a8eGC7E+80>
...
End of assembler dump.