Skip to content
Alan Jeffrey edited this page Nov 3, 2015 · 15 revisions

Dromaeo

Running all the tests

Benchmarking should be done with a release build:

./mach build --release

There is a test runner which downloads and runs all the Dromaeo tests in headless mode:

./mach test-dromaeo

Once the tests are downloaded locally, you can run them inside servo:

./target/release/servo tests/dromaeo/dromaeo/web/index.html

Running a single test

Before running individual tests, you need to create the test harness:

cp tests/dromaeo/dromaeo/web/htmlrunner.js tests/dromaeo/dromaeo/

The default test harness runs test once. If you want to run each test more than once (and you probably do) then edit tests/dromaeo/dromaeo/web/htmlrunner.js, for example:

var startTest = parent.startTest || function(){};
var test = parent.test || function(name, fn){ 
  console.log(name);
  for (var i=0; i<1000; i++) { fn(); }
};
var endTest = parent.endTest || function(){};
var prep = parent.prep || function(fn){ fn(); };

You can then run individual test pages, for example:

./target/release/servo --exit tests/dromaeo/dromaeo/tests/dom-traverse.html 

Profiling with google-perftools

On a Debian-based system, install the google perftools package:

sudo apt-get install google-perftools

Run servo on a test page with the profiling library:

LD_PRELOAD=/usr/lib/libprofiler.so.0 \
CPUPROFILE=/tmp/servo-cpu.log \
target/release/servo --exit tests/dromaeo/dromaeo/tests/dom-traverse.html

Generate a call graph from the log:

google-pprof --svg --focus=dom target/release/servo \
/tmp/servo-cpu.log > /tmp/servo-cpu.svg

The flag --focus=dom filters the call graph to only show calls involving the dom namespace.

Core DOM performance

Servo and Firefox running the DOM Core tests:

Dromaeo DOM Core results for Servo Dromaeo DOM Core results for Firefox

Speculating about the cases where Firefox is getting significantly better performance:

Caching and cache invalidation are implemented in https://github.com/servo/servo/pull/8227, which gets about a 1000x speed-up on the relevant DOM query tests.

  • DOM Traversal: From looking at the call graph, a surprising amount of time is spent in item and len. The issue with len appears to be rooting, since everything else compiles to a field access:

Dromaeo dom-traverse call graph

Looking at the generated x86, you can see vector code mixed in with what should really just be a pointer indirection:

gdb target/release/servo
disassemble 'dom::nodelist::_$LT$impl$GT$::len::hfe60577417cf4a8eGC7' 
Dump of assembler code for function _ZN3dom8nodelist13_$LT$impl$GT$3len20hfe60577417cf4a8eGC7E:
...
0x0000000000951474 <+228>:	mov    %r15,%rdi
0x0000000000951477 <+231>:	callq  0x62b8d0 <_ZN7raw_vec13_$LT$impl$GT$6double6double21h16855681377121452665E>
0x000000000095147c <+236>:	mov    0x10(%r15),%rbp
0x0000000000951480 <+240>:	jmpq   0x9513e0 <_ZN3dom8nodelist13_$LT$impl$GT$3len20hfe60577417cf4a8eGC7E+80>
...
End of assembler dump.
Clone this wiki locally