forked from servo/servo
-
Notifications
You must be signed in to change notification settings - Fork 0
Css selector matching meeting 2013 07 19
Jack Moffitt edited this page Jul 30, 2013
·
1 revision
- https://wiki.mozilla.org/Servo/StyleUpdateOnDOMChange
- dbaron: rule hash: we look at the selector. if it has an ID, we stick it in a hash keyed on ID. if it doesn't have an ID but has a class, we stick it in a hash based on class. doesn't include things inside and :not. otherwise if it has a tag name we stick it in a hash for tag names otherwise if it has a namespace we stick it in a hash for namespace otherwise it goes in a "could not hash" list. when we go to match:
- look at global rules
- look at rules for classes if it has a class
- look at rules for ids if it has an id
- look at only selectors that matches these rules That optimization is the cause of the common performance advice to authors that you should try to be as specific as you can on the RHS of the selector. Recently authors have stopped following this advice. That is why we use a Bloom filter. Boris would know exactly what goes into the Bloom filter and I don't remember off the top of my head. It's called AncestorFilter somewhere in layout/style. Then the basic process for matching is to match from right to left except that there's some fun stuff with combinators where you want to greedily match but not do too much branching. For some selectors we do branching but we don't do more than necessary.
- ssapin: What do you mean by branching?
- dbaron: body > div p You want to be careful that when you're matching this that if you have the tree structure "body > div > div > p", if you're doing the matching the stupid way, you start at the right end of the selector. We say "OK p matches, we've got a descendant combinator, let's go to the next point. div matches. now we have a child combinator, let's see if the child is 'body'. Nope. Fail." The problem here is that we have to backtrack. Basically we only need to do this thing for certain combinations of combinators in sequence. This code is in SelectorMatchesTree(). The other interesting set of optimizations is the way we do dynamic change handling. This might influence some design decisions. There are a bunch of optimizations that influence the way you might want to do selectors. One is when you change an attribute in the markup. Let's say you have a selector which is p[title="foo"]. You change the attribute from "bar" to "foo". We need to recompute styles b/c that attribute changes. One of the problems here is that attribute changes are quite common and most of them don't cause selector matching. You don't want to do style matching for every attribute change. We store a data structure in the cascade that points to every attribute selector. The data structure is a linked list of these pieces: +----------------- attribute cache points here V [ ] <- [ ] <- [ ] body > | div[foo] | p We have a linked list of objects here. Note that the > goes in the same object as the BODY. Boris implemented this. If it's a descendant combinator we need to restyle the element and the rest of the subtree. If it's a sibling combinator we need to restyle siblings. The reason this wins so much is that otherwise an attribute change requires you to restyle the entire subtree. With this optimization you run the selector matching process for an attribute change but you only run the selector matching process for the element itself but not all of its children. Only if the "did this change?" returns true do we have to run the selector matching algorithm for descendants (or siblings). For pseudo-classes it works basically the same way. I think those are major selector matching optimizations. I talked about the bloom filter but I didn't write that. I talked about the selector matching and the greedy combinators stuff, and I talked about the attributes.
- bz: People remember that I wrote this up: https://wiki.mozilla.org/Servo/StyleUpdateOnDOMChange Also covers how WebKit does it. What we do with attributes has its pluses and minuses. It trades off some performance on every attribute change in favor of faster restyling. It presupposes that restyling is expensive. To the extent that this is false it may not help. It is not as true perhaps with the Bloom filter. But it is still expensive and probably helps. However in Gecko we do too many virtual functions so that could be bad. If you in Gecko change an attribute and there were no selectors selecting on that attribute it takes us a fair amount of work to figure that out just because of all of these calls through rule processors and we ask many objects "hey, do you know anything about this attribute" and these are all virtual function calls. If we do something like this then we should have a SINGLE place all of this is queried. We don't care what cascade everything it is (UA rule, document rule, user rule). In Gecko we have to ask each individually (is it a UA rule? is it a document rule? is it a user rule?) It used to be that WK was way better at this than Gecko. WebKit has regressed so now they're equal to us. In practice when you change attributes it's rare that there are attributes that match on the value... EXCEPT for class selectors. In class selectors it's the case where you want to do a bunch of random class selectors. There are always random class selectors that do things before and after. Checking before and after really tells you anything. The bloom filter. The bloom filter is an attempt to optimize...
- ssapin: The hash pointing to attribute selectors. Does that include IDs and classes?
- bz: That's correct. "Do we have an ID selector before or after?" We actually have a hash table for IDs. There is also a table for everything else and it does a linear scan of selectors that match that attribute.
- bz: We stole the Bloom filter from WK. The idea is to optimize cases where the page author writes a descendant combinator and the thing to the RHS matches a lot e.g. ".foo div". They want all the divs that are a descendant of one element. But now what that means is that for EVERY div we have to walk up the tree. But of course most divs are not descendants of a .foo. This takes a fair amount of time. It was showing very high in profiles. Order of seconds to do a dynamic restyle of the page!! Especially bad if the thing on the RHS included :hover so now every time you move the mouse you waste seconds. Proposal was that instead of every element walking up to the root you keep track of what elements you might have and very quickly be able to filter out lots of selectors. Every time you have a div you have to go find class foo. The way this is implemented is a Bloom filter which basically just stores information about what the classes and IDs and tag names of all your ancestors are. The way we do that is that they're already interned strings which already have a hash code associated with them so we just take the hash code and use it in the bloom filter. The problem is that when you start the selector matching you have to build up the bloom filter for all ancestors up to that point (not too bad) and as you go down the tree you have to add kids' IDs, classes, tag names to the filter and then you have to take them out -- we use a counting Bloom filter so that we can take them out.
- ssapin: Do we have several filters?
- bz: No, we modify the same filter. This is a problem for parallelizing.
- ssapin: Thought Bloom filters couldn't have things removed
- bz: Counting bloom filter. It has a byte not a bit for each location. It might saturate and then you just deoptimize. How do we parallelize? Copying is not that easy because it's a big chunk of memory. I used a page sized filter. 500 bytes. Copying 500 bytes isn't all that cheap. That's something we should think about. How should we do something along the lines of this optimization that's slightly more amenable to parallelized selector matching? Other issue: Really relies on the fact that you're matching on the element tree. Gecko doesn't always traverse the element tree while you're doing selector matching. Not a problem for Servo because styles are matched to element.
- ssapin: What else might we be matching on?
- bz: In Servo styles are put on elements. This is good. It makes this easier.
- ssapin: In Gecko it traverses frames?
- bz: Yes. Can cause problems. I think this is basically it. Some discussion to be had as to the data structure for selectors. We're not too happy about the one we have in Gecko. It's bloated and not too cache friendly.
- dbaron: One thing I didn't talk about is what happens when we add/remove stuff from the tree. What happens to the selector matching then? For content tree modification the interesting piece is sibling selectors. If you add or remove stuff from the content tree you might change whether its later siblings change selectors. Another interesting case is that you might change whether its parent matches :empty. You also might change :nth-child selectors of children. These are all kind of annoying cases. We basically handle all of them by sticking bits on content nodes by saying "the child of this content node has one of these 3 categories of things". The basic gist of this is that we never actually implemented the dynamic change stuff correctly until we added these bits to the content nodes. I'm not sure if WK did this right.
- bz: They fixed most of the cases but they still have edge cases where it's broken.
- dbaron: You can look at the Gecko code for that. I don't think it's been a massive performance problem so it's probably an OK approach. One other thing that might be worth bringing up is that we sort of have a perf problem in Gecko right now and I don't know how to fix it in the current design, though it's not strictly selector matching. It's basically that StyleContexts are immutable in Gecko. If we change a StyleContext's parent or we change the RuleNode (the list of rules it matches) then we have to recreate StyleContexts for all of its descendants. The case in which we've been having perf problems that I'm worried about is stuff like transform changes on a particular element. For example in Gaia when there's multiple screens of the home screen. Swiping on the home screen to move from screen to screen changes the transform. One of the perf problems we've been running into is that when you change a transform we have to recreate style contexts for descendants and we have to recompute styles for all of the style contexts. If any of the descendants defeats any of our rule caching optimizations then this ends up being expensive. What's the Servo approach?
- jack: In the layout case we're putting in some stuff where certain layout tasks...
- pcwalton: In servo we don't have incremental reflow. Every element has separate computed styles that don't change. We could make them mutable during style computation, then immutable for later layout.
- bz: The reason why they're immutable in Gecko is that it lets us compare before style and after style easily. The other obvious approach is to mutate things in place but then we have to keep track of changes in place.
- pcwalton: We could have a mutable style context that has monotonic change sets.
- dbaron: how to deal with storage? Rule tree lets us use less memory by saving computation and sharing that.
- bz: we use style structs. style contexts share pointers to these structs that they may own or not. Individual structs not refcounted because kids never outlive parents. also teh properties are separated in the struct by whether they are inherited or not. for example, color is default inherited and display is not, so those are in different structs in the context.
- dbaron: this separation is key to memory optimization.
- bz: for the reset properties, we only store those in the style context if they can't be shared across elements.
- dbaron: by which we mean can't be shared by all elements that share the (missed)
- bz: we have these thigns split into 20 structs. 10 or 11 words just for all these pointers in the style context.
- bz: the other point there is that the list of rules is represented as a trie. we do sharing up and down the rule tree as well.
- pcwalton: this stuff sounds like it may make it hard to have mutable style structs.
- bz: having mutable style contexts might be easier. but mutable style structs would be hard.
- jet: simon, do you have even here to get started?
- ssapin: yes. which optimizations are the hardest to implement later?
- bz: i think the first thing to figure out is which things will be immutable and which will be mutable (recomputed and replace). the next part is which things will not play well with parallel things. hthis idea of sharing stuff and lazily computing stuff uhas a fundamental tension with doing things in parallel. just identifying which pieces are likely to cause those problems would be a good start. if we're just focused on selector matching, then figuring out the memory representation of selectors is perhaps the right place ot start. the rest of it seems relatively straightforward. and figuring out what we want to do with the bloom filter is osmething we dont' have an answer for (modulo seth's comments about ZOOMM paper).
- jet: these notes should live somewhere better than ehterpad. gecko pieces should get moved to gecko wiki (https://wiki.mozilla.org/Gecko:Overview), and servo stuff should go into servo wiki. even before we lay down any code would be good to start keeping track of that stuff. perhaps making the section on that page more concise and a new see also page.