forked from servo/servo
-
Notifications
You must be signed in to change notification settings - Fork 0
Meeting 2014 09 15
mbrubeck edited this page Sep 16, 2014
·
2 revisions
Please update your status at: http://benjamin.smedbergs.us/weekly-updates.fcgi/
- static suite issues (jack)
- rust upgrade status (kmc)
- bloom filter work (cgaebel)
- improving table code and specs (jdm)
- possible COW DOM implementation (ryanc)
- rust inheritance http://discuss.rust-lang.org/t/summary-of-efficient-inheritance-rfcs/494 (jack)
- laleh, cgaebel, pcwalton, zwarich, gw, mbrubeck, martin, kmc, june0cho, samsungpatrick, jdm, jack, brson
- jack: I think everyone saw the static suite email. We're trying to triage issues blocking power measurement and screenshot of these files. There are so many PRs, I'm not currently sure the overall status. But I believe there's a new variant of the SetLayerClip bug (SetLayerOrigin). Another one that gw was going to look at is, when you run Servo and the profiler on one of these pages, they often have something like 24 layout passes, which seems excessive. The current theory is that each image or iframe causes a full relayout.
- jdm: Also anytime JavaScript modifies the DOM in any way.
- jack: Can we log a reason every time a layout happens?
- pcwalton: One thing we could do is have finer-grained profiler buckets, not just layout but layout:first-layout, layout:increment-layout, iframe, etc.
- jdm: We should build this in now; it would be a lot easier than trying to retrofit later.
- zwarich: Speaking of relayout issues, why do some many pages fail to converge? They go back and forth between layouts indefinitely or at least for a long time.
- pcwalton: Are you talking about things on wikipedia where it jumps around?
- zwarich: Yes, and on some static suite pages.
- pcwalton: I do know there's an issue with incorrect style sharing caching.
- mbrubeck: That previously caused problems with mousing over links, other links would change.
- pcwalton: That's just a bug in can_share_style_with_node. Probably something similar here.
- jdm: I see fonts jumping back and forth a lot.
- pcwalton: Probably a false positive in can_share_style_with_node. You could test that hypothesis by making that function always return false. Could also compare to the Blink function that it's based on. I know bz will say "I told you so." He was nervous about that code but it's such an enormous help on perf.
- jack: Could any of these be a good project for cgaebel?
- pcwalton: It's more of a bug hunt than a big fun intern project, but on the other hand bugs can be fun.
- jack: Anything else?
- pcwalton: There's incremental reflow. I can rebase and post my work if cgaebel wants to take it over.
- jack: That's definitely important, good to have someone work on it.
- zwarich: There's an open question - we get some layout gains relative to existing browsers, but they spend more of their effort these days on incremental layout. Being able to compare those scenarios would be a better comparison than just measuring a single initial layout.
- jack: Yeah, we definitely want to show numbers for incremental numbers over time. We don't have the infrastructure in place to start measuring that yet. The static suite is a stepping stone to that. It might be good if you're going to work on incremental reflow to also put in the profiling buckets we talked about earlier. That'll also help us debug issues with relayout.
- jack: How many bugs will it take to view the wikipedia logo?
- mbrubeck: I have it working locally. It took three (#3296, #3339, and #3352)
- pcwalton: Reviews are gated on Simon, but he's pretty busy.
- jack: He's just back from a series of meetings now, I think. Whose PRs are we waiting for review on?
- pcwalton: Mostly mbrubeck. Simon likes to own that part.
- mbrubeck: Mine are tiny, so it shouldn't take long.
- jack: If he gets to it tomorrow morning, great. Otherwise...
- pcwalton: I did a review pass, but I want to give Simon a chance since he can be very particular.
- jack: Let's merge it tomorrow with your review and we can fix it up after if necessary.
- kmc: IS anyone working on a rust upgrade currently?
- jack: zwarich was working on one to fix an LLVM issue.
- zwarich: I was working at first without realizing that cargo was building non-optimized by default, so I had to re-do my testing. But after testing on an optimized build I think it's working. The main change is that you no longer need to include the standard crate, and the syntax flipped from
use a=b
touse b as a
. - kmc: Can you send a branch to me and I'll finish it up and hopefully land html5ever?
- zwarich: We need to re-do my work in the post-cargo world.
- jack: Make sure you push work to the servo/servo repo, so other people can help, do other platforms, etc.
- cgaebel: So, CSS selectors are normally matched by looking at the rightmost selector, and then looking at an element's parents to see if the left parts match. Over 90% of selectors include descendant selectors. We can use a bloom filter to track elements' ancestors to quickly match these. I did an implementation that works in parallel, throwing away work when it's stolen. It has a pretty big win; I forgot the numbers, but it was something like an 80% reduction in total load time for CNN.
- jack: In the implementation I know one of the issues was large bloom filter size, so you can't copy them. How did you get around that?
- cgaebel: It's in the 10s to 100s of KB range. We don't copy them; there is one per thread. We construct a new one any time a thread sees new stuff.
- zwarich: 100s of KB seems pretty large.
- cgaebel: I just picked a size that seemed good, giving a 90% hit rate at ~4000 entries. The numbers here are from memory, don't hold me to them.
- zwarich: Do you remove things as you go back up the tree?
- cgaebel: Yes
- zwarich: Then 4000 entries seems a bit high. The one in WebKit is only 4KB. How often do you have something 4000 nodes deep?
- cgaebel: One entry per id, class, tag name, etc.
- zwarich: 1000, or 1200 is probably good.
- cgaebel: Order of 10ns for removing an entry...
- zwarich: But it could be a lot different on a mobile CPU with smaller, slower caches. On x86 everything is fast and even L3 cache is fine. But for mobile, smaller might be better. Or maybe those are getting faster too. But at least when WebKit did it, sizing it to fit in cache was important on mobile.
- cgaebel: The false positive rate climbs really fast as it gets small, which scares me a little.
- jack: We should benchmark and tune on Android.
- jdm: There was a PR from... someone... improving table code in some manner. This is an area that's not really specified. Ms2ger pushed back on that, I feel maybe too aggressively, but also maybe with a good idea, that we could start putting together a spec for tables. Or maybe engage with someone else who has spec-writing experience. It could be an opportunity to say, we'll only implement what we can spec.
- pcwalton: dbaron has been working on this, at least when he has time. As I understand, he is still planning to get that done. So the question is, should we only implement what is already in dbaron's draft spec? Is it okay to implement stuff that's in dbaron's head but not written down yet? Or should we just implement things that we know to be web-compatible, as long as we also put them on the spec radar? These are different areas on the spectrum of tenable positions. Obviously it's not tenable to say we're waiting for a W3C Recommendation. It might make sense to do a sort of Intent to Implement thing, like in the Blink and Gecko processes. That is, send around a message letting everyone know we intend to imple
- jack: If we want to take this approach of asking for a spec, as long as dbaron has a web-accessible draft posted and that is what we spec, then that should be fine. Whose PR was it, anyway?
- gw: That was mine about intrinsic widths of table cells.
- pcwalton: Here's the link to dbaron's spec: http://dbaron.org/css/intrinsic/
- jack: That's the nice one that's kind of specced; it's percentage heights that's not good.
- pcwalton: There is also http://dev.w3.org/csswg/css3-tables-algorithms/Overview.src.htm
- gw: The w3 one is considered obsolete, last I looked.
- jack: Well, why don't you check dbaron's spec against your implementation, and see which one needs to change. And we can send out an intent to implement to document that this is what we're doing. And I think that should satisfy Ms2ger. The only other thing I can think of is that we can send someone from our team to adopt the spec work under dbaron's tutelage. It's a lot of work but it's not urgent. We just need to start on doing the work.
- pcwalton: At some point some site we care about is going to use percentage height and then we need to do something about it. We do something today, though I don't know what it is. This spec by dbaron may get us out of the woods for widths, but when we get to heights...
- jdm: I'd like to talk to dbaron and get a sense of his plans and timeline.
- jack: Please CC the list on that. SimonSapin and mbrubeck both have some spec-editing experience. I assume Ms2ger will chime in too.
- mbrubeck: (Speaking of people who have spec-writing experience...)
- jack: ryanc sent an email about Copy-on-Write DOM issues.
- ryanc: I was looking at the code; the basic idea is when whenever we update the DOM, we copy the node along with its ancestors all the way to the root. In the worst case I end up copying the whole DOM tree. That happens when all nodes get updated. I did a preliminary experiment, and the performance looks okay. There's one point to make: In order for the COW DOM to work, I had to modify the DOM structure to not have any sibling pointers; only parent and child pointers. After I make copies, I have to lock the original DOM and the copied DOM so that I can update the parent pointers. When I copy the DOM, the node has two parents. (See my slides for pictures.)
- jack: The big difference between this idea and our previous plan and partial implementation was that our implementation used intrusive pointers, with five pointers in the node and then five extra pointers that were the "dirty" versions. Then there was a cleanup phase to copy the dirty pointers into the clean pointers. This is similar to the external hash table of parent versions in your implementation. Our implementation has one view that is a read-only view of clean pointers only; and then the view from script which only ever touches the dirty pointers.
- jack: In your slides, some operations could be up to twice as bad. In our version, on the other hand, the memory requirement is effectively doubled because every mutable field in the DOM node needs both clean and dirty versions, which is also bad. One thing your slides illustrated is that you hope the subtree that's modified is just a small portion of the document. But in our planned implementation, we paid the memory cost for every node in the document. In both cases I think we came to the conclusion that this optimization is not worth the cost (in our case, memory; in your case, time).
- jack: We've basically decided this is not a fruitful area for further research. Now that we have most of Servo implemented, we find that the COW DOM is only useful in this narrow slice of time before the flow tree is constructed. Basically just CSS selector matching and the cascade. Due to this reduced usefulness, we basically decided we wouldn't bother. But I'm glad you did that research, because I don't think any of us had every written down the whole problem domain. After seeing it written down, it's kind of... why did we ever think this would work.
- pcwalton: It was far from a clear thing that we could have the flow tree off the main thread and do layout without touching the DOM. It seems to work, but we still need to implement incremental reflow in order to know for sure. There's still a question of how to do style recalc without janking the main thread. We may want to look at empirical data to determine why style recalc happens. We could schedule those better, or introduce APIs like getComputedStyleAsync. I think the issues that motivated the COW DOM are still real, even if COW DOM is not the right approach for it.5
- jack: It's definitely not a good approach if you pay a memory cost and cleanup phase cost for every node.
- pcwalton: And if the only benefit is style recalc off the main thread, it's not worth it. But other simpler approaches might work. Style recalc is still very expensive.
- yichoi: I understand your opinion, but I wonder if there is some use case for COW DOM, maybe in some special cases?
- jack: Originally we accessed DOM information everywhere in layout, so the use case was for parallelizing layout. But the architecture has evolved so the DOM is no longer used from the layout thread for most of layout, but only in style calculation. If we want to parallelize that, we still need something like COW DOM that allows access to the required DOM data but won't get modified while script is running. You're right that currently in Servo this use case is quite small.
- zwarich: It seems like with the intrusive approach, the space hit is really hard to justify. With the external approach, with a separate data structure, you add a big constant time factor to every DOM operation or traversal. That'll show up really badly in DOM microbenchmarks like Dromaeo. I feel like approaches that modify the DOM as a datastructure are not going to work for one reason or another.
- jack: It's hard to say. In the case where you have an external structure to save space, you might make up the performance penalty by having layout working concurrently.
- zwarich: On real pages I think that may be the case, but on benchmarks it won't help. Unless you find some clever way to avoid the slow path when there's no need for it. But that would involve more code paths, be more complicated. And it raises the question of how much you care about doing well on Dromaeo, PeaceKeeper, etc. But it's a very big impact on ability to compete on those benchmarks.
- jack: I'm sure we'll keep thinking about it.
- zwarich: Unfortunately I'm a champion of an RFC that basically says Servo's use case is bad and should be rewritten. :/ In general there are a couple of different approaches. Most of them don't use Servo's use case as an example in the RFC. One doesn't have a completed RFC yet. Obviously we want the DOM example to be idiomatic, and from the Servo point of view we'd require that for any approach. Open question, do we require statically safe downcasts?
- jdm: I don't know. We generate all of the downcasting code right now and it's correct to the best of my knowledge. It's not a really big change for us.
- jack: But also not much of a win, either.
- jdm: I guess it would be nice to know it's actually correct and not have to do it ourselves.
- zwarich: What are the ugliest parts of the current DOM implementation?
- jdm: In terms of unsafe code, it would be the casting, and writing out the trait implementations of all ancestor traits for each descendant. (That part isn't unsafe; the unsafe part is the part that says, "Yes if I can do this then I'll transmute.") Getting rid of the manual vtables and boilerplate to call the super function would be nice. It's a big pain.
- jack: Do you have a preferred proposal?
- jdm: No comment yet.
- jack: The reason I brought this up today is there's a Rust meeting tomorrow about this.
- brson: Actually, Wednesday. And jdm should come if he can.
- jdm: Depends on my schedule.
- zwarich: Me too.
- zwarich: I feel like the best options are ones that extend traits. If they can work, one of those two (the ones at the end of the list, RFC 223/91 and eddyb and kimundi's proposal which is nicely orthogonal but maybe hard to implement with a quadratic amount of code generation) seem the nicest. With some minor modifications to Rust, it is theoretically possible to implement this with detection of the duplicate code.
- jdm: The "combining enums and structs" seemed the nicest to me at first glance.
- jack: zwarich, of your two favorites, do either support safe downcasting?
- zwarich: I don't recall off the top of my head. A lot of the more recent ones require implementing a downcasting trait with unsafe code, maybe with some sugar. It seems like an afterthought with regards to usability. Open recursion is seen as the harder problem to solve, so they're focusing on that first.
- jdm: The requirement to manually upcast is also really annoying in current Servo.
- zwarich: I would hope that any solution would solve this, but maybe we should explicitly prioritize it.
- jack: Did kimundi and eddyb's proposal get a champion?
- zwarich: Aaron is the champion of it.
- jack: We'll follow up about the meeting time and attendance. The main thing to keep in mind is, I remember that when Bobby started working on the DOM code he recoiled in horror, and I hope whatever we do allows him and other Gecko people to look at Servo code and be happy.