Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with merging/rebuilding? #8

Open
tmoux opened this issue Jan 22, 2024 · 5 comments
Open

Issue with merging/rebuilding? #8

tmoux opened this issue Jan 22, 2024 · 5 comments

Comments

@tmoux
Copy link
Contributor

tmoux commented Jan 22, 2024

I believe I have found a bug related to egraph merging/rebuilding. It occurs in this test in basic.ml:

let%test "test" =
  let g = EGraph.init () in
  let c1 = EGraph.add_sexp g [%s 1] in
  let c2 = EGraph.add_sexp g [%s 2] in
  let f1 = EGraph.add_sexp g [%s (f 1)] in
  let f2 = EGraph.add_sexp g [%s (f 2)] in
  EGraph.merge g c1 c2;
  EGraph.rebuild g;
  let query = Query.of_sexp [%s f "?a"] in
  let matches = EGraph.ematch g (EGraph.eclasses g) query |> Iter.length in
  Alcotest.(check int) "check num matches" 1 matches

Here, we merge 1 and 2, so f 1 = f 2 by congruence. Thus the query f ?a should return exactly one match. However, two matches are found instead.
This behavior is the same before and after the patch I previously made, and I believe its cause is unrelated to the e-matching algorithm itself.

After some investigating, I determined that initially, there are four e-classes, which I will call c1, c2, f1, f2, which correspond to
the expressions 1, 2, f 1, and f 2, respectively. Since 1 and 2 are merged, the e-classes c1 and c2 are merged, so f1 and f2 should be merged as well. However, they are not merged, and so both f1 and f2 are treated as e-classes we should search, erroneously adding another match. This can be verified by checking that f1 and f2 do not have the same canonical e-class id after rebuilding.

I don't yet fully understand the problem, but it seems that the rebuilding is not propagating to the parent e-classes. In particular, I'm a bit confused about the e-graph's class_members field, which purportedly maps e-classes to its children e-nodes. From my understanding (see egg paper sec. 3.1), maintaining the congruence invariant requires upward merging, which would mean storing the parent e-nodes for each e-class instead. But I could be misunderstanding it.

@kiranandcode
Copy link
Member

Oh, thanks for the bug report and the detailed debugging log!! Yepp, this does seem like a bug.

w.r.t the class_members field, you're right about the natural encoding being storing the parent enodes for each eclass. It's been a while since I looked at the code, I don't quite recall the encoding ego uses, but I remember that some of the deviations from the paper were motivated by looking at egg's source code.

Hm... this needs some more experimentation.

@tmoux
Copy link
Contributor Author

tmoux commented Jan 27, 2024

I tried to determine whether the generic version had the same bug. This analogous unit test in test_prop.ml passes:

(fun () -> let graph = EGraph.init () in
let c1 = EGraph.add_node graph (L.of_sexp [%s x]) in
let c2 = EGraph.add_node graph (L.of_sexp [%s y]) in
let _ = EGraph.add_node graph (L.of_sexp [%s not x]) in
let _ = EGraph.add_node graph (L.of_sexp [%s not y]) in
EGraph.merge graph c1 c2;
EGraph.rebuild graph;
let q = qf [%s not "?a"] in
let matches = EGraph.find_matches (EGraph.freeze graph) q |> Iter.length in
Alcotest.(check int) "1 match" 1 matches)

which seems to indicate that the bug may be limited to the basic version. From my cursory look, the implementation of rebuilding in the generic version is slightly different, and that might explain why the two implementations have different behavior in this case.

@tmoux
Copy link
Contributor Author

tmoux commented Jan 27, 2024

It seems generic.ml stores parent enodes in the eclass similar to egg:

ego/lib/generic.ml

Lines 24 to 29 in 1d9a39a

type ('node, 'data) eclass = {
mutable id: Id.t;
nodes: 'node Vector.vector;
mutable data: 'data;
parents: ('node * Id.t) Vector.vector;
}

ego/lib/generic.ml

Lines 308 to 311 in 1d9a39a

List.iter (fun child ->
let tup = (node, id) in
Vector.push ((self.@[get_class_data] child).parents) tup
) (L.children node);

@kiranandcode
Copy link
Member

oooh, yepp, that could be it!! Good find. I think I wrote the initial basic version from scratch, and then iterated on the design to improve the performance. For the generic one, I ported over the design from egg in rust, but didn't backport those changes to the basic one. Hence the bug.

@tmoux
Copy link
Contributor Author

tmoux commented Feb 2, 2024

Gotcha, the generic version is a more well tested than the basic one, so I can see how the bug might've slipped through. I think I'll try to implement a fix when I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants