-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process Hangs Due to Recursive Code Transformations in Rule Graphs #709
Comments
Alright! My first suggestion is to avoid using tree-sitter queries and instead use "concrete syntax," unless you truly need the full expressiveness of tree-sitter queries (such as for alternations or similar complex patterns). For example:
Now, regarding the infinite loop issue: Piranha operates by applying transformations iteratively until it reaches a fixpoint, meaning it will continue transforming code until there are no further changes. If the code still matches after a rewrite, the same rule will trigger again. This approach is effective for feature-flash cleanup, but not ideal for migrations. To prevent repeated matches, we use what we call filters. Take a look at your code with these changes: from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter
def main():
code = """from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""
find_from_tensorflow_keras_import_layers = Rule(
name="find_from_tensorflow_keras_import_layers",
query="""rgx from tensorflow.keras import layers""", # rgx indicates regex match
is_seed_rule=True
)
extend_method_chain = Rule(
name="extend_method_chain",
query="""cs :[builder].config(:[args+])""", # cs indicates we are using concrete syntax
replace_node="builder",
replace=':[builder].config("config3", "1")',
is_seed_rule=True,
filters={Filter(enclosing_node="(assignment) @assign",
not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
)
edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")
# Create Piranha arguments
piranha_arguments = PiranhaArguments(
code_snippet=code,
language="python",
rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge], )
)
# Execute Piranha and print the transformed code
piranha_summary = execute_piranha(piranha_arguments)
print(piranha_summary[0].content)
if __name__ == "__main__":
main() |
Thank you for the suggestions! I'll try using concrete syntaxes for now, but I might need to explore Tree-sitter queries in the future. The looping issue has been resolved with the from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter
def main():
code = """sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""
find_from_tensorflow_keras_import_layers = Rule(
name="find_from_tensorflow_keras_import_layers",
query="""rgx from tensorflow.keras import layers""", # rgx indicates regex match
is_seed_rule=True
)
extend_method_chain = Rule(
name="extend_method_chain",
query="""cs :[builder].config(:[args+])""", # cs indicates we are using concrete syntax
replace_node="builder",
replace=':[builder].config("config3", "1")',
is_seed_rule=True,
filters={Filter(enclosing_node="(assignment) @assign",
not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
)
edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")
# Create Piranha arguments
piranha_arguments = PiranhaArguments(
code_snippet=code,
language="python",
rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge], )
)
# Execute Piranha and print the transformed code
piranha_summary = execute_piranha(piranha_arguments)
print(piranha_summary[0].content)
if __name__ == "__main__":
main() Expected output
Actual output
|
The second rule needs to be declared as seed = False |
Thank you so much for your help in figuring this out. I tried setting from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter
def main():
code = """from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""
find_from_tensorflow_keras_import_layers = Rule(
name="find_from_tensorflow_keras_import_layers",
query="""rgx from tensorflow.keras import layers""", # rgx indicates regex match
is_seed_rule=True
)
extend_method_chain = Rule(
name="extend_method_chain",
query="""cs :[builder].config(:[args+])""", # cs indicates we are using concrete syntax
replace_node="builder",
replace=':[builder].config("config3", "1")',
is_seed_rule=False,
filters={Filter(enclosing_node="(assignment) @assign",
not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
)
edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")
# Create Piranha arguments
piranha_arguments = PiranhaArguments(
code_snippet=code,
language="python",
rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge], )
)
# Execute Piranha and print the transformed code
piranha_summary = execute_piranha(piranha_arguments)
print(piranha_summary[0].content)
if __name__ == "__main__":
main() |
I suspect the reason is the scope. Rules in PolyglotPiranha trigger other rules within scopes. In your code, the rule PolyglotPiranha can support multiple scopes (they are user defined for each language). For example, for
Unfortunately, we don't have scopes defined for Python (we would need to add it to the piranha_arguments = PiranhaArguments(
code_snippet=code,
language="python",
rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge]),
number_of_ancestors_in_parent_scope=30 # set a big number here
) Let me know if this helps! |
Thank you again. I tried increasing the value of from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter
def main():
code = """from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""
find_from_tensorflow_keras_import_layers = Rule(
name="find_from_tensorflow_keras_import_layers",
query="""rgx from tensorflow.keras import layers""", # rgx indicates regex match
is_seed_rule=True
)
extend_method_chain = Rule(
name="extend_method_chain",
query="""cs :[builder].config(:[args+])""", # cs indicates we are using concrete syntax
replace_node="builder",
replace=':[builder].config("config3", "1")',
is_seed_rule=False,
filters={Filter(enclosing_node="(assignment) @assign",
not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
)
edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")
piranha_arguments = PiranhaArguments(
code_snippet=code,
language="python",
rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge]),
number_of_ancestors_in_parent_scope=255 # set a big number here
)
# Execute Piranha and print the transformed code
piranha_summary = execute_piranha(piranha_arguments)
print(piranha_summary[0].content)
if __name__ == "__main__":
main() |
You're right; I had forgotten. Unfortunately, parent scope only applies to nodes that are actual parents and not their context recursively, so the current workaround is to set it to global. As for the required changes, I would be happy to review your PRs, but I currently don’t have write access to this repository, so I can't promise it will be merged. The change is pretty straightforward; you just have to create a scope config file and then add its relative path to the corresponding language, as shown here: piranha/src/models/language.rs Line 201 in 6f2e9bc
In # Scope generator for python files
[[scopes]]
name = "File"
[[scopes.rules]]
enclosing_node = """
(module) @p_m
"""
scope = "(module) @python_module"
You may also add other scopes, like class, function, method etc |
@danieltrt Thank you for your help. I created the file and successfully defined
Below is the # Scope generator for python files
[[scopes]]
name = "Function"
[[scopes.rules]]
enclosing_node = """
(
(function_definition
name: (_) @n
parameters: (parameters) @fp
) @xdn
)"""
scope = """
(
[
(function_definition
name: (_) @z
parameters: (parameters) @tp
)
(#eq? @z "@n")
(#eq? @tp "@fp")
]
) @qdn
"""
[[scopes]]
name = "Class"
[[scopes.rules]]
enclosing_node = """
((class_definition name: (_) @n) @c)
"""
scope = """
(
((class_definition
name: (_) @z) @qc)
(#eq? @z "@n")
)
"""
[[scopes]]
name = "File"
[[scopes.rules]]
enclosing_node = """
(module) @p_m
"""
scope = "(module) @python_module" |
That seems right to me for your example. PolyglotPiranha cannot find an enclosing class in The function scope seems to have a little problem. This version works for me: [[scopes]]
name = "Function"
[[scopes.rules]]
enclosing_node = """
(
(function_definition
name: (_) @n
parameters: (parameters) @fp
) @xdn
)"""
scope = """
(
[
(function_definition
name: (_) @z
parameters: (parameters) @tp
) @qdn
(#eq? @z "@n")
(#eq? @tp "@fp")
]
)
""" Try running it on this code instead: def test_fn(x):
from tensorflow.keras import layers
other = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()
first = layers.builder.config("config1", "5").config("config2", "5").getOrCreate() Hopefully you will observe the expected behavior 😃 |
It works now—thanks a lot for your help. However, if PolyglotPiranha can’t find an enclosing class or function, should it print a warning or error message instead of raising an exception? Is this the expected behavior? |
I agree, I think a warning could have been a better solution here ? I don't remember the rationale for this design decision |
The PR has been opened. Thank you once again for your help. |
I think we can add a flag that optionally prints a warning rather than an throwing an exception. |
Sure, if the expectation is to apply each rule at least once, that makes sense to me. |
Thank you again for the great work. I have a question to ensure I fully understand the expectations of the rule graphs. What I'm trying to do is the following change:
before code =
after code =
Objective:
The goal is to add
.config("config3", "1")
to the config chain only if the import statementfrom tensorflow.keras import layers
exists in the file. The change should not occur if the import is different, for examplefrom pytorch.ll import layers
. I created a rule graph with two nodes:.config("config3", "1")
to the chain.Question 1: Infinite Transformation Loop When Adding
.config("config3", "1")
Observation:
The process hangs when I add
.config("config3", "1")
because it matches the search query again. If I change thereplace
parameter in theextend_method_chain
rule to.config("1")
, the code changes successfully.Question 2: Transformation Applied Regardless of Import Presence
Observation:
The second rule,
extend_method_chain
, is executed and.config("config3", "1")
is added to the method chain even whenfrom tensorflow.keras import layers
is not present in the file. What I want is the change to happen only if the import exists. Additionally, I also tried to change the scope of the rule to the File, expecting to fix this error, and got below errorpyo3_runtime.PanicException: Could not create scope query for "File"
.Blow is the code that hangs
Do you have any suggestions or workarounds to fix both errors? Or am I doing something unexpected?
The text was updated successfully, but these errors were encountered: