Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Global Value Numbering Pass #51120

Open
wants to merge 49 commits into
base: master
Choose a base branch
from
Open

Add a Global Value Numbering Pass #51120

wants to merge 49 commits into from

Conversation

Zentrik
Copy link
Member

@Zentrik Zentrik commented Aug 30, 2023

This enables common subexpression elimination, fixes #50887. This is based on the NewGVN pass in LLVM.

A possible avenue for increasing the number of eliminations would be to implement code hoisting (eg. an expression that is also present in all predecessors to a block could be moved to the immediate dominator and then replaced; alternatively we could replace it with a phi node) or Partial Redundancy Elimination.

I'm not sure when it would be optimal for this pass to run, the effects of a function may allow a non inlined call to be eliminated completely whilst when inlined only parts may be deleted.

For small functions most of the time is spent on allocating two arrays (ssa_to_ssa and the array backing val_to_ssa). For larger functions the get! dominates, #51091 speeds up get! 2x. I don't think there any easy optimizations left, one possible avenue would be computing use lists (a list of every instruction in which a SSAValue is used) and using that to not have to iterate over every instruction.

base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
Keno
Keno previously requested changes Aug 30, 2023
Copy link
Member

@Keno Keno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice work, but doesn't work as is, because it doesn't consider the state of the heap.
For example, consider the following function:

function foo()
    arr = Int[0]
    a = arr[]
    arr[] = 1
    b = arr[]
    (a, b)
end

As is, a, and b will be considered congruent and replaced with each other. However, this is incorrect.

@Keno
Copy link
Member

Keno commented Aug 30, 2023

That said, if we promote inaccessiblememonly to an IR flag, we could probably do this at least for things that don't touch memory, which might still be useful. Will have to wait until after #50805 though, since the flag space is currently full.

@giordano giordano added compiler:codegen Generation of LLVM IR and native code compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) compiler:llvm For issues that relate to LLVM labels Aug 31, 2023
@Zentrik
Copy link
Member Author

Zentrik commented Aug 31, 2023

This is nice work, but doesn't work as is, because it doesn't consider the state of the heap. For example, consider the following function:

function foo()
    arr = Int[0]
    a = arr[]
    arr[] = 1
    b = arr[]
    (a, b)
end

As is, a, and b will be considered congruent and replaced with each other. However, this is incorrect.

I'm a bit confused, this seems to work fine for me. I see the same IR before the GVN pass, after the GVN pass and after "compact 2" on master (#95af5a08926).

julia> foo()
(0, 1)

julia> ir = Base.code_ircode(foo, (); optimize_until="GVN") |> only |> first
2 1 ── %1  = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Int64}, svec(Any, Int64), 0, :(:ccall), Vector{Int64}, 1, 1))::Vector{Int64}
  │          Base.arrayset(false, %1, 0, 1)::Vector{Int64}unds_setindex!
3 │    %3  = $(Expr(:boundscheck, true))::Bool     getindex
  └───       goto #16 if not %3         ││┃         _getindex
  2 ── %5  = Base.arraysize(%1, 1)::Int64││╻╷╷       checkbounds
  └───       goto #3                    ││││╻╷╷╷      checkbounds
  3 ──       goto #4                    │││││┃││││     checkbounds_indices
  4 ──       goto #5                    ││││││┃││       all
  5 ── %9  = (%5 === 1)::Bool           │││││││╻╷        _all
  └───       goto #7 if not %9          │││││││╻         _all
  6 ──       goto #8                    ││││││││  
  7 ──       goto #9                    ││││││││  
  8 ──       goto #9                    ││││││││  
  9 ┄─ %14 = φ (#7 => false, #8 => true)::Bool│   
  └───       goto #10                   │││││││   
  10 ─       goto #11                   ││││││    
  11 ─       goto #12                   │││││     
  12 ─       goto #14 if not %14        ││││      
  13 ─       goto #15                   ││││      
  14 ─       invoke Base.throw_boundserror(%1::Vector{Int64}, ()::Tuple{})::Union{}
  └───       unreachable                ││││      
  15 ─       nothing::Nothing           │         
  16 ┄ %23 = $(Expr(:boundscheck, false))::Bool      getindex
  │    %24 = Base.arrayref(%23, %1, 1)::Int64     
  └───       goto #17                   ││╻         _getindex
  17 ─       goto #18                   ││        
4 18 ─ %27 = $(Expr(:boundscheck, true))::Bool     setindex!
  └───       goto #33 if not %27        ││┃         _setindex!
  19 ─ %29 = Base.arraysize(%1, 1)::Int64││╻╷╷       checkbounds
  └───       goto #20                   ││││╻╷╷╷      checkbounds
  20 ─       goto #21                   │││││┃││││     checkbounds_indices
  21 ─       goto #22                   ││││││┃││       all
  22 ─ %33 = (%29 === 1)::Bool          │││││││╻╷        _all
  └───       goto #24 if not %33        │││││││╻         _all
  23 ─       goto #25                   ││││││││  
  24 ─       goto #26                   ││││││││  
  25 ─       goto #26                   ││││││││  
  26 ┄ %38 = φ (#24 => false, #25 => true)::Bool  
  └───       goto #27                   │││││││   
  27 ─       goto #28                   ││││││    
  28 ─       goto #29                   │││││     
  29 ─       goto #31 if not %38        ││││      
  30 ─       goto #32                   ││││      
  31 ─       invoke Base.throw_boundserror(%1::Vector{Int64}, ()::Tuple{})::Union{}
  └───       unreachable                ││││      
  32 ─       nothing::Nothing           │         
  33 ┄ %47 = $(Expr(:boundscheck, false))::Bool      setindex!
  │          Base.arrayset(%47, %1, 1, 1)::Vector{Int64}
  └───       goto #34                   ││╻         _setindex!
  34 ─       goto #35                   ││        
5 35 ─ %51 = $(Expr(:boundscheck, true))::Bool     getindex
  └───       goto #50 if not %51        ││┃         _getindex
  36 ─ %53 = Base.arraysize(%1, 1)::Int64││╻╷╷       checkbounds
  └───       goto #37                   ││││╻╷╷╷      checkbounds
  37 ─       goto #38                   │││││┃││││     checkbounds_indices
  38 ─       goto #39                   ││││││┃││       all
  39 ─ %57 = (%53 === 1)::Bool          │││││││╻╷        _all
  └───       goto #41 if not %57        │││││││╻         _all
  40 ─       goto #42                   ││││││││  
  41 ─       goto #43                   ││││││││  
  42 ─       goto #43                   ││││││││  
  43 ┄ %62 = φ (#41 => false, #42 => true)::Bool  
  └───       goto #44                   │││││││   
  44 ─       goto #45                   ││││││    
  45 ─       goto #46                   │││││     
  46 ─       goto #48 if not %62        ││││      
  47 ─       goto #49                   ││││      
  48 ─       invoke Base.throw_boundserror(%1::Vector{Int64}, ()::Tuple{})::Union{}
  └───       unreachable                ││││      
  49 ─       nothing::Nothing           │         
  50 ┄ %71 = $(Expr(:boundscheck, false))::Bool      getindex
  │    %72 = Base.arrayref(%71, %1, 1)::Int64     
  └───       goto #51                   ││╻         _getindex
  51 ─       goto #52                   ││        
6 52 ─ %75 = Core.tuple(%24, %72)::Tuple{Int64, Int64}
  └───       return %75

Looking at the flags of arrayref,arrayset and arraysize we get 0x00000110, 0x00000100 and 0x00000130 respectively so these don't get touched by this pass anyways (the consistent flag isn't set on any of them).

@Keno
Copy link
Member

Keno commented Aug 31, 2023

Hmm, I may have been wrong, and IR_FLAG_CONSISTENT might be strong enough. I'll need to think about it again to make sure.

@gbaraldi gbaraldi removed the compiler:llvm For issues that relate to LLVM label Aug 31, 2023
@Keno Keno dismissed their stale review September 4, 2023 21:37

Having thought about it some more, I think the semantics of IR_FLAG_CONSISTENT are probably sufficient to ensure that this bails on things that access non-consistent memory, so that part is probably fine.

base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved
@nsajko
Copy link
Contributor

nsajko commented Sep 29, 2023

Does this fix #50887?

@Zentrik
Copy link
Member Author

Zentrik commented Sep 29, 2023

Does this fix #50887?

Yes that was the inspiration for this.

@nsajko
Copy link
Contributor

nsajko commented Sep 29, 2023

FTR, if you put a Fixes #50887 line into a commit or PR description, Github is supposed to automatically close the issue.

Also fixes bug when calling dominates, where firstedge is still set to 0 and thus out of bounds.
Also hopefully fix some indexing bugs.
It broke building Julia for some reason,
```
Segmentation fault
in expression starting at compiler/compiler.jl:181
perform_lifting! at ./compiler/ssair/passes.jl:667
```
Add fast path back in.
Prevent accidental use and make intent clearer
I think this should all be legal. Unfortunately iterating over `compact` makes this pass significantly slower, ~2x, as lots of time is spent in `renumber_ssa2!`.
This reverts commit 057d817.
This broke building somehow and was quite slow anyways.
Address review.
Fix bug in not wrapping result form `ssa_to_ssa` in a `SSAValue` before writing into `key` in `perform_symbolic_evaluation`
[no ci]
Fixes simplification of PhiNode to SSAValue by reverting indexing fix and setting `firstval` to equivalent SSAValue.
Makes pass about 5% faster on `x->@inline(exp(x))/(1+@inline(exp(x))` and halves allocations
Indices are Int's I believe.
`sort!` is not defined for memory whilst bootstrapping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code compiler:optimizer Optimization passes (mostly in base/compiler/ssair/)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Common subexpression elimination doesn't happen even though the relevant call infers as consistent
8 participants