Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SBUF Data Dependency Issue #1045

Open
MBshara opened this issue Nov 29, 2024 · 1 comment
Open

SBUF Data Dependency Issue #1045

MBshara opened this issue Nov 29, 2024 · 1 comment

Comments

@MBshara
Copy link

MBshara commented Nov 29, 2024

Hello, I was loading a tensor into SBUF in this tiling fashion but was encountering some issues where only the last "out_" index returned the correctly loaded values in "weights_mat_mul_2":
I got no compiler issue and yet, since I am loading in sequentially, I expected w_temp to always load correct data into weights_mat_mul_2, as it w_temp was overwritten after every iteration.

weights_mat_mul_2 = nl.ndarray((filter_height, filter_width, n_tiles_c_out, nl.par_dim(c_in_pmax), n_tiles_c_in, c_in_pmax), dtype=W.dtype, buffer=nl.sbuf)
w_temp = nl.ndarray((nl.par_dim(c_in_pmax), n_tiles_c_in, c_in_pmax, filter_height, filter_width), dtype=W.dtype, buffer=nl.sbuf)
for out_ in nl.sequential_range(n_tiles_c_out):
    w_temp[...] = nl.load(W_reshaped[out_,:,:,:,:,:])
    for fH in nl.affine_range(filter_height):
        for fW in nl.affine_range(filter_width):
            for in_ in nl.affine_range(n_tiles_c_in):
                weights_mat_mul_2[fH,fW,out_,:,in_,:] = w_temp[:,in_,:,fH,fW]

I then tested simulation and every tile in weights_mat_mul_2 contained the correct values. I then tried moving the array "w_temp" declaration into the outermost for loop and all problems were resolved:

weights_mat_mul_2 = nl.ndarray((filter_height, filter_width, n_tiles_c_out, nl.par_dim(c_in_pmax), n_tiles_c_in, c_in_pmax), dtype=W.dtype, buffer=nl.sbuf)
for out_ in nl.sequential_range(n_tiles_c_out):
    w_temp = nl.ndarray((nl.par_dim(c_in_pmax), n_tiles_c_in, c_in_pmax, filter_height, filter_width), dtype=W.dtype, buffer=nl.sbuf)
    w_temp[...] = nl.load(W_reshaped[out_,:,:,:,:,:])
    for fH in nl.affine_range(filter_height):
        for fW in nl.affine_range(filter_width):
            for in_ in nl.affine_range(n_tiles_c_in):
                weights_mat_mul_2[fH,fW,out_,:,in_,:] = w_temp[:,in_,:,fH,fW]
@fayyadd
Copy link

fayyadd commented Nov 29, 2024

Thanks for reaching out, we are looking into the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants