Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise error when inputs and mask don't have same shape in Softmax #19736

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions keras/src/layers/activations/softmax.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,12 @@ def __init__(self, axis=-1, **kwargs):

def call(self, inputs, mask=None):
if mask is not None:
if mask.shape != inputs.shape:
raise ValueError(
"`mask` and `inputs` must have same shape. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They just need to be broadcastable. For instance, if the mask has shape (b, t) and the output has shape (b, t, c) that's fine.

Copy link
Contributor Author

@SuryanarayanaY SuryanarayanaY May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually documentation states that mask should be of same shape. May be we need to add here that same shape or broadcastable shape.

mask: A boolean mask of the same shape as `inputs`. The mask
          specifies 1 to keep and 0 to mask. Defaults to `None`.

Also the first dimension of shape i.e batch_size should not be considered in shape. Because if we pass input_shape as (5, 4, 1) and mask as (5,4) we get error as "Incompatible shapes: [5,4,1] vs. [5,4]". Do we need to mention this explicitly in docs?

Also the compute_output_shape method simply returns input_shape. Suppose if I pass

x = np.random.rand(7, 1, 5)
mask = np.ones((5, 1))
l_1 = keras.layers.Softmax(axis=-1,trainable = True,dtype='float32',autocast = True)
print(l_1(x).shape) #(7, 1, 5)
print(l_1(x,mask).shape) #(7, 5,1) 
print(l_1.compute_output_shape(x.shape)) #(7, 1, 5)

Noticed output shape is different from compute_output_shape method as the mask is broadcastable here.
If I change mask shape to (1,5) instead of (5,1) then all shapes are same.

Attached gist for reference in case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fchollet Any update on this PR? Please. Thank you!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fchollet Any update on this PR? Please. Thank you!

f"Got inputs shape: {inputs.shape} and "
f"mask shape: {mask.shape}"
)
adder = (
1.0 - backend.cast(mask, inputs.dtype)
) * _large_negative_number(inputs.dtype)
Expand Down
Loading