You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have noticed the output of ctcdecode includes timesteps, which the description says it can be used as alignment.
But I just get shape (Batchsize,N_beams,N_timesteps). I don't know how to use it.
timesteps - Shape: BATCHSIZE x N_BEAMS
The timestep at which the nth output character has peak probability. Can be used as alignment between the audio and the transcript.
Thanks in advance.
The text was updated successfully, but these errors were encountered:
@blankspark have you ever figured out how to use them? I am looking to get word-level time alignments, but I don't know how to calculate this information from the timesteps returned by ctcdecode.
I have noticed the output of ctcdecode includes timesteps, which the description says it can be used as alignment.
But I just get shape (Batchsize,N_beams,N_timesteps). I don't know how to use it.
timesteps - Shape: BATCHSIZE x N_BEAMS
The timestep at which the nth output character has peak probability. Can be used as alignment between the audio and the transcript.
Thanks in advance.
The text was updated successfully, but these errors were encountered: