0.4.1
Changelog
Bug Fixes
-
Bug: The size mismatch in tensor operations in the forward method of the
DilatedAttentionLLAMA
class.- Root Cause: The tensors that are being operated upon did not have matching dimensions due to incorrect striding operations.
- Resolution: We modified the dilation process by introducing an inner loop over split tensors to handle each part separately, which resolved the dimension mismatch issues.
-
Bug: Index out of range error while transposing tensors.
- Root Cause: The index provided to the transpose operation was larger than the total number of dimensions in the tensor.
- Resolution: Corrected the index passed to the transpose operation to fit within the number of dimensions in the tensor.
Improvements
-
Optimized Tensor Operations: The tensor operations in the forward method were optimized to ensure they all operate on tensors with matching dimensions, improving the efficiency of the model.
-
Added Error Handling: We added checks for dimension mismatches in tensor operations to throw useful error messages when the input data does not match the expected shape.
Features
-
DilatedAttentionLLAMA Class: Introduced a new DilatedAttentionLLAMA class that uses dilated attention mechanism for the forward method. This new implementation is designed to be more efficient for larger sequence lengths.
-
Performance Testing: Added a simple performance test to benchmark the speed of the forward method in the DilatedAttentionLLAMA class.