allocate_matrix
was actually one of the last functions we wrote in this part. We thought the implementation would be slightly harder as opposed to other functions we needed to fill, so we left it to the end. Turned out it wasn't as hard as we thought after all.
allocate_matrix_ref
was the last function we worte in this part 1. Also, it was the primary cause of pain and agony in the debugging. Initially, we thought that the differences between allocate_matrix
and this function were the offsets and the exisiting matrix. We did not realize that the underlying structure need be preserved in both matricies. we fixed this in the debugging process of the Python-C interface, and although in the end it seemed quite straightforward, the pointer arithmetic was quite difficult at first.
We implemented deallocate_matrix
function right after we wrote allocate_matrix
. It made sense to us to do it that way, cause once we understood the general structure of how the matrix is assigned and saved on the heap, we could easily deallocate it.
Writing the get function took a relatively short time, it was basically returning a specific indexed value.
Writing the set function also took relatively hsort time, since much like the get function, it was essentially referencing a nested array and setting a value.
The code of the fill_matrix
was a mixture of the add_matrix and set functions. We implemented this function by running 2 for loops, and essentially calling the set function with each index values.
Adding two matricies was the first function we implemented, we thought it was conceptionally easier to get a grasp on, and in the process we could understand how the approach the problem. It turned out to be an easy function to implement once the structure of the matrix was clear.
This was the second function we implemented. Much of the code assembles that of the add_matrix
, with the only difference in the subtraction part.
mul_matrix
turned out to take the most of the time in this part. Even with that much spent, we ended up getting it wrong once we completed Task 1. Albeit passing the test, we did not check whether mul_matrix
worked on large matricies, and this was cause of mcuh pain later on. We ended up fixing mul_matrix
in the debugging process of Python-C interface.
Writing the pow_matrix
function also took some time, although it wasn't so hard conceptually as the mul_matrix
, as we were calling the mul_matrix to handle the heavy lifting. One thing we missed out was the case when the power is taken of 0 or 1. These two cases were handled later on in the debugging process of the Python-C interface part.
Negating a matrix seemed to be not too difficult of a task, so we did that third, once we was done with the add_matrix
and sub_matrix
.
Taking the absolute of the function was quite similar to the neg_matrix
function, the only difference was to include the conditional to see if a value at matrix[i][j] was negative.
The number methods turned out to be the easiest of the all function to write, and also took the least time. Here is detailed description for each function.
It took us some time to understand how the Python-C interface would work and the correct way to implement this function. Essentially, it was calling the function we already wrote in part 1, with the caveat that we needed to unpack arguments and do error checking. After reading the Python docs on the C interface, we implemented this function with ease.
Matrix61c_sub was much like Matrix61c_add
, we still needed to do type casting of python objects and also error checking, after which we just called the matrix_sub function we already implemented in part 1.
While initially it seemed that Matrix61c_multiply
would be very different from Matrix61c_add and others, it turned out to be very similar. We still had to do type checks, castings and error handling.
This function was the first one we implemented. Although it turned out not to be too different from Matrix61c_sub
or Matrix61c_add
, we thought it was a good starting point since we needed to do less error checking and object casting.
The implementation for Matrix61c_abs
was very similar to that of Matrix61c_neg
, with one difference in calling the matrix_abs funcion instead of neg_matrix
.
Much like the rest of the functions, we type checked python object before we casted and then did error handling both before and after calling pow_matrix
function in matrix.c.
At first, we were confused on how to deal with this functions, but after reading the docs we found the right way and got rid of compiler warnings.
Instance methods took collectively less time to implement than the Number methods, but per function, we spent more time on the Instance methods. Below is a detailed description of each function.
For Matrix61c_set_value, we first needed to check the number of argument supplied to the PyObject* args
, and make sure that it was 3. After, we basically did bunch of type checks and castings, eventually calling the set()
function in the matrix.c.
Matrix61c_get_value was very similar to Matrix61c_set_value with the caveat that we had less values in the PyObject* args
. So, we made sure that the arguments supplied were 2, and did type checks, casting and finally called get()
in matrix.cc
Frankly, this was the most tedious part in the whole project. First because we had no idea what was needed to be done, and the docs were not really helpful. After finally finding a workaround for the Matrix61c_get_value function, the Matrix61c_set_value started failing and then we were very perplexed. Turns out in the number of arguments, we are not actually supposed to hardcode those values. Just solving this tiny issue took a lot of time.
This was our sneak peek to hell. Writing the each of these functions and having them work properly took days.
Writing Matrix61c_subscript
took around 250 lines of code. Since there were a large amount of possible combinations passed to the PyObject* key
, each had to be checked seperately, both for 1D and 2D matricies. First, we implemented the subscript for 1D matricies, which took collectively about a day and a half. After thaving the 1D case work, we expanded to the general 2D case. The main difference between the 1D and 2D matricies for this marticular function weas having an extra type o variables for key
, in particular, the tuple of integer/slice.
Writing Matrix61c_set_subscript
took considerably less time now that we had finished writing the Matrix61c_subscript
function. Actually, writing Matrix61c_subscript
first made it easier for us, since we can call this function in the implementation of Matrix61c_set_subscript
, and so that saved a lot of code lines. We still had the problem of configuring whether the passed arguments were for a 1D matrix. We resorted to making a helper function named isASingleNumberIndex
which did just that. It took two argument Matrix61c *self
and PyObject *key
and returned 1 if the matrix was 1D and 0 if not. After checking if a matrix is 1D we simply implemented the set()
function in applicable places.
As a side note, in both functions that fell under indexing we did an incredible amount of error and type checking. This made sure when the unreasonable arguments were passed, a proper error was spitted out.
Making contiguous memory. This was done in the allocate_matrix
and allocate_matrix_ref
. Then we used openmp on any matrices with more than 100k elements, this made sure that the smaller matricies were not affected by the openmp yet the larger matricies still yieled results pretty fast. Next step was to use repeated squaring for log(n) runtime in pow_matrix
. After, we unrolled for loops when appropriate, usually 4x, in function specified above. Finally, we used SIMD in mul_matrix
, and also loop unrolling and openmp, and transposed the mat2 for better cache hits.