Skip to content

Latest commit

 

History

History
61 lines (41 loc) · 4.61 KB

README.md

File metadata and controls

61 lines (41 loc) · 4.61 KB

Visual Odometry - Camera motion estimation

Visual Odometry is a crucial concept in Robotics Perception for estimating the trajectory of the robot (the camera on the robot to be precise).

This projects aims at implementing different steps to estimate the 3D motion of the camera, and provides as output a plot of the trajectory of the camera.

Frames of a driving sequence taken by a camera in a car, and the scripts to extract the intrinsic parameters are given here.

Approach and implementation:

  • To estimate the 3D motion (translation and rotation) between successive frames in the sequence:
    • Point correspondences between successive frames were found using SIFT (Scale Invariant Feature Transform) algorithm. (refer to extract_features() function in commonutils.py)
    • Fundamental matrix (F) was estimated using 8-point algorithm within RANSAC (refer to fundmntl_mat_from_8_point_ransac() function in commonutils.py)
    • Essential matrix (E) was estimated from the fundamental matrix using the camera calibration parameters given. (refer to calc_essential_matrix() function in commonutils.py)
    • E matrix was decomposed into to translation(T) and rotation(R) matrices to get four possible combinations.
    • Correct R and T were found from testing the depth positivity, i.e. for each of the four solutions depth of all the points was linearly estimated using the cheirality equations. The R and T that gave the maximum number of positive depth values was chosen.
    • For each frame, the position of the camera center was plotted based on the rotation and translation parameters between successive frames.
  • The rotation and translation parameters that were calculated and the plot were compared against the ones calculated using opencv's cv2.findEssentialMat() and cv2.recoverPose() functions.
  • To have a better estimate of the R and T parameters, code was enhanced to solve for depth and the 3D motion non-linearly using non-linear triangulation (for estimating depth) and non-linear PnP (for estimating R and T).
  • Refer to this page (section 3.1-3.5.1) for more information about the steps involved.
  • Elaborate explanation about the approach, concepts and the pipeline can be found in the report.

Output:

Comparison of the plots calculated using inbuilt opencv functions (in blue) and by estimating F and E matrix without using the inbuilt functions (in red):

alt text

Output video can be found here

Instructions to run the code:

Input dataset

  • Go to directory: cd Code/

  • To pre-process the images:

    • $ python imagepreprocessor.py
  • To the camera motion estimation task (implementation without OpenCV functions):

    • $ python motionestimation.py
  • You need to add the processed frames in the 'processed_data/frames' directory. Or, you can add the raw frames to './Oxford_dataset/stereo/centre/' dir

  • To estimate the camera motion using inbuilt functions:

    • $python motionestimator_inbuilt.py
  • The accuracy of the motion depended upon the number of iteration for the RANSAC algo and the approximated recovery of the pose.

  • The algorithm sometimes plotted differnt tracks with same tuning parameters.

  • We experimented with the different combinations of parameters and implementations.

  • To check whether our implementation calculated appropriate essential matrix we used OpenCV's pose recovery method. We got fair results in this case.

  • Non-linear methods were consuming more time.

References: