Skip to content

Latest commit

 

History

History
62 lines (52 loc) · 3.32 KB

readme.md

File metadata and controls

62 lines (52 loc) · 3.32 KB

Behavior cloning for Error Discovery

This repository contains the code for the paper "Self Supervised Detection of Incorrect Human Demonstrations: A Path Toward Safe Imitation Learning by Robots in the Wild" by Noushad Sojib & Momotaz Begum.

BED is a BC model with an additional parameter vector $w$ of length $|D|$. It utilize a loss function that penalize different kinds of inconsistency and help to learn $w_i\approx1$ for good demos and $w_i\approx0$ for bad demos. As bad demos add more loss the the total loss, discarding them (by assigning $w_i=0$) will reduce the total loss and help to detect the bad demos.

Installation

    def forward(self, **inputs):
        enc_outputs = self.nets["encoder"](**inputs) 
        self.last_latent = enc_outputs[-1,:]    #add this line
        self.latent=enc_outputs                 #add this line
        mlp_out = self.nets["mlp"](enc_outputs)
        return self.nets["decoder"](mlp_out)

Download the data

  • Layman V1.0 can data download
  • Full dataset download link will be available soon.

Training BED

Run the following command to train the BED model

python bed_training_path.py --config path/config.json --m 0.8 --accelerate 40 --gscale 5

Example: Train BED on can 80% data. You can press Ctrl+C for early stopping.

python bed_training_path.py --config path/configs/can/bed_can_510_p20b.json --m 0.8 --accelerate 40 --gscale 5

Explnation of the arguments:

  • --config: path to the configuration file
  • --m: percentage of demos we want to keep
  • --accelerate: use higher learning rate after this epoch for faster convergence
  • --gscale: importance of path loss

Expected w: As there are 150 demos total, we expect 30 of them will get $w\approx0$ and 120 of them will get $w\approx1$. Rounding will make them binary. Here is the expected w vector before rounding:

w:  [ 1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    0.49  1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    0.96  1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  0.73  1.    1.    1.    1.    0.99  1.    1.    1.    1.    1.    1.
  1.    1.    0.99  1.    1.    0.99  0.91  1.    1.    1.    1.    1.
  1.    0.62  0.99  0.99  1.    1.    1.    0.48  0.08  0.72  1.    1.
  0.84  1.    1.    1.    1.    1.    0.99  0.99  1.    1.    1.    1.
 -0.   -0.   -0.    0.51 -0.   -0.   -0.   -0.   -0.   -0.   -0.   -0.
 -0.   -0.   -0.   -0.   -0.   -0.   -0.    0.   -0.    0.59 -0.    0.
  0.37 -0.   -0.   -0.    0.    1.  ]

View the training log located in logs to see how training looks like. It tooks 47 minutes to train on a Single NVIDIA A40 GPU.

Acknowledgement