Skip to content

Latest commit

 

History

History
190 lines (149 loc) · 8.14 KB

File metadata and controls

190 lines (149 loc) · 8.14 KB

Fall-Detection using LSTM Autoencoder v1.2.1


Logo

Abstract

Even if it's not well known, falling is one of the causes for accidental injury for most of the elderly people over 65 years old. If the fall incidents are not detected in a timely manner, they could lead to serious injury. Moreover, such incidents can happen in outdoor uncontrolled environments, such as parking lots and limited-access roads, where the probability of being noticed and receive punctual help is even smaller. A system which detects abnormal events (such as falls) only using camera streams, without requiring extra sensors, could provide timely aid by triggering an alarm. In this work, we provide a solution using an LSTM Autoencoder which is one of the most used machine learning algorithm for anomaly detection. After applying a real-time pose estimation framework, called OpenPose, to the real-time video source, the generated poses undergo some preprocessing steps (filtering, normalization, quantization, windowing etc.) before being inputted to an LSTM Autoencoder. The model is trained to learn the normal behaviour of a walking person through a large and heterogeneous dataset of 19 joint points of human body actions. At testing time, if the autoencoder generates an output which is too different from the corrispondent input, it means that the time window given to the model is an anomaly, meaning the person is falling.

Inference test on a 6 seconds scene (2 windows):


Table of Contents

  1. About The Project
  2. Getting Started
  3. Contacts
  4. Acknowledgements

About The Project

In this work we propose a system that monitors the pose of people in a scene only through a traditional camera, and learns the way the user behaves. The system would learn patterns over time and would therefore be capable of building a solid knowledge of what a normal pose behaviour is, and what it isn't (in particular, falling events). Both training set and test set were made ad-hoc for this work, in order to build a dataset that represents a specific task: video surveillance in places like pedestrian areas, parking lots and parks of various type. So video footages are set up to capture a large area, with the camera installed at about 3 meters of heigth.

Proposed models

Researches show that LSTM Autoencoders and similar models led to promising results in detecting anomalous temporal events in a semi-supervised manner. The idea of training a model with only "normal events" is important because, in nature, abnormal instances occur very rarely, therefore the acquirement of such data is expensive. Hence the base kind of model used for this study is an Autoencoder with LSTM layers as its elementary units as showed in the figure below. Throughout the model selection phase various model architectures have been tried and tested in order to find the best one.


We proposed, in our study, different shallow and deep models. Better performance were reached with shallow models with few units per layer, e.g. an autoencoder with [63, 32] - [32, 64] architecture, as in the figure.

Reconstruction error

Illustration of different learning behaviours in the movements with MSE and MAE.


Frameworks

Getting Started

Istructions for code and data are down below.

Code structure

src
├── draw_skeletons.py
│   ──> contains procedures for generating images/videos from skeleton data
├── preprocessing.py
│   ──> contains all the preprocessing functions
├── model_and_training.py
│   ──> contains the model architectures and training procedure
├── evaluation.py 
│   ──> contains all the metrics and functions used for evaluating the model

Dataset structure

Before training, you have to set up the dataset directory in a precise manner. The preprocessing stage takes two different datasets, the train set and the test set. Each one is a directory of directories, and the preprocessing procedure scans every directory in alphabetical order or the order speicified in the code (the order of the list of string that represents the name of directories), collecting all the json.

Dataset
├── Train-Set
│   ├── Dir_1
│   │   └── json data
│   ├── ...
│   │   └── ...
│   └── Dir_N
│       ├── json data
├── Test-Set
│   ├── Dir_1
│   │   └── json data
│   ├── ...
│   │   └── ...
│   └── Dir_N
│       ├── json data

Contacts

Acknowledgements