Teammates: Theodore Hadges, and Caroline Barker, and Aravind Vaddi
Inspired by the Ring Home Security system, this implementation has the added feature of automatically granting access to previously authenticated users (by analyzing visitors faces and storing classifications)
This is a purely cloud-based (AWS) system, save for the simple front-end pages and the hardware required to trigger the video stream.
Below, a demo which simulates a visitor walking up to the house and triggering the motion sensor. (This is only
the beginning of the process.)
I trigger the partial infrared (PIR) motion sensor with my hand, but obviously it would be facing the same direction as the camera
in real life. Once motion is detected, the camera module begins streaming to our stream processor in the cloud.
When we view the Kinesis Video Stream (the first stage of the processor) monitor below, you'll notice my face is displayed
after a few seconds of the stream being triggered.
What happens next is conditional and depends on whether the visitor (me in this case) is a new or returning (approved) visitor.
If they are a new visitor, a notification will be sent to the homeowner via SNS in the form of a text message. The owner, upon clicking the link in the text message, will be led to a page like this:
Note that the owner must login before being able to access this page and view visitor images.
Next, if the owner approves the visitor, the visitor's face will be indexed using Rekognition so that the system will be able to detect that particular visitor an automatically grant them access in the future, without the need for the owner's approval.
The visitor will then receive a text with a sign-in link and a one-time-password (OTP) which expires in 5 minutes.
The sign-in form looks like this:
If the OTP is correct then the user will be granted access. At this point the door would be physically unlocked, but that is not the focus of this project, so instead the user is redirected to the welcome page:
However, if the user is denied, they'll be redirected to the denied page. (In this case, the door would remain locked.)
Since the owner already approved this visitor, the visitor's face details are stored in our Rekognition collection and database. If that visitor returns, we perform facial analysis again, capture a frame of the video, and train the model by indexing the new face and adding it to the same group as the previously indexed faces for this user.
The returning visitor gets a text message from the system with an OTP, and can login via the visitor page, as above.
We used CodePipeline to auto-deploy code from our master
branch on GitHub to S3. We also used the CloudFront content delivery network to distribute our
website and replicate the distribution configuration across multiple edge
locations around the world for caching content and providing fast delivery of
the website.
Here is the flowchart for this process:
Here is the flowchart of our full architecture diagram.
There are a few services we haven't yet added to this diagram:
- We used some S3 buckets for image storage and the front-end
- We used Cognito so that the owner needs to login to the homeowner page (wp1).
- We used CloudFront and CodePipeline as mentioned above
- We used some hardware to sense motion and trigger a video stream to Kinesis
- Raspberry Pi 4
- Partial Infrared Sensor module with Fresnel lens
- Raspberry Pi Camera Video module
- Jumper wires
The program if_motion_then_stream.py
takes care of the logic involved with detecting motion and
initiating the video stream.
Once the PIR sensor detects a certain amount of infrared light, it will send a HIGH signal to the corresponding GPIO pin on the Raspberry Pi in the form of a few volts.
The sensor is a little sensitive and we don't want any false positives, so the system does not begin streaming video until there are 10 consecutive time steps in which a HIGH signal is sent to the pin. Each time step is 0.1 seconds.
Once this condition is met, the red LED will turn on, and the video will begin streaming to AWS Kinesis Video Stream, the first step in our stream processor.
The stream processor is the pipeline through which our video data flows. There are three major components:
- Kinesis Video Stream (KVS): This is the service which ingests the video stream which is sent from our camera module.
- Rekognition: This is the service that performs video facial recognition analysis on all visitors.
- Kinesis Data Stream (KDS): This is the service that captures and manages Rekognition output logs. This is the last step in the pipeline, and we have set up a trigger from KDS to the Lambda LF1 function. LF1 is the consumer of this data.
- Created 2 main buckets:
smart-door-image-store
andsmart-door-b1
. - We put all frontend files in
smart-door-b1
and all images insmart-door-image-store
. - Added automatic deployment with CodePipeline as discussed in the next section.
The front-end is hosted in an S3 bucket. Since editing those files directly is impossible and downloading, editing,
then reuploading is clumsy and inefficient, we decided to use CodePipeline.
Whenever we push to our master
branch in our GitHub repository,
CodePipeline will pull those changes, then clone them to our S3 bucket.
This makes for easy integration between GitHub and S3.
The website source files are hosted on S3, but we also use a CDN called CloudFront to replicate configurations of the distribution to edge locations around the world, for caching content and providing fast delivery of the website.
We built a very simple static website with approval/login forms. Website design and UX was not important for us. (We can always come back to this later.)
- We built an API using API Gateway which allows us to make requests and get resources from our backend.
- This is the Lambda function in the bottom of the architecture diagram.
- Receives an event after the owner submits an approval form.
- If the new visitor is approved do the following:
- Upload the image to S3
- Index the face in Rekognition
- Add the S3 object key to the
visitors
table - Generate an OTP for the new visitor and add it to the
passcodes
table - Send a text to the new visitor containing a sign-in link and the OTP
- Otherwise, if the visitor is denied access, do nothing.
Rekognition is the second step in the stream processor pipeline (Kinesis Video Stream -> Rekognition -> Kinesis Data Stream).
When a visitor face matches a previously indexed face with at least 85% confidence, the current image of the
visitor is indexed to train the model on this visitor. Here is what the response looks like when there is
a person with ExternalImageID ted-hadges
who has already visited twice:
LF1 receives the video data stream event and does some checks. If the Rekognition response shows that there is a
matching face, then that user has already been approved in the past. Thus, LF1 at this point indexes the new face,
uploads the face to S3, adds the S3 object key to that user's photos
list in the visitors
table, then
sends a text message to the visitor with an OTP so that they can sign in.
Otherwise, if there is not a matching face, a text message will be sent to the owner. The owner, upon clicking the link in the text and logging in via Cognito, will be presented with the image of the visitor and and an accept/deny form for the new user.
We use AWS Simple Notification Service (SNS) for publishing text messages to users.
LF2 is the lambda function at the top of the architecture diagram. All LF1 does is check whether the OTP
of the returning visitor is valid and grant/deny them access. It also flags the OTP as having been used
and sets the expiration time for that OTP to the current time in the passcodes
table.
There are two tables: visitors
and passcodes
. visitors
has all information about visitors, while
passcodes
is only the visitor's phone number, their OTP, and the expiration time.
We set a TTL for the expTime
attribute of the passcodes
table such that the record expires at that time.
As a fun add-on, we decided to add authentication. We created a userpool, and a new client app within that pool. Since the callback URL (where to redirect to once the user is signed in) must use https (not http), we used our CloudFront URL. We used an authorization code grant flow whereby the user receives a code via email upon registration and must enter the code on the screen to activate their account.
The only user who would need to make an account is the homeowner. Visitors would not be able to make an account; therefore the visitor login page is public.
Cognito has some nice built-in forms to handle registration and login. This is convenient in that we did not have to create a separate web page just for the form.