Skip to content

Tachionstrahl/SignLanguageRecognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SignLanguageRecognition

No further work will be done on this project.

This repository contains a variety of tools to build up a experimental ecosystem for recognizing signs of the german sign language (DGS). Our claim is an experimental attempt at live subtitling of gestures. For this we train a deep learning model (RNN) for predicting the actual signs made by a person filmed. Therefore we use MediaPipe, a framework for building ML pipelines, to extract face and hand positions, including multiple coordinates for each finger.

Table of contents

  1. Demo Video
  2. Supported Words
  3. Requirements
  4. Installation
  5. Build & Run Demo
  6. Workflow

Demo Video

demo

Supported Words

The following table lists all supported words with a link to SignDict and the english translation.

German English
Bier Beer
Computer Computer
Deutschland Germany
du you
essen (to) eat
Foto Photo
Fußball Soccer
gestern yesterday
haben (to) have
Hallo Hello
Haus House
heute today
Hose Pants
Hunger Hunger
ich Me
Land Country
lernen (to) learn
lieben (to) love
Mainz Mainz (city in germany)
morgen tomorrow
rennen (to) run
Software Software
Sonne Sun
Tag Day
trinken (to) drink
Universität University
unser our(s)
Welt World
Wetter Weather
zeigen (to) show

Requirements

  1. Ubuntu 18.04 (other version may not work)
  2. GPU with OpenGL ES 3.1+ support
  3. Camera with a minimum resolution of 640x480 @ 30fps

Installation

For Demo

  1. Follow the instructions to install MediaPipe. Note that bazel version 3.5.0 is not supported, we recommend using 3.4.1. Make sure, you can build and run MediaPipe GPU examples, see https://google.github.io/mediapipe/getting_started/building_examples.html#option-2-running-on-gpu
  2. Clone the repository

Optional: To work with the code

  1. To work with the jupyter notebooks, we recommend to install Anaconda.
  2. Install TensorFlow 2.2.0 with conda, see https://anaconda.org/anaconda/tensorflow-gpu

Demo: Build and Run

  1. Open a terminal within the repository and navigate to the src folder: cd src/
  2. Build the demo with ./build_prediction.sh
  3. Run the demo with ./run_prediction.sh
  4. Exit the application with Strg + C

Workflow

1. Gathering video examples

For training we need many videos for each sign, we want to predict. Those examples are generated by users of our platform Gebärdenfutter.

2. Extracting face and hand positions

For extracting multi hand and face detections for each frame of the videos and saving them, we built a pipeline with MediaPipe, e.g. have a look at the DetectionsToCSVCalculator, we implemented. It simply writes out the detections made by MediaPipe to CSV files.

3. Training deep learning model

The CSV files are used to train a deep learning model with Keras, a high level API for TensorFlow. To find best hyperparameter sets we use Weights&Biases' Sweeps. Check out the lab folder.

4. Live prediction (Subtitling) Work in progress

SignLang Prediction Graph

Visualization of MediaPipe Graph

The trained model is used for predicting live video stream. See the SignLangRecognitionCalculator for further details on how we try to use the model for live predictions. Currently it's not working well, like we expected before, but it provides us an infrastructure for experiments and testing. You've got ideas for improvements? Let us know!