Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurability via Yottato #119

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 34 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,40 @@ See the accompanying blog post for full details: https://medium.com/@harvitronix

## Requirements

This code requires you have Keras 2 and TensorFlow 1 or greater installed. Please see the `requirements.txt` file. To ensure you're up to date, run:
1. This code requires you have Keras 2 and TensorFlow 1 or greater installed. Please see the `requirements.txt` file. To ensure you're up to date, run:

`pip install -r requirements.txt`

You must also have `ffmpeg` installed in order to extract the video files. If `ffmpeg` isn't in your system path (ie. `which ffmpeg` doesn't return its path, or you're on an OS other than *nix), you'll need to update the path to `ffmpeg` in `data/2_extract_files.py`.
2. You must also have `ffmpeg` installed in order to extract the video files. If `ffmpeg` isn't in your system path (ie. `which ffmpeg` doesn't return its path, or you're on an OS other than *nix), you'll need to update the path to `ffmpeg` in `data/2_extract_files.py`.

3. Configuration of the runs is performed using yottato, from

https://github.com/prabindh/yottato

After cloning or download, perform below steps to all to local python package list.

```
cd yottato

python setup.py install

```

## Configuration

Important configuration parameters {location of data, hyperparameters} are configurable via the JSON file at,

config/config.json

Typically, the below parameters would need to be configured. Note - the instructions in below section (Getting the data, or running different commands) apply to the already set default parameters.

- globalDataRepo : This locates a central place where media files are stored, and where results of analysis are kept

DEFAULT Repo location: ./data

- training/algorithm : This identifies the model/algorithm to be used (ex, lrcn, lstm, cnn etc)

DEFAULT algorithm : lrcn

## Getting the data

Expand All @@ -42,26 +71,22 @@ Before you can run the `lstm` and `mlp`, you need to extract features from the i

The CNN-only method (method #1 in the blog post) is run from `train_cnn.py`.

The rest of the models are run from `train.py`. There are configuration options you can set in that file to choose which model you want to run.
The rest of the models are run from `train.py`. Configurations can be performed using the config/config.json

The models are all defined in `models.py`. Reference that file to see which models you are able to run in `train.py`.

Training logs are saved to CSV and also to TensorBoard files. To see progress while training, run `tensorboard --logdir=data/logs` from the project root folder.

## Demo/Using models

I have not yet implemented a demo where you can pass a video file to a model and get a prediction. Pull requests are welcome if you'd like to help out!

## TODO

- [ ] Add data augmentation to fight overfitting
- [x] Support multiple workers in the data generator for faster training
- [ ] Add a demo script
- [x] Add a demo script
- [ ] Support other datasets
- [ ] Implement optical flow
- [ ] Implement more complex network architectures, like optical flow/CNN fusion

## UCF101 Citation

Khurram Soomro, Amir Roshan Zamir and Mubarak Shah, UCF101: A Dataset of 101 Human Action Classes From Videos in The Wild., CRCV-TR-12-01, November, 2012.
Khurram Soomro, Amir Roshan Zamir and Mubarak Shah, UCF101: A Dataset of 101 Human Action Classes From Videos in The Wild., CRCV-TR-12-01, November, 2012.

22 changes: 22 additions & 0 deletions config/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"classes" : ["PlayingDhol", "BenchPress"],
"globaldataRepo" : "data",
"sessionname" : "testsession",
"featurefile" : "data_file.csv",
"traintestsplit" : 70,
"training" :
[
{
"modality" : "video",
"algorithm" : "lrcn",
"sequencelength" : 40,
"maxframes": 300,
"loadtomemory" : 0,
"batchsize" : 32,
"epochs" : 1000,
"learningrate" : 0.00001,
"decay" : 0.000001,
"trainingfilelist" : "videofilelist-UCF.json"
}
]
}
28 changes: 28 additions & 0 deletions config/videofilelist-UCF.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"files": [
{
"name": "UCF-101\\ApplyEyeMakeup\\v_ApplyEyeMakeup_g01_c01.avi",
"slices": [
{
"classes": [
"ApplyEyeMakeup"
],
"duration": 6,
"start": "00:00:00"
}
]
},
{
"name": "UCF-101\\ApplyLipstick\\v_ApplyLipstick_g01_c01.avi",
"slices": [
{
"classes": [
"ApplyLipstick"
],
"duration": 5,
"start": "00:00:00"
}
]
}
]
}
61 changes: 41 additions & 20 deletions data.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,54 +32,69 @@ def gen(*a, **kw):

class DataSet():

def __init__(self, seq_length=40, class_limit=None, image_shape=(224, 224, 3)):
def __init__(self, seq_length=40, class_limit=None, image_shape=(224, 224, 3), repo_dir = 'data', feature_file_path='data_file.csv', work_dir='data', classlist=[]):
"""Constructor.
seq_length = (int) the number of frames to consider
class_limit = (int) number of classes to limit the data to.
None = no limit.
"""
self.seq_length = seq_length
self.class_limit = class_limit
self.sequence_path = os.path.join('data', 'sequences')
self.sequence_path = os.path.join(work_dir, 'sequences')
self.repo_dir = repo_dir
self.work_dir = work_dir
self.max_frames = 300 # max number of frames a video can have for us to use it

# Get the data.
self.data = self.get_data()

self.data = self.get_data(os.path.join(repo_dir, feature_file_path))
print (len(self.data), "data samples in", feature_file_path)
# Get the classes.
self.classes = self.get_classes()
self.classes = self.get_classes(classlist)

# Now do some minor data cleaning.
self.data = self.clean_data()

print (len(self.data), "cleaned data samples in list")
self.image_shape = image_shape

@staticmethod
def get_data():
def get_data(feature_file_path):
"""Load our data from file."""
with open(os.path.join('data', 'data_file.csv'), 'r') as fin:
with open(feature_file_path, 'r') as fin:
reader = csv.reader(fin)
data = list(reader)

return data

def check_data(self, batch_size):
ret = True
train, test = self.split_train_test()
if len(train) < batch_size:
print ("ERROR: [%s] samples [%d] less than batch size of [%d]" % ('train', len(train), batch_size))
ret = False
if len(test) < batch_size:
print ("ERROR: [%s] samples [%d] less than batch size of [%d]" % ('test', len(test), batch_size))
ret = False
return ret


def clean_data(self):
"""Limit samples to greater than the sequence length and fewer
than N frames. Also limit it to classes we want to use."""
data_clean = []
for item in self.data:
if len(item) < 2: continue # Empty line ?
if int(item[3]) >= self.seq_length and int(item[3]) <= self.max_frames \
and item[1] in self.classes:
data_clean.append(item)

return data_clean

def get_classes(self):
def get_classes(self, classlist):
"""Extract the classes from our data. If we want to limit them,
only return the classes we need."""
classes = []
for item in self.data:
if item[1] not in classes:
if len(item) < 2: continue #Empty line ?
if item[1] not in classes: # If configured, use "item[1] in classlist:"
classes.append(item[1])

# Sort them.
Expand Down Expand Up @@ -130,7 +145,7 @@ def get_all_sequences_in_memory(self, train_test, data_type):
for row in data:

if data_type == 'images':
frames = self.get_frames_for_sample(row)
frames = self.get_frames_for_sample(self.repo_dir, row)
frames = self.rescale_list(frames, self.seq_length)

# Build the image sequence
Expand All @@ -148,6 +163,8 @@ def get_all_sequences_in_memory(self, train_test, data_type):

return np.array(X), np.array(y)



@threadsafe_generator
def frame_generator(self, batch_size, train_test, data_type):
"""Return a generator that we can use to train on. There are
Expand All @@ -159,7 +176,7 @@ def frame_generator(self, batch_size, train_test, data_type):
train, test = self.split_train_test()
data = train if train_test == 'train' else test

print("Creating %s generator with %d samples." % (train_test, len(data)))
print("Creating [%s] generator with [%d] samples." % (train_test, len(data)))

while 1:
X, y = [], []
Expand All @@ -175,7 +192,7 @@ def frame_generator(self, batch_size, train_test, data_type):
# Check to see if we've already saved this sequence.
if data_type is "images":
# Get and resample frames.
frames = self.get_frames_for_sample(sample)
frames = self.get_frames_for_sample(self.repo_dir, sample)
frames = self.rescale_list(frames, self.seq_length)

# Build the image sequence
Expand Down Expand Up @@ -212,15 +229,16 @@ def get_frames_by_filename(self, filename, data_type):
# First, find the sample row.
sample = None
for row in self.data:
if row[2] == filename:
normfilename = os.path.normpath(row[2])
if normfilename == os.path.normpath(filename):
sample = row
break
if sample is None:
raise ValueError("Couldn't find sample: %s" % filename)
raise ValueError("Couldn't find sample: %s" % os.path.normpath(filename))

if data_type == "images":
# Get and resample frames.
frames = self.get_frames_for_sample(sample)
frames = self.get_frames_for_sample(self.repo_dir, sample)
frames = self.rescale_list(frames, self.seq_length)
# Build the image sequence
sequence = self.build_image_sequence(frames)
Expand All @@ -234,12 +252,15 @@ def get_frames_by_filename(self, filename, data_type):
return sequence

@staticmethod
def get_frames_for_sample(sample):
def get_frames_for_sample(repo_dir, sample):
"""Given a sample row from the data file, get all the corresponding frame
filenames."""
path = os.path.join('data', sample[0], sample[1])
filename = sample[2]
images = sorted(glob.glob(os.path.join(path, filename + '*jpg')))
train_test = sample[0]
folder = sample[1]
pre_path = os.path.join(repo_dir, train_test, folder)
images = sorted(glob.glob(os.path.join(pre_path, '**', filename+'*.jpg'), recursive=True))
print (filename, "[",len(images),"]")
return images

@staticmethod
Expand Down
34 changes: 30 additions & 4 deletions demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,29 @@
from keras.models import load_model
from data import DataSet
import numpy as np
import sys, os, json
from yottato.yottato import yottato as yto

def predict(data_type, seq_length, saved_model, image_shape, video_name, class_limit, config):

def predict(data_type, seq_length, saved_model, image_shape, video_name, class_limit):
model = load_model(saved_model)

feature_file_path= config.featureFileName
work_dir = config.workDir
classlist= config.classes

# Get the data and process it.
if image_shape is None:
data = DataSet(seq_length=seq_length, class_limit=class_limit)
data = DataSet(seq_length=seq_length, class_limit=class_limit,
feature_file_path = feature_file_path,
repo_dir = config.repoDir,
work_dir=work_dir, classlist=classlist)
else:
data = DataSet(seq_length=seq_length, image_shape=image_shape,
class_limit=class_limit)
class_limit=class_limit,
feature_file_path = feature_file_path,
repo_dir = config.repoDir,
work_dir=work_dir, classlist=classlist)

# Extract the sample from the data.
sample = data.get_frames_by_filename(video_name, data_type)
Expand All @@ -40,6 +53,18 @@ def main():
# Limit must match that used during training.
class_limit = 4

#read config file
if len(sys.argv) > 2:
configfile = sys.argv[1]
saved_model = sys.argv[2]
else:
print ("Usage: script <fullpath to config.json> <fullpath to HDF5 stored model>")
sys.exit(0)

yto_config = yto(configfile)
model = yto_config.videoAlgorithm
seq_length = yto_config.videoSeqLength

# Demo file. Must already be extracted & features generated (if model requires)
# Do not include the extension.
# Assumes it's in data/[train|test]/
Expand All @@ -49,6 +74,7 @@ def main():
#video_name = 'v_Archery_g04_c02'
video_name = 'v_ApplyLipstick_g01_c01'

video_name = os.path.normpath(video_name)
# Chose images or features and image shape based on network.
if model in ['conv_3d', 'c3d', 'lrcn']:
data_type = 'images'
Expand All @@ -59,7 +85,7 @@ def main():
else:
raise ValueError("Invalid model. See train.py for options.")

predict(data_type, seq_length, saved_model, image_shape, video_name, class_limit)
predict(data_type, seq_length, saved_model, image_shape, video_name, class_limit, yto_config)

if __name__ == '__main__':
main()
Loading