Skip to content
This repository has been archived by the owner on Nov 8, 2018. It is now read-only.

How to train keras features on non-redundant/infinite set of labels #68

Open
anishsharma opened this issue May 31, 2018 · 0 comments
Open

Comments

@anishsharma
Copy link

I am developing a neural network in order to classify timeseries data. I know for timeseries LSTM would be right approach but in dist-keras where before passing it to a trainer, data has to be in spark dataframe format.

I am following this example LSTM and task here is to port this example to dist-keras. Timestep is 50 which means model would take 0-49 and predict 50 and so on. As you can see in the example that data is being pre-processed using numpy before being fed to keras. Since dist-keras requires data to be in spark dataframe, I have to take a different approach which is as follows:

I have straightaway created the DF:

X_train = train[:, :] y_train = train[:, -1] raw_dataset_train = sc.createDataFrame(X_train.tolist())

Above code will create a DF having 50 columns(timestep is 50) from 0 to _50.

Remove the _50 column which is the label in our case and then applying the vector assembler to all features:

features = raw_dataset_train.columns features.remove('_50') vector_assembler = VectorAssembler(inputCols=features, outputCol="features") dataset_train = vector_assembler.transform(raw_dataset_train)

Now, each row of DF contains 2 columns. First column contains the features and second contains the label(_50 column which I want to train on and later predict on). As I see it, it become a classification problem. My issues are below:

If my approach is right, then how would I defines output label for my data as their are no finite number here for output column. It could be same number as number of rows in DF.

Do I still need LSTM layers in my model? I am asking this because I have processed the data in non-lstm kind of way.(At least that is what I think. I might be wrong.)

Please advice and let me know if you need more clarification or information on this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant