Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage to build CNN Network #14

Open
WuZhuoran opened this issue Jul 9, 2019 · 4 comments
Open

Usage to build CNN Network #14

WuZhuoran opened this issue Jul 9, 2019 · 4 comments
Labels
question Further information is requested

Comments

@WuZhuoran
Copy link
Contributor

Is there any documentation for usage to build a network?

I want to try to implement some simple network based on for example MNIST dataset.

If there is no documentation, i think we can write one. For example, in keras, we can have model built like this:

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
@WuZhuoran WuZhuoran changed the title Usage for Network Usage to build Network Jul 9, 2019
@WuZhuoran WuZhuoran changed the title Usage to build Network Usage to build CNN Network Jul 9, 2019
@ddbourgin
Copy link
Owner

Unfortunately there really is no good high-level documentation at this point. This is on my TODO list, but is likely to take some time as there's a lot to document ;)

For your particular case, there are two examples of how you might go about building a full network in the models section.

In general, models using this code are going to be quite slow in comparison to any keras/tf/torch/theano implementations - the code here is optimized for readability over speed / efficiency. That said, I think it's a great idea to have some simple examples to show how the NN code corresponds to other packages.

@ddbourgin
Copy link
Owner

In general, if you want to implement a model, you'll probably want the following methods as a bare-minimum:

_build_network(self, ...):
    # initialize the network layers and store them within an 
    # OrderedDict so you can reliably iterate over them during the 
    # forward / backward passes

forward(self, X):
    # perform a forward pass. this is where the specific model architecture comes
    # into play, since you'll need to define how outputs from early layers flow to 
    # inputs of subsequent layers

backward(self, dLdy):
    # perform a backward pass. again, the route the gradients take through the network
    # will be specific to the particular model architecture

@WuZhuoran
Copy link
Contributor Author

So basically numpy-ml follows some kind of PyTorch way of building a model, right?

@ddbourgin
Copy link
Owner

Yeah, more or less. The major difference is that this code won't have a built-in backward method - you have to implement it yourself for each model

@ddbourgin ddbourgin added the question Further information is requested label Jul 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants