Implement More Activation Functions #107

FrostyTheSouthernSnowman · 2024-05-16T21:17:23Z

The README mentioned that more activation functions are on the road map. I have gotten started. As of publishing, I only added Threshold, but I plan to go down the list from the (PyTorch Docs)[https://pytorch.org/docs/stable/nn.functional.html#non-linear-activation-functions]. Since I'm just going down the list, I'll do HardTanh next, unless someone has one they want written first. I am willing to write documentation if that is needed. Otherwise, I am trying to keep everything as close to the PyTorch docs as possible. If anybody thinks I should follow Tensoflow, Jax, or something else's docs instead, I am totally up for it as well. Here's the current TODO list for what I plan to implement first:

Once I get all of those done, I'll do the rest of the ones listed on the PyTorch docs, assuming my schedule allows for it. I am more than happy to switch out anything on the list.

Also, I've noticed that the current AttributeVector system lead to a lot of type conversions, which I feel could hinder performance. I don't totally plan to look into it right now, although I could if it were necessary.

There also seem to be a lot of overloads in the test file. Once I implement a few more activation functions, I'd like to see if I can split it into just a few basic categories that can be widely used across all the different activation functions.

I implemented the threshold activation function as well as unit tests to verify its functionality.

andresnowak · 2024-05-19T00:13:01Z

For the activation functions the tests for the backward part and also forward (apart from test_activations) would be in tests/mojo/test_mlops and tests/python/test_mlops_torch. And I wouldn't say there are activation function to prioritize, but i would say the only necessary for now would be leaky_relu, gelu and selu (from the ones mentioned i think those are the most used), but I think (not sure) the non approximation (the non tanh) version of gelu is complicated to implement. Because maybe we have to think about how to divide some parts of the code base, so some cleanups will be necessary. So I think with only those function would be enough I think.

FrostyTheSouthernSnowman · 2024-05-22T12:07:04Z

Ok. Will work on leaky_relu and selu, and I'll do gelu last.

FrostyTheSouthernSnowman · 2024-05-22T12:33:37Z

Also, I was looking through the code and I realized that there are tests in test_mlops.mojo for the activation functions that I forgot to write. Will write those for threshold, hard_tanh, and leaky_relu, as well as the tests in test_activations

andresnowak · 2024-05-22T16:50:50Z

There are also tests in tests/python/test_mlops_torch.mojo. Those are the three places test_activations, test_mlops and test_mlops_torch

Next up: test_mlops and test_mlops_torch

FrostyTheSouthernSnowman · 2024-05-31T14:41:33Z

The torch compatibility tests for threshold and hard tanh show there to be some bugs. Are they important enough to be worth debugging or should I just continue to gelu and selu and just delete all the threshold and hardtanh code?

andresnowak · 2024-05-31T16:42:57Z

If there are errors when comparing with the torch version yeah they should be fixed. But if you want yeah you can delete them and work with gelu, selu or you can fix the hard thanh and threshold.

FrostyTheSouthernSnowman · 2024-05-31T18:05:31Z

Ok. Haven't ever heard of them being used anyways. I'll just get rid of that code and work on the gelu and selu

FrostyTheSouthernSnowman · 2024-06-07T11:00:20Z

Is there an OP for elementwise min or max? Like what is denoted mathematically by say min(0, x)? Would be useful in a few places. And could simplify some implementations.

andresnowak · 2024-06-07T18:07:20Z

yeah in autograd/ops/basics there is the max op, or in utils/tensorutils there is the reduce op (only reduce all or over one dimension)

FrostyTheSouthernSnowman · 2024-06-07T18:09:01Z

Doesn't the current max op get the max value in the tensor? I need something that gets the max or min between two values.

andresnowak · 2024-06-07T18:32:19Z

that is part of mojo, there is already a max op in mojo in the math module (that gets the max value or min the min value between two simd values)

FrostyTheSouthernSnowman · 2024-06-07T18:33:19Z

Thanks! I'll use that then

FrostyTheSouthernSnowman added 6 commits May 16, 2024 17:05

Implemented threshold

0e6bc32

I implemented the threshold activation function as well as unit tests to verify its functionality.

Reduced cast operations for threshold

458f1d4

Fixed naming convention internally in threshold

3ec559f

Fixed bug with backward for threshold

5d97da5

Implemented Hardtanh

dcacce1

Rest of Hardtanh implementation

7567822

Implemented leaky_relu

14578fa

FrostyTheSouthernSnowman added 8 commits May 23, 2024 07:41

Added leaky_relu tests to test_activations.mojo

362ef21

Next up: test_mlops and test_mlops_torch

made LeakyReLU importable through basalt.nn

b24dd09

Implemented threshold test in test_mlops.mojo

7cece9f

Implemented LeakyReLU tests in test_mlops

20919ea

Implemented HardTanH tests in test_mlops

c92d75d

Autogenerated formatting changes to test_mlops

3442d78

Added torch compatibility test for leaky_relu

62ec6a4

Added torch compatibility test for hard tanh

871371f

Removed broken threshold and hardtanh activation functions

083e291

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement More Activation Functions #107

Implement More Activation Functions #107

FrostyTheSouthernSnowman commented May 16, 2024 •

edited

andresnowak commented May 19, 2024 •

edited

FrostyTheSouthernSnowman commented May 22, 2024

FrostyTheSouthernSnowman commented May 22, 2024

andresnowak commented May 22, 2024 •

edited

FrostyTheSouthernSnowman commented May 31, 2024

andresnowak commented May 31, 2024

FrostyTheSouthernSnowman commented May 31, 2024

FrostyTheSouthernSnowman commented Jun 7, 2024

andresnowak commented Jun 7, 2024 •

edited

FrostyTheSouthernSnowman commented Jun 7, 2024

andresnowak commented Jun 7, 2024 •

edited

FrostyTheSouthernSnowman commented Jun 7, 2024

Implement More Activation Functions #107

Are you sure you want to change the base?

Implement More Activation Functions #107

Conversation

FrostyTheSouthernSnowman commented May 16, 2024 • edited

andresnowak commented May 19, 2024 • edited

FrostyTheSouthernSnowman commented May 22, 2024

FrostyTheSouthernSnowman commented May 22, 2024

andresnowak commented May 22, 2024 • edited

FrostyTheSouthernSnowman commented May 31, 2024

andresnowak commented May 31, 2024

FrostyTheSouthernSnowman commented May 31, 2024

FrostyTheSouthernSnowman commented Jun 7, 2024

andresnowak commented Jun 7, 2024 • edited

FrostyTheSouthernSnowman commented Jun 7, 2024

andresnowak commented Jun 7, 2024 • edited

FrostyTheSouthernSnowman commented Jun 7, 2024

FrostyTheSouthernSnowman commented May 16, 2024 •

edited

andresnowak commented May 19, 2024 •

edited

andresnowak commented May 22, 2024 •

edited

andresnowak commented Jun 7, 2024 •

edited

andresnowak commented Jun 7, 2024 •

edited