Create an `Op` for NumPy's generalized `ufunc`s #695

brandonwillard · 2021-12-11T01:05:43Z

We need to extend our existing ufunc support—currently provided by the Elemwise Op—with a new Op that covers NumPy's gufuncs. Let's call this proposed Op Blockwise.

This new Blockwise Op would allow us to generalize at least a few existing Ops, and it would provide a very direct bridge to Numba and JAX's equivalent functionality (e.g. Numba's direct support for gufuncs and JAX's vmap). It would also allow us to implement a very convenient np.vectorize-like helper function.

The implementation details behind a Blockwise Op will likely involve a lot of the same logic as Elemwise and RandomVariable. At a high level, the gradient methods in the former could be extended to account for non-scalar elements, while the latter demonstrates some of the relevant shape logic.

The RandomVariable Op already works similarly to gufuncs, because it supports generic "base" random variable "op"s that map distinctly shaped inputs to potentially non-scalar outputs. A good example is the MultinomialRV Op; its gufunc-like signature would be (), (n) -> (n).

Here's an illustration:

import numba
import numpy as np

import aesara.tensor as at


#
# Existing `gufunc`-like functionality provided by `RandomVariable`
#
X_rv = at.random.multinomial(np.r_[1, 2], np.array([[0.5, 0.5], [0.4, 0.6]]))
X_rv.eval()
# array([[0, 1],
#        [2, 0]])

#
# A NumPy `vectorize` equivalent (this doesn't create a `gufunc`)
#
multinomial_vect = np.vectorize(np.random.multinomial, signature="(),(n)->(n)")
multinomial_vect(np.r_[1, 2], np.array([[0.5, 0.5], [0.4, 0.6]]))
# array([[1, 0],
#        [2, 0]])


#
# A Numba example that creates a NumPy `gufunc`
#
@numba.guvectorize([(numba.int64, numba.float64[:], numba.int64[:])], "(),(n)->(n)")
def multinomial_numba(a, b, out):
    out[:] = np.random.multinomial(a, b)


multinomial_numba(np.r_[1, 2], np.array([[0.5, 0.5], [0.4, 0.6]]))
# array([[1, 0],
#        [1, 1]])

See the originating discussion here.

The text was updated successfully, but these errors were encountered:

kc611 · 2022-01-15T13:57:55Z

One way I see this Op being implemented is by having it store an Function as a property (similar to how Elemwise does with it's atomic functions) and then calling it upon the Numpy arrays while looping over inputs in Op.perform. (The signature and relevant shape information can be supplied and processed during __init__)

Otherwise if we want the inner FunctionGraph to be 'a part' of the outer FunctionGraph, that'd require an interface similar to Scan.

brandonwillard · 2022-01-15T16:36:11Z

One way I see this Op being implemented is by having it store an Function as a property (similar to how Elemwise does with it's atomic functions) and then calling it upon the Numpy arrays while looping over inputs in Op.perform. (The signature and relevant shape information can be supplied and processed during __init__)

~~This approach would preclude us from providing a np.vectorize-like interface for Ops.~~ If I understand what you're saying correctly, that's effectively the same as the approaches mentioned below.

Otherwise if we want the inner FunctionGraph to be 'a part' of the outer FunctionGraph, that'd require an interface similar to Scan.

Yes, it would be similar to a narrower, streamlined Scan, or an expanded Elemwise, or an OpFromGraph with looping.

kc611 · 2022-01-16T15:42:17Z

Built a rudimentary implementation: https://gist.github.com/kc611/72732f4305fe273c2da1620d6fb4e90c

We should discuss how/at which level are we going to check for input shapes, without the actual shape information as integer tuples, it'll be very hard to determine if inputs are valid or not until we hit Op.perform where we do have that information.

Any way in which #711 can help us out here ?

brandonwillard · 2022-01-16T22:54:36Z

Built a rudimentary implementation: https://gist.github.com/kc611/72732f4305fe273c2da1620d6fb4e90c

Yes, that's a good start, especially the use of the HasInnerGraph interface.

To parse the NumPy signatures, we can use the same code NumPy uses (e.g. numpy.lib.function_base._parse_gufunc_signature and the like). My guess is that we could adapt most—if not all—of the basic logic used by np.vectorize, but I believe we already have (in numerous places and different ways).

Now, I think it's important that we leave the ufunc signatures out of the Op-level logic, and only use it as a convenient/familiar interface, because we'll need to focus on implementing a new Op interface to replace RandomVariable.[ndim_supp, ndims_params] and Elemwise.nfunc_spec (not the NumPy function name part, though).

This interface is really the first step. From there, the work is in Op.infer_shape and Op.grad/Op.L_op/Op.R_op.

We should discuss how/at which level are we going to check for input shapes, without the actual shape information as integer tuples, it'll be very hard to determine if inputs are valid or not until we hit Op.perform where we do have that information.

There's nothing we can—or really need—to do about that. That's the nature of this type of work: we simply do what we can with the available static information and leave the run-time-level checking to the run-time.

Any way in which #711 can help us out here ?

Yes, but now we're talking about static analysis-based optimizations. Those are great, but they come later down the line.

brandonwillard · 2022-01-17T03:59:17Z

@kc611, I just created an outline for the kind of Op that might work. You're welcome to take over that PR if you're interested in implementing this; otherwise, we can use it as a reference until either I have the time or someone else wants to take this on.

kc611 · 2022-01-17T04:25:29Z

I'll give it a try.

brandonwillard added help wanted Extra attention is needed important new operator refactor This issue involves refactoring labels Dec 11, 2021

brandonwillard pinned this issue Dec 11, 2021

brandonwillard mentioned this issue Jan 17, 2022

Add Blockwise Op #757

Closed

brandonwillard linked a pull request Jan 17, 2022 that will close this issue

Add Blockwise Op #757

Closed

brandonwillard added Op implementation Involves the implementation of an Op and removed new Op labels Jul 11, 2022

brandonwillard mentioned this issue Jul 23, 2022

Implement aesara.tensor.matmul #744

Merged

This was referenced Aug 2, 2022

Wrong gradients when inputs are dynamically broadcasted #1089

Open

Generalize linalg.det beyond 2D arrays #1065

Open

Implement missing ufunc.outer functions for binary ops #612

Open

brandonwillard mentioned this issue Aug 31, 2022

Add gufunc signature to RandomVariable's docstrings #1160

Merged

brandonwillard linked a pull request Sep 26, 2022 that will close this issue

Add Blockwise Op #1215

Open

twiecki unpinned this issue Oct 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create an `Op` for NumPy's generalized `ufunc`s #695

Create an `Op` for NumPy's generalized `ufunc`s #695

brandonwillard commented Dec 11, 2021 •

edited

kc611 commented Jan 15, 2022

brandonwillard commented Jan 15, 2022 •

edited

kc611 commented Jan 16, 2022

brandonwillard commented Jan 16, 2022 •

edited

brandonwillard commented Jan 17, 2022

kc611 commented Jan 17, 2022

Create an Op for NumPy's generalized ufuncs #695

Create an Op for NumPy's generalized ufuncs #695

Comments

brandonwillard commented Dec 11, 2021 • edited

kc611 commented Jan 15, 2022

brandonwillard commented Jan 15, 2022 • edited

kc611 commented Jan 16, 2022

brandonwillard commented Jan 16, 2022 • edited

brandonwillard commented Jan 17, 2022

kc611 commented Jan 17, 2022

Create an `Op` for NumPy's generalized `ufunc`s #695

Create an `Op` for NumPy's generalized `ufunc`s #695

brandonwillard commented Dec 11, 2021 •

edited

brandonwillard commented Jan 15, 2022 •

edited

brandonwillard commented Jan 16, 2022 •

edited