Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

mcabbott · 2024-03-10T20:12:25Z

At present Parallel allows multiple layers and one input, but not the reverse. This PR extends it to allow both ways... much like broadcasting in connection((inputs .|> layers)...).

julia> Parallel(+, inv)(1, 2, 3)  # was an error
1.8333333333333333

julia> (1,2,3) .|> (inv,)
(1.0, 0.5, 0.3333333333333333)

Does this have any unintended side-effects?

PR Checklist

Tests are added
Entry in NEWS.md
Documentation, if applicable

codecov · 2024-03-10T20:32:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.03%. Comparing base (eb6492c) to head (0544711).

❗ Current head 0544711 differs from pull request most recent head 9ee2c69. Consider uploading reports for the commit 9ee2c69 to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #2393       +/-   ##
===========================================
+ Coverage   43.04%   74.03%   +30.98%     
===========================================
  Files          32       32               
  Lines        1856     1918       +62     
===========================================
+ Hits          799     1420      +621     
+ Misses       1057      498      -559

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/layers/basic.jl

mcabbott · 2024-03-13T02:41:50Z

Here's the complete run-down on where Flux does & doesn't splat at present:

julia> using Flux

julia> pr(x) = begin println("arg: ", x); x end;

julia> pr(x...) = begin println(length(x), " args: ", join(x, " & "), " -> tuple"); x end;

julia> c1 = Chain(pr, pr); ########## simple chain

julia> c1(1)
arg: 1
arg: 1
1

julia> c1((1, 2))
arg: (1, 2)
arg: (1, 2)
(1, 2)

julia> c1(1, 2)
ERROR: MethodError:
Closest candidates are:
  (::Chain)(::Any)

julia> p1 = Parallel(pr, a=pr);  ########## combiner + one layer

julia> p1(1)
arg: 1
arg: 1
1

julia> p1((1, 2))  # one 2-Tuple is NOT accepted, always splatted  --> changed by PR
ERROR: ArgumentError: Parallel with 1 sub-layers can take one input or 1 inputs, but got 2 inputs

julia> p1(1, 2)  # more obvious error  --> changed by PR
ERROR: ArgumentError: Parallel with 1 sub-layers can take one input or 1 inputs, but got 2 inputs

julia> p1((a=1, b=2))  # one NamedTuple is ok
arg: (a = 1, b = 2)
arg: (a = 1, b = 2)
(a = 1, b = 2)

julia> p1((((1,),),))  # splatted many times
arg: 1
arg: 1
1

julia> p2 = Parallel(pr, a=pr, b=pr);  ########## combiner + two layers

julia> p2(1)  # one non-tuple arg is broadcasted
arg: 1
arg: 1
2 args: 1 & 1 -> tuple
(1, 1)

julia> p2(1, 2)  # 2 args sent to 2 layers
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

julia> p2((1, 2))  # one tuple splatted
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

julia> p2((a=1, b=2))  # one NamedTuple sent to both
arg: (a = 1, b = 2)
arg: (a = 1, b = 2)
2 args: (a = 1, b = 2) & (a = 1, b = 2) -> tuple
((a = 1, b = 2), (a = 1, b = 2))

julia> p2(((1,2), ((3,4),)))  # only splatted once
arg: (1, 2)
arg: ((3, 4),)
2 args: (1, 2) & ((3, 4),) -> tuple
((1, 2), ((3, 4),))

julia> Chain(pr, p2, pr)((1, 2))  # here earlier layers cannot pass p2 two arguments
arg: (1, 2)
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
arg: (1, 2)
(1, 2)

This PR changes the two error cases above:

julia> p1((1, 2))  # changed by PR
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

julia> p1(1, 2)  # changed by PR
arg: 1
arg: 2
2 args: 1 & 2 -> tuple
(1, 2)

You could argue that p1((1, 2)) already has a plausible meaning, apply one layer to one input Tuple. But this use of Parallel is really just Chain (or in this order, ∘). And it's an error at present.

I think p1(1, 2) has no other plausible meaning.

The rule after this PR is:

(p::Paralel)(input::Tuple) always splats to p(input...)
return combine((inputs .|> layers)...)

Step 1 is unchanged, but step 2 previously allowed only broadcasting of the input. And today, I have a use where I want to broadcast the layer instead (easier than sharing it). That's in fact the 3rd case mentioned here: #1685 (comment) but I think it never worked.

mcabbott · 2024-03-13T03:05:39Z

Reading old threads... around here #2101 (comment) it was agreed that adding (c::Chain)(xs...) = c(xs) would make sense, but there was never a PR.

That's the first MethodError in my list above. I would like this too, and perhaps should just add it here.

mcabbott · 2024-03-13T14:32:18Z

Anyone remember why we allow Parallel(hcat)? You can write Returns(hcat()) if you really want that...

julia> Parallel(hcat)()
Any[]

julia> Parallel(hcat)(NaN)  # ignores input, but this case is tested
Any[]

julia> Parallel(hcat)(1,2,3)
ERROR: ArgumentError: Parallel with 0 sub-layers can take one input or 0 inputs, but got 3 inputs

Can we just make this an error on construction? I think that's basically what was agreed in #1685

ToucheSir

Since it was linked here, can you quickly comment on the relationship between this and FluxML/Functors.jl#80?

ToucheSir reviewed Mar 13, 2024

View reviewed changes

src/layers/basic.jl Outdated Show resolved Hide resolved

mcabbott changed the title ~~Allow Parallel(+, f)(x, y, z) to work like broadcasting~~ Allow Parallel(+, f)(x, y, z) to work like broadcasting, and enable Chain(identity, Parallel(+, f))(x, y, z) Mar 13, 2024

mcabbott mentioned this pull request Mar 13, 2024

functor Base.Splat FluxML/Functors.jl#80

Merged

mcabbott added the run downstream test label Mar 13, 2024

mcabbott added 7 commits March 19, 2024 13:49

let Parallel(+, f)(x, y, z) work like broadcasting

d371fec

add (::Chain)(xs...) method

4b3fe90

more examples

75d239f

correction

a46da88

change implementation to dispatch

6e431e9

nicer errors when called on zero inputs

5743841

disallow zero layers, let's try this out

9ee2c69

mcabbott force-pushed the parallel_bc branch from 0544711 to 9ee2c69 Compare March 19, 2024 17:50

ToucheSir approved these changes Mar 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

mcabbott commented Mar 10, 2024

codecov bot commented Mar 10, 2024 •

edited

mcabbott commented Mar 13, 2024 •

edited

mcabbott commented Mar 13, 2024

mcabbott commented Mar 13, 2024 •

edited

ToucheSir left a comment

Allow Parallel(+, f)(x, y, z) to work like broadcasting, and enable Chain(identity, Parallel(+, f))(x, y, z) #2393

Are you sure you want to change the base?

Allow Parallel(+, f)(x, y, z) to work like broadcasting, and enable Chain(identity, Parallel(+, f))(x, y, z) #2393

Conversation

mcabbott commented Mar 10, 2024

PR Checklist

codecov bot commented Mar 10, 2024 • edited

Codecov Report

mcabbott commented Mar 13, 2024 • edited

mcabbott commented Mar 13, 2024

mcabbott commented Mar 13, 2024 • edited

ToucheSir left a comment

Choose a reason for hiding this comment

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

Allow `Parallel(+, f)(x, y, z)` to work like broadcasting, and enable `Chain(identity, Parallel(+, f))(x, y, z)` #2393

codecov bot commented Mar 10, 2024 •

edited

mcabbott commented Mar 13, 2024 •

edited

mcabbott commented Mar 13, 2024 •

edited