Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autodiff across FE result #188

Open
cpfiffer opened this issue Jan 25, 2022 · 1 comment
Open

Autodiff across FE result #188

cpfiffer opened this issue Jan 25, 2022 · 1 comment

Comments

@cpfiffer
Copy link

Someone noted to me that FixedEffectModels.jl is tricky to use AD on because there are so many explicit Float64 type constraints -- does anyone have a good sense of how much effort it would take to remove/parameterize/reduce the explicit type constraints here?

As an example, the FixedEffectModel struct has a lot of Float64 explicit types that could be parametric instead.

struct FixedEffectModel <: RegressionModel
coef::Vector{Float64} # Vector of coefficients
vcov::Matrix{Float64} # Covariance matrix
vcov_type::CovarianceEstimator
nclusters::Union{NamedTuple, Nothing}
esample::BitVector # Is the row of the original dataframe part of the estimation sample?
residuals::Union{AbstractVector, Nothing}
fe::DataFrame
fekeys::Vector{Symbol}
coefnames::Vector # Name of coefficients
yname::Union{String, Symbol} # Name of dependent variable
formula::FormulaTerm # Original formula
formula_predict::FormulaTerm
contrasts::Dict
nobs::Int64 # Number of observations
dof_residual::Int64 # nobs - degrees of freedoms
rss::Float64 # Sum of squared residuals
tss::Float64 # Total sum of squares
r2::Float64 # R squared
adjr2::Float64 # R squared adjusted
F::Float64 # F statistics
p::Float64 # p value for the F statistics
# for FE
iterations::Union{Int, Nothing} # Number of iterations
converged::Union{Bool, Nothing} # Has the demeaning algorithm converged?
r2_within::Union{Float64, Nothing} # within r2 (with fixed effect
# for IV
F_kp::Union{Float64, Nothing} # First Stage F statistics KP
p_kp::Union{Float64, Nothing} # First Stage p value KP
end

Is there an appetite for this? I think it'd be lovely to be able to AD through high-dimensional fixed effect estimates.

@schrimpf
Copy link

What do you have in mind? coef inherits its element type from y and X. These are constructed from the dataframe and formula, and converted to Float64. I could imagine getting the type of y and X from the types of the columns of the dataframe to allow automatic differentiation with respect to the data, but it seems somewhat unusual to try to autodiff through a dataframe.

If you use the lower level code in FixedEffects.jl, there is some compatibility with AD.

using ForwardDiff, FiniteDiff, FixedEffects, LinearAlgebra

N = 10
K = 2
d1=repeat(1:(N÷2), inner = 2)
d2=repeat(1:2, inner = N÷2)
p1 = FixedEffect(d1)
p2 = FixedEffect(d2)
x = rand(N,K)
y = rand(N)
function regfe(y,x,fes)
  x = copy(x)
  y = copy(y)
  w = FixedEffects.uweights(eltype(x), size(x, 1))
  feM = AbstractFixedEffectSolver{eltype(x)}(fes, w, Val{:cpu}, Threads.nthreads())
  solve_residuals!(x, feM)
  return((x'*x) \ (x'*y))
end
regfe(y, x, [p1, p2])


# derivative wrt x
J1 = ForwardDiff.jacobian(x->regfe(y, reshape(x,N,K), [p1, p2]) , vec(x))
J2 = FiniteDiff.finite_difference_jacobian(x->regfe(y, reshape(x,N,K), [p1, p2]) , vec(x))
norm(J1- J2)

# derivative wrt y
J1 = ForwardDiff.jacobian(y->regfe(y, reshape(x,N,K), [p1, p2]) , y)
J2 = FiniteDiff.finite_difference_jacobian(y->regfe(y, reshape(x,N,K), [p1, p2]) , y)
norm(J1- J2)

It's dreadfully slow, but appears to give correct results. Reverse mode packages are probably harder. The in-place mutation of solve_residuals! and iteration in lsmr probably mean it'd require defining custom methods. Doing so could also greatly improve the ForwardDiff performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants