Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary outputs of model steps vs. derived variables and other secondary outputs #53

Open
smmaurer opened this issue Nov 6, 2018 · 0 comments

Comments

@smmaurer
Copy link
Member

smmaurer commented Nov 6, 2018

This issue is to lay out a strategy for handling primary outputs of model steps vs. derived variables, vs. potential secondary outputs that are not derived variables. Tagging @mxndrwgrdnr and @janowicz in case you're interested.

Background

Current templates are designed around the idea that when a model step runs, it produces a single Orca column (pd.Series) of primary output: predicted prices, predicted choices, etc.

Sometimes there are additional relevant outputs. For example, when we allocate households or employers to buildings that have capacity constraints, the primary output is the agents' choice of buildings. The available capacity in the buildings also changes, but the template does not currently update that column.

This works out because the capacity is a derived variable that can be calculated from other data. A common pattern is to define the capacity column as a callable -- with the correct Orca cache settings, the capacities will be recalculated as needed.

Advantages of the status quo

  • it supports existing use cases well
  • it's nice to have a distinction between the primary output of a model step and derived variables
  • it's nice to keep the output of model steps as simple and consistent as possible

Create standards for secondary outputs?

Users who aren't familiar with the idioms of Orca derived variables would probably expect the model step to automatically update capacities. This would be feasible, although a little bit complicated to support all the potential combinations of constraints and Orca column types.

If secondary outputs will be common -- particularly secondary outputs that aren't derived variables -- we could put together some general functionality for this.

But so far, additional outputs tend to fall into the category of status reporting (sampled alternatives, choice probabilities, model fits) rather than info that needs to go into the core Orca representation of model state.

Conclusions

The current use pattern works, but we need to document it clearly.

We may want to support secondary outputs, but it will make the API more complicated so my inclination is to wait on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant