Primary outputs of model steps vs. derived variables and other secondary outputs #53

smmaurer · 2018-11-06T21:41:28Z

This issue is to lay out a strategy for handling primary outputs of model steps vs. derived variables, vs. potential secondary outputs that are not derived variables. Tagging @mxndrwgrdnr and @janowicz in case you're interested.

Background

Current templates are designed around the idea that when a model step runs, it produces a single Orca column (pd.Series) of primary output: predicted prices, predicted choices, etc.

Sometimes there are additional relevant outputs. For example, when we allocate households or employers to buildings that have capacity constraints, the primary output is the agents' choice of buildings. The available capacity in the buildings also changes, but the template does not currently update that column.

This works out because the capacity is a derived variable that can be calculated from other data. A common pattern is to define the capacity column as a callable -- with the correct Orca cache settings, the capacities will be recalculated as needed.

Advantages of the status quo

it supports existing use cases well
it's nice to have a distinction between the primary output of a model step and derived variables
it's nice to keep the output of model steps as simple and consistent as possible

Create standards for secondary outputs?

Users who aren't familiar with the idioms of Orca derived variables would probably expect the model step to automatically update capacities. This would be feasible, although a little bit complicated to support all the potential combinations of constraints and Orca column types.

If secondary outputs will be common -- particularly secondary outputs that aren't derived variables -- we could put together some general functionality for this.

But so far, additional outputs tend to fall into the category of status reporting (sampled alternatives, choice probabilities, model fits) rather than info that needs to go into the core Orca representation of model state.

Conclusions

The current use pattern works, but we need to document it clearly.

We may want to support secondary outputs, but it will make the API more complicated so my inclination is to wait on it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Primary outputs of model steps vs. derived variables and other secondary outputs #53

Primary outputs of model steps vs. derived variables and other secondary outputs #53

smmaurer commented Nov 6, 2018

Primary outputs of model steps vs. derived variables and other secondary outputs #53

Primary outputs of model steps vs. derived variables and other secondary outputs #53

Comments

smmaurer commented Nov 6, 2018

Background

Advantages of the status quo

Create standards for secondary outputs?

Conclusions