Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Targets for the ANN-Regression Model #6725

Open
WolframRinke opened this issue Feb 6, 2024 · 2 comments
Open

Multi-Targets for the ANN-Regression Model #6725

WolframRinke opened this issue Feb 6, 2024 · 2 comments

Comments

@WolframRinke
Copy link
Contributor

What's your use case?

I have built an ANN based modelling toolkit from more than thirty years ago from scratch in ANSI-C . I used it for modelling industrial processes for many different real time applications in the oil and gas industry. I started porting some of this tools to Python and use them in the python widget, for example. Which I plan to convert into a custom widget or an explain-widget for ANN. (https://www.researchgate.net/publication/277006650_CALCULATING_THE_DEPENDENCY_OF_COMPONENTS_OF_OBSERVABLE_NONLINEAR_SYSTEMS_USING_ARTIFICIAL_NEURAL_NETWORKS) My explain-ANN- widget code supports MISO and MIMO ANN models.

So I want to model multi input multi output ANN models, which are allowed in the underlying scikit-learn library, but not in Orange3 itself, which makes many situations difficult to model and even not properly, when I simulate a MIMO model as several MISO models.

So I reverse engineered Orange3 cause of the lack of internal documentation and I managed to do some adjustments and additions within the Orange3 source to allow ANN-regression models to support multi targets. Which so far, works, but the test and train model widget, has an issue with multi dimensional results. So I got stuck there and need some help or internal documentation.

What's your proposed solution?

My did already most of the necessary changes to support multi targets for ANN. training a multi target with scikit-learn seems to work so far. The only change was to tell scikit-learn thath the ANN model will have more than 1 target (normally done through the n_outputs parameter) Test and score only handles single target regressions using a result vector, so the two dimensional array can be converted in "FORTRANstyle" to convert the 2 dim array into a vector. But also the training or test data array needs to be converted to a single vector.

Here I need help to learn about the internal data structure and idea behind the result data class.

Are there any alternative solutions?

@janezd janezd self-assigned this Feb 16, 2024
@janezd
Copy link
Contributor

janezd commented Feb 16, 2024

As you have probably seen, the Domain and Table classes allow for multiple targets, but most widgets can't handle them. For this reason, even if the user constructs such data, e.g. in Select Columns, Orange shows a warning that this data won't be very useful.

The object that is output by Test and Score, Orange.evaluation.Results has a 1d array for predictions (actually 2d, because these are predictions of all models that are being tested) and a 2d array for probabilities (that is, 3d, for all models). Supporting multiple targets would require adding another dimension, which would break all downstream widgets.

But the question is - what would you do with this? Even Test and Score would probably only show scores (ca, auc, prec, recall, ...) for a single output at a time. If you then want to analyze this in, say, the ROC widget, you would also analyze one output at a time.

The most doable solution, I think, is to modify the Test and Score widget as follows. When training, it would trains multi-output models. But then it would construct multiple instances of Orange.evaluation.Results, one per each output. The widget would have an additional combo (which would be hidden for "normal" data, and shown only for multi-output models) in which the user would select a single output. The widget would show the scores for this output, and the widget would also only output the corresponding Result object.

Alternatively, the widget could show scores for all target variables (e.g. multiple ca's at once), but the user would stil have to choose which results to output, so that the widget would output the same object as it does now.

Would this make sense?

@janezd janezd removed their assignment Feb 16, 2024
@WolframRinke
Copy link
Contributor Author

Sorry for my late response. I understand, that not all widgets support multi target, but this does not matter. I modified the code already and introduced a new modelling type "multi_regression" in addition to "classification" and "regression". I use this indicator to introduce a special treatment in to build the proper ANN model, but my biggest problem at the moment is the handling of the results in the test and score widget, because I cannot fix a dimension issue. Maybe I can pass over my code to give me some hints. Also I found a bug in the way results are treated.

I will try to follow your hints in the Test&Score widget.

The main reason to make this extension is caused by the fact, that in industrial applications for process modelling MI-MO is my preferred architecture. I also plan to add a new explain model widget, which is based calculating the derivative of an ANN, which supports MI-SO and MI-MO models.

I am looking forward to your thoughts.

:-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants