Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input matrix not invertible error in Levenberg Marquardt algorithm using neupy.algorithms #258

Open
mahatibharadwaj opened this issue Sep 18, 2019 · 15 comments
Assignees
Labels

Comments

@mahatibharadwaj
Copy link

mahatibharadwaj commented Sep 18, 2019

Previously, I did not get any errors and the code ran properly. I even could see the results properly. Now after implementing everything, I want to save my results. For that I am running my code again and facing this new issue now. Please help me resolve this.

Please find the error below.

C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: RuntimeWarning: invalid value encountered in less
  

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
   1333     try:
-> 1334       return fn(*args)
   1335     except errors.OpError as e:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1318       return self._call_tf_sessionrun(
-> 1319           options, feed_dict, fetch_list, target_list, run_metadata)
   1320 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1406         self._session, options, feed_dict, fetch_list, target_list,
-> 1407         run_metadata)
   1408 

InvalidArgumentError: Input matrix is not invertible.
	 [[{{node training-updates/MatrixSolve}}]]

During handling of the above exception, another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-14-4e83562a364c> in <module>
----> 1 optimizer.train(xTrain, yTrain, xTest, yTest)

~\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\base.py in train(self, X_train, y_train, X_test, y_test, *args, **kwargs)
    299             X_train=X_train, y_train=y_train,
    300             X_test=X_test, y_test=y_test,
--> 301             *args, **kwargs)
    302 
    303     def one_training_update(self, X_train, y_train):

~\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\base.py in train(self, X_train, y_train, X_test, y_test, epochs, batch_size)
    268                     update_start_time = time.time()
    269 
--> 270                     train_error = self.one_training_update(X_batch, y_batch)
    271                     self.n_updates_made += 1
    272 

~\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\lev_marq.py in one_training_update(self, X_train, y_train)
    173 
    174         return super(LevenbergMarquardt, self).one_training_update(
--> 175             X_train, y_train)

~\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\base.py in one_training_update(self, X_train, y_train)
    303     def one_training_update(self, X_train, y_train):
    304         return self.functions.one_training_update(
--> 305             *as_tuple(X_train, y_train))
    306 
    307     def get_params(self, deep=False, with_network=True):

~\AppData\Roaming\Python\Python37\site-packages\neupy\utils\tf_utils.py in wrapper(*input_values)
     72         result, _ = session.run(
     73             [outputs, tensorflow_updates],
---> 74             feed_dict=feed_dict,
     75         )
     76         return result

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
    927     try:
    928       result = self._run(None, fetches, feed_dict, options_ptr,
--> 929                          run_metadata_ptr)
    930       if run_metadata:
    931         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1150     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1151       results = self._do_run(handle, final_targets, final_fetches,
-> 1152                              feed_dict_tensor, options, run_metadata)
   1153     else:
   1154       results = []

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1326     if handle is None:
   1327       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1328                            run_metadata)
   1329     else:
   1330       return self._do_call(_prun_fn, handle, feeds, fetches)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
   1346           pass
   1347       message = error_interpolation.interpolate(message, self._graph)
-> 1348       raise type(e)(node_def, op, message)
   1349 
   1350   def _extend_graph(self):

InvalidArgumentError: Input matrix is not invertible.
	 [[node training-updates/MatrixSolve (defined at C:\Users\mahati.bharadwaj\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\lev_marq.py:159) ]]

Caused by op 'training-updates/MatrixSolve', defined at:
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 505, in start
    self.io_loop.start()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 148, in start
    self.asyncio_loop.run_forever()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\asyncio\base_events.py", line 539, in run_forever
    self._run_once()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\asyncio\base_events.py", line 1775, in _run_once
    handle._run()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\asyncio\events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\ioloop.py", line 690, in <lambda>
    lambda f: self._run_callback(functools.partial(callback, future))
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\ioloop.py", line 743, in _run_callback
    ret = callback()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 787, in inner
    self.run()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 748, in run
    yielded = self.gen.send(value)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 378, in dispatch_queue
    yield self.process_one()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 225, in wrapper
    runner = Runner(result, future, yielded)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 714, in __init__
    self.run()
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 748, in run
    yielded = self.gen.send(value)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 365, in process_one
    yield gen.maybe_future(dispatch(*args))
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
    yielded = next(result)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 272, in dispatch_shell
    yield gen.maybe_future(handler(stream, idents, msg))
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
    yielded = next(result)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 542, in execute_request
    user_expressions, allow_stdin,
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
    yielded = next(result)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 294, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 536, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2854, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2880, in _run_cell
    return runner(coro)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\async_helpers.py", line 68, in _pseudo_sync_runner
    coro.send(None)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3057, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3248, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3325, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-13-94821c1be936>", line 1, in <module>
    optimizer = algorithms.LevenbergMarquardt(network, signals=on_epoch_end,)
  File "C:\Users\mahati.bharadwaj\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\base.py", line 149, in __init__
    self.init_functions()
  File "C:\Users\mahati.bharadwaj\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\lev_marq.py", line 133, in init_functions
    super(LevenbergMarquardt, self).init_functions()
  File "C:\Users\mahati.bharadwaj\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\base.py", line 176, in init_functions
    training_updates = self.init_train_updates()
  File "C:\Users\mahati.bharadwaj\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\lev_marq.py", line 159, in init_train_updates
    tf.matmul(J_T, tf.expand_dims(err_for_each_sample, 1))
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\ops\gen_linalg_ops.py", line 1422, in matrix_solve
    "MatrixSolve", matrix=matrix, rhs=rhs, adjoint=adjoint, name=name)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
    op_def=op_def)
  File "C:\Users\mahati.bharadwaj\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Input matrix is not invertible.
	 [[node training-updates/MatrixSolve (defined at C:\Users\mahati.bharadwaj\AppData\Roaming\Python\Python37\site-packages\neupy\algorithms\gd\lev_marq.py:159) ]]
@itdxer
Copy link
Owner

itdxer commented Sep 18, 2019

Hi,
Do you have your code available somewhere? Also, can you tell me what are the versions of neupy and TensorFlow do you use?

@itdxer
Copy link
Owner

itdxer commented Sep 18, 2019

Also, what is your mu_update_factor value?

@mahatibharadwaj
Copy link
Author

mahatibharadwaj commented Sep 18, 2019

tensorflow version = 1.13.2
neupy version = 0.8.2 (not very sure where to check in anaconda environment)
mu_update_factor is default value

my code

import numpy as np
np.random.seed(20)

from neupy import algorithms, layers
from neupy.exceptions import StopTraining

from neupy.layers import *

import pandas as pd

import time

from sklearn import preprocessing

data = pd.read_csv("data6k.csv")
train = data.iloc[1:600,1:20]
test = data.iloc[601:671,1:20]

xTrain = train.iloc[:,3:20]
yTrain = train.iloc[:,0]

xTest = test.iloc[:,3:20]
yTest = test.iloc[:,0]

network = join(Input(16), Relu(8), Linear(1))

def on_epoch_end(optimizer):
    if optimizer.errors.valid[-1] < 0.001:
        raise StopTraining("Training has been interrupted")

start_time = time.time()

optimizer = algorithms.LevenbergMarquardt(network, signals=on_epoch_end,)

optimizer.train(xTrain, yTrain, xTest, yTest)

yPred = optimizer.predict(xTest)

I am getting this error at optimizer.train

Please help me resolve this

@mahatibharadwaj
Copy link
Author

please let me know

@itdxer
Copy link
Owner

itdxer commented Sep 18, 2019

I think this might require a fix, in the meanwhile, can you try to reduce the mu_update_factor value from 1.2 to maybe 1.1 or 1.05 and/or you can also try to increase mu from 0.1 to maybe 0.2 or 0.5 (and maybe even all the way to 1)

@itdxer itdxer self-assigned this Sep 18, 2019
@itdxer itdxer added the bug label Sep 18, 2019
@mahatibharadwaj
Copy link
Author

It is strange how the same code worked two days back and is giving this issue now. It will be helpful if you could fix it asap and also explain the issue. Thanks

@mahatibharadwaj mahatibharadwaj changed the title Input matrix invertible error in Levenberg Marquardt algorithm using neupy.algorithms Input matrix not invertible error in Levenberg Marquardt algorithm using neupy.algorithms Sep 18, 2019
@itdxer
Copy link
Owner

itdxer commented Sep 18, 2019

did you try to modify mu and mu_update_factor values? did it help to solve your problem?

@mahatibharadwaj
Copy link
Author

mahatibharadwaj commented Sep 18, 2019

I am again getting the same error at mu_update_factor=1.1, mu=0.2
This is working for mu_update_factor=1.1, mu=0.1 but predicted values are deviated a lot.
Can you please tell me the ideal values for mu_update_factor and mu to avoid this error?
My requirement is to get less model training time and deviation of predicted values from actual values should be as low as possible. I am unable to decide ideal mu_update_factor, mu and error threshold value (currently 0.001) as it is not working for many values. Please suggest ideal values as per my requirement. Please help.

@itdxer
Copy link
Owner

itdxer commented Sep 18, 2019

Can you please tell me the ideal values for mu_update_factor and mu to avoid this error?

Inversion happens on the jacobian matrix and the mu parameter is added to each diagonal element of this matrix. This trick helps to break linear dependence between rows/columns in the square matrix. But when mu is way too large then the training might be less effective since mu introduces a bit of noise. The mu_update_factor helps to increase or decrease mu value based on the training performance. mu_update_factor=1 means that there will be no adjustments and large value will mean that small change in the error value can drastically increase or decrease mu value. After many updates mu can approach zero, so that's why. I thought that changing this parameters can help to resolve your problem.

@mahatibharadwaj
Copy link
Author

Thanks. But different combinations of mu_update_factor, mu and error threshold value are giving this same error. How are these three related and how do we decide how to tune them? Is it still a bug or the user has to decide. Trial and error is a tedious method.
Does this also depend on the size of the data set?

@itdxer
Copy link
Owner

itdxer commented Sep 18, 2019

But different combinations of mu_update_factor, mu and error threshold value are giving this same error.

Sorry, maybe I misunderstood you, did you say that it worked for mu_update_factor=1.1, mu=0.1 ?

This is working for mu_update_factor=1.1, mu=0.1 but predicted values are deviated a lot.


How are these three related and how do we decide how to tune them?

It's important for you to understand algorithm before using it. Please refer to this book in order to learn more about it: https://hagan.okstate.edu/NNDesign.pdf (see Section 12).


Is it still a bug or the user has to decide.

The mu parameter has to deal with this problem, but for some reason it doesn't. I might need to put a threshold on the minimum mu value in order to ensure that matrix will remain invertible (but I'm not 100% whether that's the problem that you're experiencing).

@itdxer
Copy link
Owner

itdxer commented Sep 18, 2019

Would it be possible for you to set verbose=True and share outputs that you're observing in the terminal

optimizer = algorithms.LevenbergMarquardt(network, signals=on_epoch_end, verbose=True)

@mahatibharadwaj
Copy link
Author

mahatibharadwaj commented Sep 19, 2019

Thanks for the information. I observed another strange thing with the parameters. The same combination of mu, mu_update_factor and error threshold always doesn't give the result. Sometimes it gives this error and sometimes it works. I think this needs to be fixed.

@rdx10001
Copy link

same error. Even after changing mu and mu_update_factor.
Main information

[ALGORITHM] LevenbergMarquardt

[OPTION] loss = mse
[OPTION] mu = 0.1
[OPTION] mu_update_factor = 1.1
[OPTION] show_epoch = 1
[OPTION] shuffle_data = False
[OPTION] signals = None
[OPTION] target = Tensor("placeholder/target/linear-1:0", shape=(?, 1), dtype=float32)
[OPTION] verbose = True

[TENSORFLOW] Initializing Tensorflow variables and functions.
WARNING:tensorflow:From c:\python37\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
[TENSORFLOW] Initialization finished successfully. It took 0.24 seconds

@itdxer
Copy link
Owner

itdxer commented Oct 12, 2020

@rdx10001 do you get the same error during the first training iteration or after some number of epochs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants