Ipddp #249

wang-chen · 2023-06-04T00:48:29Z

New features:

two base classes: module.Cost and module.Constraint; their derivatives: module.QuadCost and module.LinCon. (In addition to module.System, stagewise cost and constraint are frequently used in optimal control problem setting.)
interior-point differential dynamic programming solver, and its differentiable version. (A variant of traditional DDP/iLQR solver, which can deal with stagewise constraint.)

…ension unification

correct literally to the ipddp matlab code; can reach optimality, but still need to verify its correctness using more systems

…this

pytorch use float32 by default, so given 0.02, it will return something like 0.199999953

strange bug in fp.filter, line 418, numerical stability if relax a bit

Encounter the RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation difficult to debug

should be constructed only on the last iteration.

use no_grad() for previous iterations; todo: vmap bug

two strange bug fix: 1. disable 'vectorize', otherwise, vmap error; did not use batch, still get this error; 2. disable sin function; sinbackward error

find the explanation of sin function here, https://blog.csdn.net/weixin_39679367/article/details/122754199 fixed it.

torch.set_default_dtype(torch.float64) important for solving, seems related to the tol set in the inequality. add plot file

some tuning

seems that the coefficient of last LQR iteration should be detached, as it is fixed values computed from the solved trajectory

1. Remove file docs/requirements.txt 2. change requirements/dev.txt back to latest one 3. Remove pypose/module/minimal_test.py 4. Remove all codes below if __name__ == "__main__": in all pypose/module/ files, and move them to test folder. 5. Use "pyramid style" for all imports.

add tutorial slide: https://docs.google.com/presentation/d/1i_76qH7wq1eZrPIlPe64r2MBI2HlQyWnMLj1Tb4OSi0/edit?usp=sharing

address some conflicts in init file, dynamics; also upgrade env to torch2.0

…train_ipddp_cartpole; import InvPend from tests

…ipddp

ntu-caokun · 2023-07-03T14:06:31Z

auto-tests has passed, plz review @aerogjy

aerogjy

I can successfully run all the examples and tests. Several major issues: 1. The API design should be consistent with lqr or mpc. 2. Infeasible start traj is missing in the docs. 3. The constraints type definition is a bit unclear. 4. Not sure the current organization of cost and constraints in the modules are proper @wang-chen 5. Some of the code in the core ipddp class is overwhelming, need to clean up, and try to avoid long variable discussion. (See the detailed comments inline the code.)

aerogjy · 2023-07-12T19:01:28Z

pypose/module/ipddp.py

+ >>> traj_opt[batch_id] = solver.optimizer()
+
+ '''
+ def __init__(self, sys=None, stage_cost=None, terminal_cost=None, cons=None, n_cons=0, init_traj=None):


The API design should be similar to lqr and mpc, which consists of the system, time horizon and the cost related terms. Here, I think the cons and n_cons could be somehow merged together? Also the init_traj shouldn't be necessary.

deleted n_cons;
time_horizon maybe not be necessary; if time-varying LQR is considered, then Q matrix should be of shape [ns,ns,T], T is included in Q;
in my case, init_traj seems to be necessary, as I need it to extract ns, nc, and initialize many matrices; while lqr initializes these matrices in lqr_backward, which is not suitable for our case where backward will be called many times.

aerogjy · 2023-07-12T19:03:05Z

pypose/module/ipddp.py

+ #-----------------------------------------------
+ return self
+
+ def forward(self, fp_list):


I'm kind of confused about solver and forward. It seems in the example, forward is used. In the test case, solver is used. I suggest we keep the consistency with the lqr and mpc with a forward would be great. Also, try to keep similar arg list of the forward function.

In the test, it only cares about solving the traj, and does not create a computational graph for learning; in the training example, it first calls solver() to solve the trajectory, then use forward to create the computational graph for autodiff.

I think we can find a way to unified them? Not because we are doing testing so we have to write a separate function. How about we keep one of them, and make a flag/switch so that you can turn on/off the graph computation part? So that we always use the forward, no solver() anymore?

aerogjy · 2023-07-12T19:06:12Z

pypose/module/ipddp.py

+
+ The IPDDP process can be summarised as iterative backward and forward recursions.
+
+ - The backward recursion.


It seems two scenarios are implemented with one starts from a feasible trajectory and the other from an infeasible trajectory. But only one scenario's doc is provided? That's why there is no updated related to the slack variable y, although it's defined.

fixed this.

aerogjy · 2023-07-12T19:09:55Z

pypose/module/ipddp.py

+ ns, nc, ncons, self.T = self.x.size(-1), self.u.size(-1), n_cons, self.u.size(-2)
+
+ # algorithm parameter
+ self.mu, self.maxiter, self.tol, self.infeas = 1.0, 50, torch.tensor([1.0e-7], dtype=self.x.dtype, device=self.x.device), False


For self.infeas, didn't see anywhere check the start trajectory and set it to true or false.

aerogjy · 2023-07-12T19:12:02Z

pypose/module/constraint.py

+ # Potential performance loss here - self.A and self.B involves jacobian eval
+ return self._ref_c - pp.bmv(self.gx, self._ref_state) - pp.bmv(self.gu, self._ref_input)
+
+class LinCon(Constraint):


Is this constraint inequality constraint? Didn't see anywhere in the doc defines that and whether it's >0, or <0, or >=0, or <=0.

Added to the docs in constraint.py

aerogjy · 2023-07-12T19:13:47Z

examples/module/ipddp/train_ipddp_invpend.py

+ [-1., 0.],
+ [-2.5, 1.]],
+ device=device)
+ state = torch.tensor([[-2.,0.],


Why define state twice?

aerogjy · 2023-07-12T19:14:49Z

tests/module/test_ipddp.py

+ init_traj_sample = {'state': init_traj['state'][batch_id:batch_id+1],
+ 'input': init_traj['input'][batch_id:batch_id+1]}
+ ipddp = IPDDP(sys, stage_cost, terminal_cost, lincon, gx.shape[-2], init_traj_sample)
+ traj_opt[batch_id] = ipddp.solver()


Here, used solver, but in the training example, used forward.

In the test, it only cares about solving the traj, and does not create a computational graph for learning; in the training example, it first calls solver() to solve the trajectory, then use forward to create the computational graph for autodiff.

aerogjy · 2023-07-12T19:17:06Z

pypose/module/ipddp.py

+
+ # -------- derivatives --------------------------
+ # terms related with system dynamics
+ self.fx = torch.zeros(B + (self.T, ns, ns), dtype=self.x.dtype, device=self.x.device)


Line 208 to 237 are a bit overwhelm. I think there should be a better way without defining all of the variables at the beginning. Maybe @wang-chen can have some suggestion.

aerogjy · 2023-07-12T19:18:16Z

pypose/module/ipddp.py

+ self.opterr, self.reg, self.bp_failed, self.recovery = 0., 0., False, 0
+
+ def getDerivatives(self):
+ self.p_fn.set_refpoint(self.x[...,-1,:], self.u[...,-1,:])


Same thing here. Can we just directly call the derivative when you are using them? Since you've already defined separate class for the cost and constraints.

that will make the backwardpasscompact() overwhelm. Notice that in LQR, the cost-related terms are passed in instead of being computed inside the loop, and there are no second-order dynamics terms and constraint terms, which makes the LQR code much more elegant.

aerogjy · 2023-07-12T19:20:31Z

pypose/module/ipddp.py

+ c_err, r_err, qu_err = torch.zeros(B, dtype=self.x.dtype, device=self.x.device), torch.zeros(B, dtype=self.x.dtype, device=self.x.device), torch.zeros(B, dtype=self.x.dtype, device=self.x.device)
+
+ # set regularization parameter
+ if (self.fp_failed or self.bp_failed):


Not recommend. @wang-chen how to avoid those long parameter discussion?

ntu-caokun · 2023-07-31T11:41:12Z

fixed issues 2, 3;
issue 1, 4, 5 needs discussion

ntu-caokun · 2023-08-12T09:29:53Z

Please use the da9aaf6 to review, as the latest merge cannot pass the tests;
In new commits: as per discussion, I have unified the init function interface with MPC, while still adding the B argument, for batch size;
tidy up the code.

aerogjy

Thanks for addressing most of my comments last time. I made a few minor comments on the doc and some other things. 1. I guess we could try to unify forward and solver(); 2. LTI LTV can use system matrices to find the system dimension 3. Put InvPend in the example as a general example for the broad users. I will hand it over to @wang-chen after this.

aerogjy · 2023-09-11T13:42:49Z

pypose/module/ipddp.py

+ stage_cost (:obj:`instance`): Stage cost of the optimal control problem.
+ terminal_cost (:obj:`instance`): Terminal cost of the optimal control problem.
+ cons (:obj:`instance`): Constraints of the optimal control problem.
+ n_cons (:obj:`int`): Dimension of constraints.


Need to be updated

aerogjy · 2023-09-11T13:45:17Z

pypose/module/ipddp.py

+ filter, line search in the forwardpass.
+
+ From the learning perspective, this can be interpreted as a module with unknown parameters in
+ :math:`\begin{Bmatrix} \mathbf{f}, \mathbf{c}, q, p \end{Bmatrix}`,


Do we still have q, p in this general constraint?

q, p here denote the stage and terminal costs, respectively.

aerogjy · 2023-09-11T13:45:58Z

pypose/module/ipddp.py

+ which can be integrated into a larger end-to-end learning system.
+
+ Note:
+ The implementation is based on paper `Interior Point Differential Dynamic Programming


Le't try to add the full citation, with authors and journal name, etc. :)

aerogjy · 2023-09-11T13:48:37Z

pypose/module/ipddp.py

+ #-----------------------------------------------
+ return self
+
+ def forward(self, fp_list):


I think we can find a way to unified them? Not because we are doing testing so we have to write a separate function. How about we keep one of them, and make a flag/switch so that you can turn on/off the graph computation part? So that we always use the forward, no solver() anymore?

aerogjy · 2023-09-11T13:50:18Z

pypose/module/constraint.py

+ Linear/linearized constraint term on state
+
+ .. math::
+ \mathbf{g}_{\mathbf_{x}} = \left. \frac{\partial \mathbf{g}}{\partial \mathbf{x}} \right|_{\chi^*}


gx and gu cannot show up correctly in the compiled document

merged solver and forward;
previously solver is done for each sample, and then collect them together in batch and do forward to get computational graph;

now each sample use their own instance, i.e., do their solve and computational graph separately.

aerogjy · 2023-09-11T13:56:14Z

examples/module/ipddp/train_ipddp_invpend.py

+ i, traj_loss.item(), model_loss.item()))
+
+ print( args.save)
+ os.system('python .\plot.py "{}" &'.format(args.save))


This line always pop up error for not finding the plot.py. I suggest fixing the file path.

aerogjy · 2023-09-11T13:58:50Z

pypose/module/dynamics.py

- def __init__(self, A, B, C, D, c1=None, c2=None):
- super().__init__()
+
+ def __init__(self, A, B, C, D, c1=None, c2=None, xdim = None, udim = None, ydim = None):


I thought in LTI, LTV, we can find the system dimension from the ABCD matrices?

aerogjy · 2023-09-11T14:01:00Z

examples/module/ipddp/train_ipddp_invpend.py

+import pickle as pkl
+import torch.optim as optim
+from pypose.module.ipddp import IPDDP
+from tests.module.test_ipddp import InvPend


Could we make a InvPend as a dynamic example in the pypose/examples, rather than in the test_ipddp? That will make our code more general and modular to be used by other users.

…t can be implemented in a batched version

Kun and others added 30 commits November 12, 2022 16:21

create cost base

1f807c6

quadcost completed, next to debug the quadraticize with autograd

39fad18

implement the second order derivative

0d42dac

cleanup, todo documentation

01011e5

add ipddp py 1st version

94c6475

complete constraint,ipddp forwardpass grammar check, todo: vector dim…

74b20cf

…ension unification

executable version, but incorrect, todo: check the internal process

f815394

recheck code

bbd9103

correct literally to the ipddp matlab code; can reach optimality, but still need to verify its correctness using more systems

add invpend, but the input dim is 1 cause a lot of issues, todo: fix …

3a7ef4d

…this

spot a bug in QuadCost when doing line-by-line check with matlab

b6382a9

still inconsistent with matlab

8804fbe

spot the bug for inconsistency with matlab

e1b3d4c

pytorch use float32 by default, so given 0.02, it will return something like 0.199999953

cleanup ipddp code

da5d587

strange bug in fp.filter, line 418, numerical stability if relax a bit

documentation for cost class

265bc67

documentation for constraint class

d52c5f6

add comments to ipddp

1e2dffa

add train file

12c28f2

autograd fails

ed22a0f

Encounter the RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation difficult to debug

add a scheme to construct the computational graph

260eac5

should be constructed only on the last iteration.

save a version

78858a5

use no_grad() for previous iterations; todo: vmap bug

workable version

09f66a4

two strange bug fix: 1. disable 'vectorize', otherwise, vmap error; did not use batch, still get this error; 2. disable sin function; sinbackward error

continue last commit

551618b

find the explanation of sin function here, https://blog.csdn.net/weixin_39679367/article/details/122754199 fixed it.

train works for single parameter

c9521e0

torch.set_default_dtype(torch.float64) important for solving, seems related to the tol set in the inequality. add plot file

update train_ipddp_cartpole

aebe0cf

some tuning

detach in the backwardsimplified

26ad0a1

seems that the coefficient of last LQR iteration should be detached, as it is fixed values computed from the solved trajectory

clean up code

88b4dbb

Merge branch 'ipddp' of https://github.com/ntu-caokun/pypose into ipddp

978511e

Merge branch 'main' into ipddp

13426ab

Update ipddp.py

df9ff04

add tutorial slide: https://docs.google.com/presentation/d/1i_76qH7wq1eZrPIlPe64r2MBI2HlQyWnMLj1Tb4OSi0/edit?usp=sharing

ntu-caokun added 10 commits June 5, 2023 22:54

Merge branch 'main' of https://github.com/pypose/pypose into ipddp

43d2be2

address some conflicts in init file, dynamics; also upgrade env to torch2.0

should bind with the last commit

e35deb1

minor changes: move excludeBatch to function/dim temporarily; delete …

a2d5def

…train_ipddp_cartpole; import InvPend from tests

add comments; todo: main doc and gpu debug

df8cd4f

fixed gpu errors

ace584f

update main docs

7d58bd7

delete test_vmap (not in use)

9e5884f

Merge branch 'ipddp' of https://github.com/ntu-caokun/pypose_ck into …

dcd4a9a

…ipddp

replace bdot with torch.einsum

1899db1

Merge branch 'main' of https://github.com/pypose/pypose into ipddp

dca133c

aerogjy self-requested a review July 12, 2023 18:59

aerogjy requested changes Jul 12, 2023

View reviewed changes

ntu-caokun added 5 commits July 31, 2023 15:28

minor corrections

6041f57

fix infeas related

4cd0842

Merge branch 'main' of https://github.com/pypose/pypose into ipddp

936f245

delete n_cons pass-in

7542778

update doc

61b185a

ntu-caokun added 3 commits August 12, 2023 16:56

unify the interface with mpc

72043ad

take getDerivatives out of backwardpasscompact

da9aaf6

Merge branch 'main' of https://github.com/pypose/pypose into ipddp

161c93c

aerogjy self-requested a review September 11, 2023 02:58

aerogjy reviewed Sep 11, 2023

View reviewed changes

wang-chen and others added 4 commits September 24, 2023 15:05

Merge branch 'main' into ipddp

4520635

minor updates except the forward, solver comment

b83651a

merge solver and forward, while note that the computational graph par…

aaa96e2

…t can be implemented in a batched version

fixed server error, not allow import examples folder in tests folder

62a89e5

ntu-caokun requested a review from aerogjy September 25, 2023 12:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ipddp #249

Ipddp #249

wang-chen commented Jun 4, 2023 •

edited

ntu-caokun commented Jul 3, 2023

aerogjy left a comment

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023 •

edited

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023

aerogjy Sep 11, 2023

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023

aerogjy Jul 12, 2023

aerogjy Jul 12, 2023

ntu-caokun Jul 31, 2023

aerogjy Jul 12, 2023

ntu-caokun commented Jul 31, 2023

ntu-caokun commented Aug 12, 2023

aerogjy left a comment

aerogjy Sep 11, 2023

aerogjy Sep 11, 2023

ntu-caokun Sep 25, 2023

aerogjy Sep 11, 2023

aerogjy Sep 11, 2023

aerogjy Sep 11, 2023

ntu-caokun Sep 25, 2023

aerogjy Sep 11, 2023

aerogjy Sep 11, 2023

aerogjy Sep 11, 2023


		The IPDDP process can be summarised as iterative backward and forward recursions.

		- The backward recursion.

Ipddp #249

Are you sure you want to change the base?

Ipddp #249

Conversation

wang-chen commented Jun 4, 2023 • edited

ntu-caokun commented Jul 3, 2023

aerogjy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ntu-caokun Jul 31, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ntu-caokun commented Jul 31, 2023

ntu-caokun commented Aug 12, 2023

aerogjy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wang-chen commented Jun 4, 2023 •

edited

ntu-caokun Jul 31, 2023 •

edited