-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Features/436 svd #1009
base: features/436-SVD
Are you sure you want to change the base?
Features/436 svd #1009
Conversation
…m matrices it is working fine.
…re more efficient and better now.
…put matrix into 3 parts/tensors using prep_halo
…ss does bi_diagonalize for its own part and sends the changes to neighbouring ranks. Currently, halfway through it.
…pdate halo git s properly to get the correct bidiagonal matrix finally.
… Implemented as given in 2-3 research papers.
…re. Slicing properly for each matrix is becomming little difficult.
…to not properly applying one of the transformations on Ej.
…e final resulting matrix is also comming perfectly bidiagonal but Some checking should be done for larger matrices with mostly zeros.
…ay to the bi_diagonalize() function. Now, It is working perfectly fine in multiple processes.
…rrors occurred as expected and resolved almost all of them. Now the algorithm is even better and as of now there are no errors and resulting matrix is comming exactly bi_diagonal for m>=n and diagonal for m<n.
…. Should check the command with which code should be run...
…s and now should keep comm.send at correct location.
… one process until next one gives it something. This is should be managed accordingly.
…unning, The program is not completing.
… inputs. Implementation is with using of 2 processess.
for more information, see https://pre-commit.ci
👇 Click on the image for a new way to code review
Legend |
…into features/436-SVD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great so far!
i left a few comments on how to make the code a bit cleaner. the next step is to break the algorithm down into smaller parts and say what is going on there in comments. This will greatly help the next person which is reading the code. for example, where does tag 3 come from?
I would rather see comments which explain too much, than comments when explain too little. we can always shorten them later
heat/core/linalg/bcg.py
Outdated
if j == 2: | ||
if rank == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these can be combined into a single statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done sir, 👍
heat/core/linalg/bcg.py
Outdated
if Ej.size(0) > 0 and Ej.size(1) > 0: | ||
|
||
Uj = comm.recv(source=0, tag=2) | ||
Ej = torch.matmul(Uj.float(), Ej.float()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to call .float()
here? they should already be the same dtype that they were when they were sent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"RuntimeError: expected scalar type Float but found Double" is coming if I remove it at some places. So, I kept them where ever required.
heat/core/linalg/bcg.py
Outdated
|
||
# print("b: ", b) | ||
|
||
U1, vt1 = ht.eye(m, dtype=ht.float64), ht.eye(n, dtype=ht.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we dont know what that the input will be a float64 matrix, you should use the dtype of the input DNDarray (arr.dtype
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to arr.dtype
heat/core/linalg/bcg.py
Outdated
# print("This one:", arr) | ||
# req4 = comm.irecv(source=0,tag=1) | ||
# arr = req4.wait() | ||
Ej = arr[p_left:p_right, i + (j - 2) * b + 1 : i + 1 + (j - 1) * b] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you use this multiple times, it would probably help with readability if you saved i + (j - 2) * b + 1
as a variable. the same goes for other slice start/stop indices
…research paper mentioned here will be the best way to get a good idea.
@Hmm-its-me have you merged the latest version of your base branch? That will include the latest version of the CI. You should have a PR from @bhagemeier waiting to be merged. Thanks! |
Thanks, @ClaudiaComito mam, I have merged it, and now all the checks are passed 👍 |
@Sai-Suraj-27 can you resolve conflicts? I merged |
@ClaudiaComito Done...👍 |
@Sai-Suraj-27 you haven't pulled in the latest changes. Check out the |
@ClaudiaComito Sorry, my bad, I have pulled the latest changes, 👍 |
I will try to complete review as soon as possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of "Features/436-SVD #1009"
Since I have not been involved in mentoring this GSoC'22 project, the following comments are rather general, possibly show my missing knowledge, and sometimes do not specifically address parts of the code.
The code is well-structured and formatted, thus readable, and appropriate documentation is available for almost all functions, in particular for all functions that are exposed to the user. However, I have the following comments:
-
As far as I understand from the title of the PR, the original purpose of this branch was to add an SVD. The two main contributions I can find, namely
bi_diagonalize
andblock_diagonalize
, are important steps towards this and --from a mathematical point of view-- certainly also the main steps. Nevertheless, I believe that it would make sense to shiftblock_diagonalize
from the filesvd.py
toutils.py
since this routine is an intermediate step that will usually not be exposed to the user. This is roughly the same for the unit tests, wherebi_diagonalize
is tested under the namesvd
which is a bit misleading. (Such refactoring, however, could also be done at a later stage of the project…) -
The function
block_diagonalize
still prints some output for debugging. This additional output can be removed. -
The function
block_diagonalize
does not seem to scale well (see the plot attached). I tested this on up to 24 MPI-processes (2 threads each) on up to 2 nodes of the HDFML-cluster. I would have expected a descrease in computing time when increasing the number of processes. Nevertheless, it may be the case that my test-matrices (size 7500x7500) were just too small to see scaling effects… -
The function
bi_diagonalize
is not found, neither byheat.linalg.bi_diagonalize
norheat.linalg.svd.bi_diagonalize
. Is this intended or did I do something wrong? -
Anyway, after having a look onto the code of
bi_diagonalize
(without testing it practically), I have the impression that the code looks very “serial” (loops over the matrix sizes). For large matrices this may result in an infeasibly large computing time. -
As far as I see, the unit tests for
bi_diagonalize
are still missing.
Since this PR does not contain high-level routines ready to be used by non-experts and it does not seem to be active, I close this PR (as discussed in the second-last PR talk) in order to clean up our PR-list. Re-opening is possible, of course. |
Reopening this PR after internal discussion. More work is needed so setting it to Draft. |
This pull request is stale because it has been open for 60 days with no activity. |
Description
Bulge Chasing algorithm to transform the given matrix into a bidiagonal matrix (Upper bidiagonal in case of m>=n and lower bidiagonal in case of m<n).
Changes proposed:
Type of change
Changes Remaining
skip ci