-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unable to delete all the accounts in a loop, got internal error while deleting few of them #7920
Comments
@rkomandu Please run the test after updating the log level, and share the actual error log. |
yes it is reporting as that only. Deleting the account 2k in a loop with sleep 10 interval and observed multiple times the Failed to execute command
Now let us look at the noobaa.log for the 12397 and 12399 deletion and for the 12398, it had come as "NSFS Manage command"
Now if we check at the above timestamp in the below noobaa.log for the s3user-12397, s3user-12398 and s3user-12399 account deletion
where as the account is still there
noobaa.log is about 700MB, omandu-ip-cls-x-worker1 log]# ls -lh /var/log/noobaa.log for now, you can check from above logs @naveenpaul1 . uploading in GH is not possible and box also has this restriction. I need to delete the older noobaa.log content and have for the current day only to see if it reduces any size. However you can continue from above log snippets i think |
Hi @rkomandu,
➜ grep -i AccountCreated run_test_2048.txt | wc -l
2048
➜ grep -i AccountDeleted run_test_2048.txt | wc -l
2048 |
it is a recent noobaa 5.15.3 d/s build of 0514
|
i am deleting all the accounts and buckets via mms3 cli and came across the similar problem as shown above bucket delete passed , account delete failed with Internal error
noobaa cli for status and list for the account is shown below
|
Thank you @rkomandu, so we can understand from the error attached that the reason for the internal error is related to encryption (
This internal error printing came from the account list. I would verify that when manually deleting the account we see the same details in the error (please run account delete on the account that account). cc: @romayalon |
|
Every account delete has an issue now
|
@shirady , this "internal server" is the latest problem w/r/t master_key as posted in above comments. However when the defect was opened in Mar 3rd week, there is no Enc enabled in d/s ODF 4.15.0. At that time the error is same as now, but don't have the noobaa cli command output |
this ENC is a new problem with ODF 4.15.3, however the main problem could still be there, as deleting in a loop is what is being in all of my first few updates |
for the latest problem @romayalon , it is the master_key problem in the CCR. I had updated in the RTC defect https://jazz07.rchland.ibm.com:21443/jazz/web/projects/GPFS#action=com.ibm.team.workitem.viewWorkItem&id=330894 @shirady
We need to get this fixed ASAP, otherwise the 5.15.3-0415 build with Enc is not going to work for basic functionality of account delete |
@rkomandu |
@romayalon , |
as on Physical machine (BM) , with your 30th May provided RPM, we have recreated the delete account problem as 47 when executing concurrently from 3 nodes of 1K each , hit the error. @Ramya-c has taken up further from Friday , discussed with Guy and she is making experiments. Unless that is sorted our first, this error even though related would be masked IMO. If you understand and address from code flow, then that is the next move, otherwise we need to wait and try when we have those extra cycles. Priority is to sort that problem |
thanks @rkomandu for the update, waiting for the logs of the concurrency tests, please keep us updated with the information you capture about this issue. |
Hi @romayalon @rkomandu The steps are described for a few accounts so you can test in, and then change the numbers of the iterations - for example, I changed it from Requirements:
sudo mkdir -p /tmp/my-config
sudo chmod 777 /tmp/my-config
for i in {1501..1503}
do
mkdir -p /tmp/nsfs_root_s3user/s3user-$i; chmod 777 /tmp/nsfs_root_s3user/s3user-$i;
done Steps: for i in {1501..1503}
do
sudo node src/cmd/manage_nsfs account add --name s3user-$i --uid $i --gid $i --new_buckets_path /tmp/nsfs_root_s3user/s3user-$i --config_root /tmp/my-config
done
You can check you can see the accounts config:
for i in {1501..1503}
do
sudo node src/cmd/manage_nsfs account delete --name s3user-$i --config_root /tmp/my-config
done You can check if you can not see any account config: |
@romayalon , as per ramya posting in the channel here https://ibmandredhatguests.slack.com/archives/C015Z7SDWQ0/p1717495547911289?thread_ts=1716826449.194409&cid=C015Z7SDWQ0, from here to the next 4 comments which shows "noobaa-cli with status and list" didn't show any master_key problem as was the case previously (i.e before 30th May build). So this delete error is occurring and now can be related to this defect of why it is failing. |
@rkomandu I agree, we just need to validate it using a print of the full error object/ stderr, thank you |
@rkomandu we need the details of the error to investigate. Did you try to delete one of the accounts (not in the loop) and see the error? |
@shirady that is what is mentioned in the slack thread with the noobaa-cli status and list, as shown above |
Name New Buckets Path Uid Gid User s3user-31407 /gpfs/remote_fvt_fs/s3user-31407-dir 31407 31407 None Please migrate your code to use AWS SDK for JavaScript (v3). |
@ramya-c3 would you please add an explanation? Edit:
@ramya-c3 @rkomandu @romayalon |
@shirady , what you mentioned in your above comment as step 1, 2 is correct. |
@rkomandu as I understand in the loop you are using the I would also suggest to run the same loop with |
@shirady , Dev team will have to change that in mms3 and then only this can be tried. noobaa-cli i don't think we will use it here |
Environment info
Standalone Noobaa with ODF 4.15.0 (d/s build ) 0313 --> noobaa-core-5.15.0-20240313.el9.x86_64
Actual behavior
2
for i in
seq 8100 9000
; do mms3 account delete s3user-$i; doneAccount s3user-8369 deleted successfully
Account s3user-8370 deleted successfully
Account s3user-8371 deleted successfully
Account s3user-8372 deleted successfully
Account s3user-8373 deleted successfully
Account s3user-8374 deleted successfully
Account s3user-8375 deleted successfully
Account s3user-8376 deleted successfully
Failed to execute command for Account s3user-8377: The server encountered an internal error. Please retry the request --> here
Account s3user-8378 deleted successfully
Account s3user-8379 deleted successfully
Account s3user-8380 deleted successfully
Account s3user-8381 deleted successfully
Account s3user-8382 deleted successfully
[root@rkomandu-ip-cls-x-worker2 ~]# mms3 account list | grep 8377
s3user-8377 /mnt/fs1/s3user-8377-dir 8377 8377
[root@rkomandu-ip-cls-x-worker2 ~]# mms3 account list | grep 8377
s3user-8377 /mnt/fs1/s3user-8377-dir 8377 8377
[root@rkomandu-ip-cls-x-worker2 ~]# mms3 account list | grep 8376
[root@rkomandu-ip-cls-x-worker2 ~]# mms3 account list | grep 8378
[root@rkomandu-ip-cls-x-worker2 ~]# ls -ld /mnt/fs1/s3user-8377-dir
drwxrwx--- 2 8377 8377 4096 Mar 14 10:27 /mnt/fs1/s3user-8377-dir
[root@rkomandu-ip-cls-x-worker2 ~]# ls -ltr /mnt/fs1/s3user-8377-dir
total 0
Account s3user-8522 deleted successfully
Account s3user-8523 deleted successfully
Account s3user-8524 deleted successfully
Account s3user-8525 deleted successfully
Account s3user-8526 deleted successfully
Account s3user-8527 deleted successfully
Account s3user-8528 deleted successfully
Failed to execute command for Account s3user-8529: The server encountered an internal error. Please retry the request --> here
Account s3user-8530 deleted successfully
Account s3user-8531 deleted successfully
Account s3user-8532 deleted successfully
..
Account s3user-8966 deleted successfully
Account s3user-8967 deleted successfully
Account s3user-8968 deleted successfully
Account s3user-8969 deleted successfully
Account s3user-8970 deleted successfully
Account s3user-8971 deleted successfully
Failed to execute command for Account s3user-8972: The server encountered an internal error. Please retry the request --> here
Account s3user-8973 deleted successfully
Account s3user-8974 deleted successfully
Account s3user-8975 deleted successfully
Account s3user-8976 deleted successfully
Failed to execute command for Account s3user-8977: The server encountered an internal error. Please retry the request --> here
Account s3user-8978 deleted successfully
[root@rkomandu-ip-cls-x-worker2 ~]# mms3 account list | grep 8972
s3user-8972 /mnt/fs1/s3user-8972-dir 8972 8972
[root@rkomandu-ip-cls-x-worker2 ~]# mms3 account list | grep 8977
s3user-8977 /mnt/fs1/s3user-8977-dir 8977 8977
we opened defect internally (RTC 327700) , and Ramya figured out that it is due to Noobaa problem.
as you could observe no entry for the s3user-8972 delete request
Expected behavior
It shouldn't show
Steps to reproduce
Create accounts in a loop say 2K and then try to delete, may be along with buckets as well. Try deleting them from the CES nodes
More information - Screenshots / Logs / Other output
The text was updated successfully, but these errors were encountered: