"500 Internal Server Error" because of "Duplicate entry" after Upgrade to PI-3.9.1 #3810

beckerr-rzht · 2023-11-11T10:58:09Z

We operate PrivacyIdea in production with a MariaDB Galera cluster consisting of 3 nodes.
Each database node is assigned a PI node, which communicates with the local MariaDB node (on localhost).
One of the nodes is selected as the master via keepalived.

This worked without any problems up to PI 3.8.1. Especially when the master was changed via keepalive.

However, after upgrading from PI-3.8.1 to PI-3.9.1, I now see many "duplicate key" errors.
Example:

pymysql.err.IntegrityError: (1062, "Duplicate entry '34685581' for key 'PRIMARY'")

The above exception was the direct cause of the following exception:
--
sqlalchemy.exc.IntegrityError: (pymysql.err.IntegrityError) (1062, "Duplicate entry '34685581' for key 'PRIMARY'")
[SQL: INSERT INTO usercache (id, username, used_login, resolver, user_id, timestamp) VALUES (%(id)s, %(username)s, %(used_login)s, %(resolver)s, %(user_id)s, %(timestamp)s)]
[parameters: {'id': 34685581, 'username': 'XXXX', 'used_login': 'XXXX', 'resolver': 'LDAP', 'user_id': 'X-Y-Y-Y-X', 'timestamp': datetime.datetime(2023, 11, 10, 22, 47, 46, 752497)}]

This should actually prevented by the Galera cluster, as its configuration regarding autoincrement is correct:
https://mariadb.com/kb/en/auto_increment/#replication

SHOW VARIABLES LIKE 'auto_inc%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| auto_increment_increment | 3     |
| auto_increment_offset    | 1 (for node 1), 2 (for node 2), 3 (for node 3) |
+--------------------------+-------+

I have compared dumps of the PI database and see that many SEQUENCES have been added and suspect the cause of the problem there.

Example:

...
CREATE SEQUENCE `usercache_seq` start with 34685579 minvalue 1 maxvalue 9223372036854775806 increment by 0 cache 1000 nocycle ENGINE=InnoDB;
SELECT SETVAL(`usercache_seq`, 34688579, 0);

Since we first encountered the problem in the productive environment, I have this urgent question:

Is there a quick fix or workaround that gets around this problem?

The text was updated successfully, but these errors were encountered:

beckerr-rzht · 2023-11-11T11:23:37Z

It seems to be related to this commit:
e5e0027

And I understand that this commit should actually fix such problems. Strange.

plettich · 2023-11-13T14:48:06Z

Hi @beckerr-rzht
Since we also support PostgreSQL (which always used sequences) we changed all the primary key columns to sequences in our database abstraction layer (SQLAlchemy). For databases which don't support sequences (MySQL) this would be translated to an AUTO INCREMENT statement.
Since MariaDB 10.6.1 it also supports sequences and so SQLAlchemy uses them as well (that is why we had to add all the missing sequences).
According to https://mariadb.com/kb/en/create-sequence/#arguments-to-create the sequences should use the auto_increment_increment value if it is created with an INCREMENT value of 0 (which we do).

You have always only one PI node writing to the db? So this error happens during a failover?

beckerr-rzht · 2023-11-14T09:48:03Z

Good questions.
I would like to go into a little more detail and explain the cluster and the complete update process:

Cluster

There are three VMs with IDP (Shibboleth), PI and Mariadb Galera nodes (1:10.3.38-0ubuntu0.20.04.1).
Each IDP-Instance (Shibboleth) talks to the local PI-Instance, which talks itself to the local Galera cluster node.
Normally VM1 is designated as the master by keepalived.

Before the update to PI 3.9.1:

VM1 (Master) IDP1 -> PI1 (3.8.1) -> MARIADB-CLUSTER-NODE1
VM2          IDP2 -> PI2 (3.8.1) -> MARIADB-CLUSTER-NODE2
VM3          IDP3 -> Pi3 (3.8.1) -> MARIADB-CLUSTER-NODE3

Updating PI

Our seamess and quite well tested update procedure currently takes about 5 minutes and looks like this:

Execute privacyidea-pi-update on VM3 and let alter the database
Execute privacyidea-pi-update on VM2 and let alter the database (*)

Switch over keepalive master from VM1 to VM2:

VM1          IDP1 -> PI1 (3.8.1) -> MARIADB-CLUSTER-NODE1
VM2 (Master) IDP2 -> PI2 (3.9.1) -> MARIADB-CLUSTER-NODE2
VM3          IDP3 -> Pi3 (3.9.1) -> MARIADB-CLUSTER-NODE3

Execute privacyidea-pi-update on VM1 and let alter the database (*)
Switch back keepalive master from VM2 to VM1

(*) no changes should be made to the database, as this was already done in step 1 (but in fact this is my assumption).

Result

VM1 (Master) IDP1 -> PI1 (3.8.1) -> MARIADB-CLUSTER-NODE1
VM2          IDP2 -> PI2 (3.9.1) -> MARIADB-CLUSTER-NODE2
VM3          IDP3 -> Pi3 (3.9.1) -> MARIADB-CLUSTER-NODE3

Errors

The first errors occurred in this state immediately after the update to PI 3.9.1.

`usercache`

sqlalchemy.exc.IntegrityError: (pymysql.err.IntegrityError) (1062, "Duplicate entry '34685581' for key 'PRIMARY'")
[SQL: INSERT INTO usercache (id, username, used_login, resolver, user_id, timestamp) VALUES (%(id)s, %(username)s, %(used_login)s, %(resolver)s, %(user_id)s, %(timestamp)s)]
[parameters: {'id': 34685581, 'username': 'XXXX', 'used_login': 'XXXX', 'resolver': 'XXXX', 'user_id': 'XXXX', 'timestamp': datetime.datetime(2023, 11, 10, 22, 47, 46, 752497)}]
(Background on this error at: https://sqlalche.me/e/14/gkpj)

The PI-API reports this error as HTTP 500:

2023-11-10 22:47:55,123 - 10.249.131.222 - ERROR [de.zedat.fudis.shibboleth.idp.plugin.authn.fudiscr.impl.predicate.impl.UserHasAnyTokenPredicate:69] - Predicate UserHasAnyTokenPredicate: Backend error for user 'XXXX', predicate returns true.
de.zedat.fudis.shibboleth.idp.plugin.authn.fudiscr.impl.client.BackendException: org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 INTERNAL SERVER ERROR: "{"detail": null, "id": 1, "jsonrpc": "2.0", "result": {"error": {"code": -500, "message": "500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application."}, "status": false}, "time": 1699652875.044161, "version": "privacyIDEA 3.9", "signature": "XXXX"}"

To get rid of these errors, I switched off and emptied the usercache table.

`privacyidea.lib.auditmodules.sqlaudit:328`

The next morning I continued testing and connected my browser (via an entry in the /etc/hosts file) to VM3.
At this point, data from PI1 and PI3 was written to the database.
I have used this many times before, but have never run into errors, because both PI and Mariadb-Galera allow this.

This morning, however, I received errors of this type:

[2023-11-11 09:48:26,875][1436434][140022750586624][ERROR][privacyidea.lib.auditmodules.sqlaudit:328] exception IntegrityError('(pymysql.err.IntegrityError) (1062, "Duplicate entry \'36912748\' for key \'PRIMARY\'")')
[2023-11-11 09:48:26,876][1436434][140022750586624][ERROR][privacyidea.lib.auditmodules.sqlaudit:329] DATA: {'startdate': datetime.datetime(2023, 11, 11, 9, 48, 26, 148965), 'success': True, 'serial': None, 'user': 'XXXX', 'realm': 'XXXX', 'resolver': 'XXXX', 'token_type': None, 'client': '127.0.0.1', 'client_user_agent': None, 'privacyidea_server': 'idp-1', 'action': 'GET /token/', 'action_detail': '', 'thread_id': '140022750586624', 'info': 'realm: None', 'administrator': 'XXXX', 'policies': 'XXXX'}

A little later I created this issue.
Since then I have not done any failover via keepalive or connected my browser to VM3.

nilsbehlen · 2024-05-23T13:23:04Z

hey @plettich did we fix this already? maybe in 3.9.3 or .2?

beckerr-rzht added the Type: Possible bug Suspected bug by user label Nov 11, 2023

cornelinux assigned plettich Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"500 Internal Server Error" because of "Duplicate entry" after Upgrade to PI-3.9.1 #3810

"500 Internal Server Error" because of "Duplicate entry" after Upgrade to PI-3.9.1 #3810

beckerr-rzht commented Nov 11, 2023

beckerr-rzht commented Nov 11, 2023

plettich commented Nov 13, 2023

beckerr-rzht commented Nov 14, 2023

nilsbehlen commented May 23, 2024

"500 Internal Server Error" because of "Duplicate entry" after Upgrade to PI-3.9.1 #3810

"500 Internal Server Error" because of "Duplicate entry" after Upgrade to PI-3.9.1 #3810

Comments

beckerr-rzht commented Nov 11, 2023

beckerr-rzht commented Nov 11, 2023

plettich commented Nov 13, 2023

beckerr-rzht commented Nov 14, 2023

Cluster

Updating PI

Result

Errors

usercache

privacyidea.lib.auditmodules.sqlaudit:328

nilsbehlen commented May 23, 2024

`usercache`

`privacyidea.lib.auditmodules.sqlaudit:328`