Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sqlite errors #4709

Open
2 tasks done
fabry006 opened this issue Apr 25, 2024 · 8 comments
Open
2 tasks done

Sqlite errors #4709

fabry006 opened this issue Apr 25, 2024 · 8 comments
Labels
area:deployment related to how uptime kuma can be deployed help

Comments

@fabry006
Copy link

⚠️ Please verify that this question has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

📝 Describe your problem

I am having troubles with sqlite, sometimes I see timeouts errors like this one:

Trace: [Error: insert into `heartbeat` (`down_count`, `duration`, `important`, `monitor_id`, `msg`, `ping`, `status`, `time`) values (0, 147870, true, 23, '200 - OK', 31, 1, '2024-04-25 13:35:05.866') - SQLITE_IOERR: disk I/O error] {
  errno: 10,
  code: 'SQLITE_IOERR'
}
    at consoleCall (<anonymous>)
    at Timeout.safeBeat [as _onTimeout] (/app/server/model/monitor.js:1028:25)
2024-04-25T15:35:05+02:00 [MONITOR] ERROR: Please report to https://github.com/louislam/uptime-kuma/issues
2024-04-25T15:35:05+02:00 [MONITOR] INFO: Try to restart the monitor
2024-04-25T15:35:05+02:00 [] INFO: Cannot write to error.log

I've deployed the software as a docker container with podman in a Virtual machine based on RHEL8 and mounted a volume to a local directory
The only way to resolve those errors is to restart the container.
I have ~30/40 monitors and the max nested depth (they are grouped with Groups) is 4

Can you please help me stabilize the environemnt?

📝 Error Message(s) or Log

Trace: [Error: insert into `heartbeat` (`down_count`, `duration`, `important`, `monitor_id`, `msg`, `ping`, `status`, `time`) values (0, 147870, true, 23, '200 - OK', 31, 1, '2024-04-25 13:35:05.866') - SQLITE_IOERR: disk I/O error] {
  errno: 10,
  code: 'SQLITE_IOERR'
}
    at consoleCall (<anonymous>)
    at Timeout.safeBeat [as _onTimeout] (/app/server/model/monitor.js:1028:25)
2024-04-25T15:35:05+02:00 [MONITOR] ERROR: Please report to https://github.com/louislam/uptime-kuma/issues
2024-04-25T15:35:05+02:00 [MONITOR] INFO: Try to restart the monitor
2024-04-25T15:35:05+02:00 [] INFO: Cannot write to error.log

🐻 Uptime-Kuma Version

1.23.11-alpine

💻 Operating System and Arch

RHEL8 with podman

🌐 Browser

Edge

🖥️ Deployment Environment

  • Runtime: podman version 4.6.1
  • Database:
  • Filesystem used to store the database on: /opt type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
  • number of monitors: 40
@fabry006 fabry006 added the help label Apr 25, 2024
@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Apr 25, 2024

As an explainaition what this error means from the docs:

The SQLITE_IOERR result code says that the operation could not finish because the operating system reported an I/O error.`
A full disk drive will normally give an SQLITE_FULL error rather than an SQLITE_IOERR error.
There are many different extended result codes for I/O errors that identify the specific I/O operation that failed.

A person on reddit reddit suggested the following. Could you try and report the results?

  • If it shows up immediately when connecting to the database, it might be an issue with [...] file permissions, the file path or path style (remember different slashes for windows/linux etc, and check that node isn't altering them), or some obscure virtual machine bug related to those.
    Before you dig too deep into those, read up on your specific sqlite package on npm and make sure everything is set up correctly.
    You may want to install/update sqlite3 or an equivalent through your package manager separately!
  • If it shows up after other successful queries, it could be an indication of data or media corruption [...]

@CommanderStorm CommanderStorm added the area:deployment related to how uptime kuma can be deployed label Apr 25, 2024
@CommanderStorm
Copy link
Collaborator

Please also see #4110 if you have the same problem

@fabry006
Copy link
Author

Thank you @CommanderStorm I read your links and not sure that they apply to my case.
Yesterday I restarted the container and, again, this morning is stuck thorough the same error. Just for testing I started the container as root in order to void any kind of permission but this didn't solve the problem
I tried to create a dir inside the same directory were the db is stored and there is no issue

This is the actual content of the folder

drwxrwxrwx. 2 root root         6 Apr 23 13:59 docker-tls
-rwxrwxrwx. 1 root root      7490 Apr 23 23:46 error.log
-rwxrwxrwx. 1 root root 444760064 Apr 26 08:15 kuma.db
-rwxrwxrwx. 1 root root     98304 Apr 26 08:15 kuma.db-shm
-rwxrwxrwx. 1 root root  33685152 Apr 26 08:19 kuma.db-wal
drwxrwxrwx. 2 root root         6 Apr 23 13:59 screenshots
drwxrwxrwx. 2 root root        28 Apr 23 13:59 truststore
drwxrwxrwx. 2 root root       108 Apr 23 13:59 upload

The VM is deployed in the corporate private cloud (based on VMware).
Before I tried to mount the volume in a mounted fs that is supposed to be used to be used for the app data but I was not sure that that is a local disk, so I changed the folder to /opt/uptime/volumes but I still see these errors.

@chakflying
Copy link
Collaborator

Can you try going into Settings -> Monitor History -> press "Shrink Database"?

@fabry006
Copy link
Author

@chakflying So far the monitoring history is set to 0 (I need at least 1 year of data, hoping that in the future the Kuma UI will have a feature to select the time history for the whole year).
So what is the effect if I shrink the db? I think no data will be deleted. Am I wrong?

@chakflying
Copy link
Collaborator

Yes, no data would be deleted, it only runs a command to compact and organize the database file.

In general, most people who have reported database errors have had to reduce the retention time to eliminate the errors. I think we currently don't have the expertise to optimize our current usage of SQLite further, so there is no solution for now.

If running "Shrink Database" doesn't solve your issue, I think you can wait for external database support in 2.0, and in the meantime consider reducing the retention time after doing whatever backups necessary.

@fabry006
Copy link
Author

@chakflying I will try and let you know

@CommanderStorm
Copy link
Collaborator

(we track the steps neasesary before a V2.0 release in #4500)

@chakflying I don't know if this is related as none of the other cases with this boiled down to this. At the current moment, I suspect a similar problem as in #4110 => a broken disk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:deployment related to how uptime kuma can be deployed help
Projects
None yet
Development

No branches or pull requests

3 participants