Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server freezes after restarting monitorix #462

Open
DSmidge opened this issue Feb 1, 2024 · 35 comments
Open

Server freezes after restarting monitorix #462

DSmidge opened this issue Feb 1, 2024 · 35 comments

Comments

@DSmidge
Copy link

DSmidge commented Feb 1, 2024

Hi. I have some problems with Monitorix. Server freezes when I do "service monitorix restart". Reboot works OK.
I removed a /dev/sdf from md1 raid from array and from monitorix.conf. Reload didn't helped. Reboot crashed the server.
The already happened previously and I recreated the /var/lib/monitorix directory.

Do you have any idea if this is a Monitorix problem or I have a faulty configuration. Maybe I shouldn't run Monitorix under Apache?

Config file is /etc/monitorix/conf.d/my.conf which is a link to other file. Permissions are OK.

Apache config:
# Monitorix
Alias /monitorix/ /var/lib/monitorix/www/
<Location /monitorix/>
Require all granted

ScriptAlias /monitorix-cgi/ /var/lib/monitorix/www/cgi/
<Location /monitorix-cgi/>
DirectoryIndex monitorix.cgi
Options ExecCGI
Require all granted

Related logs:
Feb 1 20:09:10 server systemd[1]: Stopping LSB: Start Monitorix daemon...
Feb 1 20:09:10 server monitorix[1200623]: ...done.
Feb 1 20:09:10 server systemd[1]: monitorix.service: Deactivated successfully.
Feb 1 20:09:10 server systemd[1]: monitorix.service: Unit process 1035 (/usr/bin/monito) remains running after unit stopped.
Feb 1 20:09:10 server systemd[1]: Stopped LSB: Start Monitorix daemon.
Feb 1 20:09:10 server systemd[1]: monitorix.service: Consumed 1h 15min 28.200s CPU time.
Feb 1 20:09:10 server systemd[1]: monitorix.service: Found left-over process 1035 (/usr/bin/monito) in control group while starting unit. Ignoring.
Feb 1 20:09:10 server systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Feb 1 20:09:10 server systemd[1]: monitorix.service: Found left-over process 1200630 (/usr/bin/monito) in control group while starting unit. Ignoring.
Feb 1 20:09:10 server systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Feb 1 20:09:10 server systemd[1]: Starting LSB: Start Monitorix daemon...
Feb 1 20:09:10 server monitorix[1200631]: ...done.
Feb 1 20:09:10 server systemd[1]: Started LSB: Start Monitorix daemon.
Feb 1 20:09:46 server mariadbd[1462]: 2024-02-01 20:09:46 1255 [Warning] Aborted connection 1255 to db: ...
Feb 1 20:09:46 server mariadbd[1462]: 2024-02-01 20:09:46 36949 [Warning] Aborted connection 36949 to db: ...
Feb 1 20:09:46 server mariadbd[1462]: 2024-02-01 20:09:46 1257 [Warning] Aborted connection 1257 to db: ...
Feb 1 20:09:46 server mariadbd[1462]: 2024-02-01 20:09:46 36948 [Warning] Aborted connection 36948 to db: ...
Feb 1 20:09:58 server mariadbd[1462]: 2024-02-01 20:09:58 1251 [Warning] Aborted connection 1251 to db: ...
Feb 1 20:10:04 server mariadbd[1462]: 2024-02-01 20:10:04 1268 [Warning] Aborted connection 1268 to db: ...
Feb 1 20:11:22 server systemd[1]: Received SIGINT.
Feb 1 20:11:22 server systemd[1]: Stopping Session 1030 of User sa...
Feb 1 20:11:22 server systemd[1]: Removed slice Slice /system/modprobe.
Feb 1 20:11:22 server systemd[1]: Stopped target Graphical Interface.
... more Stopped commands below

And monitorix screenshots:
image
image

@mikaku
Copy link
Owner

mikaku commented Feb 1, 2024

Hi. I have some problems with Monitorix. Server freezes when I do "service monitorix restart". Reboot works OK.
I removed a /dev/sdf from md1 raid from array and from monitorix.conf. Reload didn't helped. Reboot crashed the server.
The already happened previously and I recreated the /var/lib/monitorix directory.

I'm sorry, I don't understand you. You say "Reboot works OK" but then you say "Reboot crashed the server.".

Do you have any idea if this is a Monitorix problem or I have a faulty configuration. Maybe I shouldn't run Monitorix under Apache?

Running Monitorix under Apache is well tested and I've never heard of any problems on it.

Feb 1 20:09:10 server systemd[1]: monitorix.service: Found left-over process 1035 (/usr/bin/monito) in control group while starting unit. Ignoring.
Feb 1 20:09:10 server systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

I've never seen those messages either.

What's your Monitorix version?
What OS?
How did you install Monitorix in your OS?
What says the Monitorix log file?
...

@DSmidge
Copy link
Author

DSmidge commented Feb 1, 2024

I'm sorry for the confusion. Here is the corrected text:
Server freezes when I do "service monitorix restart". "service monitorix reload" works OK.
I removed a /dev/sdf from md1 raid from array and from monitorix.conf. Service reload didn't remove the /dev/sdf from the list in Monitorix app. Service restart crashed the server. After PC restart everything worked OK.
Thank you for responding to my issue, Domen

@DSmidge
Copy link
Author

DSmidge commented Feb 1, 2024

Monitorix version: 3.15.0-izzy1
OS is Ubuntu 22.04.3 LTS (upgraded form 20.04 LTS)

Installation:
wget -O - http://apt.izzysoft.de/izzysoft.asc | sudo apt-key add -
apt-get install monitorix

@DSmidge
Copy link
Author

DSmidge commented Feb 1, 2024

Monitorix logs:
Thu Feb 1 20:09:10 2024 - SIGTERM caught.
Thu Feb 1 20:09:10 2024 - Starting Monitorix version 3.15.0 (pid 1200640).
Thu Feb 1 20:09:10 2024 - Loaded main configuration file '/etc/monitorix/monitorix.conf'.
Thu Feb 1 20:09:10 2024 - Loading extra configuration file '/etc/monitorix/conf.d/00-debian.conf'.
Thu Feb 1 20:09:10 2024 - Loading extra configuration file '/etc/monitorix/conf.d/99-server.conf'.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: Index of deletion too big.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
iptables: No chain/target/match by that name.
Thu Feb 1 20:09:11 2024 - Initializing graphs.
Thu Feb 1 20:09:12 2024 - Exiting.
Thu Feb 1 20:09:12 2024 - Generating the 'index.html' file.
Thu Feb 1 20:09:12 2024 - Ok, ready.
Thu Feb 1 20:10:21 2024 - phpfpm::phpfpm_update: ERROR: in pool 'www', unable to connect to ...
Thu Feb 1 20:10:21 2024 - phpfpm::phpfpm_update: 500 Can't connect to ...
Thu Feb 1 20:11:21 2024 - phpfpm::phpfpm_update: ERROR: in pool 'www', unable to connect to ...
Thu Feb 1 20:11:21 2024 - phpfpm::phpfpm_update: 500 Can't connect to ...
Thu Feb 1 20:11:22 2024 - SIGTERM caught.
Thu Feb 1 20:11:23 2024 - Exiting.

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

Beside some rare iptables messages and the error in the phpfpm.pm module (which seems not configured), I don't see anything special in your log file. Have you modified the default configuration files /etc/monitorix/monitorix.conf and /etc/monitorix/conf.d/00-debian.conf?

Can you paste here the your 99-server.conf configuration file?
Please, hide any sensible information before pasting!

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

This php-fpm error is strange as the statistics for it are present in Monitorix:
image
It can be, the php-fpm service was already shut down at this time. Same as the mariadb in syslog in my 1st post.

The default config files were not changed. All my changes are in 99-server.conf:
99-server.conf.txt

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

I don't see anything wrong in your configuration file.
Make sure you have only one instance of Monitorix running in your server.

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

I always thought that greylist option was optional, as I don't use graylisting. Will make a diff with /etc/monitorix/monitorix.conf to see if anything else is missing or is off.
That you very much for looking into my issue I made a small donation as an appreciation for you help.

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

I always thought that greylist option was optional, as I don't use graylisting. Will make a diff with /etc/monitorix/monitorix.conf to see if anything else is missing or is off.

It's optional indeed. Such message was wrong should have gone to another issue. My fault. Sorry for the noise.

That you very much for looking into my issue I made a small donation as an appreciation for you help.

I've just received the PayPal notification.
Thank you very much, it's really appreciated.

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

I'm not sure but it looks like the service unit of Monitorix in your Ubuntu is not being executed correctly by your systemd, or it creates somehow malfunction in your system.

I'm using different versions of systemd on different servers (none of them are Debian or Ubuntu) and all my Monitorix instances are running fine. The newest systemd version I have is 254.

What systemd version you have there?

@IzzySoft, have you seen a similar problem in your Debian/Ubuntu systems?

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

"apt list --installed" returns "systemd/now 249.11-0ubuntu3.11"

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

This is the unit file of Monitorix 3.15 running on my Fedora 39:

# cat /usr/lib/systemd/system/monitorix.service
[Unit]
Description=Monitorix
Documentation=man:monitorix(8)
After=network-online.target
Requires=network-online.target

[Service]
Type=forking
EnvironmentFile=-/etc/sysconfig/monitorix
ExecStart=/usr/bin/monitorix -c /etc/monitorix/monitorix.conf -p /run/monitorix.pid $OPTIONS
PIDFile=/run/monitorix.pid

[Install]
WantedBy=multi-user.target

Compare it with yours, it should be the same.

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

This exists: /etc/init.d/monitorix
List of package files "apt-file list monitorix" returns:
/lib/systemd/system/monitorix.service
But the file does not exist on my server. Will try to reinstall it. Maybe it was using init.d script when running "service monitorix restart".

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

"apt install monitorix --reinstall" didn't help. File monitorix.service is still missing.

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

This exists: /etc/init.d/monitorix

Oh, then your Monitorix installation is using a System V init script on a systemd. Well, most systemd accept backwards support for the System V init scripts, but perhaps you need to install a specific systemd module for this in your Ubuntu.

I'll wait to see what @IzzySoft says on this, because he is the person who packages and maintains the Izzy repository. From where you downloaded your Monitorix version.

@IzzySoft
Copy link

IzzySoft commented Feb 2, 2024

have you seen a similar problem in your Debian/Ubuntu systems?

No, I haven't. But neither I have ever removed disks while the system was running – apart from "real removables" which in turn I've never added to my Monitorix configs.

@DSmidge I might have missed it, but did you state where you installed Monitorix from? Official repos, my repo …?

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

Ah!, @IzzySoft to the rescue! 👍

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

I used this commands a while (=years) ago:
wget -O - http://apt.izzysoft.de/izzysoft.asc | sudo apt-key add -
apt-get install monitorix

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

Maybe it was installed on Ubuntu 18.04. Not sure.

@IzzySoft
Copy link

IzzySoft commented Feb 2, 2024

I've just checked my SPEC file (maybe for the next release, I should align that a little closer to yours, @mikaku) – the entire docs/ is included with %doc, so the unit file should have come along the same way as with the "official package".

List of package files "apt-file list monitorix" returns:
/lib/systemd/system/monitorix.service
But the file does not exist on my server.

If I'm not mistaken, the SPEC file would have placed it below /usr/share/docs/monitorix. I must hand back to @mikaku the question how it gets to /lib/systemd/system then. I find no reference to *lib/sys* in either SPEC file. It's quite a while ago that I performed a "fresh install"…

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

Yes, this is the location on Ubuntu: /usr/share/doc/monitorix/monitorix.service
And the contents is the same as the contents from @mikaku few posts above.

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

I must hand back to @mikaku the question how it gets to /lib/systemd/system then.

@IzzySoft, keep in mind that the .spec file of Monitorix for Fedora is very different to the .spec file that comes with the standard tar.gz of Monitorix. Fedora has very strong rules on how to use RPM macros, etc.

Check the following two links:

https://bodhi.fedoraproject.org/updates/?search=monitorix
https://koji.fedoraproject.org/koji/buildinfo?buildID=2097584

@IzzySoft
Copy link

IzzySoft commented Feb 2, 2024

keep in mind that the .spec file of Monitorix for Fedora is very different to the .spec file that comes with the standard tar.gz of Monitorix.

I referred to the one shipping in the tarball: src/monitorix/docs/monitorix.spec. It slightly differs from mine – e.g. yours uses

%doc Changes COPYING README README.nginx README.BSD docs/monitorix-alert.sh docs/monitorix-apache.conf docs/monitorix-lighttpd.conf docs/monitorix.service docs/htpasswd.pl

while mine takes everything from docs/ as %doc docs/*. Results should be the same, though (OK, mine might have some more files in %doc then but that doesn't hurt), but it makes such evaluations as here now a bit more complex. So what I meant by "align" was to make such a comparison easier on my end 😉

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

@DSmidge,

You might want to try to copy the file docs/monitorix.service into the /usr/lib/systemd/system/ directory.
Then just enable and start Monitorix like any other service:

systemctl enable monitorix
systemctl start monitorix

Let us know if that worked.

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

It's working. I used these commads:
cp -p /usr/share/doc/monitorix/monitorix.service /usr/lib/systemd/system/
/etc/init.d/monitorix stop
systemctl enable monitorix
service monitorix status # says it's inactive (dead)
systemctl start monitorix
service monitorix status # says it's active (running)

So was my installation broken or wasn't the service files not properly installed?

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

So was my installation broken or wasn't the service files not properly installed?

Well, perhaps systemd no longer supports System V init scripts, or at least, it no longer includes the plugin to support them by default.

@IzzySoft, that would mean that the .spec in docs/ that forms the base for your .deb packages could be no longer workable for modern Ubuntu versions.

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

Thank you again, @mikaku, for your help. And thank you @IzzySoft - I made a small donation to you DE bank account (hopefully it will be processed on Monday, because I put Germany for your address and town).

@IzzySoft
Copy link

IzzySoft commented Feb 2, 2024

@mikaku then maybe we have 3 options here. In order of preference:

  1. make it clear the package is intended for only Debian < x / Ubuntu < y and when used on "higher versions" folks need to manually move those files around
  2. adjust the spec to place the monitorix.service into /usr/lib/systemd/system (and keep the inid.d file out instead) – would that break on some older distributions?
  3. drop the package from my repo (very last resort)

Going with 1. maybe there could be a proper message shown at install. I've seen that with other packages, but have no idea how to accomplish that.

Going with 2. might break it on some "legacy devices" still running rather old releases of Debian/Ubuntu.

Going with 3. would only be the very last resort as it would rob those people of 1. of keeping at least their Monitorix up to date.

I don't know (not yet investigated that), but there might be a 2a) where %post checks what version of Debian/Ubuntu is running and move the files around when it detects one it knows to be working with systemd. That won't be bullet-proof then, though, as the two of us are surely not knowing all the variations (e.g. not doing it on Devuan, but doing it on Debian). Setting up a proper detection could a separate full-project…

Here's what my current %post has:

%post
####
# Insert start script into autostart
####
set -e
if [ "$1" = "configure" ]; then
  if [ -e /etc/init.d/monitorix ]; then
    /usr/sbin/update-rc.d monitorix defaults 99
    /etc/init.d/monitorix start
  fi
  #if [ ! -e /var/www/monitorix ]; then
  #  if [ -d /var/www ]; then ln -s /usr/share/monitorix /var/www/monitorix; fi
  #fi
fi
mkdir -p /var/lib/monitorix/www/imgs
chown -R www-data:nogroup /var/lib/monitorix/www/imgs
chmod 0775 /var/lib/monitorix/www/imgs

Detection could go by /etc/os-release evaluating ID (eg. "linuxmint"), ID_LIKE ("ubuntu"), and VERSION_ID ("21.2") – plus optionally UBUNTU_CODENAME if ID_LIKE=ubuntu. The two of us could probably cover Debian, Ubuntu and Mint here. Skeleton:

if [ "$ID" = "ubuntu" -o "$ID_LIKE" = "ubuntu" ]; then
  if [ "$UBUNTU_CODENAME" = "focal" -o "$UBUNTU_CODENAME" = … ]; then
    # supports systemd, so use that
  else
    # still supports SYSTEMV, keep things as they are
  fi
fi

(or better do it the other way around: explicitly naming the older ones for SYSTEMV as that list should be "static" – though then systemd would be used for all unknowns as well). Where ID can be used, a numeric comparison would go easier: $VERSION_ID -lt 19 would avoid having all the codenames listed.

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

A 2b option would be keep the current version of .spec as it is, and create a new one specific for systemd systems. But that means that you'll have to build two .deb packages as well.

@IzzySoft
Copy link

IzzySoft commented Feb 2, 2024

Which I'd rather avoid. Not everyone who's interested in Monitorix might be able to tell what it means (though one would think someone who configures it should know, some might be happy with the defaults already, or still be at "novice level). And even if, some might accidentally pick the wrong one.

Addendum for 2a, as *.deb is quite specific to Debian and its derivatives, maybe it would be safe to switch to systemd setup for (ID=linuxmint && VERSION_ID>19) || ID=ubuntu && VERSION_ID>18 || ID_LIKE=ubuntu && UBUNTU_CODENAME !(regex(list_of_codenames) ) || (ID=debian && VERSION_ID>9) || ID_LIKE=debian && DEBIAN_CODENAME !(regex))? Guess that should cover > 95%. ID=debian & ID_LIKE=debian do exist, so here we could exclude the codenames again. I'm just not sure what regex checking is allowed in the SPEC, tho this might be moot…

Again, as that tends to become error-prone, I'd prefer to go with variant 1. For all others, Monitorix is available in the "official repos" IIRC? So I could maybe name a dependency that's in base but not matched in newer distribution releases, like glibc < x, to avoid "accidental installs" of my packages? Still, they wouldn't break things, people would just have to manually move 2 files to get it working. So maybe a final message in %post would suffice? echo is allowed in the SPEC.

@DSmidge
Copy link
Author

DSmidge commented Feb 2, 2024

May I suggest not checking the version but just check if systemd is installed - for example if dir /var/run/systemd exists or "which systemctl" returns a result?

@IzzySoft
Copy link

IzzySoft commented Feb 2, 2024

Heh, why so easy when there are many funny complicated approaches 🙈 Right you are!

if [ -d /var/run/systemd ]; then echo "SystemDemolized"; else echo "SystemPfeif"; fi
  • Mint 20/21: SystemDemolized
  • Buster/Bullseye: SystemDemolized
  • Good old Wheezy: SystemPfeif

@mikaku That fine with you? If that directory is found, use the unit file – else use the init file? So:

  • install both files to %doc
  • in %post copy the matching one to its place using that to start the service
  • in %preun use the proper one to stop the service
  • in %postun make sure to rm the copy

Should be clean enough, would you agree?

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2024

Should be clean enough, would you agree?

Sounds good. @DSmidge came with a simple idea indeed. 👍
I'll take a look to it.

@IzzySoft
Copy link

IzzySoft commented Feb 3, 2024

Just let me know when you decided, and I'll try that with the next release (before pushing it out; maybe even giving you a copy of the DEB before deploying it – unless it's not needed because you implement the same 😉

mikaku added a commit that referenced this issue Feb 23, 2024
@mikaku
Copy link
Owner

mikaku commented Feb 23, 2024

@IzzySoft, I've made the agreed changes in the docs/monitorix.spec, then I've built an RPM and I installed it on a systemd-based Linux distro and then on a SysV init distro. It went successful.

Feel free to re-generate your .deb file and let me know if you see any problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants