Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tJobInstanceLiveCheck problems to get PIDs on Debian 11 #33

Open
Jens2205 opened this issue Nov 15, 2023 · 20 comments
Open

tJobInstanceLiveCheck problems to get PIDs on Debian 11 #33

Jens2205 opened this issue Nov 15, 2023 · 20 comments
Labels

Comments

@Jens2205
Copy link

Hi Jan,

This issue pertains to calling PIDs via "ps -eo pid" in the tJobInstanceLiveCheck component. We've discovered that the live-check isn't functioning properly after upgrading Debian from 9 to 11 and Java from 8 to 11.

It seems to us that Debian 11 doesn't always respond with a complete set of PIDs using this command.

We would suggest that the component use "pgrep ." to retrieve this list. We hope this will resolve this behavior.

Thanks in advance!
Jens

@jlolling
Copy link
Owner

jlolling commented Nov 15, 2023

Thanks a lot for your suggestion. I guess it would be helpful to make the commands for Windows and Linux configurable to allow such changes without the need of new component version.
I have actually also a bit strange behaviour and was questioning what is the cause. Now I have an idea.

@jlolling
Copy link
Owner

pgrep seems to not be a solution because it never shows the list of the whole system.

@Jens2205
Copy link
Author

Hi Jan,
we may found another solution via: "ls -d /proc/[0-9]* | sed 's//proc///g' | sort -n"

To make this command configurable is a good idea. But it would be fin to add one or two default values.

Thank for your help!
Susanne and Jens

@jlolling
Copy link
Owner

jlolling commented Nov 16, 2023

The live check component has now the ability to setup alternative commands to get the PID list and also alternative regex to extract the PIDs from the command output. Please try it out.
The components also provide the libs now compatible with Talend 8 and this cause less work to get the component running.
Please test the release 8.8

@Jens2205
Copy link
Author

We will test it asap :-)
Thanks!

@Jens2205
Copy link
Author

Hi,
it looks like something is missing:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/logging/log4j/LogManager
at de.cimt.talendcomp.jobinstance.manage.JobInstanceHelper.(JobInstanceHelper.java:50)

@jlolling
Copy link
Owner

This means the expected Log4J v2 lib are not in place in the job. Actually log4j v2 should be the default for projects.
Thats why component does not ship with Log4j lib.

@Jens2205
Copy link
Author

This was a good point - but we get now a new error:
Exception in thread "main" java.lang.IllegalAccessError: class de.cimt.talendcomp.jobinstance.process.ProcessHelper (in unnamed module @0x302c971f) cannot access class jdk.internal.org.jline.utils.Log (in module jdk.internal.le) because module jdk.internal.le does not export jdk.internal.org.jline.utils to unnamed module @0x302c971f
at de.cimt.talendcomp.jobinstance.process.ProcessHelper.retrieveProcessListForUnix(ProcessHelper.java:78)
at de.cimt.talendcomp.jobinstance.process.ProcessHelper.retrieveProcessList(ProcessHelper.java:61)
at de.cimt.talendcomp.jobinstance.manage.JobInstanceHelper.cleanupBrokenJobInstances(JobInstanceHelper.java:1591)

@jlolling
Copy link
Owner

jlolling commented Nov 16, 2023

Ok, I have tested it on my Mac. I will try to run it under Linux.
Which Java version do you running?
This error is actually not related to my changes. It should be caused by a change in the environment, I guess

@jlolling
Copy link
Owner

jlolling commented Nov 16, 2023

The good news is, I can reproduce the error now. A very stupid error! I will fix that immediately!

@jlolling jlolling added the bug label Nov 16, 2023
@jlolling
Copy link
Owner

Sorry for confusion. Version 8.9 works now. Please check again.

@Jens2205
Copy link
Author

Sorry Jan,
we have two further things:

  1. 8.9 is not shown as new component version in TOS (but we found it correct in the xml of the component)
  2. We are receiving the following Error by trying the following command and check also user permissions:
  • "ls -d /proc/[0-9]* | sed 's//proc///g' | sort -n"
  • "ls -d /proc/[0-9]*"
  • "ls -d /proc/*"

Always the command response seems to be empty and then the matching does not fine anything.

`[ERROR] 10:12:22 de.cimt.talendcomp.jobinstance.process.ProcessHelper- No pids could be extracted by unix command: 'ls -d /proc/*' using pattern: '[0-9]{1,8}' response:

[FATAL] 10:12:22 beat17.management_check_broken_instances_3_6.management_check_broken_instances- tJobInstanceLiveCheck_1 No running OS processes detected, this is not a valid state, abort check! Detected OS: Unix
java.lang.Exception: No running OS processes detected, this is not a valid state, abort check! Detected OS: Unix
at de.cimt.talendcomp.jobinstance.manage.JobInstanceHelper.cleanupBrokenJobInstances(JobInstanceHelper.java:1594) ~[cimt-talendcomp-jobinstance-8.9.jar:?]
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.tInfiniteLoop_1Process(management_check_broken_instances.java:2670) [management_check_broken_instances_3_6.jar:?]
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.tPostgresqlConnectionPool_1Process(management_check_broken_instances.java:1768) [management_check_broken_instances_3_6.jar:?]
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.runJobInTOS(management_check_broken_instances.java:3992) [management_check_broken_instances_3_6.jar:?]
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.main(management_check_broken_instances.java:3685) [management_check_broken_instances_3_6.jar:?]
Exception in component tJobInstanceLiveCheck_1 (management_check_broken_instances)
java.lang.Exception: No running OS processes detected, this is not a valid state, abort check! Detected OS: Unix
at de.cimt.talendcomp.jobinstance.manage.JobInstanceHelper.cleanupBrokenJobInstances(JobInstanceHelper.java:1594)
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.tInfiniteLoop_1Process(management_check_broken_instances.java:2670)
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.tPostgresqlConnectionPool_1Process(management_check_broken_instances.java:1768)
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.runJobInTOS(management_check_broken_instances.java:3992)
at beat17.management_check_broken_instances_3_6.management_check_broken_instances.main(management_check_broken_instances.java:3685)`

Thanks for your help!
Best regards
Susanne and Jens

@jlolling
Copy link
Owner

jlolling commented Nov 18, 2023

I will install Debian 11 and check it. On my Mac it works. Because the response-part is empty, it shows me the command itself does not return a list.

@Jens2205
Copy link
Author

Hi Jan, is there something new?
Thanks in advance!

@jlolling
Copy link
Owner

I am installing currently Ubuntu 22 and check today what went wrong.

@jlolling
Copy link
Owner

jlolling commented Dec 4, 2023

The problem with the ls command is, it is not a real program like ps but it is a command of the shell. Thats why the process builder from the JVM cannot run this.

@jlolling
Copy link
Owner

jlolling commented Dec 4, 2023

The component cannot handle piped commands yet.

@jlolling
Copy link
Owner

I cannot reproduce this issue. Actually it is also a problem which fixes itself. Only jobs wich are not ended will be treated by this component. And if the job is not actually dead and the component ends its entry, the job will perform an update at the end anyway and overwrite the false "died" update.
I suggest we make a Teams meeting to discuss this topic.

@jlolling
Copy link
Owner

Checkout the latest release 8.10

@jlolling
Copy link
Owner

I have found another bug. Please use version 8.11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants