Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compreface-api service in constant "crash loop". #1230

Open
lschapker opened this issue Feb 25, 2024 · 2 comments
Open

compreface-api service in constant "crash loop". #1230

lschapker opened this issue Feb 25, 2024 · 2 comments

Comments

@lschapker
Copy link

lschapker commented Feb 25, 2024

Describe the bug

"compreface-api" constantly crashing. "compreface-core" in "unhealthy" state. Attempt to login to "http://:8000 shows check for "admin", "loading ..." for "API Node" and "Core Node".

To Reproduce

Steps to reproduce the behavior:

  1. install "stack" using "Portainer" (using the provided "docker-compose.yml" and ".env" files from version "v1.2.0"

Expected behavior
Not exactly sure. Never had working. Presume that "compreface-api" should run without crashing and "compreface-core" report being "healthy".

Screenshots
Not Applicable

Desktop (please complete the following information):

  • OS: proxmox-ve: 8.1.0 (running kernel: 6.5.11-8-pve) (in a Debian 12 LXC)
  • Browser: Not applicable
  • Version: Not Applicable

Logs

Run those commands and attach result to the ticket:

docker ps

root@face-recognition:/# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3833c38b8133 exadel/compreface-fe:1.2.0 "/docker-entrypoint.…" 32 minutes ago Up 32 minutes 0.0.0.0:8000->80/tcp compreface-ui
738759dba2b7 exadel/compreface-admin:1.2.0 "sh -c 'java $ADMIN_…" 32 minutes ago Up 32 minutes compreface-admin
cc5e0099266b exadel/compreface-api:1.2.0 "sh -c 'java $API_JA…" 48 minutes ago Exited (137) 29 minutes ago compreface-api
f5e1ad12b222 exadel/compreface-core:1.2.0 "uwsgi --ini uwsgi.i…" 32 minutes ago Up 32 minutes (unhealthy) 3000/tcp compreface-core
f15ddae29544 exadel/compreface-postgres-db:1.2.0 "docker-entrypoint.s…" 32 minutes ago Up 32 minutes 5432/tcp compreface-postgres-db
c92eb2dcfcda jakowenko/double-take:latest "/bin/bash ./entrypo…" 32 minutes ago Up 32 minutes 0.0.0.0:3000->3000/tcp double-take
d25bb11b02cf portainer/agent:2.19.4 "./agent" 3 hours ago Up 3 hours portainer_edge_agent
root@face-recognition:/#

docker-compose logs
(Log from compreface-api container)

...
Killed
Listening for transport dt_socket at address: 5005
. ____ _ __ _ _
/\ / ' __ _ () __ __ _ \ \ \
( ( )_
_ | '_ | '| | ' / ` | \ \ \
\/ )| |)| | | | | || (| | ) ) ) )
' |
| .__|| ||| |_, | / / / /
=========|
|==============|/=////
:: Spring Boot :: (v2.5.13)
2024-02-25 07:49:12.839 INFO 6 --- [kground-preinit] o.h.validator.internal.util.Version : HV000001: Hibernate Validator 6.2.3.Final
2024-02-25 07:49:12.931 INFO 6 --- [ main] com.exadel.frs.TrainServiceApplication : Starting TrainServiceApplication v0.0.1-SNAPSHOT using Java 17.0.8 on cc5e0099266b with PID 6 (/home/app.jar started by root in /)
2024-02-25 07:49:12.933 INFO 6 --- [ main] com.exadel.frs.TrainServiceApplication : The following 1 profile is active: "dev"
2024-02-25 07:49:13.239 WARN 6 --- [ main] o.s.b.c.config.ConfigDataEnvironment : Property 'spring.profiles' imported from location 'class path resource [application.yml]' is invalid and should be replaced with 'spring.config.activate.on-profile' [origin: class path resource [application.yml] from app.jar - 97:13]
2024-02-25 07:50:05.302 INFO 6 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2024-02-25 07:50:56.251 INFO 6 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 45920 ms. Found 8 JPA repository interfaces.
2024-02-25 07:50:57.458 INFO 6 --- [ main] o.s.cloud.context.scope.GenericScope : BeanFactory id=355f53b0-025e-31e7-98df-696283bbc190
2024-02-25 07:50:58.362 INFO 6 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'cacheConfig' of type [com.exadel.frs.core.trainservice.config.CacheConfig$$EnhancerBySpringCGLIB$$5989377d] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2024-02-25 07:50:59.307 INFO 6 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 8080 (http)
2024-02-25 07:50:59.350 INFO 6 --- [ main] o.a.coyote.http11.Http11NioProtocol : Initializing ProtocolHandler ["http-nio-8080"]
2024-02-25 07:50:59.350 INFO 6 --- [ main] o.apache.catalina.core.StandardService : Starting service [Tomcat]
2024-02-25 07:50:59.351 INFO 6 --- [ main] org.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/9.0.62]
2024-02-25 07:50:59.579 INFO 6 --- [ main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2024-02-25 07:50:59.580 INFO 6 --- [ main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 106339 ms
2024-02-25 07:51:02.949 INFO 6 --- [ main] o.hibernate.jpa.internal.util.LogHelper : HHH000204: Processing PersistenceUnitInfo [name: default]
2024-02-25 07:51:03.090 INFO 6 --- [ main] org.hibernate.Version : HHH000412: Hibernate ORM core version 5.4.33
2024-02-25 07:51:03.092 INFO 6 --- [ main] org.hibernate.cfg.Environment : HHH000205: Loaded properties from resource hibernate.properties: {hibernate.bytecode.use_reflection_optimizer=false, hibernate.types.print.banner=false}
2024-02-25 07:51:03.290 INFO 6 --- [ main] o.hibernate.annotations.common.Version : HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
2024-02-25 07:51:04.744 INFO 6 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2024-02-25 07:51:05.008 INFO 6 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
2024-02-25 07:51:05.064 INFO 6 --- [ main] org.hibernate.dialect.Dialect : HHH000400: Using dialect: org.hibernate.dialect.PostgreSQL10Dialect
2024-02-25 07:51:19.771 INFO 6 --- [ main] o.h.e.t.j.p.i.JtaPlatformInitiator : HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]
2024-02-25 07:51:19.816 INFO 6 --- [ main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
Killed

(Log from compreface-core)

...
Found: BoundingBoxDTO(x_min=49, y_min=47, x_max=199, y_max=224, probability=0.9400066137313843, _np_landmarks=array([[104, 124],
[153, 124],
[131, 157],
[106, 177],
[146, 177]])) | severity=DEBUG request={"method":"GET","path":"/healthcheck","filename":"","api_key":"","remote_addr":"127.0.0.1"} logger=src.services.facescan.plugins.facenet.facenet module=facenet traceback=null build_version=dev
subprocess 1862 exited with code 52
DAMN ! worker 1 (pid: 3855) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 1 (new pid: 3925)
2024-02-25 08:08:24.553180: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
Found: BoundingBoxDTO(x_min=49, y_min=47, x_max=199, y_max=224, probability=0.9400066137313843, _np_landmarks=array([[104, 124],
[153, 124],
[131, 157],
[106, 177],
[146, 177]])) | severity=DEBUG request={"method":"GET","path":"/healthcheck","filename":"","api_key":"","remote_addr":"127.0.0.1"} logger=src.services.facescan.plugins.facenet.facenet module=facenet traceback=null build_version=dev
subprocess 1872 exited with code 52
DAMN ! worker 2 (pid: 3905) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 2 (new pid: 3972)
...

Additional context

Add any other context about the problem here.
I have fully uninstalled (i.e. deleting the volumes) and reinstalled multiple times, and have not yet achieved success. I don't know what the issue is.
I have verified that my CPU has "AVX" capabilities (AMD EPIC Rome) and I do not have any GPU to use/configure.
Looking at the "troubleshooting" section of "https://github.com/exadel-inc/CompreFace/blob/master/docs/Installation-options.md" isn't much help.
Please advise.

@lschapker
Copy link
Author

Please note that I have changed the "restart policy" from "always" to "none" so that I could determine whether the container was actually exiting.

@theuzzy
Copy link

theuzzy commented May 3, 2024

Hi,

Have you fix this problem? I had this problem and fixed it by changing the cpu from kvm to host in proxmox I don’t know if this works with an Lxc containers.

regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants