Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pixeltable Corrupts the data when the Server restarts or there is a power outage. #51

Open
mohsin-ashraf opened this issue Dec 13, 2023 · 0 comments

Comments

@mohsin-ashraf
Copy link

mohsin-ashraf commented Dec 13, 2023

Hi, team. I have a Linux server with the following specs.

System:
  Kernel: 6.2.0-37-generic x86_64 bits: 64 compiler: N/A Console: pty pts/5
    Distro: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Machine:
  Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
  Mobo: ASUSTeK model: PRIME Z690-P WIFI v: Rev 1.xx serial: <superuser required>
    UEFI: American Megatrends v: 1620 date: 08/12/2022
CPU:
  Info: 12-core (8-mt/4-st) model: 12th Gen Intel Core i7-12700K bits: 64 type: MST AMCP
    arch: Alder Lake rev: 2 cache: L1: 1024 KiB L2: 12 MiB L3: 25 MiB
  Speed (MHz): avg: 2232 high: 4891 min/max: 800/4900:5000:3800 cores: 1: 800 2: 1584 3: 1712
    4: 2064 5: 801 6: 3600 7: 4024 8: 1463 9: 800 10: 3600 11: 4881 12: 2671 13: 1045 14: 1434
    15: 4891 16: 3501 17: 804 18: 1423 19: 833 20: 2713 bogomips: 144383
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
  Device-1: NVIDIA vendor: ZOTAC driver: nvidia v: 535.129.03 bus-ID: 01:00.0
  Display: server: X.org v: 1.21.1.4 with: Xwayland v: 22.1.1 driver: X: loaded: nvidia
    unloaded: fbdev,modesetting,nouveau,vesa gpu: Nvidia tty: 245x74
  Message: GL data unavailable in console. Try -G --display
Drives:
  Local Storage: total: 3.87 TiB used: 2.11 TiB (54.5%)
  ID-1: /dev/nvme0n1 vendor: Western Digital model: WD PC SN740 SDDPNQD-256G-1006
    size: 238.47 GiB temp: 42.9 C
  ID-2: /dev/sda vendor: Seagate model: ST4000NM0053 size: 3.64 TiB
Partition:
  ID-1: / size: 165.2 GiB used: 83.82 GiB (50.7%) fs: ext4 dev: /dev/nvme0n1p4
  ID-2: /boot/efi size: 511 MiB used: 6.1 MiB (1.2%) fs: vfat dev: /dev/nvme0n1p1
Swap:
  ID-1: swap-1 type: file size: 2 GiB used: 78.5 MiB (3.8%) file: /swapfile
Info:
  Processes: 473 Uptime: 15h 54m Memory: 31.08 GiB used: 10.21 GiB (32.9%) Init: systemd
  runlevel: 5 Compilers: gcc: 11.4.0 Packages: 2288 Shell: Bash v: 5.1.16 inxi: 3.3.13

I am running the Pixeltable installed from using pip install git+https://github.com/mkornacker/pixeltable. It runs fine for some times and eventually corrupts the SQL data which it stores internally when there is a Server Restart or Power Outage. It starts giving the following error.

2023-11-20 12:05:22,936 INFO env env.py:172: found database postgresql://postgres:*****@localhost:6543/pixeltable
Traceback (most recent call last):
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedTable: relation "systeminfo" does not exist
LINE 2: FROM systeminfo
             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/media/user/data/pixel_datastore/ObvioEngine/pixel_table/pt_sync_db_down.py", line 147, in <module>
    cl = pt.Client()
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/pixeltable/client.py", line 34, in __init__
    Env.get().set_up()
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/pixeltable/env.py", line 175, in set_up
    metadata.upgrade_md(self._sa_engine)
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/pixeltable/metadata/__init__.py", line 40, in upgrade_md
    system_info = session.query(SystemInfo).one().md
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2798, in one
    return self._iter().one()  # type: ignore
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2847, in _iter
    result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
    return self._execute_internal(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2188, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
    return meth(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1639, in _execute_clauseelement
    ret = self._execute_context(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1848, in _execute_context
    return self._exec_single_context(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1988, in _exec_single_context
    self._handle_dbapi_exception(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2343, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "system info" does not exist
LINE 2: FROM systeminfo

This issue has been observed many times, and every time this issue occurred, there was a power outage or Server Restart. However, one thing to keep in mind is that, it not always happen, i.e. Sometimes there is power outage or Server restart but the Pixeltable is working fine.

The logs that were collected for this error are given below.

Inserting rows into table: 29rows [00:00, 4745.06rows/s]
inserted 29 rows with 0 errors 
Inserting rows into table: 30rows [00:00, 9858.13rows/s]
inserted 30 rows with 0 errors 
Annotations are pushed to db
Activated conda environment: obvio_engine
Current working directory /media/user/data/pixel_datastore/ObvioEngine/pixel_table
Running pt_sync_db_down.py
/media/user/data/repos/logs/ls_utils.py_20231120_083001.log
/media/user/data/repos/logs/pt_utils_20231120_083002.log
/media/user/data/repos/logs/pt_sync_db_down_20231120_083002.log
2023-11-20 08:30:02,759 INFO env env.py:184: found store container
2023-11-20 08:30:02,759 INFO env env.py:206: connecting to NOS
2023-11-20 08:30:02.780 | INFO     | nos.server:init:131 - Inference server already running (name=nos-inference-service-gpu, image=<Image: 'autonomi/nos:0.0.9-gpu'>, id=5c4fcc4d0f35).
2023-11-20 08:30:02,780 INFO env env.py:209: waiting for NOS
2023-11-20 08:30:02,794 INFO env env.py:172: found database postgresql://postgres:*****@localhost:6543/pixeltable
Inserting rows into table: 4rows [00:00, 1039.22rows/s]
inserted 4 rows with 0 errors 
Inserting rows into table: 51rows [00:00, 4665.22rows/s]
inserted 51 rows with 0 errors 
Inserting rows into table: 55rows [00:00, 9299.26rows/s]
inserted 55 rows with 0 errors 
Annotations are pushed to db
Activated conda environment: obvio_engine
Current working directory /media/user/data/pixel_datastore/ObvioEngine/pixel_table
Running pt_sync_db_down.py
/media/user/data/repos/logs/ls_utils.py_20231120_085002.log
/media/user/data/repos/logs/pt_utils_20231120_085002.log
/media/user/data/repos/logs/pt_sync_db_down_20231120_085002.log
2023-11-20 08:50:02,927 INFO env env.py:184: found store container
2023-11-20 08:50:02,927 INFO env env.py:206: connecting to NOS
2023-11-20 08:50:02.952 | INFO     | nos.server:init:131 - Inference server already running (name=nos-inference-service-gpu, image=<Image: 'autonomi/nos:0.0.9-gpu'>, id=5c4fcc4d0f35).
2023-11-20 08:50:02,952 INFO env env.py:209: waiting for NOS
2023-11-20 08:50:02,967 INFO env env.py:172: found database postgresql://postgres:*****@localhost:6543/pixeltable
Inserting rows into table: 45rows [00:00, 4731.96rows/s]
inserted 45 rows with 0 errors 
Inserting rows into table: 45rows [00:00, 11473.78rows/s]
inserted 45 rows with 0 errors 
Annotations are pushed to db
Activated conda environment: obvio_engine
Current working directory /media/user/data/pixel_datastore/ObvioEngine/pixel_table
Running pt_sync_db_down.py
/media/user/data/repos/logs/ls_utils.py_20231120_120501.log
/media/user/data/repos/logs/pt_utils_20231120_120509.log
/media/user/data/repos/logs/pt_sync_db_down_20231120_120509.log
2023-11-20 12:05:09,616 INFO env env.py:186: starting store container
2023-11-20 12:05:10,068 INFO env env.py:236: waiting for store container to start...
2023-11-20 12:05:11,120 INFO env env.py:225: connected to postgresql://postgres:pgpassword@localhost:6543
2023-11-20 12:05:11,133 INFO env env.py:225: connected to postgresql://postgres:pgpassword@localhost:6543
2023-11-20 12:05:11,133 INFO env env.py:206: connecting to NOS
2023-11-20 12:05:12.140 | INFO     | nos.server:_pull_image:219 - Found up-to-date server image: autonomi/nos:0.0.9-gpu
2023-11-20 12:05:12.140 | INFO     | nos.server:init:153 - Starting inference service: [name=nos-inference-service-gpu, runtime=InferenceServiceRuntime(image=autonomi/nos:0.0.9-gpu, name=nos-inference-service-gpu, gpu=True)]
2023-11-20 12:05:12.177 | WARNING  | nos.server._docker:start:99 - Container with same name already exists, removing it (name=nos-inference-service-gpu).
2023-11-20 12:05:12.673 | INFO     | nos.server:init:180 - Inference service started: [name=nos-inference-service-gpu, runtime=InferenceServiceRuntime(image=autonomi/nos:0.0.9-gpu, name=nos-inference-service-gpu, gpu=True), image=<Image: 'autonomi/nos:0.0.9-gpu'>, id=0d3230d78a88]
2023-11-20 12:05:12,673 INFO env env.py:209: waiting for NOS
2023-11-20 12:05:12.886 | WARNING  | nos.client.grpc:WaitForServer:150 - Waiting for server to start... (elapsed=0s)
2023-11-20 12:05:17.892 | WARNING  | nos.client.grpc:WaitForServer:150 - Waiting for server to start... (elapsed=5s)
2023-11-20 12:05:22,936 INFO env env.py:172: found database postgresql://postgres:*****@localhost:6543/pixeltable
Traceback (most recent call last):
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedTable: relation "systeminfo" does not exist
LINE 2: FROM systeminfo
             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/media/user/data/pixel_datastore/ObvioEngine/pixel_table/pt_sync_db_down.py", line 147, in <module>
    cl = pt.Client()
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/pixeltable/client.py", line 34, in __init__
    Env.get().set_up()
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/pixeltable/env.py", line 175, in set_up
    metadata.upgrade_md(self._sa_engine)
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/pixeltable/metadata/__init__.py", line 40, in upgrade_md
    system_info = session.query(SystemInfo).one().md
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2798, in one
    return self._iter().one()  # type: ignore
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2847, in _iter
    result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
    return self._execute_internal(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2188, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
    return meth(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1639, in _execute_clauseelement
    ret = self._execute_context(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1848, in _execute_context
    return self._exec_single_context(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1988, in _exec_single_context
    self._handle_dbapi_exception(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2343, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/media/user/data/environments/obvio_engine/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "systeminfo" does not exist
LINE 2: FROM systeminfo
             ^

I think the removing the existing running container might be causing it.

2023-11-20 12:05:12.177 | WARNING | nos.server._docker:start:99 - Container with same name already exists, removing it (name=nos-inference-service-gpu).

as this log does not appear in any runs that are running successfully.

I tried looking for the logs which pixeltable generates but could not find them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant