-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework compression activity wal markers #6920
Conversation
efd6c57
to
3f7717a
Compare
src/guc.c
Outdated
@@ -477,10 +477,9 @@ _guc_init(void) | |||
NULL); | |||
|
|||
DefineCustomBoolVariable(MAKE_EXTOPTION("enable_decompression_logrep_markers"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, decompression markers sound specific to decompression activity. With this PR, the markers will be available for recompression as well. So, I think we should generalize the GUC to be compression markers (representing both decompression and recompression).
DefineCustomBoolVariable(MAKE_EXTOPTION("enable_decompression_logrep_markers"), | |
DefineCustomBoolVariable(MAKE_EXTOPTION("enable_compression_logrep_markers"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had considered that, but it would break compatibility with applications which are already using this guc. The way I justified retaining the name is to imagine that there are brackets around the "de" in decompression, like this: enable_(de)compression_logrep_markers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed this to enable_compression_wal_markers
.
If I understand the PR correctly, your patch wouldn't send the catalog updates for the chunk tables itself, would it? That would actually break functionality on my tool, since I completely replicate the catalog in memory and need that information. I'd probably define a second set of markers with slightly different prefix names. In this case everybody would be able to decide to either ignore them or act on them. Alternatively, we could use the message body, like adding the table names, but I think the second prefix is easier and more obvious. Changing the GUC name is ok, I can let people know in the docs :) |
This PR doesn't change which catalog updates are emitted. One thing that has changed is where exactly the "end decompression marker" is in the WAL. It now surrounds both inserts into the uncompressed chunk, and the associated catalog changes. To visualise this: Previously the WAL would look like this for a "transparent decompression" event (e.g. UPDATE to compressed data):
Now, it looks like this:
Note that the IMHO this is "more correct", as the decompression markers now surround all decompression activities. Your tool can decide which kinds of activities it does or does not ignore in the context of those markers. |
3f7717a
to
c972677
Compare
Ah I see. Yeah that is totally fine, since I can still act on all catalog entries, while ignoring anything outside the catalog 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff, makes the replication markers more consistent across the whole codebase.
9e2dd49
to
d5c0eed
Compare
tsl/src/compression/api.c
Outdated
@@ -728,6 +734,7 @@ tsl_compress_chunk_wrapper(Chunk *chunk, bool if_not_compressed, bool recompress | |||
ereport((if_not_compressed ? NOTICE : ERROR), | |||
(errcode(ERRCODE_DUPLICATE_OBJECT), | |||
errmsg("chunk \"%s\" is already compressed", get_rel_name(chunk->table_id)))); | |||
write_logical_replication_msg_compression_end(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not be triggered in the error path. I guess it doesnt matter if you want it in the error path you should have it above the ereport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
65aa711
to
51edab2
Compare
tsl/src/compression/wal_utils.h
Outdated
#include "guc.h" | ||
#include <replication/message.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include "guc.h" | |
#include <replication/message.h> | |
#pragma once | |
#include <postgres.h> | |
#include <replication/message.h> | |
#include "guc.h" |
All headers should have the #pragma once
to make sure it will be included only once. Also all headers should always include "postgres.h" as the first header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
- Adds WAL markers around all compression and decompression activities. - Renames the GUC controlling this behaviour. - Enables the WAL marker GUC by default. This allows to distinguish between "user-driven" and "compression-driven" DML on uncompressed chunks. This is a requirement to be able to support DML on compressed chunks in live migration. Note: A previous commit [1] added wal markers before and after inserts which were part of "transparent decompression". Transparent decompression is triggered when an UPDATE or DELETE statement affects compressed data, or an INSERT statment inserts into a range of compressed data which has a unique or primary key constraint. In these cases, the data is first moved from the compressed chunk to the uncompressed chunk, and then the DML is applied. This change extends the existing behaviour on two fronts: 1. It adds WAL markers for both chunk compression and decompression events. 2. It extends the WAL markers for transparent decompression to include not only INSERTs into the compressed chunk, but also to TimescaleDB catalog operations which were part of the decompression. [1]: b5b46a3
51edab2
to
cbad494
Compare
In timescale#6920 was introduced the dependency of Postgres contrib extension `test_decoding` for TAP tests but we forgot to include it in the sanitizer tests.
In timescale#6920 was introduced the dependency of Postgres contrib extension `test_decoding` for TAP tests but we forgot to include it in the sanitizer tests.
In timescale#6920 was introduced the dependency of Postgres contrib extension `test_decoding` for TAP tests but we forgot to include it in the sanitizer tests.
In timescale#6920 was introduced the dependency of Postgres contrib extension `test_decoding` for TAP tests but we forgot to include it in the sanitizer tests.
In timescale#6920 was introduced the dependency of Postgres contrib extension `test_decoding` for TAP tests but we forgot to include it in the sanitizer tests.
In timescale#6920 was introduced the dependency of Postgres contrib extension `test_decoding` for TAP tests but we forgot to include it in the sanitizer tests.
In #6920 was introduced the dependency of Postgres contrib extension `test_decoding` for TAP tests but we forgot to include it in the sanitizer tests.
This allows to distinguish between "user-driven" and
"compression-driven" DML on uncompressed chunks. This is a requirement
to be able to support DML on compressed chunks in live migration.
Note: A previous commit 1 added wal markers before and after inserts
which were part of "transparent decompression". Transparent
decompression is triggered when an UPDATE or DELETE statement affects
compressed data, or an INSERT statment inserts into a range of
compressed data which has a unique or primary key constraint. In these
cases, the data is first moved from the compressed chunk to the
uncompressed chunk, and then the DML is applied.
This change extends the existing behaviour on two fronts:
events.
not only INSERTs into the compressed chunk, but also to TimescaleDB
catalog operations which were part of the decompression.