Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPDB 6 support #1833

Open
wants to merge 1 commit into
base: integration
Choose a base branch
from

Conversation

andr-sokolov
Copy link
Contributor

Add fork version property to the src/build/postgres/postgres.yaml file format. The default value of the property is PostgreSQL. Add the fork field to PgInterface structure. Implement the search in the pgInterface array not only by version, but also by the fork field, the value for comparison with which will be taken from the --fork command line option or from fork property of stanza or from the global section. For backward compatibility purposes, the PostgreSQL interface will be searched without fork option. Add tablespaceId, walSegmentSizeDefault and pageSizeDefault functions to interface, because tablespace id format, default WAL segment size and default page size are different for PostgreSQL and GPDB. Using different page sizes do not lead to performance degradation, because vectorization is preserved.

#1801

@andr-sokolov andr-sokolov force-pushed the ADBDEV-2911 branch 2 times, most recently from ea19290 to 63ec064 Compare August 9, 2022 09:27
Add fork version property to the src/build/postgres/postgres.yaml file format.
The default value of the property is PostgreSQL. Add the fork field to
PgInterface structure. Implement the search in the pgInterface array not only by
version, but also by the fork field, the value for comparison with which will be
taken from the --fork command line option or from fork property of stanza or
from the global section. For backward compatibility purposes, the PostgreSQL
interface will be searched without fork option. Add tablespaceId,
walSegmentSizeDefault and pageSizeDefault functions to interface, because
tablespace id format, default WAL segment size and default page size are
different for PostgreSQL and GPDB. Using different page sizes do not lead to
performance degradation, because vectorization is preserved.
@dwsteele
Copy link
Member

Thank you for submitting this. It is interesting to see what it takes to get pgBackRest working with Greenplum.

The bulk of the patch seems to be dealing with custom page sizes and calculating checksums for them. We are open to that part because Postgres can be configured at compile-time with different pages sizes and we've have a few requests for this support in the past.

We're less interested in supporting the custom tablespace naming or pg_control for Greenplum. The main reason for this is that we expect this code to covered by integration tests (missing from this patch) which would need to be maintained going forward. Currently adding those these tests (same version, different vendor) would require changes to the Perl test code which we would rather not do while this code is being migrated to C. Also, we are not sure that we want to branch out from our core focus on standard Postgres.

We think the best way forward would be to break out the variable page size code into a separate commit and base that on the value read from pg_control. Testing will still be an issue but we could lock the core code down to 8k pages until we are ready to build custom versions of Postgres for integration testing. That would be easy for you to disable when building for Greenplum.

@andr-sokolov
Copy link
Contributor Author

@dwsteele , Did I understand correctly that if I prepare a PR that will implement support for all page sizes used in PostgreSQL (1, 2, 4, 8, 16 and 32kb), then are you ready to merge it?

@x4m
Copy link

x4m commented Aug 17, 2022

Hey @andr-sokolov ! I see you put a lot of effort in this completely new area of backups. Yes, GP is different from PG. Let's work together on GP support in WAL-G. We totally understand and share needs of ADB, so we can cooperate.

@dwsteele
Copy link
Member

Did I understand correctly that if I prepare a PR that will implement support for all page sizes used in PostgreSQL (1, 2, 4, 8, 16 and 32kb), then are you ready to merge it?

Certainly ready to review it. But yes, this is a feature we'd be happy to have in pgBackRest.

@andr-sokolov
Copy link
Contributor Author

Certainly ready to review it. But yes, this is a feature we'd be happy to have in pgBackRest.

Thanks. I'll do it later.

@dwsteele
Copy link
Member

I've been thinking about this some more, and it may be possible to add greenplum support (as well as other forks) by adding options to give pgbackrest the information about pg_control that it requires. For example, --pg-version to set the version, --pg-page-size to set the page size, etc. This would be relatively easy for us to test but allow forks to do their own testing for compliance with that fork.

But, still makes sense to split out the page checksum patch first and then see what is left.

@inviscid
Copy link

I can't really help with the coding part but I'd be happy to help test once there is a solution available for GPDB. Any idea of a timeline?

@andr-sokolov
Copy link
Contributor Author

@inviscid, You can get source code to compile and test with
git clone https://github.com/arenadata/pgbackrest -b integration
It will be great if you find any bugs.

@Alexklkv123
Copy link

@andr-sokolov, hello! Sorry to bother you. I decided to see how pgbackrest works with GP. Everything seems to be gone, but there is a problem, I have empty fields in these requests:
(select setting from pg_catalog.pg_settings where name = 'data_directory')::text
(select setting from pg_catalog.pg_settings where name = 'archive_command')::text

Their call in the db.c file. I am currently hardcoding these values. In this case, the SHOW command in psql displays valid values. Have you encountered this problem, if so, could you write how you solved it?
Thank you!

@andr-sokolov
Copy link
Contributor Author

@andr-sokolov, hello! Sorry to bother you. I decided to see how pgbackrest works with GP. Everything seems to be gone, but there is a problem, I have empty fields in these requests: (select setting from pg_catalog.pg_settings where name = 'data_directory')::text (select setting from pg_catalog.pg_settings where name = 'archive_command')::text

Their call in the db.c file. I am currently hardcoding these values. In this case, the SHOW command in psql displays valid values. Have you encountered this problem, if so, could you write how you solved it? Thank you!

@Alexklkv123 hello! You should update GP. This commit fixes the problem https://github.com/greenplum-db/gpdb/commit/4221bffbc34c97e965e4fe3952292383c84687b7

@Alexklkv123
Copy link

@andr-sokolov, hello! Sorry to bother you. I decided to see how pgbackrest works with GP. Everything seems to be gone, but there is a problem, I have empty fields in these requests: (select setting from pg_catalog.pg_settings where name = 'data_directory')::text (select setting from pg_catalog.pg_settings where name = 'archive_command')::text
Their call in the db.c file. I am currently hardcoding these values. In this case, the SHOW command in psql displays valid values. Have you encountered this problem, if so, could you write how you solved it? Thank you!

@Alexklkv123 hello! You should update GP. This commit fixes the problem greenplum-db/gpdb@4221bff

@andr-sokolov hello! Thank you very much, after the update it worked!

@QAQQL
Copy link

QAQQL commented Mar 24, 2023

How should I configure config(pgbackrest.conf)
Used for backup:
greenplum-db(6.10.1)
The kernel is postgresql (9.4.24)

When I run the following command
pgbackrest stanza-create --pg1-host=192.168.1.183 --pg1-database=db4 --pg1-host-user=gpadmin --pg1-user=postgres --pg1-path=/data/greenplum/gpdata/gpmaster/gpseg-1 --stanza=db4

The command throws the following exception
图片

Should I upgrade GPDB7( PostgreSQL 11.9) to use pgbackrest
Or is there a problem with my configuration file

@QAQQL
Copy link

QAQQL commented Mar 24, 2023

#cat /var/log/pgbackrest/pt-stanza-create.log

-------------------PROCESS START-------------------
2023-03-24 14:47:32.696 P00 INFO: stanza-create command begin 2.46dev: --exec-id=24968-0e21973b --log-level-console=info --log-level-file=debug --pg1-database=db4 --pg1-host=192.168.1.183 --pg1-host-user=gpadmin --pg1-path=/data/greenplum/gpdata/gpmaster/gpseg-1 --pg1-port=5432 --pg1-user=postgres --repo1-path=/data/pgbackrest/backup --stanza=pt
2023-03-24 14:47:32.696 P00 DEBUG: common/lock::lockAcquire: (lockPath: {"/tmp/pgbackrest"}, stanza: {"pt"}, execId: {"24968-0e21973b"}, lockType: 2, param.timeout: 0, param.returnOnNoLock: false)
2023-03-24 14:47:32.697 P00 DEBUG: common/lock::lockAcquire: => true
2023-03-24 14:47:32.697 P00 DEBUG: config/load::cfgLoad: => void
2023-03-24 14:47:32.697 P00 DEBUG: command/stanza/create::cmdStanzaCreate: (void)
2023-03-24 14:47:32.697 P00 DEBUG: command/control/common::lockStopTest: (void)
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageExists: (this: {type: posix, path: /, write: false}, pathExp: {"/tmp/pgbackrest/pt.stop"}, param.timeout: 0)
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageInfo: (this: {type: posix, path: /, write: false}, fileExp: {"/tmp/pgbackrest/pt.stop"}, param.level: 3, param.ignoreMissing: true, param.followLink: true, param.noPathEnforce: false)
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageInfo: => {StorageInfo}
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageExists: => false
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageExists: (this: {type: posix, path: /, write: false}, pathExp: {"/tmp/pgbackrest/all.stop"}, param.timeout: 0)
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageInfo: (this: {type: posix, path: /, write: false}, fileExp: {"/tmp/pgbackrest/all.stop"}, param.level: 3, param.ignoreMissing: true, param.followLink: true, param.noPathEnforce: false)
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageInfo: => {StorageInfo}
2023-03-24 14:47:32.697 P00 DEBUG: storage/storage::storageExists: => false
2023-03-24 14:47:32.697 P00 DEBUG: command/control/common::lockStopTest: => void
2023-03-24 14:47:32.697 P00 DEBUG: db/helper::dbGet: (primaryOnly: false, primaryRequired: true, standbyRequired: false)
2023-03-24 14:47:32.697 P00 DEBUG: db/helper::dbGetIdx: (pgIdx: 0)
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::pgIsLocal: (pgIdx: 0)
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::pgIsLocal: => false
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::pgIsLocal: (pgIdx: 0)
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::pgIsLocal: => false
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::protocolRemoteGet: (protocolStorageType: pg, hostIdx: 0)
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::protocolRemoteParamSsh: (protocolStorageType: pg, hostIdx: 0)
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::protocolRemoteParam: (protocolStorageType: pg, hostIdx: 0)
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::protocolRemoteParam: => {["--exec-id=24968-0e21973b", "--log-level-console=off", "--log-level-file=off", "--log-level-stderr=error", "--pg1-database=db4", "--pg1-path=/data/greenplum/gpdata/gpmaster/gpseg-1", "--pg1-port=5432", "--pg1-user=postgres", "--process=0", "--remote-type=pg", "--stanza=pt", "stanza-create:remote"]}
2023-03-24 14:47:32.697 P00 DEBUG: protocol/helper::protocolRemoteParamSsh: => {["-o", "LogLevel=error", "-o", "Compression=no", "-o", "PasswordAuthentication=no", "[email protected]", "pgbackrest --exec-id=24968-0e21973b --log-level-console=off --log-level-file=off --log-level-stderr=error --pg1-database=db4 --pg1-path=/data/greenplum/gpdata/gpmaster/gpseg-1 --pg1-port=5432 --pg1-user=postgres --process=0 --remote-type=pg --stanza=pt stanza-create:remote"]}
2023-03-24 14:47:32.697 P00 DEBUG: common/exec::execNew: (command: {"ssh"}, param: {["-o", "LogLevel=error", "-o", "Compression=no", "-o", "PasswordAuthentication=no", "[email protected]", "pgbackrest --exec-id=24968-0e21973b --log-level-console=off --log-level-file=off --log-level-stderr=error --pg1-database=db4 --pg1-path=/data/greenplum/gpdata/gpmaster/gpseg-1 --pg1-port=5432 --pg1-user=postgres --process=0 --remote-type=pg --stanza=pt stanza-create:remote"]}, name: {"remote-0 process on '192.168.1.183'"}, timeout: 1830000)
2023-03-24 14:47:32.697 P00 DEBUG: common/exec::execNew: => {Exec}
2023-03-24 14:47:32.697 P00 DEBUG: common/exec::execOpen: (this: {Exec})
2023-03-24 14:47:32.697 P00 DEBUG: common/exec::execOpen: => void
2023-03-24 14:47:33.021 P00 DEBUG: protocol/helper::protocolRemoteGet: => {name: remote-0 ssh protocol on '192.168.1.183', state: idle}
2023-03-24 14:47:33.021 P00 DEBUG: storage/remote/storage::storageRemoteNew: (modeFile: 0640, modePath: 0750, write: false, pathExpressionFunction: null, client: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}, compressLevel: 3)
2023-03-24 14:47:33.021 P00 DEBUG: storage/remote/storage::storageRemoteNew: => {type: remote, path: /data/greenplum/gpdata/gpmaster/gpseg-1, write: false}
2023-03-24 14:47:33.021 P00 DEBUG: protocol/helper::protocolRemoteGet: (protocolStorageType: pg, hostIdx: 0)
2023-03-24 14:47:33.021 P00 DEBUG: protocol/helper::protocolRemoteGet: => {name: remote-0 ssh protocol on '192.168.1.183', state: idle}
2023-03-24 14:47:33.021 P00 DEBUG: db/db::dbNew: (client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}, storage: {type: remote, path: /data/greenplum/gpdata/gpmaster/gpseg-1, write: false}, applicationName: {"pgBackRest [stanza-create]"})
2023-03-24 14:47:33.021 P00 DEBUG: db/db::dbNew: => {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}}
2023-03-24 14:47:33.021 P00 DEBUG: db/helper::dbGetIdx: => {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}}
2023-03-24 14:47:33.021 P00 DEBUG: db/db::dbOpen: (this: {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}})
2023-03-24 14:47:33.026 P00 DEBUG: db/db::dbExec: (this: {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}}, command: {"set search_path = 'pg_catalog'"})
2023-03-24 14:47:33.026 P00 DEBUG: db/db::dbQuery: (this: {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}}, resultType: none, query: {"set search_path = 'pg_catalog'"})
2023-03-24 14:47:33.127 P00 DEBUG: db/db::dbQuery: => null
2023-03-24 14:47:33.127 P00 DEBUG: db/db::dbExec: => void
2023-03-24 14:47:33.127 P00 DEBUG: db/db::dbExec: (this: {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}}, command: {"set client_encoding = 'UTF8'"})
2023-03-24 14:47:33.127 P00 DEBUG: db/db::dbQuery: (this: {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}}, resultType: none, query: {"set client_encoding = 'UTF8'"})
2023-03-24 14:47:33.227 P00 DEBUG: db/db::dbQuery: => null
2023-03-24 14:47:33.227 P00 DEBUG: db/db::dbExec: => void
2023-03-24 14:47:33.227 P00 DEBUG: db/db::dbQuery: (this: {client: null, remoteClient: {name: remote-0 ssh protocol on '192.168.1.183', state: idle}}, resultType: row, query: {"select (select setting from pg_catalog.pg_settings where name = 'server_version_num')::int4, (select setting from pg_catalog.pg_settings where name = 'data_directory')::text, (select setting from pg_catalog.pg_settings where name = 'archive_mode')::text, (select setting from pg_catalog.pg_settings where name = 'archive_command')::text, (select setting from pg_catalog.pg_settings where name = 'checkpoint_timeout')::int4"})
2023-03-24 14:47:33.328 P00 DEBUG: db/db::dbQuery: => {Pack}
2023-03-24 14:47:33.328 P00 WARN: unable to check pg1: [DbQueryError] unable to select some rows from pg_settings
HINT: is the backup running as the postgres user?
HINT: is the pg_read_all_settings role assigned for PostgreSQL >= 10?
2023-03-24 14:47:33.328 P00 DEBUG: command/exit::exitSafe: (result: 0, error: true, signalType: 0)
2023-03-24 14:47:33.328 P00 ERROR: [056]: unable to find primary cluster - cannot proceed
HINT: are all available clusters in recovery?
--------------------------------------------------------------------
If SUBMITTING AN ISSUE please provide the following information:

                                version: 2.46dev
                                command: stanza-create
                                options: --exec-id=24968-0e21973b --log-level-console=info --log-level-file=debug --pg1-database=db4 --pg1-host=192.168.1.183 --pg1-host-user=gpadmin --pg1-path=/data/greenplum/gpdata/gpmaster/gpseg-1 --pg1-port=5432 --pg1-user=postgres --repo1-path=/data/pgbackrest/backup --stanza=pt
                                
                                stack trace:
                                db/helper.c:dbGet:132:(primaryOnly: false, primaryRequired: true, standbyRequired: false)
                                command/stanza/create.c:cmdStanzaCreate:(void)
                                main.c:main:(debug log level required for parameters)
                                --------------------------------------------------------------------

2023-03-24 14:47:33.328 P00 INFO: stanza-create command end: aborted with exception [056]
2023-03-24 14:47:33.328 P00 DEBUG: common/lock::lockRelease: (failOnNoLock: false)
2023-03-24 14:47:33.329 P00 DEBUG: common/lock::lockRelease: => true
2023-03-24 14:47:33.329 P00 DEBUG: command/exit::exitSafe: => 56
2023-03-24 14:47:33.429 P00 DEBUG: main::main: => 56

@QAQQL
Copy link

QAQQL commented Mar 24, 2023

图片

select (select setting from pg_catalog.pg_settings where name = 'server_version_num')::int4, (select setting from pg_catalog.pg_settings where name = 'data_directory')::text, (select setting from pg_catalog.pg_settings where name = 'archive_mode')::text, (select setting from pg_catalog.pg_settings where name = 'archive_command')::text, (select setting from pg_catalog.pg_settings where name = 'checkpoint_timeout')::int4

My GPDB6 is a cluster environment built by three devices

There was no error reported when I ran the sql in navicat

@Alexklkv123
Copy link

Alexklkv123 commented Mar 27, 2023 via email

@QAQQL
Copy link

QAQQL commented Mar 28, 2023

Hello! Sorry for not replying sooner, I took the day off on Friday. I see issues with pg1-user. Yours is postgres, although it should be gpadmin. Perhaps there will be errors due to the fact that not all flags are created. Attached my pgbackrest.conf. Example commands: pgbackrest --stanza=master --log-level-console=info stanza-create pgbackrest --stanza=master --log-level-console=info check pgbackrest --stanza=master --log-level-console=info backup You can also see the wiki in russian language only - https://wiki.glowbyteconsulting.com/pages/viewpage.action?pageId=169182069 Write if you have any more questions or problems!

On Fri, Mar 24, 2023 at 9:55 AM QL @.> wrote: [image: 图片] https://user-images.githubusercontent.com/51136763/227447215-2c24f4f5-8a53-4bd7-a666-5a92be6d3f57.png select (select setting from pg_catalog.pg_settings where name = 'server_version_num')::int4, (select setting from pg_catalog.pg_settings where name = 'data_directory')::text, (select setting from pg_catalog.pg_settings where name = 'archive_mode')::text, (select setting from pg_catalog.pg_settings where name = 'archive_command')::text, (select setting from pg_catalog.pg_settings where name = 'checkpoint_timeout')::int4 My GPDB6 is a cluster environment built by three devices There was no error reported when I ran the sql in navicat — Reply to this email directly, view it on GitHub <#1833 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3Q3DUN7U5T6X6GI6ZWKSWLW5VAM3ANCNFSM557YMG7Q . You are receiving this because you were mentioned.Message ID: @.>

Thank you for your reply. I still have some issues to resolve. Could you provide me with your email?
When I access the wiki, he needs me to log in, but I can't find the registration button.

@gesundes
Copy link

Hi there. I'm trying to test pgbackrest with GPDB. I was able to make a backup as well as run archiving on active master.

But it's unclear to me how to restore a database to the GPDB cluster while it has multiple segments. Maybe someone can describe how to restore backups made by pgbackrest to GPDB clusters? Is there any related documentation?

@andr-sokolov
Copy link
Contributor Author

But it's unclear to me how to restore a database to the GPDB cluster while it has multiple segments. Maybe someone can describe how to restore backups made by pgbackrest to GPDB clusters? Is there any related documentation?

@gesundes sorry for the late response. In English - https://github.com/arenadata/pgbackrest/blob/2.50-ci/README_GPDB_en.md and the same text in Russian - https://github.com/arenadata/pgbackrest/blob/2.50-ci/README_GPDB_ru.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants