Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undo changes in Postgres core for building GIST/GIN indexes #414

Closed
wants to merge 77 commits into from

Conversation

knizhnik
Copy link
Contributor

Some Postgres indexes (GIN,GIST,SPGIST...) are using two-phase build: at first phase relation pages are constructed and at second phase - all relation is wal-logged. It doesn't work with Neon because if dirty page was thrown away from shared buffer before been wal-logged, then its content will be lost.

We have added support of unlogged builds to SMGR API. But it requires changes in Postgre core. What is even worser some extensions (i.e. pgvector) are also using the same policy and have to be patched.

This PR tries to avoid changes in Postgres core and did it at Neon extension level.

lubennikovaav and others added 30 commits February 6, 2024 13:05
Most significant changes are:
- `xlog.c` refactoring - some code was moved to `xlogreader.c` and `xlogprefetcher.c`.
- `ThisTimeLineID` refactoring (4a92a1c and e997a0c), which affects walproposer code
- `XLogFileInit` refactoring, Multiple commits changed the function signature.
- resolve initdb and pg_waldump neon-specific options that conflictes with the ones from PostgreSQL.
-
* Move backpressure throttling implementation to neon extension and function for monitoring throttling time

* Update src/include/miscadmin.h

Co-authored-by: Heikki Linnakangas <[email protected]>

Co-authored-by: Heikki Linnakangas <[email protected]>
Disabled by default. The plan is to merge this now, so that we can do
performance testing quickly, and if it helps, rewrite and review it
properly.

Author: Konstantin Knizhnik
Commit a703269 replaced $(INSTALL) with plain "cp" for installing the
server header files. It sped up "make install" significantly, because
the old logic called $(INSTALL) separately for every header file,
whereas plain "cp" could copy all the files in one command. However, we
have long since made it a requirement that $(INSTALL) can also install
multiple files in one command, see commit f1c5247. Switch back to
$(INSTALL).

Discussion: https://www.postgresql.org/message-id/200503252305.j2PN52m23610%40candle.pha.pa.us
Discussion: https://www.postgresql.org/message-id/2415283.1641852217%40sss.pgh.pa.us
to support only extensions that were built against Neon PostgreSQL
Neon generates PG_VERSION files in one format - just major version number without newline. Be consistent with it
No need to perform WAL recovery in Neon

Co-authored-by: Konstantin Knizhnik <[email protected]>
…ion because spec_token is not wal logged (#223)

* Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged

refer ##2587

* Update src/backend/access/heap/heapam.c

Co-authored-by: Heikki Linnakangas <[email protected]>

Co-authored-by: Heikki Linnakangas <[email protected]>
* Fix shared memory initialization for last written LSN cache

Replace (from,till) with (from,n_blocks) for SetLastWrittenLSNForBlockRange function

* Fast exit from SetLastWrittenLSNForBlockRange for n_blocks == 0
Without this patch, on bootstrap XLP_FIRST_IS_CONTRECORD has been always put on
header of a page where WAL writing continues. This confuses WAL decoding on
safekeepers, making it think decoding starts in the middle of a record, leading
to

 2022-08-12T17:48:13.816665Z ERROR {tid=37}: query handler for 'START_WAL_PUSH postgresql://no_user:@localhost:15050' failed: failed to run ReceiveWalConn

 Caused by:
    0: failed to process ProposerAcceptorMessage
    1: invalid xlog page header: unexpected XLP_FIRST_IS_CONTRECORD at 0/2CF8000

Rebase of a1af529 for v14.
- Refactor the way the WalProposerMain function is called when started
  with --sync-safekeepers. The postgres binary now explicitly loads
  the 'neon.so' library and calls the WalProposerMain in it. This is
  simpler than the global function callback "hook" we previously used.

- Move the WAL redo process code to a new library, neon_walredo.so,
  and use the same mechanism as for --sync-safekeepers to call the
  WalRedoMain function, when launched with --walredo argument.

- Also move the seccomp code to neon_walredo.so library. I kept the
  configure check in the postgres side for now, though.
Fix indentation, remove unused definitions, resolve some FIXMEs.
Previously, we called PrefetchBuffer [NBlkScanned * seqscan_prefetch_buffers]
times in each of those situations, but now only NBlkScanned.

In addition, the prefetch mechanism for the vacuum scans is now based on
blocks instead of tuples - improving the efficiency.
Parallel seqscans didn't take their parallelism into account when determining
which block to prefetch, and vacuum's cleanup scan didn't correctly determine
which blocks would need to be prefetched, and could get into an infinite loop.
* Use prefetch in pg_prewarm extension

* Change prefetch order as suggested in review
* Update prefetch mechanisms:

- **Enable enable_seqscan_prefetch by default**
- Store prefetch distance in the relevant scan structs
- Slow start sequential scan, to accommodate LIMIT clauses.
- Replace seqscan_prefetch_buffer with the relations' tablespaces'
  *_io_concurrency; and drop seqscan_prefetch_buffer as a result.
- Clarify enable_seqscan_prefetch GUC description
- Fix prefetch in pg_prewarm
- Add prefetching to autoprewarm worker
- Fix an issue where we'd incorrectly not prefetch data when hitting a table wraparound. The same issue also resulted in assertion failures in debug builds.
- Fix parallel scan prefetching - we didn't take into account that parallel scans have scan synchronization, too.
knizhnik and others added 26 commits February 6, 2024 13:05
…extetnded Neon SMGR API (#300)

Co-authored-by: Konstantin Knizhnik <[email protected]>
* [refer #111] Persist logical rep;lication files in WAL and include then in basebackup at PS

* Fix warnings

* Write origin logical record snapshot in WAL only if there are valid origins

* Store only logical replication slots

* Fix dropping replication slots

* Replace sprintf with snprintf to make Arnica happy

* Do not checkpoint replication origin at shutdown

* Add PreCheckPointGuts function to sync replication state before start of shutdown checkpoint

* Log heap rewrite file after creation.

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
Co-authored-by: Arseny Sher <[email protected]>
* Update WAL buffers when restoring WAL at compute needed for LR

* Fix copying data in WAL buffers

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
* Prevent output callbacks from hearing about neon-file messages
* On demand downloading of SLRU segments

* Fix smgr_read_slru_segment

* Determine SLRU kind in extension

* Use ctl->PagePrecedes for SLRU page comparison in SimpleLruDownloadSegment to address wraparround

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
…mary is not alive (#364)

* Set wasShutdown=true during hot-standby replica startup only when primary is not alive
* Report fatal error if hot standaby replica is started with oldestAcriveXid=0

Postgres part of neondatabase/neon#6705
---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
…d for oldestActiveXid while replica startup (#388)

Co-authored-by: Konstantin Knizhnik <[email protected]>
This keeps the walproposer processes alive at shutdown, until after
the shutdown checkpoint has been written. That gives the walproposers
a chance to stream it to the safekeepers.
* Revert "Add comment explaining why it is safe to use FirstNormalTransactionXid for oldestActiveXid while replica startup (#388)"

This reverts commit 79b6351.

* Revert "Set wasShutdown=true during hot-standby replica startup only when primary is not alive (#364)"

This reverts commit be91d91.
* fix: XLogFlush replication slot drop

Signed-off-by: Alex Chi Z <[email protected]>

* fix all occurrences

Signed-off-by: Alex Chi Z <[email protected]>

---------

Signed-off-by: Alex Chi Z <[email protected]>
* Remember last written LSN when it is first requested

* Use rnode instead of rlocator

* Return updated LSN in SetLastWrittenLSN

* Remove wrong new line

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
@tristan957 tristan957 force-pushed the REL_15_STABLE_neon branch 2 times, most recently from 3be8940 to e2dbd63 Compare May 20, 2024 14:48
@knizhnik
Copy link
Contributor Author

knizhnik commented Jun 3, 2024

Replaced by #435

@knizhnik knizhnik closed this Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants