diff --git a/docs/cbdb-op-deploy-guide.md b/docs/cbdb-op-deploy-guide.md index 555658ded..88d8ddf58 100644 --- a/docs/cbdb-op-deploy-guide.md +++ b/docs/cbdb-op-deploy-guide.md @@ -1,10 +1,14 @@ --- -title: Deploy Manually Using RPM Package +title: Deploy with Multiple Computing Nodes --- -# Deploy Cloudberry Database Manually Using RPM Package +# Deploy Cloudberry Database Manually Using RPM Package with Multiple Computing Nodes -This document introduces how to manually deploy Cloudberry Database on physical machines using RPM package. Before reading this document, it is recommended to first read the [Software and Hardware Configuration Requirements](/docs/cbdb-op-software-hardware.md) and [Prepare to Deploy Cloudberry Database on Physical Machine](/docs/cbdb-op-prepare-to-deploy.md). +This document introduces how to manually deploy Cloudberry Database on physical machines using RPM package with multiple computing nodes. Before reading this document, it is recommended to first read the [Software and Hardware Configuration Requirements](/docs/cbdb-op-software-hardware.md) and [Prepare to Deploy Cloudberry Database on Physical Machine](/docs/cbdb-op-prepare-to-deploy.md). + +:::warning +The deployment method described in this document is suitable only for deploying Cloudberry Database v1.0.0, not for deploying later versions. +::: The deployment method in this document is for production environments. diff --git a/docs/deploy-cbdb-with-single-node.md b/docs/deploy-cbdb-with-single-node.md index 1f4bf6aa0..512e06078 100644 --- a/docs/deploy-cbdb-with-single-node.md +++ b/docs/deploy-cbdb-with-single-node.md @@ -8,432 +8,139 @@ Cloudberry Database is not fully compatible with PostgreSQL, and some features a Starting from v1.5.0, Cloudberry Database provides the single-computing-node deployment mode. This mode runs under the `utility` gp_role, with only one coordinator (QD) node and one coordinator standby node, without a segment node or data distribution. You can directly connect to the coordinator and run queries as if you were connecting to a regular multi-node cluster. Note that some SQL statements are not effective in this mode because data distribution does not exist, and some SQL statements are not supported. See [user behavior changes](#user-behavior-changes) for details. -## How to deploy - -### Step 1. Prepare to deploy - -Log into each host as the root user, and modify the settings of each node server in the order of the following sections. - -#### Add `gpadmin` admin user - -Follow the example below to create a user group and username `gpadmin`. Set the user group and username identifier to `520`. Create and specify the `gpadmin` home directory `/home/gpadmin`. - -```bash -groupadd -g 520 gpadmin # Adds user group gpadmin. -useradd -g 520 -u 520 -m -d /home/gpadmin/ -s /bin/bash gpadmin # Adds username gpadmin and creates the home directory of gpadmin. -passwd gpadmin # Sets a password for gpadmin; after executing, follow the prompts to input the password. -``` - -#### Disable SELinux and firewall software - -Run `systemctl status firewalld` to view the firewall status. If the firewall is on, you need to turn it off by setting the `SELINUX` parameter to `disabled` in the `/etc/selinux/config` file. - -```bash -SELINUX=disabled -``` - -You can also disable the firewall using the following commands: - -```bash -systemctl stop firewalld.service -systemctl disable firewalld.service -``` - -#### Set system parameters +Before reading this document, it is recommended to first read the [Software and Hardware Configuration Requirements](/docs/cbdb-op-software-hardware.md) and [Prepare to Deploy Cloudberry Database on Physical Machine](/docs/cbdb-op-prepare-to-deploy.md). -Add relevant system parameters in the `/etc/sysctl.conf` configuration file, and run the `sysctl -p` command to make the configuration file effective. +:::warning +The deployment method described in this document is suitable only for deploying Cloudberry Database v1.5.4, not for deploying earlier versions. +::: -When setting the configuration parameters, you can take the following example as a reference and set them according to your needs. Details of some of these parameters and recommended settings are provided below. - -```bash -kernel.shmall = _PHYS_PAGES / 2 -kernel.shmall = 197951838 -kernel.shmmax = kernel.shmall * PAGE_SIZE -kernel.shmmax = 810810728448 -kernel.shmmni = 4096 -vm.overcommit_memory = 2 -vm.overcommit_ratio = 95 -net.ipv4.ip_local_port_range = 10000 65535 -kernel.sem = 250 2048000 200 8192 -kernel.sysrq = 1 -kernel.core_uses_pid = 1 -kernel.msgmnb = 65536 -kernel.msgmax = 65536 -kernel.msgmni = 2048 -net.ipv4.tcp_syncookies = 1 -net.ipv4.conf.default.accept_source_route = 0 -net.ipv4.tcp_max_syn_backlog = 4096 -net.ipv4.conf.all.arp_filter = 1 -net.ipv4.ipfrag_high_thresh = 41943040 -net.ipv4.ipfrag_low_thresh = 31457280 -net.ipv4.ipfrag_time = 60 -net.core.netdev_max_backlog = 10000 -net.core.rmem_max = 2097152 -net.core.wmem_max = 2097152 -vm.swappiness = 10 -vm.zone_reclaim_mode = 0 -vm.dirty_expire_centisecs = 500 -vm.dirty_writeback_centisecs = 100 -vm.dirty_background_ratio = 0 -vm.dirty_ratio = 0 -vm.dirty_background_bytes = 1610612736 -vm.dirty_bytes = 4294967296 -``` - -##### Shared memory settings +## How to deploy -In the `/etc/sysctl.conf` configuration file, `kernel.shmall` represents the total amount of available shared memory, in pages. `kernel.shmmax` represents the maximum size of a single shared memory segment, in bytes. +### Step 1. Prepare to deploy -You can define these 2 values using the operating system's `_PHYS_PAGES` and `PAGE_SIZE` parameters: +1. Run the following commands in sequence to set up the environment. -``` -kernel.shmall = ( _PHYS_PAGES / 2) -kernel.shmmax = ( _PHYS_PAGES / 2) * PAGE_SIZE -``` + ```shell + # Installs the EPEL repository. + yum install -y epel-release -To get the values of these 2 operating system parameters, you can use `getconf`, for example: + # Adds the /usr/local/lib and /usr/local/lib64 directories to the ld.so.conf file so that the system can find the library files in these directories. + echo -e "/usr/local/lib \n/usr/local/lib64" >> /etc/ld.so.conf -```bash -$echo $(expr $(getconf _PHYS_PAGES)/2) -$echo $(expr $(getconf _PHYS_PAGES)/2 \*$(getconf PAGE_SIZE)) -``` + # Adds the /usr/lib and /usr/lib64 directories to the ld.so.conf file so that the system can find the library files in these directories. + echo -e "/usr/lib \n/usr/lib64" >> /etc/ld.so.conf -- `vm.overcommit_memory` is a Linux kernel parameter that indicates the amount of memory that the system can allocate to a process. Setting `vm.overcommit_memory` to `2` means that when the system allocates more than 2 GB of memory, the operation will be rejected. -- `vm.overcommit_ratio` is a kernel parameter and is the percentage of RAM occupied by the application process. The default value on CentOS is `50`. `vm.overcommit_ratio` is calculated as follows: - - ``` - vm.overcommit_ratio = (RAM - 0.026 * gp_vmem) / RAM + # Reloads the dynamic library cache to make the system recognize the new library directories. + ldconfig ``` -- The calculation method of `gp_vmem` is as follows: +2. Run the following commands in sequence to configure password-free authentication for `gpadmin`. - ``` - # If the system memory is less than 256 GB, use the following formula to calculate: - gp_vmem = ((SWAP + RAM) – (7.5GB + 0.05 * RAM)) / 1.7 + ```bash + #!/bin/bash - # If the system memory is greater than or equal to 256 GB, use the following formula to calculate: - gp_vmem = ((SWAP + RAM) – (7.5GB + 0.05 * RAM)) / 1.17 + # Creates a group named gpadmin + /usr/sbin/groupadd gpadmin - # In the above formulas, SWAP is the swap space on the host, in GB. - # RAM is the size of the memory installed on the host, in GB. - ``` + # Creates a user named gpadmin, adds it to the gpadmin group and the wheel group. + /usr/sbin/useradd gpadmin -g gpadmin -G wheel -##### IP segmentation settings + # Sets the password for the gpadmin user to "cbdb@123". + echo "cbdb@123"|passwd --stdin gpadmin -When the Cloudberry Database uses the UDP protocol for internal connection, the network card controls the fragmentation and reassembly of IP packets. If the size of a UDP message is larger than the maximum size of network transmission unit (MTU), the IP layer fragments the message. + # Adds the gpadmin user to the /etc/sudoers file, granting permission to run all commands without a password. + echo "gpadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers -- `net.ipv4.ipfrag_high_thresh`: When the total size of IP fragments exceeds this threshold, the kernel will attempt to reorganize IP fragments. If the fragments exceed this threshold but all fragments have not arrived within the specified time, the kernel will not reorganize the fragments. This threshold is typically used to control whether larger shards are reorganized. The default value is `4194304` bytes (4 MB). -- `net.ipv4.ipfrag_low_thresh`: Indicates that when the total size of IP fragments is below this threshold, the kernel will wait as long as possible for more fragments to arrive, to allow for larger reorganizations. This threshold is used to minimize unfinished reorganization operations and improve system performance. The default value is `3145728` bytes (3 MB). -- `net.ipv4.ipfrag_time` is a kernel parameter that controls the IP fragment reassembly timeout. The default value is `30`. + # Adds the root user to the /etc/sudoers file, granting permission to run all commands without a password. + echo "root ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers -It is recommended to set the above parameters to the following values: + # Generates an RSA private key for the root user without a passphrase, and redirects the output to a null device. + ssh-keygen -t rsa -N '' -f /root/.ssh/id_rsa <<< $'\n' >/dev/null 2>&1 -``` -net.ipv4.ipfrag_high_thresh = 41943040 -net.ipv4.ipfrag_low_thresh = 31457280 -net.ipv4.ipfrag_time = 60 -``` + # Adds "PasswordAuthentication yes" to the /etc/ssh/sshd_config file to allow password login. + echo "PasswordAuthentication yes" >> /etc/ssh/sshd_config -##### System memory + # Modifies the /etc/ssh/sshd_config file, replacing "UseDNS YES" with "UseDNS no" to disable DNS queries. + sed -i "s/#UseDNS YES/UseDNS no/g" /etc/ssh/sshd_config -- If the server memory exceeds 64 GB, the following parameters are recommended in the `/etc/sysctl.conf ` configuration file: + # Generates an RSA private key as the gpadmin user without a passphrase, and redirects the output to a null device. + sudo -u gpadmin ssh-keygen -t rsa -N '' -f /home/gpadmin/.ssh/id_rsa <<< $'\n' >/dev/null 2>&1 - ``` - vm.dirty_background_ratio = 0 - vm.dirty_ratio = 0 - vm.dirty_background_bytes = 1610612736 # 1.5GB - vm.dirty_bytes = 4294967296 # 4GB - ``` + # Adds the public key of the gpadmin user to its authorized_keys file as the gpadmin user. + sudo -u gpadmin cat /home/gpadmin/.ssh/id_rsa.pub >> /home/gpadmin/.ssh/authorized_keys -- If the server memory is less than 64 GB, you do not need to set `vm.dirty_background_bytes ` or `vm.dirty_bytes`. It is recommended to set the following parameters in the `/etc/sysctl.conf ` configuration file: + # Changes the ownership of the /home/gpadmin directory to the gpadmin user. + sudo chown -R gpadmin:gpadmin /home/gpadmin/ - ``` - vm.dirty_background_ratio = 3 - vm.dirty_ratio = 10 - ``` - -- To deal with emergency situations when the system is under memory pressure, it is recommended to add the `vm.min_free_kbytes` parameter to the `/etc/sysctl.conf` configuration file to control the amount of available memory reserved by the system. It is recommended to set `vm.min_free_kbytes` to 3% of the system's physical memory, with the following command: - - ```bash - awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo /etc/sysctl.conf + # Uses the ssh-keyscan command to add the host's public key to the current user's known_hosts file. + ssh-keyscan $(hostname) >> ~/.ssh/known_hosts ``` -- The setting of `vm.min_free_kbytes` is not recommended to exceed 5% of the system's physical memory. - -##### Resource limit - -Edit the `/etc/security/limits.conf` file and add the following content, which will limit the amount of hardware and software resources. - -``` -*soft nofile 524288 -*hard nofile 524288 -*soft nproc 131072 -*hard nproc 131072 -``` +### Step 2. Install Cloudberry Database RPM package -##### CORE DUMP - -1. Add the following parameter to the `/etc/sysctl.conf` configuration file: - - ``` - kernel.core_pattern=/var/core/core.%h.%t - ``` - -2. Run the following command to make the configuration effective: +1. Download the Cloudberry Database RPM package to the current directory, For example, downloading `https://github.com/cloudberrydb/cloudberrydb/releases/download/1.5.4/cloudberrydb-1.5.4-1.el7.x86_64.rpm`. You need to replace the download address in the command with the actual target address. ```bash - sysctl -p + wget https://github.com/cloudberrydb/cloudberrydb/releases/download/1.5.4/cloudberrydb-1.5.4-1.el7.x86_64.rpm ``` -3. Add the following parameter to `/etc/security/limits.conf`: +2. Install the RPM package. You need to replace the package name in the command with the actual package name. + ```bash + yum install -y cloudberrydb-1.5.3-1.x86_64.rpm ``` - * soft core unlimited - ``` - -##### Set mount options for the XFS file system - -XFS is the file system for the data directory of Cloudberry Database. XFS has the following mount options: -``` -rw,nodev,noatime,inode64 -``` - -You can set up XFS file mounting in the `/etc/fstab` file. See the following commands. You need to choose the file path according to the actual situation: - -```bash -mkdir -p /data0/ -mkfs.xfs -f /dev/vdc -echo "/dev/vdc /data0 xfs rw,nodev,noatime,nobarrier,inode64 0 0" /etc/fstab -mount /data0 -chown -R gpadmin:gpadmin /data0/ -``` - -Run the following command to check whether the mounting is successful: - -```bash -df-h -``` - -##### Blockdev value - -The blockdev value for each disk file should be `16384`. To verify the blockdev value of a disk device, use the following command: - -```bash -sudo/sbin/blockdev --getra -``` - -For example, to verify the blockdev value of the example server disk: - -```bash -sudo/sbin/blockdev --getra /dev/vdc -``` - -To modify the blockdev value of a device file, use the following command: - -```bash -sudo/sbin/blockdev --setra -``` - -For example, to modify the file blockdev value of the hard disk of the example server: - -```bash -sudo/sbin/blockdev --setra16384/dev/vdc -``` - -##### I/O scheduling policy settings for disks - -The disk type, operating system, and scheduling policies of Cloudberry Database are as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Storage device typeOSRecommended scheduling policy
NVMeRHEL 7none
RHEL 8none
Ubuntunone
SSDRHEL 7noop
RHEL 8none
Ubuntunone
OtherRHEL 7deadline
RHEL 8mq-deadline
Ubuntumq-deadline
- -Refer to the following command to modify the scheduling policy. Note that this command is only a temporary modification, and the modification becomes invalid after the server is restarted. - -```bash -echo schedulername>/sys/block//queue/scheduler -``` - -For example, temporarily modify the disk I/O scheduling policy of the example server: - -```bash -echo deadline>/sys/block/vdc/queue/scheduler -``` - -To permanently modify the scheduling policy, use the system utility `grubby`. After using `grubby`, the modification takes effect immediately after you restart the server. The sample command is as follows: - -```bash -grubby --update-kernel=ALL --args="elevator=deadline" -``` - -To view the kernel parameter settings, use the following command: - -```bash -grubby --info=ALL -``` - -##### Disable Transparent Huge Pages (THP) - -You need to disable Transparent Huge Pages (THP), because it reduces database performance. The command is as follows: - -```bash -grubby --update-kernel=ALL --args="transparent_hugepage=never" -``` - -Check the status of THP: - -```bash -cat /sys/kernel/mm/*transparent_hugepage/enabled -``` - -##### Disable IPC object deletion - -Disable IPC object deletion by setting the value of `RemoveIPC` to `no`. You can set this parameter in the `/etc/systemd/logind.conf` file of Cloudberry Database. +### Step 3. Deploy Cloudberry Database with a single computing node -``` -RemoveIPC=no -``` +Use the scripting tool [`gpdemo`](/docs/sys-utilities/db-util-gpdemo.md) to quickly deploy Cloudberry Database. `gpdemo` is included in the RPM package and will be installed in the `GPHOME/bin` directory along with the configuration scripts (gpinitsystem, gpstart, and gpstop). `gpdemo` supports quickly deploying Cloudberry Database with a single computing node. -After disabling it, run the following command to restart the server to make the disabling setting effective: +The commands above create a new directory and run `gpdemo` to deploy a Cloudberry Database cluster of a single computing node. ```bash -service systemd-logind restart -``` +#!/bin/bash -##### SSH connection threshold +# Changes the owner and group of the /usr/local/cloudberrydb directory to gpadmin. +chown gpadmin:gpadmin /usr/local/cloudberrydb -To set the SSH connection threshold, you need to modify the `MaxStartups` and `MaxSessions` parameters in the `/etc/ssh/sshd_config` configuration file. Both of the following writing methods are acceptable. +# Installs specific versions of Python libraries using pip3. +pip3 install psutil==5.7.0 pygresql==5.2 pyyaml==5.3.1 -``` -MaxStartups 200 -MaxSessions 200 -``` +# Switches to the gpadmin user. +su - gpadmin -``` -MaxStartups 10:30:200 -MaxSessions 200 -``` +# Creates a directory named test_gpadmin. +mkdir test_gpadmin -Run the following command to restart the server to make the setting take effect: +# Enters the test_gpadmin directory. +cd test_gpadmin -```bash -service sshd restart -``` +# Sources the Greenplum environment variables. +source /usr/local/cloudberrydb/greenplum_path.sh -##### Clock synchronization +# Runs the gpdemo command to create the Cloudberry Database cluster. +gpdemo -Cloudberry Database requires the clock synchronization to be configured for all hosts, and the clock synchronization service should be started when the host starts. You can choose one of the following synchronization methods: +# Sources the gpdemo environment variables. +source gpdemo-env.sh -- Use the coordinator node's time as the source, and other hosts synchronize the clock of the coordinator node host. -- Synchronize clocks using an external clock source. - -The example in this document uses an external clock source for synchronization, that is, adding the following configuration to the `/etc/chrony.conf` configuration file: - -``` -# Use public servers from the pool.ntp.org project. -# Please consider joining the pool (http://www.pool.ntp.org/join.html). -server 0.centos.pool.ntp.org iburst +# Checks the status of Cloudberry Database. +gpstate -s ``` -After setting, you can run the following command to check the clock synchronization status: +### Step 4. Connect to Cloudberry Database -```bash -systemctl status chronyd -``` +1. Connect to Cloudberry Database. -### Step 2. Install Cloudberry Database - -1. Download the RPM package to the home directory of `gpadmin`. - - ```bash - wget -P /home/gpadmin + ```sql + psql -p 7000 postgres ``` -2. Install the RPM package in the `/home/gpadmin` directory. - - When running the following command, you need to replace `` with the actual RPM package path, as the `root` user. During the installation, the directory `/usr/local/cloudberry-db/` is automatically created. +2. View the information of active segments. - ```bash - cd /home/gpadmin - yum install - ``` - -3. Grant the `gpadmin` user the permission to access the `/usr/local/cloudberry-db/` directory. - - ```bash - chown -R gpadmin:gpadmin /usr/local - chown -R gpadmin:gpadmin /usr/local/cloudberry* - ``` - -4. Configure local SSH connection for the node. As the `gpadmin ` user, perform the following operations: - - ```bash - ssh-keygen - ssh-copy-id localhost - ssh `hostname` # Makes sure that the local SSH connection works well. + ```sql + postgres=# select * from gp_segment_configuration; ``` -### Step 3. Deploy Cloudberry Database with a single computing node - -Use the scripting tool [`gpdemo`](/docs/sys-utilities/db-util-gpdemo.md) to quickly deploy Cloudberry Database. `gpdemo` is included in the RPM package and will be installed in the `GPHOME/bin` directory along with the configuration scripts (gpinitsystem, gpstart, and gpstop). `gpdemo` supports quickly deploying Cloudberry Database with a single computing node. - -In the above [setting mount options for the XFS file system](#set-mount-options-for-the-xfs-file-system), the XFS file system's data directory is mounted on `/data0`. The following commands deploy a single-computing-node cluster in this data directory: - -```bash -cd /data0 -NUM_PRIMARY_MIRROR_PAIRS=0 gpdemo # Uses gpdemo -``` - -When `gpdemo` is running, a warning will be output `[WARNING]: -SinglenodeMode has been enabled, no segment will be created.`, which indicates that Cloudberry Database is currently being deployed in the single-computing-node mode. - ## Common issues ### How to check the deployment mode of a cluster diff --git a/docs/releases/release-1.5.4.md b/docs/releases/release-1.5.4.md index 79c0a4ba9..68d79e3bb 100644 --- a/docs/releases/release-1.5.4.md +++ b/docs/releases/release-1.5.4.md @@ -12,6 +12,10 @@ Quick try: [v1.5.4](https://github.com/cloudberrydb/cloudberrydb/releases/tag/1. Full Changelog: [https://github.com/cloudberrydb/cloudberrydb/compare/1.5.3...1.5.4](https://github.com/cloudberrydb/cloudberrydb/compare/1.5.3...1.5.4) +:::caution +The v1.5.4 installation package can be used to deploy Cloudberry Database only with a single computing node. See [Deploy with a Single Computing Node](/docs/deploy-cbdb-with-single-node.md). It cannot be used for multi-node deployment. +::: + ## Improvements - Add the `cbdb_relation_size` function by [@fanfuxiaoran](https://github.com/fanfuxiaoran) in [#428](https://github.com/cloudberrydb/cloudberrydb/pull/428) diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-deploy-guide.md b/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-deploy-guide.md index a9ffe503a..c128d9b3d 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-deploy-guide.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-deploy-guide.md @@ -1,10 +1,14 @@ --- -title: 通过 RPM 包手动部署 +title: 多计算节点部署 --- -# 通过 RPM 包在物理机上手动部署 Cloudberry Database +# 在多计算节点上部署 Cloudberry Database -本文档介绍如何通过 RPM 包在物理机上安装与部署 Cloudberry Database。在阅读本文前,建议先阅读[软硬件配置需求](/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-software-hardware.md)和[物理机部署前准备工作](/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-prepare-to-deploy.md)。 +本文档介绍如何通过 RPM 包在多计算节点的物理机上安装与部署 Cloudberry Database。在阅读本文前,建议先阅读[软硬件配置需求](/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-software-hardware.md)和[物理机部署前准备工作](/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-prepare-to-deploy.md)。 + +:::warning 警告 +本文档介绍的方法仅适用于通过 RPM 包部署 Cloudberry Database v1.0.0,不适用于部署更高版本的 Cloudberry Database。 +::: 本文所介绍的部署方法可用于生产环境。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/deploy-cbdb-with-single-node.md b/i18n/zh/docusaurus-plugin-content-docs/current/deploy-cbdb-with-single-node.md index c0a5e3e21..6f9c12bfe 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/deploy-cbdb-with-single-node.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/deploy-cbdb-with-single-node.md @@ -1,441 +1,146 @@ --- -title: 单计算节点模式部署 +title: 单计算节点部署 --- -# 单计算节点模式部署 Cloudberry Database(引入自 v1.5.0 版本) +# 单计算节点模式部署 Cloudberry Database Cloudberry Database 与 PostgreSQL 并不完全兼容,部分功能和语法都是专有的。如果用户业务已经依赖 Cloudberry Database,想在单节点上使用 Cloudberry Database 特有的语法和功能,规避与 PostgreSQL 的兼容性问题,那么可以使用这种单计算节点的部署方式。 自 v1.5.0 起,Cloudberry Database 提供这一单计算节点的部署模式。该模式在 `utility` gp_role 下运行,仅有一个 coordinator (QD) 和一个 coordinator standby 节点,没有 segment 节点和数据分布。用户可以直接连接到 coordinator 并执行查询,就像连接的是一个正常的多节点集群一样。注意,由于没有数据分布,一些 SQL 语句在单计算节点部署下没有效果,还有一些 SQL 语句不受支持。具体可见最后一节[用户行为变更](#用户行为变更)。 -## 部署方法 - -### 第 1 步:部署前准备 - -使用 root 用户登录每台主机,按照以下章节的顺序,依次修改各节点主机的设置。 - -#### 新增 `gpadmin` 管理用户 - -参考以下示例,创建用户组和用户名 `gpadmin`,将用户组和用户名的标识号设为 `520`,创建并指定主目录 `/home/gpadmin/`。 - -```bash -groupadd -g 520 gpadmin # _添加用户组 gpadmin_ -useradd -g 520 -u 520 -m -d /home/gpadmin/ -s /bin/bash gpadmin # _添加用户名 gpadmin 并创建主目录。_ -passwd gpadmin # _为 gpadmin 设置密码,执行后,按照提示输出密码。_ -``` - -#### 禁用 SELinux 和防火墙软件 - -执行 `systemctl status firewalld` 查看防火墙状态。如果防火墙处于开启状态,你需要关闭防火墙,即在 `/etc/selinux/config` 文件中将 `SELINUX` 参数设为 `disabled`。 - -```bash -SELINUX=disabled -``` - -你还可以使用以下命令禁用防火墙: - -```bash -systemctl stop firewalld.service -systemctl disable firewalld.service -``` +在阅读本文前,建议先阅读[软硬件配置需求](/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-software-hardware.md)和[物理机部署前准备工作](/i18n/zh/docusaurus-plugin-content-docs/current/cbdb-op-prepare-to-deploy.md)。 -#### 设置系统参数 +:::warning 警告 +本文档介绍的方法仅适用于通过 RPM 包部署 Cloudberry Database v1.5.4,不适用于部署此前版本的 Cloudberry Database。 +::: -编辑 `/etc/sysctl.conf` 配置文件,在配置文件中添加相关系统参数,并执行 `sysctl -p` 命令让配置文件生效。 - -以下配置参数仅供参考,请按实际需要进行设置。下文介绍了其中一些配置参数的详细信息以及推荐设置。 - -```bash -kernel.shmall = _PHYS_PAGES / 2 -kernel.shmall = 197951838 -kernel.shmmax = kernel.shmall * PAGE_SIZE -kernel.shmmax = 810810728448 -kernel.shmmni = 4096 -vm.overcommit_memory = 2 -vm.overcommit_ratio = 95 -net.ipv4.ip_local_port_range = 10000 65535 -kernel.sem = 250 2048000 200 8192 -kernel.sysrq = 1 -kernel.core_uses_pid = 1 -kernel.msgmnb = 65536 -kernel.msgmax = 65536 -kernel.msgmni = 2048 -net.ipv4.tcp_syncookies = 1 -net.ipv4.conf.default.accept_source_route = 0 -net.ipv4.tcp_max_syn_backlog = 4096 -net.ipv4.conf.all.arp_filter = 1 -net.ipv4.ipfrag_high_thresh = 41943040 -net.ipv4.ipfrag_low_thresh = 31457280 -net.ipv4.ipfrag_time = 60 -net.core.netdev_max_backlog = 10000 -net.core.rmem_max = 2097152 -net.core.wmem_max = 2097152 -vm.swappiness = 10 -vm.zone_reclaim_mode = 0 -vm.dirty_expire_centisecs = 500 -vm.dirty_writeback_centisecs = 100 -vm.dirty_background_ratio = 0 -vm.dirty_ratio = 0 -vm.dirty_background_bytes = 1610612736 -vm.dirty_bytes = 4294967296 -``` - -##### 共享内存设置 +## 部署方法 -在 `/etc/sysctl.conf` 配置文件中: +### 第 1 步:部署前准备 -- `kernel.shmall` 表示可用共享内存的总量,单位是页。`kernel.shmmax` 表示单个共享内存段的最大值,以字节为单位。 +1. 按顺序依次执行以下命令,以配置运行环境: - 你可以使用操作系统的 `_PHYS_PAGES` 和 `PAGE_SIZE` 两个参数来定义这两个值: + ```shell + # 安装 EPEL 仓库 + yum install -y epel-release - ```bash - kernel.shmall = ( _PHYS_PAGES / 2) - kernel.shmmax = ( _PHYS_PAGES / 2) * PAGE_SIZE - ``` + # 将 /usr/local/lib 和 /usr/local/lib64 目录添加到 ld.so.conf 文件中,以便系统能够找到这些目录下的库文件 + echo -e "/usr/local/lib \n/usr/local/lib64" >> /etc/ld.so.conf - 要获取这两个操作系统参数的值,你可以使用 `getconf` ,示例如下: + # 将 /usr/lib 和 /usr/lib64 目录添加到 ld.so.conf 文件中,以便系统能够找到这些目录下的库文件 + echo -e "/usr/lib \n/usr/lib64" >> /etc/ld.so.conf - ```bash - $ echo $(expr $(getconf _PHYS_PAGES) / 2) - $ echo $(expr $(getconf _PHYS_PAGES) / 2 \$(getconf PAGE_SIZE)) + # 重新加载动态库缓存,使系统能够识别新增的库目录 + ldconfig ``` -- `vm.overcommit_memory` 是一个 Linux 内核参数,表示系统可分配给某进程的内存大小。将 `vm.overcommit_memory` 设置为 `2`,表示当系统分配的内存超过 2 GB 时,系统会拒绝该操作。 -- `vm.overcommit_ratio` 是一个内核参数,是应用进程占用 RAM 的百分比。在 CentOS 上默认值为 `50`。`vm.overcommit_ratio` 的计算公式如下: +2. 按顺序依次执行以下命令,为 `gpadmin` 用户配置免密认证。 ```bash - vm.overcommit_ratio = (RAM - 0.026 * gp_vmem) / RAM - ``` + #!/bin/bash -其中 `gp_vmem` 的计算方法如下: + # 创建一个名为 gpadmin 的组 + /usr/sbin/groupadd gpadmin - ```bash - # 如果系统内存低于 256 GB, 使用如下公式计算: - gp_vmem = ((SWAP + RAM) – (7.5GB + 0.05 * RAM)) / 1.7 + # 创建一个名为 gpadmin 的用户,将其添加到 gpadmin 组和 wheel 组 + /usr/sbin/useradd gpadmin -g gpadmin -G wheel - # 如果系统内存大于等于 256 GB, 使用如下公式计算: - gp_vmem = ((SWAP + RAM) – (7.5GB + 0.05 * RAM)) / 1.17 + # 设置 gpadmin 用户的密码为 "cbdb@123" + echo "cbdb@123"|passwd --stdin gpadmin - # 以上公式中,SWAP 是主机上的交换空间,以 GB 为单位。 - # RAM 是主机上安装的内存大小,以 GB 为单位。 - ``` + # 添加 gpadmin 用户到 /etc/sudoers 文件,赋予其无密码执行所有命令的权限 + echo "gpadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers -##### IP 分段设置 + # 添加 root 用户到 /etc/sudoers 文件,赋予其无密码执行所有命令的权限 + echo "root ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers -当 Cloudberry Database 内部连接使用 UDP 协议,网卡会控制 IP 数据包的分段和重组。如果 UDP 消息的大小大于网络最大传输单元 (MTU) 的大小,IP 层会对消息进行分段。 + # 生成 root 用户的 RSA 私钥,不设置密码,并将输出重定向到空设备 + ssh-keygen -t rsa -N '' -f /root/.ssh/id_rsa <<< $'\n' >/dev/null 2>&1 -- `net.ipv4.ipfrag_high_thresh`:当 IP 分片的总大小超过该阈值时,内核将尝试对 IP 分片进行重组。如果分片超过了这个阈值,但全部片段在规定的时间内仍未到达,内核则不会重组这些分片。该阈值通常用于控制是否对较大的分片进行重组。默认值为 `4194304` 字节(即 4 MB)。 -- `net.ipv4.ipfrag_low_thresh`:表示当 IP 分片的总大小低于该阈值时,内核将尽可能地等待更多分片到达,以便进行更大的重组。这个阈值的目的是尽量减少未完成的重组操作,以提高系统性能。默认值为 `3145728` 字节(3 MB)。 -- `net.ipv4.ipfrag_time` 是一个控制 IP 分片重组超时时间的内核参数,默认值是 `30`。 + # 在 /etc/ssh/sshd_config 文件中添加 "PasswordAuthentication yes",允许密码登录 + echo "PasswordAuthentication yes" >> /etc/ssh/sshd_config -推荐将以上参数设为如下值: + # 修改 /etc/ssh/sshd_config 文件,将 "UseDNS YES" 替换为 "UseDNS no",禁用 DNS 查询 + sed -i "s/#UseDNS YES/UseDNS no/g" /etc/ssh/sshd_config -``` -net.ipv4.ipfrag_high_thresh = 41943040 -net.ipv4.ipfrag_low_thresh = 31457280 -net.ipv4.ipfrag_time = 60 -``` + # 以 gpadmin 用户身份生成 RSA 私钥,不设置密码,并将输出重定向到空设备 + sudo -u gpadmin ssh-keygen -t rsa -N '' -f /home/gpadmin/.ssh/id_rsa <<< $'\n' >/dev/null 2>&1 -##### 系统内存 + # 以 gpadmin 用户身份将 gpadmin 用户的公钥添加到其 authorized_keys 文件 + sudo -u gpadmin cat /home/gpadmin/.ssh/id_rsa.pub >> /home/gpadmin/.ssh/authorized_keys -- 如果服务器内存超过 64 GB,建议在 `/etc/sysctl.conf` 配置文件中进行如下参数设置: + # 将 /home/gpadmin 目录的所有权修改为 gpadmin 用户 + sudo chown -R gpadmin:gpadmin /home/gpadmin/ - ``` - vm.dirty_background_ratio = 0 - vm.dirty_ratio = 0 - vm.dirty_background_bytes = 1610612736 # 1.5GB - vm.dirty_bytes = 4294967296 # 4GB + # 使用 ssh-keyscan 命令将主机的公钥添加到当前用户的 known_hosts 文件 + ssh-keyscan $(hostname) >> ~/.ssh/known_hosts ``` -- 如果服务器内存低于 64 GB,则不需要设置 `vm.dirty_background_bytes` 和 `vm.dirty_bytes`,建议在 `/etc/sysctl.conf` 配置文件中进行如下参数设置: - - ``` - vm.dirty_background_ratio = 3 - vm.dirty_ratio = 10 - ``` +### 第 2 步:通过 RPM 包安装 Cloudberry Database -- 为了应对系统出现内存压力时的紧急情况,建议在 `/etc/sysctl.conf` 配置文件中新增 `vm.min_free_kbytes` 参数,用于控制系统保留的可用内存量。建议将 `vm.min_free_kbytes` 设置为系统物理内存的 3%,命令如下: +1. 下载 Cloudberry Database 的 RPM 安装包至当前目录。例如下载 `https://github.com/cloudberrydb/cloudberrydb/releases/download/1.5.4/cloudberrydb-1.5.4-1.el7.x86_64.rpm`。你需要将命令中的下载地址替换为实际的安装包地址。 ```bash - awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo /etc/sysctl.conf - ``` - -- `vm.min_free_kbytes` 的设置不建议超过系统物理内存的 5%。 - -##### 资源限制设置 - -编辑 `/etc/security/limits.conf` 文件并添加如下内容,这将对软硬件资源用量进行限制。 - -``` -*soft nofile 524288 -*hard nofile 524288 -*soft nproc 131072 -*hard nproc 131072 -``` - -##### 核心转储(CORE DUMP)设置 - -1. 添加以下参数至 `/etc/sysctl.conf` 配置文件: - - ``` - kernel.core_pattern=/var/core/core.%h.%t + wget https://github.com/cloudberrydb/cloudberrydb/releases/download/1.5.4/cloudberrydb-1.5.4-1.el7.x86_64.rpm ``` -2. 执行以下命令使配置生效: +2. 安装 RPM 包。你需要将命令中的安装包名替换为实际的安装包命。 ```bash - sysctl -p + yum install -y cloudberrydb-1.5.3-1.x86_64.rpm ``` -3. 添加以下参数至 `/etc/security/limits.conf`: - - ``` - soft core unlimited - ``` - -##### 为 XFS 文件系统设置挂载选项 - -XFS 是 Cloudberry Database 数据目录的文件系统,XFS 使用以下选项进行挂载: - -``` -rw,nodev,noatime,inode64 -``` - -你可以在 `/etc/fstab` 文件中设置 XFS 文件挂载,参考如下命令。你需要根据实际情况选择文件路径: - -```bash -mkdir -p /data0/ -mkfs.xfs -f /dev/vdc -echo "/dev/vdc /data0 xfs rw,nodev,noatime,nobarrier,inode64 0 0" /etc/fstab -mount /data0 -chown -R gpadmin:gpadmin /data0/ -``` - -执行以下命令查看挂载是否成功: - -```bash -df -h -``` - -##### 预读值设置 - -每个磁盘设备文件的预读 (blockdev) 值应该是 `16384`。要验证磁盘设备的预读取值,你可以使用以下命令: - -```bash -sudo /sbin/blockdev --getra -``` - -例如,验证本文示例服务器硬盘的文件预读值: - -```bash -sudo /sbin/blockdev --getra /dev/vdc -``` - -要修改设备文件的预读值,你可以使用以下命令: - -```bash -sudo /sbin/blockdev --setra -``` - -例如,修改本文档服务器硬盘的文件预读值: - -```bash -sudo /sbin/blockdev --setra 16384 /dev/vdc -``` - -##### 磁盘的 I/O 调度策略设置 - -Cloudberry Database 的磁盘类型、操作系统以及调度策略如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
存储设备类型OS推荐的调度策略
NVMeRHEL 7none
RHEL 8none
Ubuntunone
SSDRHEL 7noop
RHEL 8none
Ubuntunone
其他RHEL 7deadline
RHEL 8mq-deadline
Ubuntumq-deadline
- -参考以下命令修改调度策略。注意,该命令仅为临时修改,服务器重启后,修改将失效。 - -```bash -echo schedulername /sys/block/ - ``` - -2. 在 `/home/gpadmin` 目录下安装 RPM 包。 +1. 连接到 Cloudberry Database。 - 执行以下命令时,你需要将 `` 替换为实际的安装包路径,并使用 `root` 用户执行。安装时,会自动创建默认安装目录 `/usr/local/cloudberry-db/`。 - - ```bash - cd /home/gpadmin - yum install - ``` - -3. 为 `gpadmin` 用户授予安装目录的权限: - - ```bash - chown -R gpadmin:gpadmin /usr/local - chown -R gpadmin:gpadmin /usr/local/cloudberry* + ```sql + psql -p 7000 postgres ``` -4. 配置节点的本地 SSH 登录。在 `gpadmin` 用户下: +2. 查看活动 Segment 的信息。 - ```bash - ssh-keygen - ssh-copy-id localhost - ssh `hostname` # 确认本地 SSH 登录能正常工作 + ```sql + postgres=# select * from gp_segment_configuration; ``` -## 第 3 步:部署单计算节点的 Cloudberry Database - -使用脚本工具 [`gpdemo`](/i18n/zh/docusaurus-plugin-content-docs/current/sys-utilities/db-util-gpdemo.md) 快速部署 Cloudberry Database。`gpdemo` 包含在 RPM 包中,将随配置脚本(gpinitsystem、gpstart、gpstop 等)一并安装到 `GPHOME/bin` 目录下,支持快捷部署无 Segment 节点的 Cloudberry Database。 - -在上面[为 XFS 文件系统设置挂载选项](#为-xfs-文件系统设置挂载选项)中,XFS 文件系统的数据目录挂载在了 `/data0` 上。以下指令在该数据目录下部署一个单计算节点集群: - -```bash -cd /data0 -NUM_PRIMARY_MIRROR_PAIRS=0 gpdemo # 使用 gpdemo 工具 -``` - -在 `gpdemo` 的执行过程中,会输出一条新的警告 `[WARNING]:-SinglenodeMode has been enabled, no segment will be created.`,这表示当前正以单计算节点模式部署 Cloudberry Database。 - ## 常见问题 ### 如何确认集群的部署模式 diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/releases/release-1.5.4.md b/i18n/zh/docusaurus-plugin-content-docs/current/releases/release-1.5.4.md index 5e80a1327..7388ad99b 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/releases/release-1.5.4.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/releases/release-1.5.4.md @@ -12,6 +12,10 @@ Cloudberry Database v1.5.4 是一个小版本,包含了一些提升改进、 完整的变更日志:[https://github.com/cloudberrydb/cloudberrydb/compare/1.5.3...1.5.4](https://github.com/cloudberrydb/cloudberrydb/compare/1.5.3...1.5.4) +:::caution 告示 +v1.5.4 提供的 RPM 包仅适用于在单计算节点上部署,参见[在单计算节点上部署集群](/i18n/zh/docusaurus-plugin-content-docs/current/deploy-cbdb-with-single-node.md)。该安装包不能用于多节点部署。 +::: + ## 提升改进 - 添加 `cbdb_relation_size` 函数 [#428](https://github.com/cloudberrydb/cloudberrydb/pull/428) by [@fanfuxiaoran](https://github.com/fanfuxiaoran)