Skip to content

Commit

Permalink
Add postgres (#15)
Browse files Browse the repository at this point in the history
* add postgres
* update ci
* update gpdb code
  • Loading branch information
haobibo authored Mar 14, 2024
1 parent 429735b commit e937761
Show file tree
Hide file tree
Showing 11 changed files with 233 additions and 97 deletions.
87 changes: 87 additions & 0 deletions .github/workflows/build-docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: build-docker-images

on:
push:
branches: [ main ]
paths-ignore:
- "*.md"

pull_request:
branches: [ main ]
paths-ignore:
- "*.md"

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

env:
REGISTRY_URL: "docker.io" # docker.io or other registry URL, DOCKER_REGISTRY_USERNAME/DOCKER_REGISTRY_PASSWORD to be set in CI env.
BUILDKIT_PROGRESS: "plain" # Full logs for CI build.

# DOCKER_REGISTRY_USERNAME and DOCKER_REGISTRY_PASSWORD is required for docker image push, they should be set in CI secrets.
DOCKER_REGISTRY_USERNAME: ${{ secrets.DOCKER_REGISTRY_USERNAME }}
DOCKER_REGISTRY_PASSWORD: ${{ secrets.DOCKER_REGISTRY_PASSWORD }}

# used to sync image to mirror registry
DOCKER_MIRROR_REGISTRY_USERNAME: ${{ secrets.DOCKER_MIRROR_REGISTRY_USERNAME }}
DOCKER_MIRROR_REGISTRY_PASSWORD: ${{ secrets.DOCKER_MIRROR_REGISTRY_PASSWORD }}

jobs:
qpod_bigdata:
name: "bigdata"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image bigdata latest docker_bigdata/Dockerfile && push_image
qpod_elasticsearch:
name: "elasticsearch"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image elasticsearch latest docker_elasticsearch/Dockerfile && push_image
qpod_kafka_confluent:
name: "kafka"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image kafka latest docker_kafka_confluent/Dockerfile && push_image
qpod_greenplum:
name: "greenplum"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image greenplum latest docker_greenplum/Dockerfile && push_image
qpod_postgres:
name: "postgres"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image postgres latest docker_postgres/postgres-ext.Dockerfile && push_image
## Sync all images in this build (listed by "names") to mirror registry.
sync_images:
needs: ["qpod_bigdata", "qpod_elasticsearch", "qpod_kafka_confluent", "qpod_postgres", "qpod_greenplum"]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
printenv > /tmp/docker.env
docker run --rm \
--env-file /tmp/docker.env \
-v $(pwd):/tmp \
-w /tmp \
qpod/docker-kit python /opt/utils/image-syncer/run_jobs.py
56 changes: 0 additions & 56 deletions .github/workflows/docker.yml

This file was deleted.

13 changes: 6 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,17 @@
# QPod Data Lab - Docker Image Stack
# QPod Data Lab - Docker Image Stack for BigData

[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/QPod/data-lab/docker.yml?branch=main)](https://github.com/QPod/data-lab/actions/workflows/docker.yml)
[![Join the Gitter Chat](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/QPod/)
[![Docker Pulls](https://img.shields.io/docker/pulls/qpod/qpod.svg)](https://hub.docker.com/r/qpod/qpod)
[![Docker Starts](https://img.shields.io/docker/stars/qpod/qpod.svg)](https://hub.docker.com/r/qpod/qpod)
[![Recent Code Update](https://img.shields.io/github/last-commit/QPod/data-lab.svg)](https://github.com/QPod/data-lab/stargazers)
[![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/QPod/lab-data/docker.yml?branch=main)](https://github.com/QPod/lab-data/actions/workflows/docker.yml)
[![Recent Code Update](https://img.shields.io/github/last-commit/QPod/lab-data.svg)](https://github.com/QPod/lab-data/stargazers)

Please generously STAR★ our project or donate to us! [![GitHub Starts](https://img.shields.io/github/stars/QPod/data-lab.svg?label=Stars&style=social)](https://github.com/QPod/data-lab/stargazers)
[![Donate-PayPal](https://img.shields.io/badge/Donate-PayPal-blue.svg)](https://paypal.me/haobibo)
[![Donate-AliPay](https://img.shields.io/badge/Donate-Alipay-blue.svg)](https://raw.githubusercontent.com/wiki/haobibo/resources/img/Donate-AliPay.png)
[![Donate-WeChat](https://img.shields.io/badge/Donate-WeChat-green.svg)](https://raw.githubusercontent.com/wiki/haobibo/resources/img/Donate-WeChat.png)

[Wiki & Document](https://github.com/QPod/docker-images/wiki) | [中文使用指引(含中国地区镜像)](https://github.com/QPod/docker-images/wiki/QPod%E4%B8%AD%E6%96%87%E6%8C%87%E5%BC%95)
Discussion and contributions are welcome:
[![Join the Discord Chat](https://img.shields.io/badge/Discuss_on-Discord-green)](https://discord.gg/kHUzgQxgbJ)
[![Open an Issue on GitHub](https://img.shields.io/github/issues/QPod/lab-data)](https://github.com/QPod/lab-data/issues)

## Building blocks for data lake and pipelines projects

Expand Down
3 changes: 2 additions & 1 deletion docker_bigdata/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ RUN source /opt/utils/script-setup.sh \
&& echo "Install mysql client:" && setup_mysql_client \
&& echo "Install mongosh:" && setup_mongosh_client \
&& echo "Install redis-cli:" && setup_redis_client \
&& echo "Install pyflink:" && install_pip /opt/utils/list_install_pip_pyflink.txt \
&& echo "Install pyspark:" && install_pip /opt/utils/list_install_pip_pyspark.txt \
&& echo "Install pyflink:" && install_pip /opt/utils/list_install_pip_pyflink.txt \
&& pip install --no-deps apache-flink \
&& echo "Clean up" && list_installed_packages && install__clean
13 changes: 5 additions & 8 deletions docker_bigdata/work/list_install_pip_pyflink.txt
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
% from: https://github.com/apache/flink/blob/master/flink-python/setup.py
apache-flink
pemja
pandas
pyarrow
apache-beam
cloudpickle
avro-python3
requests
%py4j==0.10.9.3
%apache-beam==2.38.0
%cloudpickle==2.1.0
%avro-python3>=1.8.1,!=1.9.2,<1.10.0
%fastavro>=1.1.0,<1.4.8
%protobuf<3.18
%pemja==0.2.4
% apache-flink % tempfix bu install without deps
55 changes: 55 additions & 0 deletions docker_elasticsearch/solr.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
FROM ubuntu:latest

LABEL maintainer="[email protected]"

USER root

ENV APACHE_DIST="http://archive.apache.org/dist" \
MAVEN_VERSION="3.6.3" \
SOLR_VERSION="8.3.1" \
SOLR_HOME="/data/solr" \
SOLR_LIB_DIR="/data/solr/.lib" \
SOLR_SERVER_LIB="/opt/solr/server/solr-webapp/webapp/WEB-INF/lib" \
PATH="/opt/solr/bin:/opt/maven/bin:$PATH"

RUN mkdir -p $SOLR_HOME $SOLR_LIB_DIR \
&& apt-get -y update --fix-missing && apt-get -y upgrade \
&& apt-get -qq install -y --no-install-recommends wget unzip lsof openjdk-11-jdk-headless \
&& apt-get autoremove && apt-get clean && rm -rf /var/lib/apt/lists/* \
&& install_zip() { wget -nv $1 -O /tmp/TMP.zip && unzip -q /tmp/TMP.zip -d /opt/ && rm /tmp/TMP.zip ; } \
&& install_zip "${APACHE_DIST}/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.zip" && mv /opt/apache-maven-${MAVEN_VERSION} /opt/maven \
&& install_zip "${APACHE_DIST}/lucene/solr/${SOLR_VERSION}/solr-${SOLR_VERSION}.zip" && mv /opt/solr-${SOLR_VERSION} /opt/solr \
&& sed -i -e '/-Dsolr.clustering.enabled=true/ a SOLR_OPTS="$SOLR_OPTS -Denable.runtime.lib=true -Dsun.net.inetaddr.ttl=60 -Dsun.net.inetaddr.negative.ttl=60"' /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_HOME=${SOLR_HOME}' >> /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_PID_DIR=${SOLR_HOME}' >> /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_LOGS_DIR=${SOLR_HOME}/logs' >> /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_LOG_LEVEL=WARN' >> /opt/solr/bin/solr.in.sh \
&& echo '#!/bin/bash' >> /opt/solr/bin/start-solr.sh \
&& echo '[ -f "${SOLR_HOME}/solr.xml" ] || cp -R /opt/solr/server/solr/* ${SOLR_HOME}/' >> /opt/solr/bin/start-solr.sh \
&& echo 'cp -R ${SOLR_LIB_DIR}/*.jar ${SOLR_SERVER_LIB}/' >> /opt/solr/bin/start-solr.sh \
&& echo '/opt/solr/bin/solr start -force -f -c' >> /opt/solr/bin/start-solr.sh \
&& chmod +x /opt/solr/bin/start-solr.sh

RUN mvn_get() { mvn dependency:copy -DlocalRepositoryDirectory="/tmp/m2repo" -DoutputDirectory="${SOLR_SERVER_LIB}" -Djavax.net.ssl.trustStorePassword=changeit -Dartifact="$1"; } \
&& mvn_get "com.janeluo:ikanalyzer:2012_u6" \
&& mvn_get "com.hankcs:hanlp:portable-1.6.3" \
&& mvn_get "com.huaban:jieba-analysis:1.0.2" \
&& rm -Rf /tmp/* /opt/solr/docs/ \
&& ls -alh ${SOLR_SERVER_LIB}

RUN source /opt/utils/script-utils.sh \
&& VERSION_GRADLE="6.5.1" \
&& URL_GRADLE="https://downloads.gradle-dn.com/distributions/gradle-${VERSION_GRADLE}-bin.zip" \
&& install_zip ${URL_GRADLE} && mv /opt/gradle-* /opt/gradle \
&& ln -s /opt/gradle/bin/gradle /usr/bin/ \
&& echo "@ Version of Gradle:" && gradle --version


EXPOSE 8983 9983

WORKDIR /opt/solr

VOLUME /data/solr

ENTRYPOINT ["start-solr.sh"]
CMD ["start-solr.sh]
13 changes: 8 additions & 5 deletions docker_greenplum/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,13 @@ FROM ${BASE_NAMESPACE:+$BASE_NAMESPACE/}${BASE_IMG} AS builder

COPY rootfs /

RUN VERSION_GPDB_RELEASE="7.0.0-beta.3" \
&& source /opt/utils/script-utils.sh \
RUN set -x && source /opt/utils/script-utils.sh \
&& install_apt /opt/utils/install_list_greenplum.apt \
&& apt-get -qq install -yq --no-install-recommends gcc g++ bison flex cmake pkg-config ccache ninja-build \
&& install_tar_gz https://github.com/greenplum-db/gpdb/releases/download/${VERSION_GPDB_RELEASE}/${VERSION_GPDB_RELEASE}-src-full.tar.gz \
&& VERSION_GPDB_RELEASE=$(curl -sL https://github.com/greenplum-db/gpdb/releases.atom | grep 'releases/tag' | grep "7." | head -1 | grep -Po '\d[\d.]+' ) \
&& URL_GBDP="https://github.com/greenplum-db/gpdb/releases/download/${VERSION_GPDB_RELEASE}/${VERSION_GPDB_RELEASE}-src-full.tar.gz" \
&& echo "Downloading GBDP src release ${VERSION_GPDB_RELEASE} from: ${URL_GBDP}" \
&& install_tar_gz $URL_GBDP \
&& cd /opt/gpdb_src \
&& PYTHON=/opt/conda/bin/python3 ./configure --prefix=/opt/gpdb --with-perl --with-python --with-libxml --with-gssapi --with-openssl \
&& sudo make -j16 && sudo make install -j16
Expand All @@ -25,7 +27,7 @@ ENV GPHOME="/opt/gpdb" \
GPDATA="/data/gpdb" \
GPUSER="gpadmin"

RUN source /opt/utils/script-utils.sh \
RUN set -x && source /opt/utils/script-utils.sh \
&& echo "source ${GPHOME}/greenplum_path.sh" >> /etc/profile \
&& useradd -u 1000 ${GPUSER} -s /bin/bash -d /home/${GPUSER} \
&& usermod -aG root ${GPUSER} \
Expand All @@ -48,7 +50,8 @@ RUN source /opt/utils/script-utils.sh \
&& echo "Clean up" && list_installed_packages && install__clean

USER ${GPUSER}
RUN [ -e ~/.ssh/id_rsa.pub ] || ssh-keygen -t rsa -b 4096 -N "" -C GreenplumDB -f ~/.ssh/id_rsa \
RUN set -x && whoami \
&& [ -e ~/.ssh/id_rsa.pub ] || ssh-keygen -t rsa -b 4096 -N "" -C GreenplumDB -f ~/.ssh/id_rsa \
&& cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys \
&& ssh-keygen -A -v \
&& chmod 600 ~/.ssh/authorized_keys \
Expand Down
7 changes: 4 additions & 3 deletions docker_greenplum/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# GreenplumDB
# GreenplumDB

This is the docker contianer for starting a GreenplumDB 7 cluster.
https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/landing-index.html
Expand All @@ -25,12 +25,13 @@ psql -d postgres -c "ALTER ROLE gpadmin WITH PASSWORD 'gpadmin';"

Please refer to the file `example/gpdb-single-vm/docker-compose.yml`.
Note: it is neded to create folders `primary1` and `primary2` for segment nodes in `/data/database/greenplum`:
```

```bash
mkdir -pv /data/database/greenplum/primary1
mkdir -pv /data/database/greenplum/primary2
```

# Debug
## Debug

```bash
# to build the docker image
Expand Down
22 changes: 22 additions & 0 deletions docker_postgres/postgres-ext.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Distributed under the terms of the Modified BSD License.

ARG BASE_NAMESPACE
ARG BASE_IMG="base"
FROM ${BASE_NAMESPACE:+$BASE_NAMESPACE/}${BASE_IMG} as builder

ARG PG_MAJOR=15
FROM library/postgres:${PG_MAJOR:-latest}

LABEL maintainer="[email protected]"

COPY work /opt/utils/
COPY --from=builder /opt /opt

RUN set -x \
&& apt-get update && apt-get install -y gettext \
apt-utils apt-transport-https ca-certificates gnupg2 dirmngr locales sudo lsb-release curl \
&& envsubst < /opt/utils/install_list_pgext.apt > /opt/utils/install_list_pgext.apt \
&& . /opt/utils/script-utils.sh \
&& install_apt /opt/utils/install_list_base.apt \
&& install_apt /opt/utils/install_list_pgext.apt \
&& echo "Clean up" && list_installed_packages && install__clean
11 changes: 11 additions & 0 deletions docker_postgres/work/install_list_pgext.apt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
postgresql-contrib
postgresql-${PG_MAJOR}-postgis*
postgresql-${PG_MAJOR}-pgvector
postgresql-${PG_MAJOR}-cron
postgresql-${PG_MAJOR}-wal2json

% https://packagecloud.io/citusdata/community/${distro_name}/ ${distro_codename} main
% https://packagecloud.io/timescale/timescaledb/${distro_name}/ ${distro_codename} main
% https://apt.postgresml.org ${distro_codename} maintainer
% timescaledb-2-postgresql-${PG_MAJOR}
% postgresql-${PG_MAJOR}-citus-12.1
Loading

0 comments on commit e937761

Please sign in to comment.