getsocial-rnd · taraspos · Apr 16, 2019 · Apr 17, 2019 · Apr 17, 2019 · Apr 18, 2019
diff --git a/Dockerfile b/Dockerfile
@@ -1,8 +1,8 @@
-FROM  neo4j:3.4.6-enterprise
+FROM  neo4j:3.5.4-enterprise
 
-RUN apk update && apk add --no-cache --quiet \
-    e2fsprogs \
-    curl \
+RUN apk add --no-cache \
+	e2fsprogs \
+	curl \
 	zip \
 	unzip \
 	python py-pip && \
@@ -11,17 +11,22 @@ RUN apk update && apk add --no-cache --quiet \
 
 # Install plugins
 RUN mkdir -p /var/lib/neo4j/plugins
-RUN curl -L -s https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/3.4.0.1/apoc-3.4.0.1-all.jar > /var/lib/neo4j/plugins/apoc-3.4.0.1-all.jar
-RUN curl -L -s http://central.maven.org/maven2/mysql/mysql-connector-java/6.0.6/mysql-connector-java-6.0.6.jar > /var/lib/neo4j/plugins/mysql-connector-java-6.0.6.jar
 
-COPY docker-entrypoint.sh /docker-entrypoint.sh
+ENV NEO4J_APOC_VERSION=3.5.0.2
+
+ADD https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/$NEO4J_APOC_VERSION/apoc-$NEO4J_APOC_VERSION-all.jar /var/lib/neo4j/plugins/apoc-$NEO4J_APOC_VERSION-all.jar
+ADD http://central.maven.org/maven2/mysql/mysql-connector-java/6.0.6/mysql-connector-java-6.0.6.jar /var/lib/neo4j/plugins/mysql-connector-java-6.0.6.jar
+
+
+ENV EXTENSION_SCRIPT=/ecs-extension.sh
+
+COPY ecs-extension.sh ${EXTENSION_SCRIPT}
 COPY init_db.sh /init_db.sh
 
 # These were created earlier by image, but we dont need them since
 # entrypoint will configure Neo to use them if they exist.
-RUN rm -rf /var/lib/neo4j/data /var/lib/neo4j/logs
+RUN rm -rf ${NEO4J_HOME}/data/ ${NEO4J_HOME}/logs/ ${NEO4J_HOME}/metrics/
 
 EXPOSE 5000 5001 6000 6001 7000
 
-ENTRYPOINT ["/docker-entrypoint.sh"]
-CMD ["neo4j"]
+CMD ["start"]
diff --git a/Makefile b/Makefile
@@ -1,16 +1,27 @@
-COMMIT=$(shell git rev-parse HEAD)
+COMMIT=taras-$(shell git rev-parse HEAD)
+DATE=$(shell date +%Y-%m-%d-%H-%M)
 
 .PHONY:
 build:
 	@ echo "Building image..."
 	@ docker build -t neo .
 
-# Use param REPO, e.g. REPO=xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/neo to specify ECR repository
-# Use param REGION, e.g. REGION=us-east-1
+# Use param NEO_ECR_REPO, e.g. NEO_ECR_REPO=xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/neo to specify ECR repository
+# Use param NEO_AWS_REGION, e.g. NEO_AWS_REGION=us-east-1
 .PHONY:
 push_image: build
 	@ echo "Pushing image based on last commit $(COMMIT)"
-	@ $(shell aws ecr get-login --region $(NEO_AWS_REGION))
+	@ $(shell aws ecr get-login --region $(NEO_AWS_REGION) --no-include-email)
 	@ docker tag neo:latest $(NEO_ECR_REPO):$(COMMIT)
 	@ docker push $(NEO_ECR_REPO):$(COMMIT)
 	@ echo "Pushed image $(NEO_ECR_REPO):$(COMMIT)"
+
+
+.PHONY: create_stack
+create_stack:
+	awless --no-sync --color always \
+            create stack \
+            name=taras-neo-test1-$(DATE) \
+            capabilities=CAPABILITY_IAM \
+            template-file=./cloudformation.yml \
+            stack-file=./config.yml
diff --git a/README.md b/README.md
@@ -4,23 +4,24 @@ A setup for HA (High-Availability) deployment of a [Neo4j Enterprise](https://ne
 
 You can obtain Neo4j from the [official website](https://neo4j.com/). Please contact [email protected] for Enterprise licensing.
 
-### Includes:
+## Includes
+
 - Customizable CloudFormation template.
-- Custom docker image on top [official Neo4j image](https://hub.docker.com/_/neo4j/). Current version - *Neo4j 3.4.6*
+- Custom docker image on top [official Neo4j image](https://hub.docker.com/_/neo4j/). Current version - *Neo4j 3.5.4*
 
 ## Features
 
-* Automatic daily backups to S3 using a slave-only instance.
-* Bootstrap a cluster from a backup snapshot.
-* Autoscaling (based on Memory Utilization).
-* CloudWatch alerts setup.
-* Bootstrap a node with an existing data volume for quick startup.
-* Automatically create users+credentials for read-only and read/write access.
+- Automatic daily backups to S3 using a slave-only instance.
+- Bootstrap a cluster from a backup snapshot.
+- Autoscaling (based on Memory Utilization).
+- CloudWatch alerts setup.
+- Bootstrap a node with an existing data volume for quick startup.
+- Automatically create users+credentials for read-only and read/write access.
 
-## Prerequisites:
+## Prerequisites
 
-* Install [Docker](https://docs.docker.com/engine/installation/) to build the image.
-* [AWS CLI](https://aws.amazon.com/cli) for uploading images to ECR.
+- Install [Docker](https://docs.docker.com/engine/installation/) to build the image.
+- [AWS CLI](https://aws.amazon.com/cli) for the ECR Auth.
 
 ## How does it work?
 
@@ -32,12 +33,16 @@ It uses [Bolt](https://boltprotocol.org/) – a highly efficient, lightweight
 
 Essentially it's a Neo4j cluster with a minimum of 2 nodes (use at least 3 for HA), which is split logically into 2 ECS clusters (yet still it's 1 Neo4j cluster):
 
-### A Read-Write cluster with one master node and multiple slaves:
+### A Read-Write cluster with one master node and multiple slaves
+
 - Fast synchronisation between and master and nodes.
 - Load Balancer keeps only a current master node in service. Hence slaves act like hot-standby in case of a failover.
 - All nodes are eligible for becoming a master. Reelection will be quickly spotted by ELB.
 
-### A Read-only cluster with one slave node:
+### A Read-only cluster with one slave node [optionally]
+
+_This will generate additional costs, since separate ELB is created for this node_
+
 - Slower synchronisation.
 - Can not become master.
 - Can not accept write queries.
@@ -50,7 +55,7 @@ Essentially it's a Neo4j cluster with a minimum of 2 nodes (use at least 3 for H
 
 Ports open:
 
-```
+```yaml
   - HTTP(s): 7473, 7474
   - Bolt: 7687
 ```
@@ -62,38 +67,72 @@ Ports open:
 
 2. Save environment variable for use in makefile (customize them first)
 
-        $ export NEO_ECR_REPO=<paste here ARN of your ECR repo>
-        $ export NEO_AWS_REGION=<your AWS region>
+    ```sh
+    export NEO_ECR_REPO=<paste here ARN of your ECR repo>
+    export NEO_AWS_REGION=<your AWS region>
+    ```
 
 3. Build Docker image and push it to your ECR:
 
-        $ make build
-        $ make push_image
+    ``` sh
+    make push_image
+    ```
 
 4. Feel free to modify `cloudformation.yml` in any way you like before spinning up infrastructure, however most of the things are customizable via parameters.
 
-5. [Create a Cloud Formation stack](https://console.aws.amazon.com/cloudformation/home#/stacks/new) using `cloudformation.yml`.
+5. [Create a Cloud Formation stack](https://console.aws.amazon.com/cloudformation/home#/stacks/new) using `cloudformation.yml` with your parameters.
+
+    _If you want to setup simpler (and cheaper)  environment, without the Slave-Only node (and all realted resources),
+    you can set `SlaveMode=ABSENT` and ignore the rest of `Slave` related parameters (except `SlaveSubnetID` you need to choose any subnet there,
+    it will be ingnored as long as `SlaveMode=ABSENT`)_
+
+    **Parameters guide**
+
+    Parameter | Description
+    ----------|----------
+    AcceptLicense | Must be set to `true` in order to use Neo4J
+    AdminUser     | Default Admin user should be `neo4j` and can't be changed
+    ClusterInstanceType | EC2 instance type
+    DesiredCapacity     | Number of desired Neo4J nodes (excluding the SlaveOnly one)
+    DockerECRARN        | ARN of your Private ECR repo
+    DockerImage         | URL of your customly build Neo4J Image
+    Domain              | The domain for the your Neo4J cluster endpoint (http://<domain>:7474)
+    DomainHostedZone    | Route53 Domain Hosted zone to register your DNS record
+    EBSSize             | Size of EBS volume for Neo4J data in GBs
+    EBSType             | Type of EBS volume
+    GuestPassword       | Password for the Neo4J Read Only user
+    GuestUser           | Name for the Neo4j Read Only user
+    KeyName             | SSH key to use for EC2 instances access
+    MaxSize             | Max number of instances in the cluster
+    SubnetID            | List of Subnets to place Neo4J Cluster nodes. *Supported only one instance per Subnet*
+    Mode                | [Neo4J DB Mode](https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.mode)
+    NodeSecurityGroups  | List of additional SG to apply on your EC2 instances
+    SlaveMode           | [Neo4J DB Mode](https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.mode) for the SlaveOnly instance, with the addional one `ABSENT` that can be used to create Neo4J cluster without the SlaveOnly instance
+    SlaveOnlyDomain     | The domain for the your Neo4J SlaveOnly endpoint (http://<domain>:7474)
+    SlaveOnlyInstanceType | EC2 instance type for the slave only mode
+    SlaveSubnetID       | SubnetID for the Slave only instance. *Should be different one from the main cluster subnets, but should be able to access other instances*. Even if `SlaveMode` set to `ABSENT` some value must be set here (it will be ignored in this case) 
+    VpcId               | AWS VPC ID to place your cluster in
+    SNSTopicArn         | SNS Topic ARN to send Alerts to. If none specified, new one will be created
+    SnapshotPath        | Path to the DB snapshot on the S3, to restore data from on start (_<bucket_name>/hourly/neo4j-backup-<timestamp>.zip_)
 
    During this step you will define all the resources you need and configure Docker image with Neo4j for ECS.
-    Please make sure to set 2 tags for your stack (on "Options" page):
+    Please consider tagging your stack (on "Options" page):
 
-        Name: <how you name your stack>
-        Environment: <your env name, e.g production>
+    ```yaml
+    Name: <how you name your stack>
+    Environment: <your env name, e.g production>
+    ```
 
 ## Upgrade version
 
 Please see [detailed instructions](./UPGRADE_README.md) to upgrade using this CF template.
 
-
 ## Known Problems
 
-* You can't restore server from a backup without a downtime. See [further instructions](https://neo4j.com/docs/operations-manual/current/backup/restore-backup/#backup-restore-ha-cluster).
-* Autoscaling is hardcoded via RAM utilization (>70%). Feel free to modify for your own needs.
-* Sometimes, rolling updates, that require nodes reboot, render them stuck for some time before rejoining cluster. Probaly a slower rolling update can help so that at each moment at least one node is already registered in main ELB as master.
+- You can't restore server from a backup without a downtime. See [further instructions](https://neo4j.com/docs/operations-manual/current/backup/restore-backup/#backup-restore-ha-cluster).
+- Autoscaling is hardcoded via RAM utilization (>70%). Feel free to modify for your own needs.
+- Sometimes, rolling updates, that require nodes reboot, render them stuck for some time before rejoining cluster. Probaly a slower rolling update can help so that at each moment at least one node is already registered in main ELB as master.
 
 ## TODO
 
-```
-* Parametrize autoscaling.
-* Allow disabling slave-only more for simplest 1-node setups.
-```
+- Parametrize autoscaling.
diff --git a/UPGRADE_README.md b/UPGRADE_README.md
@@ -1,43 +1,39 @@
-## Upgrade guide
+# Upgrade guide
 
 Information below is up-to-date with Neo4j 3.4.6.
 
-### Patch version upgrades
+## Patch version upgrades
 
 If you upgrade between patch versions, you might use
 [rolling upgrade](https://neo4j.com/docs/operations-manual/current/upgrade/causal-cluster/#cc-upgrade-rolling)
 just by updating each instance separately. However it is possible *only when a store format upgrade is not needed (see release notes for particular change)*.
 
 One step: build a new version of docker image (based on newest official Neo4j Docker image), and use that image in CloudFormation (see step #1 in Generic Upgrades section below).
 
-
-### Generic upgrades
+## Generic upgrades
 
 (Based on [official Neo4j guide](https://neo4j.com/docs/operations-manual/current/upgrade/))
 
 Moves between minor/major versions do not allow zero-downtime (as of the 3.4.6) database upgrade.
 
 You want to make use of CloudFormation (CF) parameters to tweak upgrade steps, as follows below.
 
-### Preconditions
+## Preconditions
 
 - Time of day with lowest graph load
 - Client-side retry system or logging of all queries, in order to not lose write queries.
 - Build neo4j docker image with new version and have it in ECS.
 - AWS console open.
 
-
-### Migration
+## Migration
 
 1. Update CF stack with parameters:
 
         DockerImage = <docker image address name:tag> 
 
     This will roll updates to cluster. If during this master fails over to another node, client might spot couple seconds window.
 
-
-
-2. Update CF stack with parameters:
+1. Update CF stack with parameters:
 
         Mode = SINGLE
         SlaveMode = SINGLE
@@ -53,48 +49,40 @@ You want to make use of CloudFormation (CF) parameters to tweak upgrade steps, a
 
     Verify it's working properly. Good idea is to roll client tests since this is a final upgraded DB.
 
-
-3. Manual migrations (if needed).
+1. Manual migrations (if needed).
 
     If upgrade process involves index recreation or other migration to data, that needs to be done manually, it's the right time to do it.
 
-
-4. Update CF stack with parameters:
+1. Update CF stack with parameters:
 
         Mode = HA (but keep SlaveMode = SINGLE)
         AllowUpgrade = False
 
     Verify your single master is fine as HA node. Again downtime since master reboots.
 
-
-5. Change "Name" tags for all slave Neo4j data volumes.
+1. Change "Name" tags for all slave Neo4j data volumes.
 
     For example "neo4j-production-data" → "neo4j-production-data-old". We don't need data volumes with old DB on slaves, with changed name nodes won't use old volumes after reboot.
 
-
-6. (Optional) Create slave copies of master's migrated volume.
+1. (Optional) Create slave copies of master's migrated volume.
 
     If DB is big, this step is useful. Without it slave nodes will start on fresh DB and will need to catch up with master in online mode.
 
     So, you create a snapshot from a migrate Neo4j data volume and create volumes with needed tags in all regions. (Environment tag, Name tag).
 
     Still during boot of node, Neo4j might refuse to use your volume as too old, Neo will simply throw out the data and catch up with master online 
 
-
-
-7. Update CF stack with parameters: (N = 2 for example)
+1. Update CF stack with parameters: (N = 2 for example)
 
         MaxSize = N
         DesiredCapacity = N
 
     Will upscale cluster and create new slaves, that will catch up.
 
-
-8. Update CF stack with parameters:
+1. Update CF stack with parameters:
 
         SlaveMode = HA
 
     Will reboot read-only slave as HA member, hooking up to new volume.
 
-
     Verify both main cluster and slave-only node are working properly, roll tests. Upgrade complete :)