Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase max_shards_per_node #3

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 57 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,73 @@
Installation instructions
=========================

- The code should be pulled from GitHub using the following command:

```
https://github.com/IDR/idr-log-analysis.git
```
- The following commands should be run to create working folders for both Elasticsearch and Fluentd

```
cd idr-log-analysis
mkdir -p volumes/{es,fluentd}
chown 1000 volumes/es
chown 100 volumes/fluentd
```
- ```Nginx``` is configured to use HTTP Basic Authentication [HTTP Basic Authentication](https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/%E2%80%B8), so it is needed to create a username and password, then save both in a file named ```httppasswd```, e.g.

```
echo `username:password` > nginx/httppasswd
```
- To ensure that Docker is installed on your machine, you can run the following command:

```
sudo docker --version
```

-If Docker is not installed, it needs to be installed.

- The following command is used to run the applications

```
cd idr-log-analysis
mkdir -p volumes/{es,fluentd}
chown 1000 volumes/es
chown 100 volumes/fluentd
echo '<NGINX-BASIC-AUTH>' > nginx/passwd
```
```
docker-compose up -d
docker-compose logs -f
docker compose up -d
```

# Kibana setup
_ To follow the log output, the user can run this command:

```
docker compose logs -f
```

Kibana setup
============

1. Connect to http://localhost:12381 and login with Nginx basic auth.
2. In the Kibana Management go to `Saved Objects` and import `kibana-export.json`.
This should import saved index patterns, visualisations and dashboards.
If the visualisations are disconnected from the indices associate them as follows:
- IDR visualisations should be associated with index pattern `fluentd.nginx.access.*`.
- IDR-analysis visualisations should be associated with index pattern `fluentd.haproxy.http.*`.
2. In the Kibana Management go to [Saved Objects](http://localhost:12381/app/management/kibana/objects) and import the following file:

```
kibana-export.ndjson
```

This should import saved index patterns, visualisations and dashboards.
If the visualisations are disconnected from the indices associate them as follows:

- IDR visualisations should be associated with index pattern `fluentd.nginx.access.*`.
- IDR-analysis visualisations should be associated with index pattern `fluentd.haproxy.http.*`.

3. If you don't have a default index pattern Under `Index Patterns` make `fluentd.nginx.access.*` the default.


If you need to create the index patterns yourself:

1. Create index pattern with pattern `fluentd.haproxy.http.*`.
2. Select `@timestamp` as the `Time Filter field name`.
3. Check that the field `host` has type `ip` and `geoip` has type `geo_point`.
If these types are incorrect it means the Elasticsearch index mapping wasn't created.
4. Repeat for index pattern `fluentd.nginx.access.*`.


# Updating logs
Updating logs
=============

1. Create a directory `/uod/idr/versions/nginx-logs-combined/prodNN` where `prodNN` is the release that is being archived
2. Untar the `nginx` archive copied to `/uod/idr/versions/prodNN` under this directory
Expand All @@ -39,8 +77,8 @@ If you need to create the index patterns yourself:
6. Move `access.log-prodNN` to `/uod/idr/versions/nginx-logs-combined/prod-merged-agg/`
7. Fluentd should automatically start to process this file


## Notes
Notes
=====

Although the log ingest process can handle individual `access.log*` files it will continually tail each file, so aggregating them massively reduces the number of open file handles.

22 changes: 20 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,36 @@ version: "3"
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.8.1
logging:
options:
max-size: 60m
max-file: 4
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xmx4096m"
- "cluster.max_shards_per_node=6000"
- "cluster.max_shards_per_node=8000"
- "xpack.security.http.ssl.enabled=false"
- "xpack.security.enabled=false"
- "xpack.security.transport.ssl.enabled=false"
- "xpack.security.enrollment.enabled=false"
- "path.repo=/usr/share/backup"
restart: unless-stopped
networks:
- efk-idranalysis
- efk-idranalysis
volumes:
# Elasticsearch database
- ./volumes/es:/usr/share/elasticsearch/data
- ./volumes/es_backup:/usr/share/backup

fluentd:
build:
context: ./fluentd
network: host
restart: unless-stopped
logging:
options:
max-size: 60m
max-file: 4
networks:
- efk-idranalysis
links:
Expand All @@ -30,15 +42,21 @@ services:
- /uod/idr/versions/idr-analysis-logs-combined/:/idranalysis-haproxy-logs/:ro
- /uod/idr/versions/nginx-logs-combined/prod-merged-agg/:/idr-nginx-logs:ro
- ./volumes/fluentd/:/fluentd/pos/
# - ./fluentd/conf:/fluentd/etc
# - /scratch/folder1/:/idranalysis-haproxy-logs/:ro
# - /scratch/folder2/:/idr-nginx-logs:ro
#- ./volumes/fluentd/:/fluentd/pos/

kibana:
image: docker.elastic.co/kibana/kibana:8.8.1
restart: unless-stopped
networks:
efk-idranalysis:
ipv4_address: "10.11.0.10"

nginx:
image: library/nginx:1.24.0
restart: unless-stopped
links:
- "elasticsearch"
networks:
Expand Down
21 changes: 12 additions & 9 deletions fluentd/conf/fluent.conf
Original file line number Diff line number Diff line change
Expand Up @@ -171,17 +171,20 @@ time_format %d/%b/%Y:%H:%M:%S.%L
template_overwrite true
type_name fluentd
#<buffer tag, time>
<buffer tag>
flush_interval 10s
#chunk_limit_size 8MB
#timekey 10s
#timekey_wait 10s
#overflow_action block
flush_at_shutdown true
#flush_mode interval
<buffer tag>
flush_thread_count 2
chunk_limit_size 48MB
queue_limit_length 4
flush_interval 10s
retry_max_interval 30s
retry_forever true
retry_type exponential_backoff
#retry_timeout 3h
retry_wait 20s
overflow_action block
retry_max_times 50
flush_at_shutdown true
</buffer>

#@log_level debug
log_es_400_reason true
request_timeout 60s
Expand Down