Welcome To The World of Elastic search Collection®. A collection of awesome software, libraries, documents, books, resources and cool stuff about ELK Stack. Thanks to our daily readers and contributors. The goal is to build a categorized community-driven collection of very well-known resources. Sharing, suggestions and contributions are always welcome!
When people ask, “what is Elasticsearch?”, some may answer that:
- It’s “an index”,
- A "search engine”,
- An “analytics database”,
- A "big data solution”,
- that “it’s fast and scalable”,
- or that “it’s kind of like Google”.
- Elasticsearch is simple to configure, has incredible flexibility, and is an excellent tool for complex searches. Let's take a closer look.
Depending on your level of familiarity with this technology, these answers may either bring you closer to an ah-ha moment or further confuse you. But the truth is, all of these answers are correct and that’s part of the appeal of Elasticsearch.
- Elasticsearch is a distributed, open-source search and analytics engine built on Apache Lucene and developed in Java. . It was developed in Java and is designed to operate in real time. It can search and index document files in diverse formats. It was designed to be used in distributed environments by providing flexibility and scalability. Now, Elasticsearch is a widely popular enterprise search engine. Elasticsearch allows you to store, search, and analyze huge volumes of data quickly and in near real-time and give back answers in milliseconds.
To help understand how Elasticsearch handles data, we can make an analogy to a database.
- Elasticsearch stores the data using the "schema-less" concept. This means that it is not necessary to define the structure of the data that will be entered in advance, as happens with relational databases known in the market: Oracle, MySQL, and SQLServer, among others.
In our analogy of traditional relational databases, the structure of the data used by Elasticsearch would be:
-
Index: - Indices, the largest unit of data in Elasticsearch, are logical partitions of documents and can be compared to a database in the world of relational databases. More on indices
-
Type: - A type in Elasticsearch represents a class of similar documents. A type consists of a name—such as user or blog post—and a mapping.
-
Documents: - A document in Lucene consists of a simple list of field-value pairs. A field must have at least one value, but any field can contain multiple values.
-
Fields: - Are columns in Elasticsearch.
- Cluster: - A cluster is a collection of one or more servers that together hold entire data and give federated indexing and search capabilities across all servers. For relational databases, the node is DB Instance. There can be N nodes with the same cluster name.
- Node: - A node is a single server that holds some data and participates in the cluster’s indexing and querying. A node can be configured to join a specific cluster by the particular cluster name. A single cluster can have as many nodes as we want. A node is simply one Elasticsearch instance.
- Shard - A shard is a subset of documents of an index. An index can be divided into many shards.
- Replica Shard: - The main purpose of replicas is for failover: if the node holding a primary shard dies, a replica is promoted to the role of primary; replica shard is the copy of primary shard and serves to prevent data loss in case of hardware failure.
- Elasticsearch Introduction
- Elasticsearch Architecture
- Indices
- Types
- Documents
- Fields
- Cluster
- Shard
- Replica Shards
- Elasticsearch Queries
- APIs
- Elastic Stack
- Kibana
- Beats
- Logstash
- Books
- Certifications
- Elasticsearch developer tools and utilities
- Elasticsearch Use cases
Indices, the largest unit of data in Elasticsearch, are logical partitions of documents and can be compared to a database in the world of relational databases.
Continuing our e-commerce app example, you could have one index containing all of the data related to the products and another with all of the data related to the customers. You can have as many indices defined in Elasticsearch as you want. These in turn will hold documents that are unique to each index. Indices are identified by lowercase names that refer to actions that are performed actions (such as searching and deleting) on the documents that are inside each index. For a list of best practices in handling indices, check out the blog Managing an Elasticsearch Index. Another key element to getting how Elasticsearch’s indices work is to get a handle on shards.
- Best Practices for Managing Elasticsearch Indices - Understanding indices
Elasticsearch provides a full Query DSL (Domain Specific Language) based on JSON to define queries. Think of the Query DSL as an AST (Abstract Syntax Tree) of queries, consisting of two types of clauses:
- Elasticsearch official website
- Logstash is a data pipeline that helps you process logs and other event data from a variety of systems
- Kibana is a data analysis tool that helps to visualize your data; Kibana Manual docs
- beats is the platform for building lightweight, open source data shippers for many types of data you want to enrich with Logstash, search and analyze in Elasticsearch, and visualize in Kibana.
- Deep Learning for Search - teaches you how to leverage neural networks, NLP, and deep learning techniques to improve search performance. (2019)
- Relevant Search: with applications for Solr and Elasticsearch - demystifies relevance work. Using Elasticsearch, it teaches you how to return engaging search results to your users, helping you understand and leverage the internals of Lucene-based search engines. (2016)
- Elasticsearch in Action - teaches you how to build scalable search applications using Elasticsearch (2015)
- Elasticsearch in Action, Second edition - hands-on guide to developing fully functional search engines with Elasticsearch and Kibana. (2021)
- Elastic Certified Engineer notes - notes and exercises to prepare the certification exam
- frutik/awesome-search I am building e-commerce search now. Below are listed some of my build blocks
- Fess is an open source full featured Enterprise Search, with a web-crawler
- Yelp/elastalert is a modular flexible rules based alerting system written in Python
- etsy/411 - an Alert Management Web Application https://demo.fouroneone.io (credentials: user/user)
- appbaseio/mirage is a 🔎 GUI for composing Elasticsearch queries
- exceptionless/Exceptionless is an error (exceptions) collecting and reporting server with client bindings for a various programming languages
- searchkit/searchkit is a UI framework based on React to build awesome search experiences with Elasticsearch
- appbaseio/reactivemaps is a React based UI components library for building Airbnb / Foursquare like Maps
- appbaseio/reactivesearch is a library of beautiful React UI components for Elasticsearch
- appbaseio/dejavu The missing UI for Elasticsearch; landing page
- Simple File Server is an Openstack Swift compatible distributed object store that can serve and securely store billions of large and small files using minimal resources.
- logagent a log shipper to parse and ship logs to Elasticsearch including bulk indexing, disk buffers and log format detection.
- ItemsAPI simplified search API for web and mobile (based on Elasticsearch and Express.js)
- Kuzzle - An open-source backend with advanced real-time features for Web, Mobile and IoT that uses ElasticSearch as a database. (Website)
- SIAC - SIAC is an enterprise SIEM built on the ELK stack and other open-source components.
- Sentinl - Sentinl is a Kibana alerting and reporting app.
- Praeco - Elasticsearch alerting made simple
- DataStation - Easily query, script, and visualize data from every database, file, and API.
- Sense (from Elastic) A JSON aware developer console to Elasticsearch; official and very powerful
- ES-mode An Emacs major mode for interacting with Elasticsearch (similar to Sense)
- Elasticsearch Cheatsheet Examples for the most used queries, API and settings for all major version of Elasticsearch
- Elasticstat CLI tool displaying monitoring informations like htop
- Elastic for Visual Studio Code An extension for developing Elasticsearch queries like Kibana and Sense extention in Visual Studio Code
- Elastic Builder A Node.js implementation of the Elasticsearch DSL
- Bodybuilder A Node.js elasticsearch query body builder
- enju A Node.js elasticsearch ORM
- Peek An interactive CLI in Python that works like Kibana Console with additional features
- Knapsack plugin is an "swiss knife" export/import plugin for Elasticsearch
- Elasticsearch-Exporter is a command line script to import/export data from Elasticsearch to various other storage systems
- esbulk Parallel elasticsearch bulk indexing utility for the command line.
- elasticdump - tools for moving and saving indices
- elasticsearch-loader - Tool for loading common file types to elasticsearch including csv, json, and parquet
- Esctl - High-level command line interface to manage Elasticsearch clusters.
- Vulcanizer - Github's open sourced cluster management library based on Elasticsearch's REST API. Comes with a high level CLI tool
- sscarduzio/elasticsearch-readonlyrest-plugin Safely expose Elasticsearch REST API directly to the public
- mobz/elasticsearch-head is a powerful and essential plugin for managing your cluster, indices and mapping
- Bigdesk - Live charts and statistics for elasticsearch cluster
- Elastic HQ - Elasticsearch cluster management console with live monitoring and beautiful UI
- Cerebro is an open source(MIT License) elasticsearch web admin tool. Supports ES 5.x
- Kopf - Another management plugin that have REST console and manual shard allocation
- Search Guard - Elasticsearch and elastic stack security and alerting for free
- ee-outliers - ee-outliers is a framework to detect outliers in events stored in an Elasticsearch cluster.
- Elasticsearch Comrade - Elasticsearch admin panel built for ops and monitoring
- elasticsearch-admin - Web administration for Elasticsearch
- SIREn Join Plugin for Elasticsearch This plugin extends Elasticsearch with new search actions and a filter query parser that enables to perform a "Filter Join" between two set of documents (in the same index or in different indexes).
- NLPchina/elasticsearch-sql - Query elasticsearch using familiar SQL syntax. You can also use ES functions in SQL.
- elastic/elasticsearch-hadoop - Elasticsearch real-time search and analytics natively integrated with Hadoop (and Hive)
- jprante/elasticsearch-jdbc - JDBC importer for Elasticsearch
- pandasticsearch - An Elasticsearch client exposing DataFrame API
- monstache - Go daemon that syncs MongoDB to Elasticsearch in near realtime
- jprante/elasticsearch-plugin-bundle A plugin that consists of a compilation of useful Elasticsearch plugins related to indexing and searching documents
- elastic/timelion time-series analyses application. Overview and installation guide: Timelion: The time series composer for Kibana
- Kibana Alert App for Elasticsearch - Kibana plugin with monitoring, alerting and reporting capabilities
- VulnWhisperer - VulnWhisperer is a vulnerability data and report aggregator.
- Wazuh Kibana App - A Kibana app for working with data generated by Wazuh.
- Datasweet Formula - A real time calculated metric plugin Datasweet Formula.
- nbs-system/mapster - a visualization which allows to create live event 3d maps in Kibana
- Kibana Tag Cloud Plugin - tag cloud visualization plugin based on d3-cloud plugin
- LogTrail - a plugin for Kibana to view, analyze, search and tail log events from multiple hosts in realtime with devops friendly interface inspired by Papertrail
- Analyze API - Kibana 6 application to manipulate the
_analyze
API graphically - kbn_network - This is a plugin developed for Kibana that displays a network node that link two fields that have been previously selected.
- /r/elasticsearch
- Elasticsearch forum
- Stackoverflow
- Books on Amazon does not fit well into this category, but worth checking out!
- TODO: Put some good twitter accounts
- Centralized Logging with Logstash and Kibana On Ubuntu 14.04 everything you need to now when you are creating your first Elasticsearch+Logstash+Kibana instance
- dwyl/learn-elasticsearch a getting started tutorial with a pack of valuable references
- Make Sense of your Logs: From Zero to Hero in less than an Hour! by Britta Weber demonstrates how you can build Elasticsearch + Logstash + Kibana stack to collect and discover your data
- $$ Elasticsearch 7 and Elastic Stack - liveVideo course that teaches you to search, analyze, and visualize big data on a cluster with Elasticsearch, Logstash, Beats, Kibana, and more.
- Elasticsearch Intro - Elasticsearch: What it is, How it works, and what it’s used for.
- A Useful Elasticsearch Cheat Sheet in Times of Trouble
- The definitive guide for Elasticsearch on Windows Azure
- Elasticsearch pre-flight checklist
- 9 Tips on Elasticsearch Configuration for High Performance
- Best Practices in AWS
- How to Secure Elasticsearch and Kibana with NGINX, LDAP and SSL 🔒
- Elasticsearch server on Webfaction using NGINX with basic authorization and HTTPS protocol
- Elasticsearch Guides Useful Elasticsearch guides with best practices, troubleshooting instructions for errors, tips, examples of code snippets and more.
- Elasticsearch Java Virtual Machine settings explained
- Tuning Garbage Collection for Mission-Critical Java Applications
- G1: One Garbage Collector To Rule Them All
- Use Lucene’s MMapDirectory on 64bit platforms, please!
- Black Magic cookbook
- G1GC Fundamentals: Lessons from Taming Garbage Collection
- JVM Garbage Collector settings investigation PDF Comparison of JVM GC
- Garbage Collection Settings for Elasticsearch Master Nodes Fine tunine your garbage collector
- Understanding G1 GC Log Format To tune and troubleshoot G1 GC enabled JVMs, one must have a proper understanding of G1 GC log format. This article walks through key things that one should know about the G1 GC log format.
How to start using G1
#ES_JAVA_OPTS=""
ES_JAVA_OPTS="-XX:-UseParNewGC -XX:-UseConcMarkSweepGC -XX:+UseG1GC"
- The Authoritative Guide to Elasticsearch Performance Tuning (Part 1) Part 2 Part 3
- Tuning data ingestion performance for Elasticsearch on Azure - and not only for Azure. That's a great article about Elasticsearch Performance testing by example
- Elasticsearch Indexing Performance Cheatsheet - when you plan to index large amounts of data in Elasticsearch (by Patrick Peschlow)
- Elasticsearch for Logging Elasticsearch configuration tips and tricks from Sanity
- Scaling Elasticsearch to Hundreds of Developers by Joseph Lynch @yelp
- 10 Elasticsearch metrics to watch
- Understanding Elasticsearch Performance
- Our Experience of Creating Large Scale Log Search System Using Elasticsearch - topology, separate master, data and search balancers nodes
- 📂 Elasticsearch on Azure Guidance it is 10% on Azure and 90% of a very valuable general information, tips and tricks about Elasticsearch
- How to avoid the split-brain problem in Elasticsearch
- Datadog's series about monitoring Elasticsearch performance:
- Performance Monitoring Essentials - Elasticsearch Edition
- Operator for running Elasticsearch in Kubernetes
- Apache Hive integration
- Connecting Tableau to Elasticsearch (READ: How to query Elasticsearch with Hive SQL and Hadoop)
- mradamlacey/elasticsearch-tableau-connector
- 5 Logstash Alternatives and typical use cases
- ElastAlert: Alerting At Scale With Elasticsearch, Part 1 by engineeringblog.yelp.com
- ElastAlert: Alerting At Scale With Elasticsearch, Part 2 by engineeringblog.yelp.com
- Elastalert: implementing rich monitoring with Elasticsearch
- Elasticsearch as a Time Series Data Store by Felix Barnsteiner
- Running derivatives on Voyager velocity data By Colin Goodheart-Smithe
- Shewhart Control Charts via Moving Averages: Part 1 - Part 2 by Zachary Tong
- Implementing a Statistical Anomaly Detector: Part 1 - Part 2 - Part 3 by Zachary Tong
- Classifying images into Elasticsearch with DeepDetect (forum thread with discussion) by Emmanuel Benazera
- Elasticsearch with Machine Learning (English translation) by Kunihiko Kido
- Recommender System with Mahout and Elasticsearch
- Data Infrastructure at IFTTT Elasticsearch, Kafka, Apache Spark, Redhsift, other AWS services
- OFAC compliance with Elasticsearch using AWS
- Building a Streaming Search Platform - Streaming Search on Tweets: Storm, Elasticsearch, and Redis
- LogZoom, a fast and lightweight substitute for Logstash
- Graylog2/graylog2-server - Free and open source log management (based on ES)
- Fluentd vs. Logstash for OpenStack Log Management
- Building a Directory Map With ELK
- Structured logging with ELK - part 1
- Search for 😋 Emoji with Elasticsearch 🔎
- Complete Guide to the ELK Stack
- Elasticsearch Engineer Interview Questions
- logiq - Simple WebUI Monitoring Tool for Logstash ver. 5.0 and up
- ElasticSearch Report Engine - An ElasticSearch plugin to return query results as either PDF,HTML or CSV.
- Elasticsearch Glossary - explanations of Elasticsearch terminology, including examples, common best practices and troubleshooting guides for various issues.
- Elasticsearch for logs and metrics: A deep dive – Velocity 2016 by Sematext Developers
- Elasticsearch in action Thijs Feryn a beginner overview
- Getting Down and Dirty with ElasticSearch by Clinton Gormley
- How we scaled Raygun
- Getting started with Elasticsearch
- Speed is a Key: Elasticsearch under the Hood introduction + basic performance optimization
- $$ Pluralsight: Getting Started With Elasticsearch for .NET Developers this course will introduce users to Elasticsearch, how it works, and how to use it with .NET projects.
- $$ Complete Guide to Elasticsearch Comprehensive guide to Elasticsearch, the popular search engine built on Apache Lucene
- How Elasticsearch powers the Guardian's newsroom
- Elasticsearch Query Editor in Grafana
- Scale Your Metrics with Elasticsearch 2019 by Philipp Krenn (Elastic) optimization tips and tricks
- #bbuzz 2015: Adrien Grand – Algorithms and data-structures that power Lucene and Elasticsearch
- Rafał Kuć - Running High Performance Fault-tolerant Elasticsearch Clusters on Docker and slides
- Working with Elasticsearch - Search, Aggregate, Analyze, and Scale Large Volume Datastores - O'Reilly Media
- End-to-end Recommender System with Spark and Elasticsearch by Nick Pentreath & Jean-François Puget. Slide deck
- Elasticsearch config for a write-heavy cluster - reyjrar/elasticsearch.yml
- chenryn/ESPL - Elastic Search Processing Language PEG parser sample for SPL to Elasticsearch DSL
- thomaspatzke/EQUEL an Elasticsearch QUEry Language, based on G4 grammar parser
Yelp, IFTTT, StackExchange, Raygun, Mozilla, Spotify, CERN, NASA Zalando
MIT License & cc license
This work is licensed under a Creative Commons Attribution 4.0 International License.