forked from amundsen-io/amundsensearchlibrary
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request amundsen-io#5 in BI/amundsensearchlibrary from add…
…_metrics_dashboards to test * commit '6ffc57032ef83197b3d642ccad02b31a633c0d18': Add metrics/dashboards Fix #24, correct initialisation of elastic search (amundsen-io#27) [DPTOOLS-2252] Publish Docker image in CI (amundsen-io#26) Integrates Atlas DSL Search (amundsen-io#17) Update PULL_REQUEST_TEMPLATE.md (amundsen-io#23) Update README.md (amundsen-io#22) Add codecov based for search repo (amundsen-io#20) Update README.md (amundsen-io#19) Set the elasticsearch base (endpoint) from env variable (amundsen-io#16) Adds the PR template for amundsen search service (amundsen-io#15) Doc fix: Docker pull the official image (amundsen-io#14) Changed the name of this file for consistency (amundsen-io#13) gitignore dist/ as in metadataservice PR #28 (amundsen-io#12)
- Loading branch information
Showing
21 changed files
with
937 additions
and
105 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
### Summary of Changes | ||
|
||
_Include a summary of changes then remove this line_ | ||
|
||
### Tests | ||
|
||
_What tests did you add or modify and why? If no tests were added or modified, explain why. Remove this line_ | ||
|
||
### Documentation | ||
|
||
_What documentation did you add or modify and why? Add any relevant links then remove this line_ | ||
|
||
### CheckList | ||
Make sure you have checked **all** steps below to ensure a timely review. | ||
- [ ] PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2" | ||
- In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). | ||
- [ ] PR includes a summary of changes. | ||
- [ ] PR adds unit tests, updates existing unit tests, __OR__ documents why no test additions or modifications are needed. | ||
- [ ] In case of new functionality, my PR adds documentation that describes how to use it. | ||
- All the public functions and the classes in the PR contain docstrings that explain what it does | ||
- [ ] PR passes `make test` | ||
- [ ] I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,10 +6,11 @@ | |
*.egg-info | ||
.*.swp | ||
.DS_Store | ||
build/ | ||
dist/ | ||
venv/ | ||
venv3/ | ||
.cache/ | ||
build/ | ||
.idea/ | ||
.coverage | ||
*coverage.xml | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Atlas search investigation | ||
There are several approaches to integrate searching within [Apache Atlas](https://atlas.apache.org/ "Apache Atlas"), we describe multiple options below: | ||
|
||
- Use REST API's | ||
|
||
Directly using the Atlas API's is quick to implement and easy to setup for administrators. Atlas uses a search engine | ||
underwater (embedded Solr) to perform search queries, thus in theory this method should scale up. Disadvantages are that | ||
we are limited to the REST api that Atlas offers, we could potentially add functionality via pull requests and extend | ||
the search capabilities. The [advanced search](https://atlas.apache.org/Search-Advanced.html "Apache Atlas Advanced Search") | ||
provides a DSL which contains basic forms of aggregation and arithmetic. | ||
|
||
- Use Data Builder to fill Elasticsearch from Atlas | ||
|
||
Adopting Atlas within the Data Builder to fill Elasticsearch is a relatively straightforward way of staying | ||
compatible with the Neo4j database. It could either be pulling data from Atlas or being pushed by Kafka. This method | ||
requires a setup of Elasticsearch and Airflow, which increases the amount of infrastructure and maintenance. | ||
Another disadvantage is that with a big inflow of metadata this method might not scale as well as the other methods. | ||
|
||
- Use underlying Solr or Elasticsearch from Apache Atlas | ||
|
||
Within Atlas there is the possibility to open up either Solr or the experimental Elasticsearch. It depends on janusgraph | ||
(the behind the scenes graph database) which populates the search engine. Therefore the search engine would not be compatible with | ||
the data builder setup. Adoption of such a search engine would require either new queries, some kind of transformer | ||
within the search engine, or changes within Atlas itself. | ||
|
||
## Discussion | ||
Both the REST API approach and the data builder approach can be implemented and be configurable. Both approaches have | ||
their own benefits, the data builder together provides a more fine-tuned search whereas the Atlas REST API comes out | ||
of the box with Atlas. The last approach of using the underlying search engine from Atlas provides direct access | ||
to all the meta data with a decent search API. However, integration would be less straight forward as the indexes would | ||
differ from the data builders search engine loader. | ||
|
||
|
||
The focus is initially to implement the REST API approach and afterwards potentially implement an Atlas data extractor | ||
and importer within the Amundsen Data Builder. So that administrators have more flexibility in combining data sources. |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,28 +1,53 @@ | ||
import os | ||
|
||
ELASTICSEARCH_ENDPOINT_KEY = 'ELASTICSEARCH_ENDPOINT' | ||
ELASTICSEARCH_INDEX_KEY = 'ELASTICSEARCH_INDEX' | ||
ELASTICSEARCH_AUTH_USER_KEY = 'ELASTICSEARCH_AUTH_USER' | ||
ELASTICSEARCH_AUTH_PW_KEY = 'ELASTICSEARCH_AUTH_PW' | ||
ELASTICSEARCH_CLIENT_KEY = 'ELASTICSEARCH_CLIENT' | ||
SEARCH_PAGE_SIZE_KEY = 'SEARCH_PAGE_SIZE' | ||
STATS_FEATURE_KEY = 'STATS' | ||
|
||
PROXY_ENDPOINT = 'PROXY_ENDPOINT' | ||
PROXY_USER = 'PROXY_USER' | ||
PROXY_PASSWORD = 'PROXY_PASSWORD' | ||
PROXY_CLIENT = 'PROXY_CLIENT' | ||
PROXY_CLIENT_KEY = 'PROXY_CLIENT_KEY' | ||
PROXY_CLIENTS = { | ||
'ELASTICSEARCH': 'search_service.proxy.elasticsearch.ElasticsearchProxy', | ||
'ATLAS': 'search_service.proxy.atlas.AtlasProxy' | ||
} | ||
|
||
|
||
class Config: | ||
LOG_FORMAT = '%(asctime)s.%(msecs)03d [%(levelname)s] %(module)s.%(funcName)s:%(lineno)d (%(process)d:'\ | ||
'%(threadName)s) - %(message)s' | ||
LOG_DATE_FORMAT = '%Y-%m-%dT%H:%M:%S%z' | ||
LOG_LEVEL = 'INFO' | ||
|
||
# Used to differentiate tables with other entities in Atlas. For more details: | ||
# https://github.com/lyft/amundsenmetadatalibrary/blob/master/docs/proxy/atlas_proxy.md | ||
ATLAS_TABLE_ENTITY = 'Table' | ||
|
||
# The relationalAttribute name of Atlas Entity that identifies the database entity. | ||
ATLAS_DB_ATTRIBUTE = 'db' | ||
|
||
# Display name of Atlas Entities that we use for amundsen project. | ||
# Atlas uses qualifiedName as indexed attribute. but also supports 'name' attribute. | ||
ATLAS_NAME_ATTRIBUTE = 'qualifiedName' | ||
|
||
# Config used by ElastichSearch | ||
ELASTICSEARCH_INDEX = '_all' | ||
|
||
|
||
class LocalConfig(Config): | ||
DEBUG = False | ||
TESTING = False | ||
STATS = True | ||
STATS = False | ||
LOCAL_HOST = '0.0.0.0' | ||
ELASTICSEARCH_ENDPOINT = os.environ.get('ELASTICSEARCHSERVICE', | ||
'http://{LOCAL_HOST}:9200'.format(LOCAL_HOST=LOCAL_HOST)) | ||
ELASTICSEARCH_INDEX = 'tables_alias' | ||
ELASTICSEARCH_AUTH_USER = 'elastic' | ||
ELASTICSEARCH_AUTH_PW = 'elastic' | ||
PROXY_PORT = '9200' | ||
PROXY_ENDPOINT = os.environ.get('PROXY_ENDPOINT', | ||
'http://{LOCAL_HOST}:{PORT}'.format( | ||
LOCAL_HOST=LOCAL_HOST, | ||
PORT=PROXY_PORT) | ||
) | ||
PROXY_CLIENT = PROXY_CLIENTS[os.environ.get('PROXY_CLIENT', 'ELASTICSEARCH')] | ||
PROXY_CLIENT_KEY = os.environ.get('PROXY_CLIENT_KEY') | ||
PROXY_USER = os.environ.get('CREDENTIALS_PROXY_USER', 'elastic') | ||
PROXY_PASSWORD = os.environ.get('CREDENTIALS_PROXY_PASSWORD', 'elastic') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
from typing import Iterable | ||
|
||
|
||
class Dashboard: | ||
def __init__(self, *, | ||
dashboard_group: str, | ||
dashboard_name: str, | ||
description: str, | ||
last_reload_time: list, | ||
user_id: str, | ||
user_name: str, | ||
tags: str) -> None: | ||
self.dashboard_group = dashboard_group | ||
self.dashboard_name = dashboard_name | ||
self.description = description | ||
self.last_reload_time = last_reload_time | ||
self.user_id = user_id | ||
self.user_name = user_name | ||
self.tags = tags | ||
|
||
def __repr__(self) -> str: | ||
return 'Dashboard(dashboard_group={!r}, dashboard_name={!r}, ' \ | ||
'description={!r}, last_reload_time={!r}, user_id={!r},' \ | ||
'user_name={!r}, tags={!r})' \ | ||
.format(self.dashboard_group, | ||
self.dashboard_name, | ||
self.description, | ||
self.last_reload_time, | ||
self.user_id, | ||
self.user_name, | ||
self.tags) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
from typing import Iterable | ||
|
||
|
||
class Metric: | ||
def __init__(self, *, | ||
dashboard_group: str, | ||
dashboard_name: str, | ||
metric_name: str, | ||
metric_function: list, | ||
metric_description: str, | ||
metric_type: str, | ||
metric_group: str) -> None: | ||
self.dashboard_group = dashboard_group | ||
self.dashboard_name = dashboard_name | ||
self.metric_name = metric_name | ||
self.metric_function = metric_function | ||
self.metric_description = metric_description | ||
self.metric_type = metric_type | ||
self.metric_group = metric_group | ||
|
||
def __repr__(self) -> str: | ||
return 'Metric(dashboard_group={!r}, dashboard_name={!r}, ' \ | ||
'metric_name={!r}, metric_function={!r}, metric_description={!r},' \ | ||
'metric_type={!r}, metric_group={!r})' \ | ||
.format(self.dashboard_group, | ||
self.dashboard_name, | ||
self.metric_name, | ||
self.metric_function, | ||
self.metric_description, | ||
self.metric_type, | ||
self.metric_group) |
Oops, something went wrong.