Skip to content

Commit 1a61ed4

Browse files
committed
Add documentation + session update sub-command
1 parent 706d277 commit 1a61ed4

File tree

3 files changed

+214
-27
lines changed

3 files changed

+214
-27
lines changed

docs/cli.md

Lines changed: 92 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -45,16 +45,14 @@ Like most command line tools, `--help` is your best friend. Use it to discover t
4545

4646
```console
4747
datahub --help
48+
py metadata-ingestion nix impure 13:06:12
4849
Usage: datahub [OPTIONS] COMMAND [ARGS]...
4950

5051
Options:
51-
--debug / --no-debug Enable debug logging.
52-
--log-file FILE Enable debug logging.
53-
--debug-vars / --no-debug-vars Show variable values in stack traces. Implies --debug. While we try to avoid
54-
printing sensitive information like passwords, this may still happen.
55-
--version Show the version and exit.
56-
-dl, --detect-memory-leaks Run memory leak detection.
57-
--help Show this message and exit.
52+
--debug / --no-debug Enable debug logging.
53+
--log-file FILE Write debug-level logs to a file.
54+
--version Show the version and exit.
55+
--help Show this message and exit.
5856

5957
Commands:
6058
actions <disabled due to missing dependencies>
@@ -76,6 +74,7 @@ Commands:
7674
migrate Helper commands for migrating metadata within DataHub.
7775
properties A group of commands to interact with structured properties in DataHub.
7876
put A group of commands to put metadata in DataHub.
77+
session Manage DataHub session profiles
7978
state Managed state stored in DataHub by stateful ingestion.
8079
telemetry Toggle telemetry.
8180
timeline Get timeline for an entity based on certain categories
@@ -280,16 +279,83 @@ DATAHUB_TELEMETRY_TIMEOUT=10
280279
DATAHUB_DEBUG=false
281280
```
282281

283-
### container
282+
### session
284283

285-
A group of commands to interact with containers in DataHub.
284+
**🤝 Version Compatibility:** `acryl-datahub>=0.15.0.5`
285+
286+
The `session` group of commands allows you to manage DataHub sessions to multiple instances using the same cli. The session information will be store in a local file under `~/.datahub/sessions.json`.
287+
which can be inspected at any time.
286288

287-
e.g. You can use this to apply a tag to all datasets recursively in this container.
288289
```shell
289-
datahub container tag --container-urn "urn:li:container:0e9e46bd6d5cf645f33d5a8f0254bc2d" --tag-urn "urn:li:tag:tag1"
290-
datahub container domain --container-urn "urn:li:container:3f2effd1fbe154a4d60b597263a41e41" --domain-urn "urn:li:domain:ajsajo-b832-4ab3-8881-7ed5e991a44c"
291-
datahub container owner --container-urn "urn:li:container:3f2effd1fbe154a4d60b597263a41e41" --owner-urn "urn:li:corpGroup:[email protected]"
292-
datahub container term --container-urn "urn:li:container:3f2effd1fbe154a4d60b597263a41e41" --term-urn "urn:li:term:PII"
290+
datahub session --help
291+
Usage: datahub session [OPTIONS] COMMAND [ARGS]...
292+
293+
Manage DataHub session profiles
294+
295+
Options:
296+
--help Show this message and exit.
297+
298+
Commands:
299+
create Create profile with which to connect to a DataHub instance
300+
delete Delete a session profile
301+
list List all session profiles
302+
save Save the current active datahubenv config as a session
303+
use Set the active session
304+
```
305+
306+
Here we detail the sub-commands available under the dataproduct group of commands:
307+
308+
#### create
309+
310+
Use this command to create a new datahub session. This is similar to [datahub init](#init) with an extra step to fill out the name of the *profile* under which to save the session. The profile name is important as it uniquely identifies the name of the session.
311+
312+
```shell
313+
datahub session create
314+
```
315+
316+
:::note
317+
This command has a flag `--use-password`, that can be used to authenticate to the instance using a username/password combo and dynamically generate an access token with an 1 hour validaty for that username.
318+
:::
319+
320+
#### update
321+
322+
Use this command to update an existing session identified by the profile name.
323+
A form will then appear for the user to fill out, this will be prefilled with the existing value if it exists or a default exists.
324+
325+
```shell
326+
datahub session update -p <profile name>
327+
```
328+
329+
#### delete
330+
331+
Use this command to delete an existing session identified by the profile name.
332+
333+
```shell
334+
datahub session delete -p <profile name>
335+
```
336+
337+
#### list
338+
339+
Use this command to list all existing sessions available in `~/.datahub/sessions.json`.
340+
341+
```shell
342+
datahub session list
343+
```
344+
345+
#### save
346+
347+
Use this command to save the existing configuration stored in `~/.datahubenv` as a session with the specified name.
348+
349+
```shell
350+
datahub session save -p <profile name>
351+
```
352+
353+
#### use
354+
355+
Command used to specify which profile to use. This overwrites whatever is in `~/.datahubenv`.
356+
357+
```shell
358+
datahub session use -p <profile name>
293359
```
294360

295361
### check
@@ -626,6 +692,18 @@ Use this to delete a Data Product from DataHub. Default to `--soft` which preser
626692
# > datahub dataproduct delete --urn "urn:li:dataProduct:pet_of_the_week" --hard
627693
```
628694

695+
### container
696+
697+
A group of commands to interact with containers in DataHub.
698+
699+
e.g. You can use this to apply a tag to all datasets recursively in this container.
700+
```shell
701+
datahub container tag --container-urn "urn:li:container:0e9e46bd6d5cf645f33d5a8f0254bc2d" --tag-urn "urn:li:tag:tag1"
702+
datahub container domain --container-urn "urn:li:container:3f2effd1fbe154a4d60b597263a41e41" --domain-urn "urn:li:domain:ajsajo-b832-4ab3-8881-7ed5e991a44c"
703+
datahub container owner --container-urn "urn:li:container:3f2effd1fbe154a4d60b597263a41e41" --owner-urn "urn:li:corpGroup:[email protected]"
704+
datahub container term --container-urn "urn:li:container:3f2effd1fbe154a4d60b597263a41e41" --term-urn "urn:li:term:PII"
705+
```
706+
629707
## Miscellaneous Admin Commands
630708

631709
### lite (experimental)

metadata-ingestion/src/datahub/cli/session_cli.py

Lines changed: 114 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,76 @@
11
import json
2+
import copy
3+
from math import log
24
import os
3-
from typing import Dict
4-
55
import click
6+
import logging
7+
from sqlalchemy import null
68
from tabulate import tabulate
79

810
from datahub.cli.cli_utils import fixup_gms_url, generate_access_token
911
from datahub.cli.config_utils import load_client_config, persist_raw_datahub_config
1012
from datahub.ingestion.graph.config import DatahubClientConfig
1113

14+
logger = logging.getLogger(__name__)
1215
DATAHUB_SESSIONS_PATH = os.path.expanduser("~/.datahub/sessions.json")
1316

17+
from typing import Optional, Type, Any
18+
import click
1419

15-
def load_sessions() -> Dict[str, DatahubClientConfig]:
20+
def parse_value(value: str, expected_type: Optional[Type] = None) -> tuple[Any, Optional[str]]:
21+
22+
if not value:
23+
return None, None
24+
if expected_type is None:
25+
return value, None
26+
27+
try:
28+
if expected_type == str:
29+
return value, None
30+
elif expected_type == float:
31+
return float(value), None
32+
elif expected_type == bool:
33+
return bool(value), None
34+
elif expected_type == list:
35+
return [item.strip() for item in value.split(',') if item.strip()], None
36+
elif expected_type == dict:
37+
pairs = [pair.strip().split(':') for pair in value.split(',') if pair.strip()]
38+
return {k.strip(): v.strip() for k, v in pairs}, None
39+
else:
40+
return None, f"Unsupported type: {expected_type}"
41+
except Exception as e:
42+
return None, str(e)
43+
44+
45+
def dynamic_prompt(prompt_text: str, expected_type: Optional[Type] = None, default: Optional[object] = None) -> Any:
46+
type_hint = {
47+
str: "text",
48+
float: "number",
49+
bool: "boolean",
50+
list: "comma-separated values",
51+
dict: "key1:value1,key2:value2",
52+
None: "any value"
53+
}.get(expected_type, "value")
54+
55+
if default is not None:
56+
prompt_text += f" [{str(default)}]"
57+
58+
while True:
59+
if default is not None:
60+
value = click.prompt(prompt_text, default=default, show_default=False)
61+
else:
62+
value = click.prompt(prompt_text, default='', show_default=False)
63+
result, error = parse_value(value, expected_type)
64+
65+
if error and value: # Only show error and retry if there was input
66+
click.echo(f"Error: {error}")
67+
retry = click.confirm("Would you like to try again?", default=True)
68+
if not retry:
69+
return None
70+
else:
71+
return result
72+
73+
def load_sessions() -> dict[str, DatahubClientConfig]:
1674
if not os.path.exists(DATAHUB_SESSIONS_PATH):
1775
return {}
1876
with open(DATAHUB_SESSIONS_PATH, "r") as f:
@@ -23,7 +81,7 @@ def load_sessions() -> Dict[str, DatahubClientConfig]:
2381
}
2482

2583

26-
def save_sessions(sessions: Dict[str, DatahubClientConfig]) -> None:
84+
def save_sessions(sessions: dict[str, DatahubClientConfig]) -> None:
2785
os.makedirs(os.path.dirname(DATAHUB_SESSIONS_PATH), exist_ok=True)
2886
with open(DATAHUB_SESSIONS_PATH, "w") as f:
2987
json.dump(
@@ -37,6 +95,19 @@ def session() -> None:
3795
pass
3896

3997

98+
@session.command()
99+
def list() -> None:
100+
"""List all session profiles"""
101+
sessions = load_sessions()
102+
if not sessions:
103+
click.echo("No profiles found")
104+
return
105+
106+
headers = ["Profile", "URL"]
107+
table_data = [[name, config.server] for name, config in sessions.items()]
108+
click.echo(tabulate(table_data, headers=headers))
109+
110+
40111
@session.command()
41112
@click.option(
42113
"--use-password",
@@ -75,27 +146,50 @@ def create(use_password: bool) -> None:
75146

76147
profile_name = click.prompt("Enter name for profile", type=str)
77148

149+
# TODO if profile_name already exists, cancel the create, ask the customer to delete the old one or update instead
78150
config = DatahubClientConfig(server=host, token=token)
79151
sessions[profile_name] = config
80152
save_sessions(sessions)
81153
click.echo(f"Created profile: {profile_name}")
82154

83155

84156
@session.command()
85-
def list() -> None:
86-
"""List all session profiles"""
157+
@click.option(
158+
"-p",
159+
"--profile",
160+
type=str,
161+
required=True,
162+
help="Name of profile to delete",
163+
)
164+
def update(profile: str) -> None:
165+
"""Update a session profile"""
87166
sessions = load_sessions()
88-
if not sessions:
89-
click.echo("No profiles found")
167+
if profile not in sessions:
168+
click.echo(f"Profile {profile} not found")
90169
return
91170

92-
headers = ["Profile", "URL"]
93-
table_data = [[name, config.server] for name, config in sessions.items()]
94-
click.echo(tabulate(table_data, headers=headers))
171+
config: DatahubClientConfig = sessions[profile]
172+
config_copy: DatahubClientConfig = copy.deepcopy(config)
173+
for key, value in config_copy.dict().items():
174+
# Get type information for each field in DatahubClientConfig so we can generate
175+
# a dynamic, type-safe click prompt for it.
176+
field_type = config_copy.__fields__.get(key).type_
177+
new_value = dynamic_prompt(f"Enter new value for {key}", field_type, value)
178+
config_copy.__setattr__(key, new_value)
95179

180+
sessions[profile] = config_copy
181+
save_sessions(sessions)
182+
click.echo(f"Updated profile: {profile}")
183+
96184

97185
@session.command()
98-
@click.argument("profile", type=str)
186+
@click.option(
187+
"-p",
188+
"--profile",
189+
type=str,
190+
required=True,
191+
help="Name of profile to delete",
192+
)
99193
def delete(profile: str) -> None:
100194
"""Delete a session profile"""
101195
sessions = load_sessions()
@@ -109,7 +203,13 @@ def delete(profile: str) -> None:
109203

110204

111205
@session.command()
112-
@click.argument("profile", type=str)
206+
@click.option(
207+
"-p",
208+
"--profile",
209+
type=str,
210+
required=True,
211+
help="Name of profile to use",
212+
)
113213
def use(profile: str) -> None:
114214
"""Set the active session"""
115215
sessions = load_sessions()
@@ -124,6 +224,7 @@ def use(profile: str) -> None:
124224

125225
@session.command()
126226
@click.option(
227+
"-p",
127228
"--profile",
128229
type=str,
129230
required=True,

metadata-ingestion/src/datahub/configuration/common.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
)
1616

1717
import pydantic
18+
from pydantic.fields import ModelField
1819
from cached_property import cached_property
1920
from pydantic import BaseModel, Extra, ValidationError
2021
from pydantic.fields import Field
@@ -122,6 +123,13 @@ def parse_obj_allow_extras(cls, obj: Any) -> Self:
122123
with unittest.mock.patch.object(cls.Config, "extra", pydantic.Extra.allow):
123124
return cls.parse_obj(obj)
124125

126+
@classmethod
127+
def is_field_required(cls, field_name: str) -> bool:
128+
field: ModelField | None = cls.__fields__.get(field_name, None)
129+
if field:
130+
return field.required
131+
return False
132+
125133

126134
class PermissiveConfigModel(ConfigModel):
127135
# A permissive config model that allows extra fields.

0 commit comments

Comments
 (0)