Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Develop Apache Ranger Plugin for Polaris to Enhance Access Control for Apache Iceberg #274

Open
dbosco opened this issue Sep 8, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@dbosco
Copy link

dbosco commented Sep 8, 2024

Is your feature request related to a problem? Please describe.

No response

Describe the solution you'd like

Apache Polaris provides metadata management for Apache Iceberg. From the authorization point of view, key features of Polaris include:

  • RBAC (Role-Based Access Control): Polaris supports RBAC for table and view-level operations. See Documentation
  • Role Management: Polaris allows the creation of Principals with roles like Data Engineer, Data Scientist, etc.
  • Catalog Roles: Specialized roles like Catalog Administrators, Catalog Readers, and Catalog Contributors can be defined to manage access to different parts of the data catalog.
  • Granular Privileges: Polaris provides fine-grained privileges for operations on Tables, Views, Namespaces, and Catalogs. Examples include TABLE_CREATE, TABLE_READ_DATA, TABLE_WRITE_DATA, VIEW_CREATE, NAMESPACE_CREATE, CATALOG_MANAGE_CONTENT, and more.
  • Credential Vending: Polaris vends credentials based on the specific table the user is trying to access.
  • API for Role Management: Polaris offers an API to manage grants for roles, allowing fine-tuned control over data access.

Objective:

To enhance the usability and security of Polaris for Apache Iceberg users, the request is to develop an Apache Ranger plugin that integrates Polaris' access control features with Apache Ranger. This integration will allow for centralized and consistent management of access policies, audit logging, and fine-grained access control across different tools used with Apache Iceberg.

Use Cases:

  1. Centralized Access Policy Management:
  • Implement centralized and consistent management of access policies for data stored using Apache Iceberg across multiple tools and environments.
  1. Access Control for Data Engineering Workloads:
  • Manage and control access to datasets used by Data Engineering workloads (e.g., Apache Spark) with a coarser-grained approach at the table level.
  1. Fine-Grained Access Control for Data Analysts:
  • Provide fine-grained access control for Data Analysts using compute engines like Trino. This control can be enforced by leveraging the native Ranger Plugin in Trino, allowing for more granular control over data access at the table, view, or even column level.
  1. Centralized Access Auditing:
  • Enable centralized collection and analysis of access audit logs across all tools used to access datasets in Iceberg, ensuring comprehensive auditing and compliance.

Expected Deliverables:

  • A fully functional Apache Ranger plugin for Polaris that supports the outlined use cases.
  • Documentation on how to configure and deploy the plugin.
  • Integration tests to ensure the plugin works as expected with Apache Iceberg and other tools like Apache Spark and Trino.
  • A detailed user guide explaining how to use the plugin for managing access control in various scenarios.

Describe alternatives you've considered

No response

Additional context

References

PolarisAuthorizer Class on GitHub: The PolarisAuthorizer class provides the core authorization logic in Polaris, which can be leveraged by the Apache Ranger plugin.

Most Apache projects and Open Source projects like Presto (https://prestodb.io/docs/current/connector/hive-security.html#ranger-based-authorization) , Trino (trinodb/trino#22674), Apache Hive (https://github.com/apache/ranger/tree/master/hive-agent), Apache Kafka (https://cwiki.apache.org/confluence/display/RANGER/Kafka+Plugin have native integration with Apache Ranger. Some of these might also benefit with this integration

A corresponding tracking JIRA is also created in the Apache Ranger project. https://issues.apache.org/jira/browse/RANGER-4910

@dbosco dbosco added the enhancement New feature or request label Sep 8, 2024
@csi-mboero
Copy link

Hello all,
I share a strong interest in this issue and I'm looking forward to any updates or insights. Thanks for addressing it!

@sankalp-vairat
Copy link

Hello,
This feature would be extremely beneficial for implementing fine-grained access control for Apache Iceberg. Looking forward to updates !

@jbonofre
Copy link
Member

Clearly a great proposal (and planned to be honest). We love contribution ;)

@dbosco
Copy link
Author

dbosco commented Oct 24, 2024

@jbonofre I am happy to start working on some design considerations. Let me know if Polaris is following any design template that I can follow, or I start with an initial document and we can then iterate over it. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants
@jbonofre @dbosco @sankalp-vairat @csi-mboero and others