Skip to content

A Terraform module to deploy and manage datasets in Google BigQuery, a serverless, highly scalable, and cost-effective multicloud data warehouse designed for business agility.in Google Cloud https://cloud.google.com/bigquery

License

Notifications You must be signed in to change notification settings

mineiros-io/terraform-google-bigquery-dataset

Repository files navigation

Build Status GitHub tag (latest SemVer) Terraform Version Google Provider Version Join Slack

terraform-google-bigquery-dataset

A Terraform module to create a Google Bigquery Dataset on Google Cloud Services (GCP).

This module supports Terraform version 1 and is compatible with the Terraform Google Provider version 4. and 5._**

This module is part of our Infrastructure as Code (IaC) framework that enables our users and customers to easily deploy and manage reusable, secure, and production-grade cloud infrastructure.

Module Features

A Terraform base module for creating a google_bigquery_dataset resources. Datasets are top-level containers that are used to organize and control access to your tables and views.

Getting Started

Most basic usage just setting required arguments:

module "terraform-google-bigquery-dataset" {
  source = "github.com/mineiros-io/terraform-google-bigquery-dataset.git?ref=v0.1.1"

  dataset_id = "example_dataset"
}

Module Argument Reference

See variables.tf and examples/ for details and use-cases.

Top-level Arguments

Main Resource Configuration

  • dataset_id: (Required string)

    A unique ID for this dataset, without the project name.

  • friendly_name: (Optional string)

    A descriptive name for the dataset.

  • description: (Optional string)

    A user-friendly description of the dataset.

  • project: (Optional string)

    The ID of the project in which the resource belongs. Default is the project that is configured in the provider.

  • location: (Optional string)

    The geographic location where the dataset should reside.

  • default_table_expiration_ms: (Optional number)

    The default lifetime of all tables in the dataset, in milliseconds. Once this property is set, all newly-created partitioned tables in the dataset will have an expirationMs property in the timePartitioning settings set to this value, and changing the value will only affect new tables, not existing ones. The storage in a partition will have an expiration time of its partition time plus this value. Setting this property overrides the use of defaultTableExpirationMs for partitioned tables: only one of defaultTableExpirationMs and defaultPartitionExpirationMs will be used for any new partitioned table. If you provide an explicit timePartitioning.expirationMs when creating or updating a partitioned table, that value takes precedence over the default partition expiration time indicated by this property.

    Default is null.

  • default_partition_expiration_ms: (Optional number)

    The default partition expiration for all partitioned tables in the dataset, in milliseconds.The minimum value is 3600000 milliseconds (one hour).

  • labels: (Optional map(string))

    Key value pairs in a map for dataset lab.

    Default is {}.

  • resource_tags: (Optional any)

    The tags attached to this table. Tag keys are globally unique. Tag key is expected to be in the namespaced format, for example "123456789012/environment" where 123456789012 is the ID of the parent organization or project resource for this tag key. Tag value is expected to be the short name, for example "Production".

    Default is null.

  • max_time_travel_hours: (Optional number)

    Defines the time travel window in hours. The value can be from 48 to 168 hours (2 to 7 days).

    Default is null.

  • external_dataset_reference: (Optional any)

    Information about the external metadata storage where the dataset is defined." Supported attributes:

    • external_source - (Required) External source that backs this dataset.
    • connection - (Required) The connection id that is used to access the externalSource. Format: projects/{projectId}/locations/{locationId}/connections/{connectionId}

    Default is null.

  • is_case_insensitive: (Optional bool)

    TRUE if the dataset and its table names are case-insensitive, otherwise FALSE. By default, this is FALSE, which means the dataset and its table names are case-sensitive. This field does not affect routine references.

    Default is false.

  • access: (Optional list(access))

    An array of objects that define dataset access for one or more entities.

    Default is [].

    Each access object in the list accepts the following attributes:

    • domain: (Optional string)

      A domain to grant access to. Any users signed in with the domain specified will be granted the specified access.

    • role: (Optional string)

      Describes the rights granted to the user specified by the other member of the access object. Basic, predefined, and custom roles are supported. Predefined roles that have equivalent basic roles are swapped by the API to their basic counterparts.

    • group_by_email: (Optional string)

      An email address of a Google Group to grant access to.

    • user_by_email: (Optional string)

      An email address of a Google User to grant access to.

    • special_group: (Optional string)

      A special group to grant access to. Possible values include:

      • projectOwners: Owners of the enclosing project.
      • projectReaders: Readers of the enclosing project.
      • projectWriters: Writers of the enclosing project.
      • allAuthenticatedUsers: All authenticated BigQuery users.
  • view: (Optional object(view))

    A view from a different dataset to grant access to.

    Default is [].

    The view object accepts the following attributes:

    • project_id: (Required string)

      The ID of the project containing this table.

    • table_id: (Required string)

      The ID of the table.

    • dataset_id: (Required string)

      The ID of the dataset containing this table.

  • role: (Optional map(role))

    (Optional) A map of dataset-level roles including the role, special_group, group_by_email, and user_by_email

    Default is [].

  • default_encryption_configuration: (Optional object(default_encryption_configuration))

    The default encryption key for all tables in the dataset. Once this property is set, all newly-created partitioned tables in the dataset will have encryption key set to this value, unless table creation request (or query) overrides the key.

    The default_encryption_configuration object accepts the following attributes:

    • kms_key_name: (Required string)

      Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key.

  • delete_contents_on_destroy: (Optional bool)

    If set to true, delete all the tables in the dataset when destroying the resource; otherwise, destroying the resource will fail if tables are present.

    Default is false.

  • authoritative: (Optional bool)

    Whether to exclusively set (authoritative mode) or add (non-authoritative/additive mode) members to the role.

    Default is true.

  • iam: (Optional list(iam))

    A list of IAM access to apply to the created BigQuery dataset.

    Default is [].

    Each iam object in the list accepts the following attributes:

    • role: (Required string)

      The role that should be applied. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

    • members: (Optional set(string))

      Identities that will be granted the privilege in role. Each entry can have one of the following values:

      • allUsers: A special identifier that represents anyone who is on the internet; with or without a Google account.
      • allAuthenticatedUsers: A special identifier that represents anyone who is authenticated with a Google account or a service account.
      • user:{emailid}: An email address that represents a specific Google account. For example, [email protected] or [email protected].
      • serviceAccount:{emailid}: An email address that represents a service account. For example, [email protected].
      • group:{emailid}: An email address that represents a Google group. For example, [email protected].
      • domain:{domain}: A G Suite domain (primary, instead of alias) name that represents all the users of that domain. For example, google.com or example.com.

      Default is [].

    • authoritative: (Optional bool)

      Whether to exclusively set (authoritative mode) or add (non-authoritative/additive mode) members to the role.

      Default is true.

Module Configuration

  • module_enabled: (Optional bool)

    Specifies whether resources in the module will be created.

    Default is true.

  • module_depends_on: (Optional list(dependency))

    A list of dependencies. Any object can be assigned to this list to define a hidden external dependency.

    Example:

    module_depends_on = [
      google_network.network
    ]

Module Outputs

The following attributes are exported in the outputs of the module:

  • google_bigquery_dataset: (object(google_bigquery_dataset))

    The google_bigquery_dataset resource object created by this module.

  • iam: (list(iam))

    The resources created by mineiros-io/bigquery-dataset-iam/google module.

External Documentation

Google Documentation

Terraform Google Provider Documentation

Module Versioning

This Module follows the principles of Semantic Versioning (SemVer).

Given a version number MAJOR.MINOR.PATCH, we increment the:

  1. MAJOR version when we make incompatible changes,
  2. MINOR version when we add functionality in a backwards compatible manner, and
  3. PATCH version when we make backwards compatible bug fixes.

Backwards compatibility in 0.0.z and 0.y.z version

  • Backwards compatibility in versions 0.0.z is not guaranteed when z is increased. (Initial development)
  • Backwards compatibility in versions 0.y.z is not guaranteed when y is increased. (Pre-release)

About Mineiros

Mineiros is a remote-first company headquartered in Berlin, Germany that solves development, automation and security challenges in cloud infrastructure.

Our vision is to massively reduce time and overhead for teams to manage and deploy production-grade and secure cloud infrastructure.

We offer commercial support for all of our modules and encourage you to reach out if you have any questions or need help. Feel free to email us at [email protected] or join our Community Slack channel.

Reporting Issues

We use GitHub Issues to track community reported issues and missing features.

Contributing

Contributions are always encouraged and welcome! For the process of accepting changes, we use Pull Requests. If you'd like more information, please see our Contribution Guidelines.

Makefile Targets

This repository comes with a handy Makefile. Run make help to see details on each available target.

License

license

This module is licensed under the Apache License Version 2.0, January 2004. Please see LICENSE for full details.

Copyright © 2020-2022 Mineiros GmbH

About

A Terraform module to deploy and manage datasets in Google BigQuery, a serverless, highly scalable, and cost-effective multicloud data warehouse designed for business agility.in Google Cloud https://cloud.google.com/bigquery

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published