You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Push ad-hoc queries down to the storage layer
For large datasets, this is considerably more efficient than fetching the entire dataset
For small datasets fetching is still cheaper, so ideally there should be a heuristic
Base implementation uses a long-lived instance of the runtime running as a service, which executes queries as SQL models. This has the advantage of being portable to deployment situations where native query capabilities are not easily available. Push down is achieved for large dataset using Spark and for small datasets using Arrow.
Optimised implementations can be added for the cloud providers and Hadoop, which all have technologies for creating query interfaces over files held in storage. Push-down is achieved by converting standard SQL into a query on the underlying data technology, the data service creates / destroys queryable tables in the infrastructure on demand and track them using updates to the storage definition.
The text was updated successfully, but these errors were encountered:
Push ad-hoc queries down to the storage layer
For large datasets, this is considerably more efficient than fetching the entire dataset
For small datasets fetching is still cheaper, so ideally there should be a heuristic
Base implementation uses a long-lived instance of the runtime running as a service, which executes queries as SQL models. This has the advantage of being portable to deployment situations where native query capabilities are not easily available. Push down is achieved for large dataset using Spark and for small datasets using Arrow.
Optimised implementations can be added for the cloud providers and Hadoop, which all have technologies for creating query interfaces over files held in storage. Push-down is achieved by converting standard SQL into a query on the underlying data technology, the data service creates / destroys queryable tables in the infrastructure on demand and track them using updates to the storage definition.
The text was updated successfully, but these errors were encountered: