Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow return query response in columnar format #14677

Open
xiangfu0 opened this issue Dec 17, 2024 · 4 comments
Open

Allow return query response in columnar format #14677

xiangfu0 opened this issue Dec 17, 2024 · 4 comments

Comments

@xiangfu0
Copy link
Contributor

It will be simpler for users who operate data on columnar manner without reassemble the data structure.

Could be a query option to turn it on/off.

@gortiz
Copy link
Contributor

gortiz commented Dec 24, 2024

Do you mean something like returning data in arrow format? That would be awesome.

@xiangfu0
Copy link
Contributor Author

@gortiz that would be different, I would expect that to be a different parameter like encoding

@gortiz
Copy link
Contributor

gortiz commented Dec 30, 2024

I don't get it. Could you describe the task further?

Does it affect the broker and controller rest api? The GRPC direct access? The current REST format is something like:

{
  "resultTable": {
    "dataSchema": {
      "columnNames": [
        "c1",
        "c2"
      ],
      "columnDataTypes": [
        "INT",
        "STRING"
      ]
    },
    "rows": [
      [
        208,
        "a"
      ],
      [
        243,
        "b"
      ],
      [
        279,
        "c"
      ]
    ]
  }
}

Do you suggest to change it to something like:

{
  "resultTable": {
    "dataSchema": {
      "columnNames": [
        "c1",
        "c2"
      ],
      "columnDataTypes": [
        "INT",
        "STRING"
      ]
    },
    "columns": [
      [
        208,
        243,
        279
      ],
      [
        "a",
        "b",
        "c"
      ]
    ]
  },

?

That could be done with a query param, but we could also return the same data in arrow format by changing the Accept: application/json to Accept: application/vnd.apache.arrow.stream or Accept: application/vnd.apache.arrow.file

@xiangfu0
Copy link
Contributor Author

I don't get it. Could you describe the task further?

Does it affect the broker and controller rest api? The GRPC direct access? The current REST format is something like:

{
  "resultTable": {
    "dataSchema": {
      "columnNames": [
        "c1",
        "c2"
      ],
      "columnDataTypes": [
        "INT",
        "STRING"
      ]
    },
    "rows": [
      [
        208,
        "a"
      ],
      [
        243,
        "b"
      ],
      [
        279,
        "c"
      ]
    ]
  }
}

Do you suggest to change it to something like:

{
  "resultTable": {
    "dataSchema": {
      "columnNames": [
        "c1",
        "c2"
      ],
      "columnDataTypes": [
        "INT",
        "STRING"
      ]
    },
    "columns": [
      [
        208,
        243,
        279
      ],
      [
        "a",
        "b",
        "c"
      ]
    ]
  },

?

That could be done with a query param, but we could also return the same data in arrow format by changing the Accept: application/json to Accept: application/vnd.apache.arrow.stream or Accept: application/vnd.apache.arrow.file

Exactly as you mentioned.
This is to change the response format to columnar, so java or jdbc client could use column API to iterator through data.

Arrow format is mostly for encoding support, which could be a different goal, but will definitely help for column API performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants