-
Notifications
You must be signed in to change notification settings - Fork 227
[JDBC] Support multiple Quarkus datasources #1482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
...n/java/org/apache/polaris/extension/persistence/relational/jdbc/JdbcBasePersistenceImpl.java
Outdated
Show resolved
Hide resolved
...c/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/JdbcDatasource.java
Outdated
Show resolved
Hide resolved
quarkus/admin/src/main/java/org/apache/polaris/admintool/config/QuarkusJdbcDataSource.java
Outdated
Show resolved
Hide resolved
#quarkus.datasource.jdbc.url=polaris | ||
#quarkus.datasource.username=polaris | ||
#quarkus.datasource.password=polaris | ||
quarkus.datasource.\"realm1\".db-kind=pgsql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, to be precise, This is for this behaviour
I observed some unexplainable quarkus behaviour and spent days on debugging it, but now i have some way to work around, with custom DS of quarkus if i don't specify them in application prop and even though i specify it my PostgresLifeCycleManagement test container it doesn't create these beans, so declearing even the garbage in the application prop does the trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having these obviously test-like entries in the production config is not nice. Could you make another attempt at removing them? Maybe you meant to put them under the test
or it
Quarkus profiles?
@@ -76,4 +81,9 @@ public PolarisConfigurationStore configurationStore() { | |||
// A configuration store is not required when running the admin tool. | |||
return new PolarisConfigurationStore() {}; | |||
} | |||
|
|||
@Produces | |||
public JdbcDatasource jdbcDatasource(@ConfigProperty(name = "polaris.persistence.realm.isolation.type") String realmIsolationType, @All List<InstanceHandle<DataSource>> dataSources) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of the polaris.persistence.realm.isolation.type
config? Considering that there could be many Polaris servers, each running with different config, this settings is rather confusion IMHO.
A given Polaris server merely needs a way to find a DataSource for each Realm ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting if specific custom DS is there then use realm specific DS or otherwise use the default one if not there, would it not create a confusion if there are both ?
For example :
consider realm1, realm2 ....
and the admin configure
Map.entry("quarkus.datasource.realm1.db-kind", "pgsql"),
Map.entry("quarkus.datasource.realm1.active", "true"),
Map.entry("quarkus.datasource.realm1.jdbc.url", postgres.getJdbcUrl()),
Map.entry("quarkus.datasource.realm1.username", postgres.getUsername()),
Map.entry("quarkus.datasource.realm1.password", postgres.getPassword()),
Map.entry("quarkus.datasource.db-kind", "pgsql"),
Map.entry("quarkus.datasource.jdbc.url", postgres.getJdbcUrl()),
Map.entry("quarkus.datasource.username", postgres.getUsername()),
Map.entry("quarkus.datasource.password", postgres.getPassword()),
do we silently make the realm1 go to realm1 DS and realm2 to default DS ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that the list of realms is restricted by RealmContextConfiguration
right now.
Since we have a fixed list of realm IDs, we can have a fixed map realm ID -> DS name, e.g.
quarkus.datasource.dsA.jdbc.url=...
quarkus.datasource.dsB.jdbc.url=...
then
polaris.realm-contex.data-source.realm1=dsA
polaris.realm-contex.data-source.realm2=dsA
polaris.realm-contex.data-source.realm3=dsB
It's up to the Polaris Admin to provide this mapping. The default would be to map all realms to the "default" DS.
In java:
public interface RealmContextConfiguration {
Map<String, String> dataSource();
[...]
}
No other config is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer we could have a different config class for that Map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see so essentially having this mapping without an explicit config to support 3 use case :
for ex consider realm1, realm2, realm3
- realm1, realm2, realm3 are 1 DB each
- realm1, realm2, realm3 are 1 DB and each table has 3 realms (as PK)
- realm 1 can be DB realm2 and realm3 can be just 1 DB and each table as 2 PK
where as having a config which decides the isolation level just allows :
case 1, 2
Thoughts ? i think having config like that reduces un-intentional error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my POV this is the Polaris Admin kind of decision. As long as Polaris provides sufficient mechanisms for admins to implement all supported strategies, it's up to the admins if they want to mix and match or isolate everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, custom donwstream builds will probably want to do the mapping in custom code and apply whatever logic fits their use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important that the configuration capture the notion of a realm strategy -- i.e. do I route requests to a metastore only based on an explicit configuration for that realm, or do I send all requests to some metastore by default?
Whether that's done explicitly by configuring a strategy or implicitly through the presence (or absence) of a default configuration, I think it's fine. Consider a setup like the following:
The admin has configured DSs for some individual realms:
polaris.realm-contex.data-source.realm.realm-1=dsA
polaris.realm-contex.data-source.realm.realm-2=dsA
polaris.realm-contex.data-source.realm.realm-3=dsB
A new realm realm-4
shows up -- I expect things to not work as the system does not know where to route the request.
Later, the admin configures polaris with a new default:
polaris.realm-contex.data-source.default=dsC
Now, I expect new realms realm-4
and realm-5
to get routed to dsC
. If the admin later "promotes" realm-4
to have its own DS, I would expect the service to respect that.
The important thing is that the system does not route requests to some invisible default backend; the admin explicitly must configure the system to route unknown (unconfigured) realms to some specific DS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use explicit default DS configuration makes sense to me.
I'd also suggest setting built-in configuration such that out-of-the-box all realms map to the "default" datasource (for simplicity), but I do not feel too strongly about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds fair, i think being explicit would be really helpful, so i am also +1
...src/main/java/org/apache/polaris/test/commons/PostgresRelationalJdbcLifeCycleManagement.java
Outdated
Show resolved
Hide resolved
b15b964
to
ed9e3a9
Compare
quarkus/admin/src/main/java/org/apache/polaris/admintool/config/QuarkusDatasourceSupplier.java
Outdated
Show resolved
Hide resolved
b1abc4b
to
8e587a0
Compare
...mons/src/main/java/org/apache/polaris/commons/PostgresRelationalJdbcLifeCycleManagement.java
Outdated
Show resolved
Hide resolved
quarkus/commons/src/main/java/org/apache/polaris/commons/QuarkusDatasourceSupplier.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/polaris/extension/persistence/relational/jdbc/DatasourceSupplier.java
Outdated
Show resolved
Hide resolved
@@ -315,6 +315,12 @@ private PolarisBaseEntity getPolarisBaseEntity(String query) { | |||
return results.getFirst(); | |||
} | |||
} catch (SQLException e) { | |||
// This look-up is used for checking if the realm is bootstrap or not. | |||
// If we have 1 DB per realm it might happen that the realm is not boostrap | |||
// at all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why not throw an exception in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its like we checked and table didn't exists, so this looks up should not fail but rather say that he nothing is bootstrappped and hence trigger actual bootstrap or purge or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would returning null
help with bootstrapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes entity null would mean entity not found and this would prompt bootstrap which inturn would run the bootstrap script. Otherwise RTE here is everything broken, i agree this is not an ideal but this is something we have from persistence we expect EntityResult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic is too obscure IMHO. Would it be preferable to have a dedicated isRealmBootstrapped(realmId)
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about if take this is in my MetaStoreManager refactor ? #1462
I have assigned it to me
...va/org/apache/polaris/extension/persistence/relational/jdbc/RelationalJdbcConfiguration.java
Outdated
Show resolved
Hide resolved
quarkus/admin/src/main/java/org/apache/polaris/admintool/config/QuarkusProducers.java
Outdated
Show resolved
Hide resolved
#quarkus.datasource.jdbc.url=polaris | ||
#quarkus.datasource.username=polaris | ||
#quarkus.datasource.password=polaris | ||
quarkus.datasource.\"realm1_ds\".db-kind=pgsql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure why we need the multitude of test-like datasource configs in the production sections of the admin tool properties 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure whats happening but here is the reason behind : #1482 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that for a Datasource to be created by Quarkus, there has to be at least one build-time property defined for that datasource: db-kind
.
And you cannot pass build-time properties through QuarkusTestResourceLifecycleManager#start
.
So PostgresRelationalJdbcLifeCycleManagement
is trying to create datasources dynamically, but since this is happening at runtime, this is not working.
Thus for tests you have to declare the datasources you intend to use here, along with their db-kind:
quarkus.datasource.\"realm1_ds\".db-kind=pgsql
quarkus.datasource.\"realm2_ds\".db-kind=pgsql
quarkus.datasource.\"realm3_ds\".db-kind=pgsql
# etc
Other runtime properties, like jdbc.url
, username
and password
can (and should) be defined in PostgresRelationalJdbcLifeCycleManagement
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If quarkus.datasource.\"realm1_ds\".db-kind=pgsql
has to be defined at build time, that will be a problem for users of the binary distribution, I guess.
I'm surprised this is required, because the default datasource does not have to be defined at build time, if I'm not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If quarkus.datasource."realm1_ds".db-kind=pgsql has to be defined at build time, that will be a problem for users of the binary distribution, I guess.
See https://quarkus.io/guides/datasource#configure-multiple-datasources:
Even when only one database extension is installed, named databases need to specify at least one build-time property so that Quarkus can detect them. Generally, this is the db-kind property [...]
This is why I asked this question, I think there is some misunderstanding going on: #1482 (comment)
I'm surprised this is required, because the default datasource does not have to be defined at build time, if I'm not mistaken.
The default datasource is treated slightly different: if db-kind
is missing, the JDBC driver will be guessed from the available JDBC drivers.
But that doesn't help here, since the intent is to use many named datasources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That datasource is created manually and is not managed by Quarkus. We can do that of course, but my impression was that we want to keep using Quarkus-managed datasources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be ideal to let Quarkus manage the Data Sources, but having to enumerate them at build time kills the user-level benefits, IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could preconfigure (quarkus.datasource."pg1".db-kind=pgsql
, etc) a few (say 5) Data Sources and allow users to provide connection strings and map them to realms in runtime... but that still looks awkward to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from my POV, it might be fine to support one Data Source (co-located realms) out-of-the-box and instruct users to use custom builds if they want to segregate realms by Data Source (experience similar to EclipseLink).
quarkus/commons/src/main/java/org/apache/polaris/commons/QuarkusDatasourceSupplier.java
Outdated
Show resolved
Hide resolved
quarkus/commons/src/main/java/org/apache/polaris/commons/QuarkusDatasourceSupplier.java
Outdated
Show resolved
Hide resolved
8e587a0
to
14ceabd
Compare
c29d549
to
80304c3
Compare
polaris-core/src/main/java/org/apache/polaris/core/config/RelationalJdbcConfiguration.java
Outdated
Show resolved
Hide resolved
polaris-core/src/main/java/org/apache/polaris/core/config/RelationalJdbcConfiguration.java
Outdated
Show resolved
Hide resolved
quarkus/admin/src/main/java/org/apache/polaris/admintool/config/QuarkusProducers.java
Outdated
Show resolved
Hide resolved
quarkus/commons/build.gradle.kts
Outdated
id("java-test-fixtures") | ||
} | ||
|
||
configurations.all { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really surprising, but indeed we need this. There is some serious Gradle snafu going on.
quarkus/commons/build.gradle.kts
Outdated
*/ | ||
|
||
plugins { | ||
alias(libs.plugins.jandex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we are missing
alias(libs.plugins.quarkus)
id("polaris-quarkus")
But indeed adding those doesn't work. Let's investigate later.
quarkus/service/src/main/java/org/apache/polaris/service/quarkus/config/QuarkusProducers.java
Outdated
Show resolved
Hide resolved
@singhpk234 just to confirm, is the intent here:
? Your current work hints at 1 since the datasources are currently managed by Quarkus, but the tests seem to expect 2, hence my confusion. |
@adutra here is what i want to achieve : #1482 (comment) polaris.relational.jdbc.datasource.realm-1=dsA polaris.relational.jdbc.datasource.default-datasource=dsC for realm-1, realm-2 we go to dsA and dsB both need to predefined essentially this would expecte all dsA, dsB, dsC configured otherwise we fail. |
d121ff2
to
dc0fbbb
Compare
OK then, that is possible 👍 I was worried you would be wanting that realm-3 should trigger the creation of an extra datasource. Thanks for confirming. |
I suggest though that you adopt SmallRye's convention for named things: polaris.relational.jdbc.datasource=dsC # default (unnamed) datasource
polaris.relational.jdbc.datasource.realm-1=dsA # named datasource "realm-1"
polaris.relational.jdbc.datasource.realm-2=dsB # named datasource "realm-2" You will find an example of that here (from #1397): Line 36 in d51cb1a
|
IIUC you mean removing default-datasource ? i think we need default being explicit called out from the configuration. I am not entirely sure what do you imply by |
dc0fbbb
to
32bb4b9
Compare
@singhpk234 in SmallRye when you are mapping a @ConfigMapping(prefix = "polaris.relation.jdbc.datasource")
public interface RelationalJdbcConfiguration {
String DEFAULT_REALM_KEY = "<default>";
/** realmId to configured Datasource name mapping. */
@WithParentName
@WithUnnamedKey(DEFAULT_REALM_KEY)
Map<String, String> realms();
} An example configuration would be: # Default config:
polaris.relation.jdbc.datasource=realm1_ds
# Alternative configs per realm:
polaris.relation.jdbc.datasource.realm2=realm2_ds
polaris.relation.jdbc.datasource.realm3=realm3_ds A typical usage would be: String dataSourceName;
if (relationalJdbcConfiguration.realms().containsKey(realmId)) {
dataSourceName = relationalJdbcConfiguration.realms().get(realmId);
} else {
dataSourceName =
relationalJdbcConfiguration.realms().get(RelationalJdbcConfiguration.DEFAULT_REALM_KEY);
} This convention is used in many other places, so from a user XP perspective it would be nice to have the same here. |
This is a nice idea ! |
About the change
Presently this just makes the test pass of Admin and i am putting this out to get some early feedback.
This also add a check for table not exists when doing a check if the realm is bootstrapped or not so that purge can proceed, as if the realm is not bootstraped tables itself will not be there.
I observed some unexplainable quarkus behaviour and spent days on debugging it, but now i have some way to work around, with custom DS of quarkus if i don't specify them in application prop and even though i specify it my PostgresLifeCycleManagement test container it doesn't create these beans, so declearing even the garbage in the application prop does the trick.