Skip to content

[JDBC] Support multiple Quarkus datasources #1482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

singhpk234
Copy link
Contributor

@singhpk234 singhpk234 commented Apr 28, 2025

About the change

Presently this just makes the test pass of Admin and i am putting this out to get some early feedback.

This also add a check for table not exists when doing a check if the realm is bootstrapped or not so that purge can proceed, as if the realm is not bootstraped tables itself will not be there.

      if (e.getSQLState().equals("42P01")) {
        return null;
      }

I observed some unexplainable quarkus behaviour and spent days on debugging it, but now i have some way to work around, with custom DS of quarkus if i don't specify them in application prop and even though i specify it my PostgresLifeCycleManagement test container it doesn't create these beans, so declearing even the garbage in the application prop does the trick.

#quarkus.datasource.jdbc.url=polaris
#quarkus.datasource.username=polaris
#quarkus.datasource.password=polaris
quarkus.datasource.\"realm1\".db-kind=pgsql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for testing?

Copy link
Contributor Author

@singhpk234 singhpk234 Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, to be precise, This is for this behaviour

I observed some unexplainable quarkus behaviour and spent days on debugging it, but now i have some way to work around, with custom DS of quarkus if i don't specify them in application prop and even though i specify it my PostgresLifeCycleManagement test container it doesn't create these beans, so declearing even the garbage in the application prop does the trick.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having these obviously test-like entries in the production config is not nice. Could you make another attempt at removing them? Maybe you meant to put them under the test or it Quarkus profiles?

@@ -76,4 +81,9 @@ public PolarisConfigurationStore configurationStore() {
// A configuration store is not required when running the admin tool.
return new PolarisConfigurationStore() {};
}

@Produces
public JdbcDatasource jdbcDatasource(@ConfigProperty(name = "polaris.persistence.realm.isolation.type") String realmIsolationType, @All List<InstanceHandle<DataSource>> dataSources) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of the polaris.persistence.realm.isolation.type config? Considering that there could be many Polaris servers, each running with different config, this settings is rather confusion IMHO.

A given Polaris server merely needs a way to find a DataSource for each Realm ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting if specific custom DS is there then use realm specific DS or otherwise use the default one if not there, would it not create a confusion if there are both ?

For example :
consider realm1, realm2 ....
and the admin configure

              Map.entry("quarkus.datasource.realm1.db-kind", "pgsql"),
              Map.entry("quarkus.datasource.realm1.active", "true"),
              Map.entry("quarkus.datasource.realm1.jdbc.url", postgres.getJdbcUrl()),
              Map.entry("quarkus.datasource.realm1.username", postgres.getUsername()),
              Map.entry("quarkus.datasource.realm1.password", postgres.getPassword()),
              Map.entry("quarkus.datasource.db-kind", "pgsql"),
              Map.entry("quarkus.datasource.jdbc.url", postgres.getJdbcUrl()),
              Map.entry("quarkus.datasource.username", postgres.getUsername()),
              Map.entry("quarkus.datasource.password", postgres.getPassword()),

do we silently make the realm1 go to realm1 DS and realm2 to default DS ?

Copy link
Contributor

@dimas-b dimas-b Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that the list of realms is restricted by RealmContextConfiguration right now.

Since we have a fixed list of realm IDs, we can have a fixed map realm ID -> DS name, e.g.

quarkus.datasource.dsA.jdbc.url=...
quarkus.datasource.dsB.jdbc.url=...

then

polaris.realm-contex.data-source.realm1=dsA
polaris.realm-contex.data-source.realm2=dsA
polaris.realm-contex.data-source.realm3=dsB

It's up to the Polaris Admin to provide this mapping. The default would be to map all realms to the "default" DS.

In java:

public interface RealmContextConfiguration {
  Map<String, String> dataSource();
[...]
}

No other config is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you prefer we could have a different config class for that Map

Copy link
Contributor Author

@singhpk234 singhpk234 Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see so essentially having this mapping without an explicit config to support 3 use case :
for ex consider realm1, realm2, realm3

  1. realm1, realm2, realm3 are 1 DB each
  2. realm1, realm2, realm3 are 1 DB and each table has 3 realms (as PK)
  3. realm 1 can be DB realm2 and realm3 can be just 1 DB and each table as 2 PK

where as having a config which decides the isolation level just allows :
case 1, 2

Thoughts ? i think having config like that reduces un-intentional error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my POV this is the Polaris Admin kind of decision. As long as Polaris provides sufficient mechanisms for admins to implement all supported strategies, it's up to the admins if they want to mix and match or isolate everything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, custom donwstream builds will probably want to do the mapping in custom code and apply whatever logic fits their use cases.

Copy link
Contributor

@eric-maynard eric-maynard Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important that the configuration capture the notion of a realm strategy -- i.e. do I route requests to a metastore only based on an explicit configuration for that realm, or do I send all requests to some metastore by default?

Whether that's done explicitly by configuring a strategy or implicitly through the presence (or absence) of a default configuration, I think it's fine. Consider a setup like the following:

The admin has configured DSs for some individual realms:

polaris.realm-contex.data-source.realm.realm-1=dsA
polaris.realm-contex.data-source.realm.realm-2=dsA
polaris.realm-contex.data-source.realm.realm-3=dsB

A new realm realm-4 shows up -- I expect things to not work as the system does not know where to route the request.

Later, the admin configures polaris with a new default:

polaris.realm-contex.data-source.default=dsC

Now, I expect new realms realm-4 and realm-5 to get routed to dsC. If the admin later "promotes" realm-4 to have its own DS, I would expect the service to respect that.

The important thing is that the system does not route requests to some invisible default backend; the admin explicitly must configure the system to route unknown (unconfigured) realms to some specific DS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use explicit default DS configuration makes sense to me.

I'd also suggest setting built-in configuration such that out-of-the-box all realms map to the "default" datasource (for simplicity), but I do not feel too strongly about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds fair, i think being explicit would be really helpful, so i am also +1

@singhpk234 singhpk234 force-pushed the feature/multiple-ds branch 2 times, most recently from b15b964 to ed9e3a9 Compare April 29, 2025 03:09
@singhpk234 singhpk234 changed the title [WIP] Support 1 DS per realm [JDBC] Support multiple Quarkus datasources Apr 29, 2025
@singhpk234 singhpk234 marked this pull request as ready for review April 29, 2025 03:57
@singhpk234 singhpk234 force-pushed the feature/multiple-ds branch 3 times, most recently from b1abc4b to 8e587a0 Compare April 29, 2025 06:39
@@ -315,6 +315,12 @@ private PolarisBaseEntity getPolarisBaseEntity(String query) {
return results.getFirst();
}
} catch (SQLException e) {
// This look-up is used for checking if the realm is bootstrap or not.
// If we have 1 DB per realm it might happen that the realm is not boostrap
// at all.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why not throw an exception in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its like we checked and table didn't exists, so this looks up should not fail but rather say that he nothing is bootstrappped and hence trigger actual bootstrap or purge or something

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would returning null help with bootstrapping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes entity null would mean entity not found and this would prompt bootstrap which inturn would run the bootstrap script. Otherwise RTE here is everything broken, i agree this is not an ideal but this is something we have from persistence we expect EntityResult

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic is too obscure IMHO. Would it be preferable to have a dedicated isRealmBootstrapped(realmId) method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about if take this is in my MetaStoreManager refactor ? #1462
I have assigned it to me

#quarkus.datasource.jdbc.url=polaris
#quarkus.datasource.username=polaris
#quarkus.datasource.password=polaris
quarkus.datasource.\"realm1_ds\".db-kind=pgsql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure why we need the multitude of test-like datasource configs in the production sections of the admin tool properties 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whats happening but here is the reason behind : #1482 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that for a Datasource to be created by Quarkus, there has to be at least one build-time property defined for that datasource: db-kind.

And you cannot pass build-time properties through QuarkusTestResourceLifecycleManager#start.

So PostgresRelationalJdbcLifeCycleManagement is trying to create datasources dynamically, but since this is happening at runtime, this is not working.

Thus for tests you have to declare the datasources you intend to use here, along with their db-kind:

quarkus.datasource.\"realm1_ds\".db-kind=pgsql
quarkus.datasource.\"realm2_ds\".db-kind=pgsql
quarkus.datasource.\"realm3_ds\".db-kind=pgsql
# etc

Other runtime properties, like jdbc.url, username and password can (and should) be defined in PostgresRelationalJdbcLifeCycleManagement.

Copy link
Contributor

@dimas-b dimas-b Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If quarkus.datasource.\"realm1_ds\".db-kind=pgsql has to be defined at build time, that will be a problem for users of the binary distribution, I guess.

I'm surprised this is required, because the default datasource does not have to be defined at build time, if I'm not mistaken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If quarkus.datasource."realm1_ds".db-kind=pgsql has to be defined at build time, that will be a problem for users of the binary distribution, I guess.

See https://quarkus.io/guides/datasource#configure-multiple-datasources:

Even when only one database extension is installed, named databases need to specify at least one build-time property so that Quarkus can detect them. Generally, this is the db-kind property [...]

This is why I asked this question, I think there is some misunderstanding going on: #1482 (comment)

I'm surprised this is required, because the default datasource does not have to be defined at build time, if I'm not mistaken.

The default datasource is treated slightly different: if db-kind is missing, the JDBC driver will be guessed from the available JDBC drivers.

But that doesn't help here, since the intent is to use many named datasources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could follow an approach similar to this, plus add some CDI lifecycle to shut those Data Sources down when the application shuts down.

@adutra : WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That datasource is created manually and is not managed by Quarkus. We can do that of course, but my impression was that we want to keep using Quarkus-managed datasources.

Copy link
Contributor

@dimas-b dimas-b Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be ideal to let Quarkus manage the Data Sources, but having to enumerate them at build time kills the user-level benefits, IMHO.

Copy link
Contributor

@dimas-b dimas-b Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could preconfigure (quarkus.datasource."pg1".db-kind=pgsql, etc) a few (say 5) Data Sources and allow users to provide connection strings and map them to realms in runtime... but that still looks awkward to me.

Copy link
Contributor

@dimas-b dimas-b Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my POV, it might be fine to support one Data Source (co-located realms) out-of-the-box and instruct users to use custom builds if they want to segregate realms by Data Source (experience similar to EclipseLink).

@singhpk234 singhpk234 force-pushed the feature/multiple-ds branch from 8e587a0 to 14ceabd Compare April 30, 2025 02:08
@singhpk234 singhpk234 force-pushed the feature/multiple-ds branch from c29d549 to 80304c3 Compare April 30, 2025 16:27
id("java-test-fixtures")
}

configurations.all {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really surprising, but indeed we need this. There is some serious Gradle snafu going on.

*/

plugins {
alias(libs.plugins.jandex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we are missing

alias(libs.plugins.quarkus)
id("polaris-quarkus")

But indeed adding those doesn't work. Let's investigate later.

@adutra
Copy link
Contributor

adutra commented Apr 30, 2025

@singhpk234 just to confirm, is the intent here:

  1. To support a finite, pre-defined number of datasources that are known at build time – or
  2. To support an unbounded, undefined number of datasources that are not known at build time

?

Your current work hints at 1 since the datasources are currently managed by Quarkus, but the tests seem to expect 2, hence my confusion.

@singhpk234
Copy link
Contributor Author

@adutra here is what i want to achieve : #1482 (comment)

polaris.relational.jdbc.datasource.realm-1=dsA
polaris.relational.jdbc.datasource.realm-2=dsB

polaris.relational.jdbc.datasource.default-datasource=dsC

for realm-1, realm-2 we go to dsA and dsB both need to predefined
realm-3 shows up since the mapping is not there it goes to default-datasource i.e dsC

essentially this would expecte all dsA, dsB, dsC configured otherwise we fail.

@singhpk234 singhpk234 force-pushed the feature/multiple-ds branch from d121ff2 to dc0fbbb Compare April 30, 2025 22:17
@adutra
Copy link
Contributor

adutra commented Apr 30, 2025

realm-1, realm-2 we go to dsA and dsB both need to predefined

OK then, that is possible 👍 I was worried you would be wanting that realm-3 should trigger the creation of an extra datasource.

Thanks for confirming.

@adutra
Copy link
Contributor

adutra commented Apr 30, 2025

polaris.relational.jdbc.datasource.realm-1=dsA
polaris.relational.jdbc.datasource.realm-2=dsB
polaris.relational.jdbc.datasource.default-datasource=dsC

I suggest though that you adopt SmallRye's convention for named things:

polaris.relational.jdbc.datasource=dsC # default (unnamed) datasource
polaris.relational.jdbc.datasource.realm-1=dsA # named datasource "realm-1"
polaris.relational.jdbc.datasource.realm-2=dsB # named datasource "realm-2"

You will find an example of that here (from #1397):

@singhpk234
Copy link
Contributor Author

IIUC you mean removing default-datasource ? i think we need default being explicit called out from the configuration. I am not entirely sure what do you imply by SmallRye's convention ? can you please elaborate

@singhpk234 singhpk234 force-pushed the feature/multiple-ds branch from dc0fbbb to 32bb4b9 Compare May 1, 2025 02:28
@adutra
Copy link
Contributor

adutra commented May 1, 2025

IIUC you mean removing default-datasource ? i think we need default being explicit called out from the configuration. I am not entirely sure what do you imply by SmallRye's convention ? can you please elaborate

@singhpk234 in SmallRye when you are mapping a Map you have the possibility to define a special key called the "unnamed" key. That key in the map will hold the default value for the map. You generally don't need to define an extra property to hold the default value, the default value will be read from the "root":

@ConfigMapping(prefix = "polaris.relation.jdbc.datasource")
public interface RelationalJdbcConfiguration {

  String DEFAULT_REALM_KEY = "<default>";

  /** realmId to configured Datasource name mapping. */
  @WithParentName
  @WithUnnamedKey(DEFAULT_REALM_KEY)
  Map<String, String> realms();

}

An example configuration would be:

# Default config:
polaris.relation.jdbc.datasource=realm1_ds
# Alternative configs per realm:
polaris.relation.jdbc.datasource.realm2=realm2_ds
polaris.relation.jdbc.datasource.realm3=realm3_ds

A typical usage would be:

String dataSourceName;
if (relationalJdbcConfiguration.realms().containsKey(realmId)) {
  dataSourceName = relationalJdbcConfiguration.realms().get(realmId);
} else {
  dataSourceName =
      relationalJdbcConfiguration.realms().get(RelationalJdbcConfiguration.DEFAULT_REALM_KEY);
}

This convention is used in many other places, so from a user XP perspective it would be nice to have the same here.

@singhpk234
Copy link
Contributor Author

This is a nice idea !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants