Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Npgsql: Add software tests validating all data types of CrateDB #772

Merged
merged 6 commits into from
Dec 18, 2024

Conversation

amotl
Copy link
Member

@amotl amotl commented Dec 12, 2024

About

The extended test case is here to both demonstrate, and make ourselves aware, about what capabilities the vanilla Npgsql driver has, in terms of data type support.

Status

All data types can be exchanged, but for data types where there is no native type support, they need to be marshalled manually, as they are communicated as STRING types. Currently, this is ~GEO_POINT, GEO_SHAPE, and OBJECT.

  • GEO_POINT uses WKT or GeoJSON (in), and a tuple representation like (85.42999997735023,66.22999997343868) (out)
  • GEO_SHAPE uses WKT or GeoJSON (in), and GeoJSON (out)
  • OBJECT uses JSON (in and out)

References

@amotl amotl requested review from kneth and surister December 12, 2024 00:04
@amotl amotl marked this pull request as ready for review December 12, 2024 00:05
@amotl amotl force-pushed the npgsql-maintenance-2014-12 branch from bd58391 to b9b9a8d Compare December 12, 2024 00:07
@amotl amotl force-pushed the npgsql-all-types branch 2 times, most recently from c5e5ea8 to c02028f Compare December 12, 2024 00:14
Copy link
Member Author

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. I am adding some insights about this patch at the spots where they are relevant.

Comment on lines 144 to 164
cmd.Parameters.AddWithValue("null_integer", DBNull.Value);
cmd.Parameters.AddWithValue("integer", 42);
cmd.Parameters.AddWithValue("bigint", 42);
cmd.Parameters.AddWithValue("float", 42.42);
cmd.Parameters.AddWithValue("double", 42.42);
cmd.Parameters.AddWithValue("decimal", 42.42);
cmd.Parameters.AddWithValue("bit", "01010101");
cmd.Parameters.AddWithValue("bool", true);
cmd.Parameters.AddWithValue("text", "foobar");
cmd.Parameters.AddWithValue("char", "foo");
cmd.Parameters.AddWithValue("timestamp_tz", "1970-01-02T00:00:00+01:00");
cmd.Parameters.AddWithValue("timestamp_notz", "1970-01-02T00:00:00");
cmd.Parameters.AddWithValue("ip", "127.0.0.1");
cmd.Parameters.AddWithValue("array", new List<string>{"foo", "bar"});
// FIXME: System.NotSupportedException: Cannot resolve 'hstore' to a fully qualified datatype name. The datatype was not found in the current database info.
// cmd.Parameters.AddWithValue("object", new Dictionary<string, string>(){{"foo", "bar"}});
cmd.Parameters.AddWithValue("geopoint", new List<double>{85.43, 66.23});
// FIXME: Npgsql.PostgresException : XX000: line 38:9: no viable alternative at input 'VALUES
// cmd.Parameters.AddWithValue("geoshape", "POLYGON ((5 5, 10 5, 10 10, 5 10, 5 5))");
cmd.Parameters.AddWithValue("float_vector", new List<double> {1.1, 2.2, 3.3});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the spot that best conveys the capabilities of the vanilla Npgsql driver together with CrateDB, where we currently can't send values of type OBJECT or GEOSHAPE using parameterized INSERT operations, at least not in the way presented here.

Comment on lines 158 to 160
// FIXME: System.NotSupportedException: Cannot resolve 'hstore' to a fully qualified datatype name. The datatype was not found in the current database info.
// cmd.Parameters.AddWithValue("object", new Dictionary<string, string>(){{"foo", "bar"}});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For making CrateDB's OBJECT data types work with vanilla adapters/drivers that have been designed for talking to PostgreSQL, there is the opportunity to converge early PostgreSQL's HSTORE feature, or recent PostgreSQL's JSON(B) feature into relevant support for CrateDB's OBJECT, in fact reasonably transparently, by having extension code in the right place.

A recent adapter (not released yet) for Python/SQLAlchemy is inside here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapters for HSTORE are part of our custom drivers for Java, .NET, and others, from the very beginning of CrateDB becoming PostgreSQL compatible, because there have been enough insights that it needs special extension code to fundamental drivers always and anyway, in order to support CrateDB's special features.

However, it looks like attention in this area has decreased after the focus shifted to increase PostgreSQL compatibility of CrateDB itself. In these times, it seems to have been forgotten that there will always be special features that CrateDB needs support for, and therefore, driver support will always be needed.

This idea went so far that we event flagged the JDBC driver as "legacy", which led to a lot of downstream confusion. We are gradually reverting that idea, and made a start over here.

Sure enough, in a perfect world, we would upstream our special needs into the canonical drivers, but currently, we are not even close to that.

Copy link
Member Author

@amotl amotl Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past, the company was pretty active in this regard,

I guess the outcome of all those endeavors at this time around 2017/2018, when the company has been more active in this regard, were those improvements to Npgsql, then starting to grow a plugin system to extend its type mapping subsystem, in order to provide proper support for "databases like PostgreSQL", and to conceive a corresponding plugin for CrateDB that leverages this plugin system, along the lines of apparently upstreaming a few other changes to Npgsql Core itself, while others possibly haven't made it upstream yet, or just can't.

This plugin exactly adds support for CrateDB's special data types, at least for OBJECTs and ARRAYs of OBJECTs, which are the most important container types, and one of the killer features of CrateDB anyway.

However, maintenance on this plugin dried out, possibly due to misunderstandings, but certainly mostly due to resource drain.

The reality in 2024 is, that this plugin, crate-npgsql, depends on a version of canonical Npgsql from six years ago, exactly like the CrateDB JDBC driver is doing it. I dearly think we need to resurrect this, in order to provide premium support for CrateDB's users, including support for special data types, instead of referring them to the standard PostgreSQL driver, which currently just does not provide those features that are actually important for users using CrateDB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to improve the situation, I started to track this internally.

I think a reasonable plan, after resurrecting maintenance of the CrateDB Npgsql Plugin, is to work through all data types already provided by Npgsql, and make them work reasonably well together with CrateDB.

/cc @kneth, @simonprickett

Copy link
Member Author

@amotl amotl Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic support for OBJECT types (marshalled using STRING) is now validated per 23911c5.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is captured in https://github.com/crate/zk/issues/26, I think we are safe for now. Leaving a link to the issue will help us to remember.

Comment on lines 161 to 162
// FIXME: Npgsql.PostgresException : XX000: line 38:9: no viable alternative at input 'VALUES
// cmd.Parameters.AddWithValue("geoshape", "POLYGON ((5 5, 10 5, 10 10, 5 10, 5 5))");
Copy link
Member Author

@amotl amotl Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe GEOSHAPE can be made work easily. I haven't yet investigated what is behind the hiccup in this case. The primary intent of this patch is to evaluate and educate, and to set the stage for relevant insights and to spark corresponding discussions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic support for GEO_SHAPE types (marshalled using STRING) is now validated per 131972a.

Comment on lines +154 to +155
cmd.Parameters.AddWithValue("timestamp_tz", "1970-01-02T00:00:00+01:00");
cmd.Parameters.AddWithValue("timestamp_notz", "1970-01-02T00:00:00");
Copy link
Member Author

@amotl amotl Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about those, because apparently, they work in this context, but it is also just a very basic test anyway. Let me reference this report, to have a complete list of evalations here, and to reduce the surface of unknown unknowns.

It might indicate a flaw we may investigate closer and possibly resolve while modernizing the driver.

@amotl amotl force-pushed the npgsql-all-types branch 2 times, most recently from 01823ea to 131972a Compare December 13, 2024 00:09
@amotl amotl changed the title Npgsql: Add test validating (almost) all data types of CrateDB Npgsql: Add software tests validating all data types of CrateDB Dec 13, 2024
Copy link

@kneth kneth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Assert.Equal(DBNull.Value, row["null_integer"]);
Assert.Equal(42, row["integer"]);
Assert.Equal((Int64) 42, row["bigint"]);
//Assert.Equal(42.42, (float) row["float"], 0.01);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these asserts disabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was probably just an oversight while erecting this a bit in haste. Thanks for spotting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed per cd5c97c.

Base automatically changed from npgsql-maintenance-2014-12 to main December 13, 2024 15:53
@amotl amotl requested a review from kneth December 13, 2024 15:58
Copy link

@kneth kneth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I would like to have links to existing issues for all FIXMEs and TODOs so we don't forget what it is about.

Comment on lines 158 to 160
// FIXME: System.NotSupportedException: Cannot resolve 'hstore' to a fully qualified datatype name. The datatype was not found in the current database info.
// cmd.Parameters.AddWithValue("object", new Dictionary<string, string>(){{"foo", "bar"}});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is captured in https://github.com/crate/zk/issues/26, I think we are safe for now. Leaving a link to the issue will help us to remember.

@amotl amotl merged commit 4c7b18b into main Dec 18, 2024
9 checks passed
@amotl amotl deleted the npgsql-all-types branch December 18, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants