Batch update with some invalid values fails for all rows #89

vadz · 2025-01-28T17:03:53Z

Suppose there is the following table in the database:

create table t(x integer check (x<100));

and consider a program which does something like the following (simplified):

constexpr int ARRAY_SIZE = 2;
SQLSetStmtAttr(h, SQL_ATTR_PARAMSET_SIZE, (void *)ARRAY_SIZE, 0);

SQLINTEGER values[ARRAY_SIZE] = { 1, 101 }; // First value is valid, second one is not.
SQLLEN inds[ARRAY_SIZE];
SQLBindParameter(h, 1, SQL_PARAM_INPUT, SQL_C_INTEGER, SQL_INTEGER, 0, 0, values, 0, inds);

SQLUSMALLINT status[ARRAY_SIZE];
SQLSetStmtAttr(h, SQL_ATTR_PARAM_STATUS_PTR, status, 0);

SQLPrepare(h, "insert into t(x) values(?)", SQL_NTS);
SQLRETURN rc = SQLExecute(h);

When using SQL Server with either Microsoft ODBC driver or FreeTDS, rc is SQL_SUCCESS_WITH_INFO and status array is filled with { SQL_PARAM_SUCCESS, SQL_PARAM_ERROR } which is nice because it allows the application to determine which row(s) contained values that resulted in an error.

When using PostgreSQL ODBC driver (v13.02, but from examining Git history it doesn't look like there have been any changes here even in the latest version), rc is SQL_ERROR and status contains SQL_PARAM_ERROR for both elements, which doesn't provide any useful information.

Is this behaviour intentional and, if not, would it be possible to change it to be more consistent with other ODBC drivers and, also, more useful?

The text was updated successfully, but these errors were encountered:

davecramer · 2025-02-01T17:16:19Z

So if I understand correctly, you would like the insert for the first to succeed and the second to fail.
Since the insert for both is a single transaction when the first fails the second fails as postgres does not commit the transaction.
This is by design.

see https://www.postgresql.org/docs/current/tutorial-transactions.html for more details

Dave

vadz · 2025-02-01T17:23:12Z

Not quite, I'm perfectly fine with the insert failing and actually think this is more correct than doing a partial insert as some other databases/ODBC drivers do. But it would be great to have some information about which rows resulted in an error and this is something that SQL_ATTR_PARAM_STATUS_PTR is supposed to provide, but this driver returns SQL_PARAM_ERROR in it even for the first row, which would have succeeded.

I.e. the problem is not (so much) returning SQL_ERROR instead of SQL_SUCCESS_WITH_INFO (although this is inconsistent with the other ODBC drivers I've tried), but not providing any useful information in the status array.

davecramer · 2025-02-02T05:08:51Z

Unfortunately the server does not provide that information. It doesn't know that the first one succeeded since it was all done in one insert.

insert into testtab1 values (1,'blah');
ERROR:  duplicate key value violates unique constraint "testtab1_pkey"
DETAIL:  Key (id)=(1) already exists.

While we do get the error message, we don't know if it is the 1st or 2nd value that failed

vadz · 2025-02-02T16:47:20Z

Just trying to understand: how does the driver manage to send multiple parameters to the server? From looking at the code it seems like it does PQsendQuery() for each row in a loop, but I must be missing something...

davecramer · 2025-02-02T20:40:05Z

Can you show me the code you are thinking it is using.
Without looking at the code I would think it does

insert into testtab1 (id) values (1),(2);

vadz · 2025-02-02T20:49:22Z

Can you show me the code you are thinking it is using.

I've been looking at the part of PGAPI_Execute() between

psqlodbc/execute.c

Line 1140 in 1fbc3b2

next_param_row:

and

psqlodbc/execute.c

Lines 1241 to 1244 in 1fbc3b2

    
           if (!exec_end) 
        
           { 
        
           	goto next_param_row; 
        
           }

but I could easily be wrong as I mostly just looked at the label name. I probably should understand how to enable tracing in this code to see what it's really doing during tun-time...

Without looking at the code I would think it does

insert into testtab1 (id) values (1),(2);

Interesting, wouldn't this run into some kind of limitation (number of parameters or total length of the command) for even relatively small amounts of data?

davecramer · 2025-02-04T18:37:08Z

it does, but not as much as one would think. It will handle quite a few values.

davecramer · 2025-02-05T10:07:19Z

I would just enable logging on the server to see how it does this.

vadz · 2025-02-05T13:32:10Z

I would just enable logging on the server to see how it does this.

Good point, thanks, this is indeed even simpler. And what I see is that the driver (at least 13.02 version from Debian Bookworm) just sends multiple semicolon separated statements to the server in a single request, e.g. for my test I see the following

LOG:  00000: statement: insert into soci_test(val) values('1');insert into soci_test(val) values('a')
ERROR:  22P02: invalid input syntax for integer: "a" at character 74

If the "at character N" part is always present (but I'm not at all sure about this), it looks like it would be possible to find the row which resulted in the error by remembering the positions of the statement boundaries for each row.

But now I also wonder if this is really the most efficient way to do bulk inserts/updates with Postgres, at the very least repeating the entire query many times like this should consume much more bandwidth than preparing it once/reusing it it for all rows?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batch update with some invalid values fails for all rows #89

Batch update with some invalid values fails for all rows #89

vadz commented Jan 28, 2025 •

edited

Loading

davecramer commented Feb 1, 2025

Uh oh!

vadz commented Feb 1, 2025

Uh oh!

davecramer commented Feb 2, 2025

Uh oh!

vadz commented Feb 2, 2025

Uh oh!

davecramer commented Feb 2, 2025

Uh oh!

vadz commented Feb 2, 2025

Uh oh!

davecramer commented Feb 4, 2025

Uh oh!

davecramer commented Feb 5, 2025

Uh oh!

vadz commented Feb 5, 2025

Uh oh!

Batch update with some invalid values fails for all rows #89

Batch update with some invalid values fails for all rows #89

Comments

vadz commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

davecramer commented Feb 1, 2025

Uh oh!

vadz commented Feb 1, 2025

Uh oh!

davecramer commented Feb 2, 2025

Uh oh!

vadz commented Feb 2, 2025

Uh oh!

davecramer commented Feb 2, 2025

Uh oh!

vadz commented Feb 2, 2025

Uh oh!

davecramer commented Feb 4, 2025

Uh oh!

davecramer commented Feb 5, 2025

Uh oh!

vadz commented Feb 5, 2025

Uh oh!

vadz commented Jan 28, 2025 •

edited

Loading