-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: group by sequence doesn't match values sequence #34
Comments
wired, on another server with the same data ,all sql above run ok. |
some more case: select cs_hash_dup_count(cs_filter(x.revision = cs_const('493383',32),x.uin) ,cs_filter(x.revision = cs_const('493383',32),x.class1)) from crashlog_get(620757031,now()::timestamp - '10 hour'::interval,now()::timestamp) as x where x is not null; select cs_hash_dup_count(cs_filter(x.revision = cs_const('493383',32),x.uin) ,cs_filter(x.revision = cs_const('493383',32),x.class1)) from crashlog_get(620757031) as x where x is not null; select cs_hash_dup_count(cs_filter(x.revision = cs_const('493383',32),x.uin) ,cs_filter(x.revision = cs_const('493383',32),x.class1)) from crashlog_get(620757031,now()::timestamp - '5 hour'::interval,now()::timestamp) as x where x is not null; select cs_hash_dup_count(cs_filter(x.revision = cs_const('493383',32),x.uin) ,cs_filter(x.revision = cs_const('493383',32),x.class1)) from crashlog_get(620757031,now()::timestamp - '4 hour'::interval,now()::timestamp) as x where x is not null; |
it looks like memory dirty on this machine: select cs_count(x.class2) c0 ,cs_count(x.uin) as c1 ,cs_count(x.class1) as c2 ,cs_count(cs_filter(x.revision = cs_const('493383',32),x.uin)) as c3 ,cs_count(cs_filter(x.revision = cs_const('493383',32),x.class1)) as c4 from crashlog_get(620757031,'2014-04-14 12:00:00','2014-04-14 14:00:00') as x where x is not null; on another machine ,the result is : Time: 2.476 ms but in the bad state machine,some query run without problem,such: |
Is the problem reproduced if you switch off multithreading query execution: set imcs.n_threads=1 in postgresql.conf |
no,restart the pg and reload the data to cs make all sql run ok. |
How can I help to debug tihis problem? it happened again. my usage: I have a cron job,delete and load data every hour: step 2:clean data of last hour in case of run this cmd multi times step3:load data of last hour at the same time,user may run sql on the server. is it casued by some concurrent delete/select/load ? because if user not use(so thiere is not select or read),then the data is ok.I find this because after hole night load data,there is no such error in the morning,but in the afternoon,the error happened. |
I found one possible synchronization issues in delete function. Also, do you have imcs_autoload enabled? If yes, please try to disable it (actually I do not want to say that it is impossible to use autoload in concurrent queries, but I want to understand if the problem is related with it or not). |
I can repreduce the bug by 2 script concurrent running in 3 sessions: while the data is constant insert into crashlog table, open first session: open second session: open third session: open forth session: for a wile,you will see .....n data error found! where n >= 1 note:you can create wx_version as: I see the code: but in imcs_delete_page,I can't find imcs->lock around imcs_free_page,is it relative? |
Did you try the new version (61)? |
I just try the new version,unfortunately,it still can be reproduced by the script above. |
Sorry, but what was the actual error message? Did you inspect the logs? |
on the pgsql console,I get error some time(may happen,then disappear,then happen...): on the console: and sure, the load session get error aoubt the out of order.but this sould not make the cs length miss match |
Actually I do not understand what makes psql return error code. If your query returns nothing (empty result set_, then psql still returns normal error code. It will return non-zero error code only if some error happen during query execution. And I suspect that such error will be "value out of timeseries order"! But please notice that this error is detected only for timestamp timeseries. And values are now appended in order of columns in the table. In your case: uin | clientversion | revision | crashtime ... |
Now,I can reproduce this bug in this branch : 64fb76f this is the step: crashlog_truncate(1 row) crash=> select crashlog_load(filter:= 'logtime >= crashlog_load
(1 row) ---------------fresh data,no miss match cs-------------------- count
(1 row) --------------------delete has bug------------- crashlog_delete
...... (240 rows) --------------cs data length miss match------------------ count
(1 row) |
for same data,if filter is countryid =1 ,the sql report error:
ERROR: group by sequence doesn't match values sequence
while if the filter is countryid =2 or countryid =3 ,then sql run OK
select clientversion,t.agg_value cnt_distinct,encode(trim(E'\000' from t.group_by) ,'escape') revision from (select wx_version.clientversion,x.uin,x.revision,countryid from wx_version,crashlog_get(clientversion,now()::timestamp - '2 hour'::interval,now()::timestamp) as x where x is not null) y,cs_project_agg(cs_hash_dup_count(cs_filter(y.countryid = cs_const(2::int),y.uin),cs_filter(y.countryid = cs_const(2::int),y.revision))) as t(agg_value,group_by) order by 2 desc nulls last limit 10;
ERROR: group by sequence doesn't match values sequence
Time: 139.287 ms
select clientversion,t.agg_value cnt_distinct,encode(trim(E'\000' from t.group_by) ,'escape') revision from (select wx_version.clientversion,x.uin,x.revision,countryid from wx_version,crashlog_get(clientversion,now()::timestamp - '2 hour'::interval,now()::timestamp) as x where x is not null) y,cs_project_agg(cs_hash_dup_count(cs_filter(y.countryid = cs_const(1::int),y.uin),cs_filter(y.countryid = cs_const(1::int),y.revision))) as t(agg_value,group_by) order by 2 desc nulls last limit 10;
clientversion | cnt_distinct | revision
---------------+--------------+----------
(0 rows)
Time: 67.556 ms
The text was updated successfully, but these errors were encountered: