fix: set finish before notifying load in LanceArrowWriter setFinished method #92

ColdL · 2025-09-24T03:34:04Z

Recently I was testing the integration between Lance and Spark.

I found that the Lance writer has a certain probability of hanging. After some troubleshooting, I discovered this is related to the LanceArrowWriter.setFinished method.

The original code appears to have a bug where it sets the finished status after notifying loadNextBatch, which could cause loadNextBatch to hang.

Root Cause

The ideal flow should be:

(thread 1) loadToken.release
(thread 1) finished = true
(thread 2) loadNextBatch
(thread 2) finished is true and count is 0 so return false

However, there's a chance it becomes:

(thread 1) loadToken.release
(thread 2) loadNextBatch
(thread 2) finished is false so return true and waiting
(thread 1) finished = false

If the second scenario occurs, thread 2 will hang indefinitely and cannot receive new notifications. jstack will show stacks hanging in LanceDataWriter.commit.

Reproduction

This issue is hard to reproduce. I encountered it in a very low-resource environment (Spark executor with only 1 core 4g) when creating a new table and writing 600 rows of data at once, where one column is a 1024-dimensional vector column.

It also occurs intermittently.

Further Confirmation

Although the current fix seems reasonable, I hope to get confirmation from maintainers to avoid introducing new unknown issues.

Any comments about this. @jackye1995

jackye1995 · 2025-09-24T06:30:29Z

Nice, I think that explains https://github.com/lancedb/lance-spark/blob/main/lance-spark-base_2.12/src/test/java/com/lancedb/lance/spark/write/LanceArrowWriterTest.java#L38. Could you re-enable that test?

ColdL · 2025-09-24T08:30:39Z

DONE

jackye1995 · 2025-09-24T17:40:30Z

looks like there are still some code style issues.

… method

ColdL · 2025-09-25T02:06:05Z

DONE

ColdL · 2025-10-09T04:05:48Z

And, this commit might also be worth another look, it hasn't been merged yet 😉

@jackye1995

github-actions bot added the bug Something isn't working label Sep 24, 2025

ColdL force-pushed the fix-lance-arrow-writer branch from 0f0dfea to 232f394 Compare September 24, 2025 08:27

ColdL force-pushed the fix-lance-arrow-writer branch from 232f394 to 871556b Compare September 24, 2025 08:35

fix: set finish before notifying load in LanceArrowWriter setFinished…

2477cdc

… method

ColdL force-pushed the fix-lance-arrow-writer branch from 871556b to 2477cdc Compare September 25, 2025 02:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: set finish before notifying load in LanceArrowWriter setFinished method #92

fix: set finish before notifying load in LanceArrowWriter setFinished method #92

Uh oh!

ColdL commented Sep 24, 2025

Uh oh!

jackye1995 commented Sep 24, 2025

Uh oh!

ColdL commented Sep 24, 2025

Uh oh!

jackye1995 commented Sep 24, 2025

Uh oh!

ColdL commented Sep 25, 2025

Uh oh!

ColdL commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: set finish before notifying load in LanceArrowWriter setFinished method #92

Are you sure you want to change the base?

fix: set finish before notifying load in LanceArrowWriter setFinished method #92

Uh oh!

Conversation

ColdL commented Sep 24, 2025

Uh oh!

jackye1995 commented Sep 24, 2025

Uh oh!

ColdL commented Sep 24, 2025

Uh oh!

jackye1995 commented Sep 24, 2025

Uh oh!

ColdL commented Sep 25, 2025

Uh oh!

ColdL commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants