Skip to content

Conversation

@cxzl25
Copy link

@cxzl25 cxzl25 commented Jul 1, 2025

Problem

In Spark, it uses kryo pool, which reuses the same kryo object, but when IdentityObjectIntMap resize throws NegativeArraySizeException exception, reuse the object again, always throws ArrayIndexOutOfBoundsException exception

Because resize first updated the mask variable, but the keyTable update failed, which caused an error in calculating the object index the next time.

Stack trace

java.lang.NegativeArraySizeException: -2147483645
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.resize(IdentityObjectIntMap.java:542)
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:306)
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:300)
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:162)
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:307)
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:300)
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:162)
        at com.esotericsoftware.kryo.util.MapReferenceResolver.addWrittenObject(MapReferenceResolver.java:41)
        at com.esotericsoftware.kryo.Kryo.writeReferenceOrNull(Kryo.java:681)
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:646)
        at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:361)
        at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:302)
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:651)
        at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:269)
        at org.apache.spark.broadcast.TorrentBroadcast$.$anonfun$blockifyObject$4(TorrentBroadcast.scala:358)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1510)
        at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:360)
java.lang.ArrayIndexOutOfBoundsException: Index 688291177 out of bounds for length 3
        at com.esotericsoftware.kryo.util.IdentityObjectIntMap.get(IdentityObjectIntMap.java:322)
        at com.esotericsoftware.kryo.util.MapReferenceResolver.getWrittenId(MapReferenceResolver.java:46)
        at com.esotericsoftware.kryo.Kryo.writeReferenceOrNull(Kryo.java:671)
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:646)
        at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:269)
        at org.apache.spark.broadcast.TorrentBroadcast$.$anonfun$blockifyObject$4(TorrentBroadcast.scala:358)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1510)
        at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:360)

Simple test

    IdentityObjectIntMap identityObjectIntMap = new IdentityObjectIntMap(1073741824, 0.8f);
    try {
      identityObjectIntMap.put("k1", 1);
      identityObjectIntMap.clear((1073741824) << 1); // Simulate resize
    } catch (NegativeArraySizeException e) {  
      e.printStackTrace(); // Expected
    }
    identityObjectIntMap.clear(2048);
    identityObjectIntMap.put("k1", 1); // ArrayIndexOutOfBoundsException

@cxzl25 cxzl25 changed the title Check new size before resize IdentityObjectIntMap check new size before resize Jul 2, 2025
@cxzl25
Copy link
Author

cxzl25 commented Jul 4, 2025

Could you help review this PR? @theigl

@theigl
Copy link
Collaborator

theigl commented Jul 7, 2025

This additional check is ok, but I don't fully understand how throwing an IAE helps in your case. Is it just to fail as fast as possible?

I see you are targeting Kryo 4.x which is not maintained anymore. Even if I merge this PR, I cannot guarantee that there will be a release any time soon.

@cxzl25
Copy link
Author

cxzl25 commented Jul 7, 2025

but I don't fully understand how throwing an IAE helps in your case

In order to reuse the same kryo object, we used KryoPoolQueueImpl. Under this case, resize may fail because NegativeArraySizeException, and the kryo object is no longer available, but we put it back to the pool, and the next time we get the same kryo object from the pool may fail.

https://github.com/apache/spark/blob/46b6ccbd93c4fe5c2b72f730a776a2739bdbc7b4/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala#L407-L410

https://github.com/apache/spark/blob/46b6ccbd93c4fe5c2b72f730a776a2739bdbc7b4/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala#L432-L433

Kryo 4.x which is not maintained anymore

I saw your fix in Kryo 5.x. ObjectMap did some size checks. This problem should not appear in 5.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants