Skip to content

[Python] Optimize performance of pyfury #1993

Open
@chaokunyang

Description

@chaokunyang

Feature Request

pyfury is 3x faster than pickle serialization and 2x faster than pickle deserialization, here is the benchmark code:

@dataclass
class ComplexObject1:
    f1: Any = None
    f2: str = None
    f3: List[str] = None
    f4: Dict[pyfury.Int8Type, pyfury.Int32Type] = None
    f5: pyfury.Int8Type = None
    f6: pyfury.Int16Type = None
    f7: pyfury.Int32Type = None
    f8: pyfury.Int64Type = None
    f9: pyfury.Float32Type = None
    f10: pyfury.Float64Type = None
    f12: List[pyfury.Int16Type] = None


@dataclass
class ComplexObject2:
    f1: Any
    f2: Dict[pyfury.Int8Type, pyfury.Int32Type]


fury = pyfury.Fury(language=pyfury.Language.PYTHON)
fury.register_type(ComplexObject1)
fury.register_type(ComplexObject2)
o = COMPLEX_OBJECT
start = time.time()
binary = fury.serialize(o)
for i in range(50000000):
    # binary = fury.serialize(o)
    fury.deserialize(binary)
print(time.time() - start)
start = time.time()
binary = pickle.dumps(o)
for i in range(500000):
    # binary = pickle.dumps(o)
    pickle.loads(binary)
print(time.time() - start)

But the performance is not fast enough still, with the flame graph, we can see there are still performance improvement space:

out

Is your feature request related to a problem? Please describe

No response

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions