Skip to content

create_chat_completion is stuck in versions 0.2.84 and 0.2.85 for Mac Silicon #1648

Open
@mobeetle

Description

@mobeetle

Prerequisites

Version 0.2.84 or 0.2.85 and using create_chat_completion method.
Tried different GGUF models.

Please answer the following questions for yourself before submitting an issue.

  • [ X ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [ X ] I carefully followed the README.md.
  • [ X ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [ X ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Provide a result as described in the documentation.

Current Behavior

Inference is stuck (I let it run for 5 minutes).
After downgrading to version 0.2.83 everything runs without a single change in the code.

Environment and Context

Mac M1 MAX, 32GB RAM, MacOS 14.5, Python 3.12, llama-cpp-python 0.2.84/5.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions