How to use coreML models in Mac M2? #12

RageshAntony · 2023-06-08T04:21:51Z

I able to use CoreML models in my mac M2 using the base 'whisper.cpp'

https://github.com/ggerganov/whisper.cpp#core-ml-support

How to use CoreML in this pywishpercpp ?

Also one suggestion

add your library in their bindings list : https://github.com/ggerganov/whisper.cpp#bindings

abdeladim-s · 2023-06-09T00:53:07Z

@RageshAntony,

I didn't try this CoreML feature because I don't have a MAC so I won't be able to test it.
But I think you can just update the whisper.cpp submodule as well as the cmakelists.txt file and build the project from source.
Thank you for the suggestion. I tried to create a discussion in the whisper.cpp repo to showcase the library features, but I think the developers didn't want to add it to the bindings list in the readme page!! Maybe it is not as good as the ones already there!

w0372299 · 2023-09-25T15:35:11Z

@abdeladim-s or @RageshAntony ,

I also have a M2 mac and have been working with whisper.cpp utilizing the GPU. However, I have not been able to do so with the pywhispercpp. Is there a more indepth guide or explanation available to use as a reference?

I also have made some modifications to your /examples/main.py to allow output to json:

if args.output_json:
logging.info(f"Saving results as a json file ...")
json_file = utils.output_json(segs, file)
logging.info(f"json file saved to {json_file}")

and

parser.add_argument('-ojson', '--output-json', action='store_true', help="output result in a json file")

I also made changes to the utils.py:

def output_json(segments: list, output_file_path: str) -> str:
"""
Creates a JSON file from a list of segments

:param segments: list of segments
:return: path of the file

:return: Absolute path of the file
"""
if not output_file_path.endswith('.json'):
    output_file_path = output_file_path + '.json'

absolute_path = Path(output_file_path).absolute()

# Convert segments to a list of dictionaries
segments_json = []
for seg in segments:
    segment_dict = {
        "start_time": seg.t0,
        "end_time": seg.t1,
        "text": seg.text
    }
    segments_json.append(segment_dict)

# Write the list of segment dictionaries to the JSON file

with open(absolute_path, 'w', encoding='utf-8') as file:
    json.dump(segments_json, file, ensure_ascii=False, indent=4)

return absolute_path

Here is json output for the /samples/jfk.wav as an example. Thanks again for your work.

[
{
"start_time": 0,
"end_time": 1100,
"text": "And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country."
}
]

abdeladim-s · 2023-09-26T02:56:55Z

Thanks @w0372299 for the Json Idea, it looks great, please submit a PR and I will merge it with the codebase.

Regarding your question, as I said, I really wish I can help but I don't have access to a MAC.
I think the good use case for whisper.cpp is to use it with CPU, if you want to use the GPU just use the original whisper with Pytorch (it is already optimized for GPU) or even better use Faster-whisper, it supports the GPU and provides better performance.

tangm mentioned this issue Jan 10, 2024

Allow building of whispercpp with COREML support #31

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use coreML models in Mac M2? #12

How to use coreML models in Mac M2? #12

RageshAntony commented Jun 8, 2023

abdeladim-s commented Jun 9, 2023

w0372299 commented Sep 25, 2023

abdeladim-s commented Sep 26, 2023

How to use coreML models in Mac M2? #12

How to use coreML models in Mac M2? #12

Comments

RageshAntony commented Jun 8, 2023

abdeladim-s commented Jun 9, 2023

w0372299 commented Sep 25, 2023

abdeladim-s commented Sep 26, 2023