Skip to content

UTF8 issue with command line parameters in Windows version #554

@bilo1967

Description

@bilo1967

If I pass the file "Chinese audio (中文).mp3" to the windows command line version, it exits with an errors:

rem Here main.exe has been renamed to whisper.exe
C:\...\whisp>whisper.exe --model models\ggml-tiny.bin --language chinese "Chinese file (中文).mp3"
whisper_init_from_file: loading model from 'models\ggml-tiny.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 1
whisper_model_load: mem required  =  127.00 MB (+    3.00 MB per decoder)
whisper_model_load: kv self size  =    2.62 MB
whisper_model_load: kv cross size =    8.79 MB
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     =   73.58 MB
whisper_model_load: model size    =   73.54 MB
error: failed to open 'Chinese file (??).mp3' as WAV file
error: failed to read WAV file 'Chinese file (??).mp3'

whisper_print_timings:     fallbacks =   0 p /   0 h
whisper_print_timings:     load time =   398.52 ms
whisper_print_timings:      mel time =     0.00 ms
whisper_print_timings:   sample time =     0.00 ms /     1 runs (    0.00 ms per run)
whisper_print_timings:   encode time =     0.00 ms /     1 runs (    0.00 ms per run)
whisper_print_timings:   decode time =     0.00 ms /     1 runs (    0.00 ms per run)
whisper_print_timings:    total time =   399.54 ms

Runs fine when I rename the file omitting the chinese logograms.
I've also tried setting the codepage to UTF-8 with chcp 65001 with no luck.

(MacOS version works fine)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions