Is all API thread-safe? (e.g. create two threads to invoke model.transcribe for two files) #296

grimmerk · 2022-10-11T14:18:35Z

grimmerk
Oct 11, 2022

I saw we can use multi-thread to invoke APIs, ref. But the question is not the case.

I also see Pytorch's inference is thread safe, ref. However, I'd like to double-confirm with you.

Oct 11, 2022

A short answer is no.

The KV caching mechanism uses forward hooks which are installed in the module objects and will cause race issues when used by multiple threads. The --threads option provides a more low-level control on how the CPU operations are parallelized, but it's less relevant if you're using GPU.

Even if you disabled KV caching, multi-threaded usage will be generally inefficient because of the GIL. Multiprocessing will buy you more, like in Multi-Instance GPU, and it may be more flexible for multithreading if you use the PyTorch C++ API as in your second reference.

If you're integrating Whisper with a serving layer, it may support automatic batching e.g. in TensorFlow Serving, …

View full answer

jongwook · 2022-10-11T20:07:58Z

jongwook
Oct 11, 2022
Maintainer

A short answer is no.

The KV caching mechanism uses forward hooks which are installed in the module objects and will cause race issues when used by multiple threads. The --threads option provides a more low-level control on how the CPU operations are parallelized, but it's less relevant if you're using GPU.

Even if you disabled KV caching, multi-threaded usage will be generally inefficient because of the GIL. Multiprocessing will buy you more, like in Multi-Instance GPU, and it may be more flexible for multithreading if you use the PyTorch C++ API as in your second reference.

If you're integrating Whisper with a serving layer, it may support automatic batching e.g. in TensorFlow Serving, which is usually more efficient use of GPU resource.

0 replies

silvacarl2 · 2023-01-13T00:00:50Z

silvacarl2
Jan 13, 2023

what is the command line or switches to enable whisper to take advnateg of multiple GPUs?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is all API thread-safe? (e.g. create two threads to invoke model.transcribe for two files) #296

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Is all API thread-safe? (e.g. create two threads to invoke model.transcribe for two files) #296

Uh oh!

grimmerk Oct 11, 2022

Replies: 2 comments

Uh oh!

jongwook Oct 11, 2022 Maintainer

Uh oh!

silvacarl2 Jan 13, 2023

grimmerk
Oct 11, 2022

jongwook
Oct 11, 2022
Maintainer

silvacarl2
Jan 13, 2023