Skip to content

Commit 262e865

Browse files
ruby : Sync whisper.cpp and model download feature (ggml-org#2617)
* Use C++17 * Add test for Pathname of model * Make Whisper::Context#initialize accept Pathname * Add shorthand for pre-converted models * Update documents * Add headings to API section in README [skip ci] * Remove unused function * Don't care about no longer included file * Cosmetic fix * Use conditional get when get model files
1 parent ed733e8 commit 262e865

File tree

10 files changed

+252
-68
lines changed

10 files changed

+252
-68
lines changed

bindings/ruby/.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
LICENSE
22
pkg/
3-
lib/whisper.*
3+
lib/whisper.so
4+
lib/whisper.bundle
5+
lib/whisper.dll

bindings/ruby/README.md

Lines changed: 55 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Usage
2222
```ruby
2323
require "whisper"
2424

25-
whisper = Whisper::Context.new("path/to/model.bin")
25+
whisper = Whisper::Context.new(Whisper::Model["base"])
2626

2727
params = Whisper::Params.new
2828
params.language = "en"
@@ -41,21 +41,60 @@ end
4141

4242
### Preparing model ###
4343

44-
Use script to download model file(s):
44+
Some models are prepared up-front:
4545

46-
```bash
47-
git clone https://github.com/ggerganov/whisper.cpp.git
48-
cd whisper.cpp
49-
sh ./models/download-ggml-model.sh base.en
46+
```ruby
47+
base_en = Whisper::Model["base.en"]
48+
whisper = Whisper::Context.new(base_en)
49+
```
50+
51+
At first time you use a model, it is downloaded automatically. After that, downloaded cached file is used. To clear cache, call `#clear_cache`:
52+
53+
```ruby
54+
Whisper::Model["base"].clear_cache
5055
```
5156

52-
There are some types of models. See [models][] page for details.
57+
You can see the list of prepared model names by `Whisper::Model.preconverted_model_names`:
58+
59+
```ruby
60+
puts Whisper::Model.preconverted_model_names
61+
# tiny
62+
# tiny.en
63+
# tiny-q5_1
64+
# tiny.en-q5_1
65+
# tiny-q8_0
66+
# base
67+
# base.en
68+
# base-q5_1
69+
# base.en-q5_1
70+
# base-q8_0
71+
# :
72+
# :
73+
```
74+
75+
You can also use local model files you prepared:
76+
77+
```ruby
78+
whisper = Whisper::Context.new("path/to/your/model.bin")
79+
```
80+
81+
Or, you can download model files:
82+
83+
```ruby
84+
model_uri = Whisper::Model::URI.new("http://example.net/uri/of/your/model.bin")
85+
whisper = Whisper::Context.new(model_uri)
86+
```
87+
88+
See [models][] page for details.
5389

5490
### Preparing audio file ###
5591

5692
Currently, whisper.cpp accepts only 16-bit WAV files.
5793

58-
### API ###
94+
API
95+
---
96+
97+
### Segments ###
5998

6099
Once `Whisper::Context#transcribe` called, you can retrieve segments by `#each_segment`:
61100

@@ -107,10 +146,12 @@ whisper.transcribe("path/to/audio.wav", params)
107146

108147
```
109148

149+
### Models ###
150+
110151
You can see model information:
111152

112153
```ruby
113-
whisper = Whisper::Context.new("path/to/model.bin")
154+
whisper = Whisper::Context.new(Whisper::Model["base"])
114155
model = whisper.model
115156

116157
model.n_vocab # => 51864
@@ -128,6 +169,8 @@ model.type # => "base"
128169

129170
```
130171

172+
### Logging ###
173+
131174
You can set log callback:
132175

133176
```ruby
@@ -160,6 +203,8 @@ Whisper.log_set ->(level, buffer, user_data) {
160203
Whisper::Context.new(MODEL)
161204
```
162205

206+
### Low-level API to transcribe ###
207+
163208
You can also call `Whisper::Context#full` and `#full_parallel` with a Ruby array as samples. Although `#transcribe` with audio file path is recommended because it extracts PCM samples in C++ and is fast, `#full` and `#full_parallel` give you flexibility.
164209

165210
```ruby
@@ -169,7 +214,7 @@ require "wavefile"
169214
reader = WaveFile::Reader.new("path/to/audio.wav", WaveFile::Format.new(:mono, :float, 16000))
170215
samples = reader.enum_for(:each_buffer).map(&:samples).flatten
171216

172-
whisper = Whisper::Context.new("path/to/model.bin")
217+
whisper = Whisper::Context.new(Whisper::Model["base"])
173218
whisper.full(Whisper::Params.new, samples)
174219
whisper.each_segment do |segment|
175220
puts segment.text

bindings/ruby/Rakefile

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -18,19 +18,9 @@ EXTSOURCES.each do |src|
1818
end
1919

2020
CLEAN.include SOURCES
21-
CLEAN.include FileList[
22-
"ext/*.o",
23-
"ext/*.metal",
24-
"ext/whisper.{so,bundle,dll}",
25-
"ext/depend"
26-
]
21+
CLEAN.include FileList["ext/*.o", "ext/*.metal", "ext/whisper.{so,bundle,dll}"]
2722

28-
task build: FileList[
29-
"ext/Makefile",
30-
"ext/ruby_whisper.h",
31-
"ext/ruby_whisper.cpp",
32-
"whispercpp.gemspec",
33-
]
23+
task build: ["ext/Makefile", "ext/ruby_whisper.h", "ext/ruby_whisper.cpp", "whispercpp.gemspec"]
3424

3525
directory "pkg"
3626
CLOBBER.include "pkg"

bindings/ruby/ext/.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@ Makefile
22
whisper.so
33
whisper.bundle
44
whisper.dll
5-
depend
65
scripts/get-flags.mk
76
*.o
87
*.c

bindings/ruby/ext/extconf.rb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
require 'mkmf'
22

33
# need to use c++ compiler flags
4-
$CXXFLAGS << ' -std=c++11'
4+
$CXXFLAGS << ' -std=c++17'
55

66
$LDFLAGS << ' -lstdc++'
77

@@ -35,10 +35,10 @@
3535
$GGML_METAL_EMBED_LIBRARY = true
3636
end
3737

38-
$MK_CPPFLAGS = '-Iggml/include -Iggml/src -Iinclude -Isrc -Iexamples'
38+
$MK_CPPFLAGS = '-Iggml/include -Iggml/src -Iggml/src/ggml-cpu -Iinclude -Isrc -Iexamples'
3939
$MK_CFLAGS = '-std=c11 -fPIC'
40-
$MK_CXXFLAGS = '-std=c++11 -fPIC'
41-
$MK_NVCCFLAGS = '-std=c++11'
40+
$MK_CXXFLAGS = '-std=c++17 -fPIC'
41+
$MK_NVCCFLAGS = '-std=c++17'
4242
$MK_LDFLAGS = ''
4343

4444
$OBJ_GGML = []

bindings/ruby/ext/ruby_whisper.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ static ID id_to_enum;
4545
static ID id_length;
4646
static ID id_next;
4747
static ID id_new;
48+
static ID id_to_path;
4849

4950
static bool is_log_callback_finalized = false;
5051

@@ -194,7 +195,9 @@ static VALUE ruby_whisper_params_allocate(VALUE klass) {
194195

195196
/*
196197
* call-seq:
198+
* new(Whisper::Model["base.en"]) -> Whisper::Context
197199
* new("path/to/model.bin") -> Whisper::Context
200+
* new(Whisper::Model::URI.new("https://example.net/uri/of/model.bin")) -> Whisper::Context
198201
*/
199202
static VALUE ruby_whisper_initialize(int argc, VALUE *argv, VALUE self) {
200203
ruby_whisper *rw;
@@ -204,6 +207,9 @@ static VALUE ruby_whisper_initialize(int argc, VALUE *argv, VALUE self) {
204207
rb_scan_args(argc, argv, "01", &whisper_model_file_path);
205208
Data_Get_Struct(self, ruby_whisper, rw);
206209

210+
if (rb_respond_to(whisper_model_file_path, id_to_path)) {
211+
whisper_model_file_path = rb_funcall(whisper_model_file_path, id_to_path, 0);
212+
}
207213
if (!rb_respond_to(whisper_model_file_path, id_to_s)) {
208214
rb_raise(rb_eRuntimeError, "Expected file path to model to initialize Whisper::Context");
209215
}
@@ -1733,6 +1739,7 @@ void Init_whisper() {
17331739
id_length = rb_intern("length");
17341740
id_next = rb_intern("next");
17351741
id_new = rb_intern("new");
1742+
id_to_path = rb_intern("to_path");
17361743

17371744
mWhisper = rb_define_module("Whisper");
17381745
cContext = rb_define_class_under(mWhisper, "Context", rb_cObject);

bindings/ruby/lib/whisper.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
require "whisper.so"
2+
require "whisper/model"

bindings/ruby/lib/whisper/model.rb

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
require "whisper.so"
2+
require "uri"
3+
require "net/http"
4+
require "pathname"
5+
require "io/console/size"
6+
7+
class Whisper::Model
8+
class URI
9+
def initialize(uri)
10+
@uri = URI(uri)
11+
end
12+
13+
def to_path
14+
cache
15+
cache_path.to_path
16+
end
17+
18+
def clear_cache
19+
path = cache_path
20+
path.delete if path.exist?
21+
end
22+
23+
private
24+
25+
def cache_path
26+
base_cache_dir/@uri.host/@uri.path[1..]
27+
end
28+
29+
def base_cache_dir
30+
base = case RUBY_PLATFORM
31+
when /mswin|mingw/
32+
ENV.key?("LOCALAPPDATA") ? Pathname(ENV["LOCALAPPDATA"]) : Pathname(Dir.home)/"AppData/Local"
33+
when /darwin/
34+
Pathname(Dir.home)/"Library/Caches"
35+
else
36+
ENV.key?("XDG_CACHE_HOME") ? ENV["XDG_CACHE_HOME"] : Pathname(Dir.home)/".cache"
37+
end
38+
base/"whisper.cpp"
39+
end
40+
41+
def cache
42+
path = cache_path
43+
headers = {}
44+
headers["if-modified-since"] = path.mtime.httpdate if path.exist?
45+
request @uri, headers
46+
path
47+
end
48+
49+
def request(uri, headers)
50+
Net::HTTP.start uri.host, uri.port, use_ssl: uri.scheme == "https" do |http|
51+
request = Net::HTTP::Get.new(uri, headers)
52+
http.request request do |response|
53+
case response
54+
when Net::HTTPNotModified
55+
# noop
56+
when Net::HTTPOK
57+
download response
58+
when Net::HTTPRedirection
59+
request URI(response["location"])
60+
else
61+
raise response
62+
end
63+
end
64+
end
65+
end
66+
67+
def download(response)
68+
path = cache_path
69+
path.dirname.mkpath unless path.dirname.exist?
70+
downloading_path = Pathname("#{path}.downloading")
71+
size = response.content_length
72+
downloading_path.open "wb" do |file|
73+
downloaded = 0
74+
response.read_body do |chunk|
75+
file << chunk
76+
downloaded += chunk.bytesize
77+
show_progress downloaded, size
78+
end
79+
end
80+
downloading_path.rename path
81+
end
82+
83+
def show_progress(current, size)
84+
return unless size
85+
86+
unless @prev
87+
@prev = Time.now
88+
$stderr.puts "Downloading #{@uri}"
89+
end
90+
91+
now = Time.now
92+
return if now - @prev < 1 && current < size
93+
94+
progress_width = 20
95+
progress = current.to_f / size
96+
arrow_length = progress * progress_width
97+
arrow = "=" * (arrow_length - 1) + ">" + " " * (progress_width - arrow_length)
98+
line = "[#{arrow}] (#{format_bytesize(current)} / #{format_bytesize(size)})"
99+
padding = ' ' * ($stderr.winsize[1] - line.size)
100+
$stderr.print "\r#{line}#{padding}"
101+
$stderr.puts if current >= size
102+
@prev = now
103+
end
104+
105+
def format_bytesize(bytesize)
106+
return "0.0 B" if bytesize.zero?
107+
108+
units = %w[B KiB MiB GiB TiB]
109+
exp = (Math.log(bytesize) / Math.log(1024)).to_i
110+
format("%.1f %s", bytesize.to_f / 1024 ** exp, units[exp])
111+
end
112+
end
113+
114+
@names = {}
115+
%w[
116+
tiny
117+
tiny.en
118+
tiny-q5_1
119+
tiny.en-q5_1
120+
tiny-q8_0
121+
base
122+
base.en
123+
base-q5_1
124+
base.en-q5_1
125+
base-q8_0
126+
small
127+
small.en
128+
small.en-tdrz
129+
small-q5_1
130+
small.en-q5_1
131+
small-q8_0
132+
medium
133+
medium.en
134+
medium-q5_0
135+
medium.en-q5_0
136+
medium-q8_0
137+
large-v1
138+
large-v2
139+
large-v2-q5_0
140+
large-v2-8_0
141+
large-v3
142+
large-v3-q5_0
143+
large-v3-turbo
144+
large-v3-turbo-q5_0
145+
large-v3-turbo-q8_0
146+
].each do |name|
147+
@names[name] = URI.new("https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-#{name}.bin")
148+
end
149+
150+
class << self
151+
def [](name)
152+
@names[name]
153+
end
154+
155+
def preconverted_model_names
156+
@names.keys
157+
end
158+
end
159+
end

0 commit comments

Comments
 (0)