Commit Graph

58 Commits

Author SHA1 Message Date
Andreas Lubbe
a0ca50f3b9
cli: Fix assignment for vad_min_silence_duration_ms (#3467)
* cli: Fix assignment for vad_min_silence_duration_ms

Found and fixed this simple copy/paste error

* server : fix vad_min_silence_duration_ms assignment

---------

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2025-10-10 15:21:03 +02:00
Georgi Gerganov
0b3587acdd
whisper : enable flash attention by default (#3441) 2025-09-30 15:47:20 +03:00
Sacha Arbonel
1f5cf0b288
server : hide language probabilities option behind flag (#3328)
* examples/server: hide language probabilities option behind flag

* code review

* fix
2025-07-21 13:03:54 +02:00
accessiblepixel
869335f2d5
server : add dtw.params for v3-large-turbo (#3307)
* Add DTW model large-v3-turbo parameters to server.cpp example

DTW support is available in whispercpp and the large-v3-turbo model has already been added to the sources, but the large-v3-turbo model hasn't been added to the server.cpp file to make use of it. This commit hopefully corrects that issue.

* match original linebreak of original server.cpp file after adding large.v3.turbo dtw
2025-07-07 12:51:15 +03:00
Daniel Bevenius
f3ff80ea8d
examples : set the C++ standard to C++17 for server (#3261)
This commit updates the server example to use C++17 as the standard.

The motivation for this change is that currently the ci-run
`ggml-100-mac-m4` is failing when compiling the server example on
macOS. The `talk-llama` example also has this setting so it looks like
an alright change to make.

ggml-ci

Refs: https://github.com/ggml-org/ci/tree/results/whisper.cpp/2a/4d6db7d90899aff3d58d70996916968e4e0d27/ggml-100-mac-m4
2025-06-17 11:29:48 +02:00
Sacha Arbonel
107c303e69
server : graceful shutdown, atomic server state, and health endpoint Improvements (#3243)
* feat(server): implement graceful shutdown and server state management

* refactor(server): use lambda capture by reference in server.cpp
2025-06-16 10:14:26 +02:00
Daniel Bevenius
0a4d85cf8a
server : add Voice Activity Detection (VAD) support (#3246)
* server : add Voice Activity Detection (VAD) support

This commit adds support for Voice Activity Detection (VAD) in the
server example.

The motivation for this is to enable VAD processing when using
whisper-server.

Resolves: https://github.com/ggml-org/whisper.cpp/issues/3089

* server : add VAD parameters to usage in README.md [no ci]

This commit also adds a few missing parameters.

* server : fix conflicting short options [no ci]
2025-06-13 13:24:03 +02:00
Daniel Bevenius
73a8c5fb94
whisper : remove whisper_load_backends function (#3196)
* whisper : remove whisper_load_backends function

This commit removes the `whisper_load_backends` function, which was used
to load all GGML backends.

The motivation for this change push the responsibility of loading
backends to user applications to give them more control over which
backends to load and when. See the references below for more context.

Resolves: https://github.com/ggml-org/whisper.cpp/issues/3182
Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801778733
Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801928990

* ruby : add check for rwc is NULL

This commit adds a check to ensure that the `rwc` pointer is not NULL
before attempting to mark its members in the garbage collector.

The motivation for this is an attempt to see if this fixed the CI build
as I'm not able to reproduce the issue locally.

Refs: https://github.com/ggml-org/whisper.cpp/actions/runs/15299612277/job/43036694928?pr=3196
2025-05-29 08:03:17 +02:00
Sacha Arbonel
78b31ca782
server : Add k6 Load Testing Script (#3175)
* add load testing script and update README for k6 integration
2025-05-22 10:03:04 +02:00
Daniel Bevenius
3882a099e1
server : add --flash-attn usage output (#3152)
This commit adds the `--flash-attn` option to the usage output of the
server example.

The motivation for this change is that while it is possible to set this
option it is not printed in the usage output.
2025-05-14 15:22:05 +02:00
Daniel Bevenius
09846f4e12
whisper: remove MSVC warnings pragmas (#3090)
* ggml : remove MSVC warnings pragmas

This commit removes the MSVC-specific pragmas as these are now handled
in CMakeLists.txt.

* whisper : remove MSVC warning pragmas

This commit removes the MSVC-specific pragmas. These are now handled in
the CMakeLists.txt file.
2025-05-05 13:09:35 +02:00
Sacha Arbonel
bcf1ed0163
server: update abort mechanism to handle HTTP connection closure (#3112) 2025-05-05 07:16:54 +02:00
Sacha Arbonel
1fa17bc752
server : update httplib.h to version 0.20.0 (#3101) 2025-05-02 06:09:41 +02:00
Daniel Bevenius
25efcfe3ed
server : add --no-gpu option to print usage output (#3098)
This commit adds the the command line option `--no-gpu` to the server
examples print usage function.

The motivation for this is that this options is available and can be set
but it is not displayed in the usage message.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3095
2025-05-01 09:15:12 +03:00
Sacha Arbonel
f0171f0616
examples : expose language detection probabilities to server example (#3044)
* feat: expose language detection probabilities to server.cpp

* feat: enhance language detection output in server.cpp

* Remove empty spaces.
2025-04-28 18:25:45 +02:00
Sacha Arbonel
170b2faf75
whisper : add no_context parameter to whisper_params (#3045) 2025-04-16 06:24:38 +02:00
Sacha Arbonel
88d13a17a7
feat: add health check endpoint to server (#2968) 2025-03-31 11:03:41 +03:00
Georgi Gerganov
c64f3e8ada
common : separate whisper sources (#2846)
* common : separate whisper sources

* examples : add chrono

* examples : add more headers
2025-02-27 12:50:32 +02:00
Dmitry Atamanov
7d3da68f79
examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759) 2025-02-27 09:06:54 +02:00
Georgi Gerganov
e940fbf283
server : fix build (#2718) 2025-01-13 08:57:33 +02:00
NETZkultur GmbH
45d3faf961
server : generate unique tmp filenames (#2718)
#Summary

This Merge Request adds a mechanism to generate unique filenames for FFmpeg conversions in whisper_server.cpp. Previously, a single fixed filename was used (e.g., whisper-server-tmp.wav), which could result in unexpected file overwrites under certain circumstances. By generating a unique filename per request, any risk of overwriting temporary files is eliminated.

#Background / Motivation
	•	Problem: Relying on a static filename for temporary audio files may lead to overwrites if multiple operations occur simultaneously or if the same file name is reused.
	•	Goal: Dynamically generate unique filenames, ensuring each request or operation uses an isolated temporary file.
2025-01-13 08:55:21 +02:00
Georgi Gerganov
ed09075ca0
server : fix help print 2024-12-22 15:32:05 +02:00
Sacha Arbonel
4183517076
server : add no-speech threshold parameter and functionality (#2654) 2024-12-21 17:00:08 +02:00
Georgi Gerganov
f4668169a0
whisper : rename suppress_non_speech_tokens to suppress_nst (#2653) 2024-12-21 12:54:35 +02:00
Sacha Arbonel
944ce49439
server : add option to suppress non-speech tokens (#2649)
* The parameter will suppress non-speech tokens like [LAUGH], [SIGH], etc. from the output when enabled.

* add to whisper_params_parse

* add missing param
2024-12-21 12:05:05 +02:00
Georgi Gerganov
2e59dced12
whisper : rename binaries + fix install (#2648)
* whisper : rename binaries + fix install

* cont : try to fix ci

* cont : fix emscripten builds
2024-12-21 09:43:49 +02:00
gilbertgong
ede1718f6d
server : ffmpeg overwrite leftover temp file (#2431)
* Remove possible leftover ffmpeg temp file from a previous failed conversion

* Revert "Remove possible leftover ffmpeg temp file from a previous failed conversion"

This reverts commit 00797403bd.

* Flag to force ffmpeg to overwrite output file if it exists
2024-10-02 15:06:40 +03:00
Toliver
5b1ce40fa8
server : use OS-generated temp file name for converted files (#2419) 2024-09-17 15:56:32 +03:00
Emmanuel Schmidbauer
bec9836849
server : add inference path to make OAI API compatible (#2270) 2024-07-08 14:24:58 +03:00
Borislav Stanimirov
af5833e298
whisper : remove speed_up and phase_vocoder* functions (#2198)
* whisper : fix cast warning

* whisper : remove phase_vocoder functions, ref #2195

* whisper : remove speed_up from whisper_full_params, closes #2195
2024-05-31 11:37:29 +03:00
Daniel Valdivia
a7dc2aab16
server : fix typo (#2181)
A simple comment typo, PR can be dismissed
2024-05-25 10:46:22 +03:00
Georgi Gerganov
7094ea5e75
whisper : use flash attention (#2152)
* whisper : use flash attention in the encoder

* whisper : add kv_pad

* whisper : remove extra backend instance (huh?)

* whisper : use FA for cross-attention

* whisper : use FA for self-attention

* whisper : simplify encoder FA

* whisper : add flash_attn runtime parameter

* scripts : add bench log

* scripts : add M1 Pro bench log
2024-05-15 09:38:19 +03:00
Georgi Gerganov
4ef8d9f44e
server : return utf-8 (#2138) 2024-05-13 15:33:46 +03:00
Emmanuel Schmidbauer
9fab28135c
server : add dtw (#2044)
* server.cpp: add dtw

* Update examples/server/server.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-15 22:16:58 +03:00
Jo Liss
e7794a868f
examples : rename --audio-context to --audio-ctx per help text (#1953) 2024-03-18 17:53:33 +02:00
Felix
07d04280be
examples : clean up common code (#1871)
move some utility functions into common.h
2024-02-19 10:50:15 +02:00
dscripka
a6fb6ab597
examples : added audio_ctx argument to main and server (#1857)
* added audio_ctx argument to main and server examples

* Better default value

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* better default value (again)

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-12 09:19:07 +02:00
Georgi Gerganov
f273e66dc6
examples : initialize context params properly (#1852) 2024-02-11 16:39:12 +02:00
Valentin Gosu
80e8a2ea39
server : allow CORS request with authorization headers (#1850)
Whisper plugin in Obsidian requires an API key which is
then sent as an authorization header.
However, the presence of an authorization header requires
a CORS Preflight, so both the OPTIONS method and
the Access-Control-Allow-Headers: authorization must be
handled.
2024-02-09 17:42:41 +02:00
JacobLinCool
baa30bacdb
server : add fields to verbose_json response (#1802)
* server: include additional fields in the verbose_json response as OpenAI does

* server: show request examples on home page

* server: todo note for compression_ratio and no_speech_prob

* server: add simple demo form to the homepage
2024-01-30 14:15:55 +02:00
Ryan Hitchman
c0329acde8
server : implement "verbose_json" format with token details (#1781)
* examples/server: implement "verbose_json" format with token details.

This is intended to mirror the format of openai's Python
whisper.transcribe() return values.

* server: don't write WAV to a temporary file if not converting

* server: use std::lock_guard instead of manual lock/unlock
2024-01-18 22:58:42 +02:00
Przemysław Pawełczyk
f5f159c320
server : fix building and simplify lib deps on Windows (#1772)
* make : fix server example building on MSYS2 environments (Windows)

It was not working since commit eff3570f78
when server was introduced.

* cmake : simplify server example lib deps on Windows

server uses httplib::Server, not httplib::SSLServer, so there is no need
to mention cryptographic libraries in target_link_libraries.
Winsock (ws2_32) suffices here.

Also use plain library names like we use in other places.
2024-01-15 15:48:13 +02:00
George Hindle
fbcb52d3cd
server : add more parameters to server api (#1754)
* feat(server): add more parameters to server api

* fix(server): reset params to original parsed values for each request
2024-01-12 13:42:52 +02:00
George Hindle
f7908f9bb8
params : don't compute timestamps when not printing them (#1755) 2024-01-12 13:24:38 +02:00
Emmanuel Schmidbauer
c46886f599
server : add request path option(#1741) 2024-01-08 22:39:51 +00:00
Georgi Gerganov
022756a872
server : fix server temperature + add temperature_inc (#1729)
* server : fix server temperature + add temperature_inc

* server : change dashes to underscores in parameter names
2024-01-07 13:35:14 +02:00
Kreijstal
6335933a5b
cmake : Fix bug in httplib.h for mingw (#1615)
Fix bug in httlib.h for mingw, please see https://github.com/yhirose/cpp-httplib/issues/1669
2023-12-10 17:47:52 +00:00
Oleg Sidorov
3163090d89
server : pass max-len argument to the server (#1574)
This commit fixes the missing parameter binding for max-len between the input
arguments and wparams.
2023-12-05 23:01:45 +02:00
Aleksander Andrzejewski
a0ec3fac54
Server : Add support for .vtt format to Whisper server (#1578)
- The code comes from examples/main
- The output mimetype is set to text/vtt

Example usage:
```shell
curl 127.0.0.1:8080/inference \
-H "Content-Type: multipart/form-data" \
-F file="@samples/jfk.wav" \
-F temperature="0.2" \
-F response-format="vtt"
```
2023-11-30 23:44:26 +00:00
Oleg Sidorov
6559b538e5
server : backport .srt output format (#1565)
This commit adds a support of .srt format to Whisper server. The code is
effectively backported from examples/main. The output mimetype is set to
application/x-subrip as per https://en.wikipedia.org/wiki/SubRip.

Example usage:

  curl 127.0.0.1:8080/inference \
    -H "Content-Type: multipart/form-data" \
    -F file="@<file-path>" \
    -F temperature="0.2" \
    -F response-format="srt"
2023-11-28 15:42:58 +02:00