Misc. bug: llama.swiftui simulator error

### Name and Version

$ ./build/bin/llama-cli --version
register_backend: registered backend Metal (1 devices)
register_device: registered device Metal (Apple M3)
register_backend: registered backend BLAS (1 devices)
register_device: registered device BLAS (Accelerate)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Apple M3)
version: 4830 (d78b4df0)
built with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin23.6.0

### Operating systems

Mac

### Which llama.cpp modules do you know to be affected?

Other (Please specify in the next section)

### Command line

```shell

```

### Problem description & steps to reproduce

When running the LlamaSwiftUI example in a simulator and trying to load a model, the following error occurs:
```console
ggml_metal_init: loaded kernel_ssm_conv_f32                                   0x0 | th_max =    0 | th_width =    0
ggml_metal_init: error: load pipeline error: Error Domain=CompilerError Code=2 "only 14 constant buffers binding are supported in the simulator but 16 were used" UserInfo={NSLocalizedDescription=only 14 constant buffers binding are supported in the simulator but 16 were used}
ggml_backend_metal_device_init: error: failed to allocate context
llama_init_from_model: failed to initialize Metal backend
Could not load context!
Error: The operation couldn’t be completed. (llama_swiftui.LlamaError error 0.)
```

This seems to be coming from the `kernel_ssm_conv_f32` kernel function:
```c++
// ref: ggml.c:ggml_compute_forward_ssm_conv_f32
// TODO: optimize
kernel void kernel_ssm_conv_f32(
        device const  void * src0,
        device const  void * src1,
        device       float * dst,
        constant   int64_t & ne00,
        constant   int64_t & ne01,
        constant   int64_t & ne02,
        constant  uint64_t & nb00,
        constant  uint64_t & nb01,
        constant  uint64_t & nb02,
        constant   int64_t & ne10,
        constant   int64_t & ne11,
        constant  uint64_t & nb10,
        constant  uint64_t & nb11,
        constant   int64_t & ne0,
        constant   int64_t & ne1,
        constant   int64_t & ne2,
        constant  uint64_t & nb0,
        constant  uint64_t & nb1,
        constant  uint64_t & nb2,
        uint3 tgpig[[threadgroup_position_in_grid]],
        uint3 tpitg[[thread_position_in_threadgroup]],
        uint3   ntg[[threads_per_threadgroup]]) {
```
This kernel function has 18 parameters and 16 are constant buffers. But the limit for
the simulator seems to be 14. I tried simply adding structs for the kernel functions that it complained about and this allowed the example to run (load a model and perform inference). 

There is already an open [issue](https://github.com/ggml-org/llama.cpp/issues/3229) and ongoing [work](https://github.com/ggml-org/llama.cpp/pull/12194) to create these structs. I'm just opening this issue for awareness/tracking. 

### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: llama.swiftui simulator error #12219

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: llama.swiftui simulator error #12219

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions