Flash attention no longer working in most recent build? #15650

Master-Pr0grammer · 2025-08-29T03:27:07Z

Master-Pr0grammer
Aug 29, 2025

I keep getting a "FlashAttention without tensor cores only supports head sizes 64 and 128." error before a seg fault when ever i try to run any gemma3 model on the most recent build.

I have a GTX 1080ti which I know is old and does not have tensor cores, however I was able to run this perfectly before updating. I was wondering if anyone had a similar experience and/or a fix that doesn't involve downgrading. Or maybe this is a bug? I wanted to ask before filing a bug report.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flash attention no longer working in most recent build? #15650

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Flash attention no longer working in most recent build? #15650

Uh oh!

Master-Pr0grammer Aug 29, 2025

Replies: 0 comments

Master-Pr0grammer
Aug 29, 2025