Skip to content

Conversation

@Gsus42
Copy link
Contributor

@Gsus42 Gsus42 commented Sep 3, 2025

Hi there,

we are using clang with -Wimplicit-fallthrough and -Werror enabled which helped us identify a few odd places in our code base. In the PCRE2 code base those places have already been commented and I propose to use the corresponding compiler attribute to make the compiler aware the fact that fallthrough is allowed at these places.

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

We'll need to do something for portability here (since many major compilers don't have __attribute__((fallthrough)), eg MSVC).

A bit nasty, hmm.

But, I welcome the suggestion to allow this warning to be enabled on Clang. According to this LLVM issue and this Flex issue (which Clang devs commented on), Clang simply isn't going to allow suppression of fallthrough warnings via a comment, which (in my opinion) is rather uncompromising and unhelpful, given decades of industry-wide use of /* fallthrough */ comments.

Our CI builds do enable -Wimplicit-fallthough on GCC, and the build does succeed due to GCC's support for suppression comments.

I think what we want is:

  • In pcre2_util.h, please add:
/* We define this fallthrough macro for suppressing -Wimplicit-fallthrough warnings.
Clang only allows this via an attribute, whereas other compilers (eg. GCC) match attributes
and also specially-formatted comments.

For maximum portability, please use this macro as:

  PCRE2_FALLTHROUGH /* Fall through */
#if defined(__has_attribute)
#if __has_attribute(fallthrough)
#define PCRE2_FALLTHROUGH __attribute__((fallthrough));
#endif
#endif
#ifndef PCRE2_FALLTHROUGH
#define PCRE2_FALLTHROUGH
#endif
  • Use that syntax throughout, to work with GCC and Clang and any other compilers (there are dozens that people use with PCRE2...)

  • Two of the YAML files in .github/ have a definition for CFLAGS_GCC_STYLE. We should update it to add -Wimplicit-fallthrough. It's included in -Wextra on GCC, but not on Clang.

Thank you!

…to define PCRE2_FALLTHROUGH similar to PCRE2_UNDEFINED.
@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Ah sorry, I was too quick changing stuff. I've introduced a similar check as for HAVE_ATTRIBUTE_UNINITIALIZED. Would that be okay as well?

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

Well done! Sorry we were working simultaneously there.

Users often pull files from PCRE2 into their own build system, so I try and keep use of Autoconf and CMake minimal. We probably have too many configure-time checks already.

The changes you made were not wrong... But I think I slightly prefer the approach I suggested!

Using __has_attribute will work with more build systems. I think every compiler with __attribute__((fallthrough)) will also have __has_attribute (it's only GCC and LLVM), so coverage should equally good.

(Interestingly, our GCC build on CI shows checking for __attribute__((fallthrough))... no and Performing Test HAVE_ATTRIBUTE_FALLTHROUGH - Failed with your change.)

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Okay, then I'd revert that last change and go with your approach.

…rough)) to define PCRE2_FALLTHROUGH similar to PCRE2_UNDEFINED."

This reverts commit deb1078.
@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

Thank you for your understanding!

In my comment, I wrote -Wno-implicit-fallthrough in my suggested code snipped, but I've realised it's actually -Wimplicit-fallthrough. Apologies

Gernot Gebhard added 2 commits September 3, 2025 12:28
@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Thank you for your understanding!

In my comment, I wrote -Wno-implicit-fallthrough in my suggested code snipped, but I've realised it's actually -Wimplicit-fallthrough. Apologies

I also had to change the /* Fall through */ to // Fall through // to avoid:
src/pcre2_util.h:136:21: error: '/*' within block comment [-Werror,-Wcomment]

case OP_TYPEMINUPTO:
case OP_TYPEPOSUPTO:
tcode += IMM2_SIZE; /* Fall through */
tcode += IMM2_SIZE; PCRE2_FALLTHROUGH;
Copy link
Contributor

@carenas carenas Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the PCRE2_FALLTHROUGH down to the next line and don't forget to include a comment next to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also be fixed by last commit.

src/pcre2_util.h Outdated
Clang only allows this via an attribute, whereas other compilers (eg. GCC) match attributes
and also specially-formatted comments.
For maximum portability, please use this macro as:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"use this macro" TOGETHER with a "fall through" comment.

and make sure to follow your own advice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed by last commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally prefer /* */ comments, although I think a few of the "new" (haha!) // ones have crept into the codebase.

It's a shame that it's awkward to put that in the doc string with that -Wcomment warning. We'll just have to format it like this or something:

/* ... Use this macro as (deleting the underscore characters):

  PCRE2_FALLTHROUGH /__* Fall through *__/
*/

Copy link
Contributor

@carenas carenas Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until recently, this codebase was C89 compatible, and // is not allowed in that standard.

C99 allows //, but we are meant to only use them as temporary placeholders, like "// TODO: fix this before releasing", which then are easy to spot.

…hope single-line comment is acceptable for this) and apply that guideline to all affected locations. if not supported by the compiler let the fallback be the same as for other macro defines.
@Gsus42 Gsus42 requested a review from carenas September 3, 2025 11:00
case OP_NOT_DIGIT:
invert_bits = TRUE;
/* Fall through */
PCRE2_FALLTHROUGH; // Fall through
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use C comments in this codebase, just like the one that was replaced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I've changed that accordingly. The only thing that is a bit inconsistent is how it is written in the comment (as hinted in my last commit).

… to use the macro. note that some compilers might complain about /* ... /* .. */, such as clang which would yield error: '/*' within block comment [-Werror,-Wcomment]. I hope that is acceptable.
Copy link
Contributor

@carenas carenas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, don't forget the CI changes so that the warning is actually enabled

src/pcre2_util.h Outdated
For maximum portability, please use this macro together with a fall through comment:
PCRE2_FALLTHROUGH; // Fall through */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something like:

... comment, for example: */

//  PCRE2_FALLTHROUGH; /* Fall through */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good hint. Will adopt this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind - my formatting (in other comment) or Carlo's is OK. You get to choose!

Copy link
Member

@NWilson NWilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, actually the JIT build needs fixing :(

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Oh wait, actually the JIT build needs fixing :(

Mh, I wonder why we get:

src/pcre2_util.h:140:27: error: ISO C does not support '[[]]' attributes before C23 [-Werror=pedantic]
  140 | #define PCRE2_FALLTHROUGH [[fallthrough]]

as I thought __has_c_attribute would check for that (see https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005fc_005fattribute.html). Mh?

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

Oh wait, actually the JIT build needs fixing :(

Mh, I wonder why we get:

src/pcre2_util.h:140:27: error: ISO C does not support '[[]]' attributes before C23 [-Werror=pedantic]
  140 | #define PCRE2_FALLTHROUGH [[fallthrough]]

as I thought __has_c_attribute would check for that (see https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005fc_005fattribute.html). Mh?

Apparently not. Perhaps:

#if defined(__has_c_attribute)
#if __has_c_attribute(fallthrough) && __STDC_VERSION__ >= 202311L
#define PCRE2_FALLTHROUGH [[fallthrough]]
#endif
#endif

It's apparently an error to use the [[fallthrough]] syntax if the compiler isn't using --std=c23. Fussy, but fair enough.

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Ah, for the clang compile we get:

src/pcre2grep.c:2058:3: error: [[]] attributes are a C23 extension [-Werror,-Wc23-extensions]
 2058 |   PCRE2_FALLTHROUGH; /* Fall through */

I guess we cannot have -Wc23-extensions as [[fallthrough]] is a C23 extension.
Can this be modified somehow, or should I rather stick with the original approach?

@carenas
Copy link
Contributor

carenas commented Sep 3, 2025

Ah, for the clang compile we get:

src/pcre2grep.c:2058:3: error: [[]] attributes are a C23 extension [-Werror,-Wc23-extensions]
 2058 |   PCRE2_FALLTHROUGH; /* Fall through */

I guess we cannot have -Wc23-extensions as [[fallthrough]] is a C23 extension. Can this be modified somehow, or should I rather stick with the original approach?

yes you can, but need to check for the C version that is being used first, see my comment on STDC_VERSION

…fallthrough__]] so there is no possibility of an accidental '#define fallthrough'.
@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Okay, I've extended the check accordingly. I also replaced [[fallthrough]] by [[__fallthrough__]] as hinted above.

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Ah right, I forgot about the JIT support... this requires a few additional changes. Working on it...

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

But it won't be as easy as you hoped. Because one of the source files is in another repo! 😢

I guess we're going to have to make a (very similar) PR into zherczeg/sljit to do the same whole business over there.

Copy link
Contributor

@carenas carenas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only a few nitpicks, assume this is being tested other than the CI?

src/pcre2_util.h Outdated
#endif
#if defined(__has_attribute) && !defined(PCRE2_FALLTHROUGH)
#if __has_attribute(fallthrough)
#define PCRE2_FALLTHROUGH __attribute__((fallthrough))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also use the underscored version of fallthrough here AFAIK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not-underscored is canonical. But the standard pretty-much guarantees that __fallback__ must work too for back-compat. Don't care, they're effectively the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, it is just to protect against a #define fallthrough somewhere

src/pcre2_util.h Outdated

// PCRE2_FALLTHROUGH; /* Fall through */

#if defined(__has_c_attribute)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order here seems wrong, IMHO we should check for the C standard first, and while we required an ANSI C compiler with C99 support, some really old ones didn't define this macro, so it is usually better to check with a substraction (an undefined macro == 0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of checks doesn't matter.

And, if __STDC_VERSION__ is not defined, the check evaluates to false, so this is a perfectly good way of doing the C23 test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, hereby you mean by checking whether __STDC_VERSION__ is defined at all?
Would it be okay to implement this as follows:

#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202311L
#if defined(__has_c_attribute)
#if __has_c_attribute(__fallthrough__)
#define PCRE2_FALLTHROUGH [[__fallthrough__]]
#endif
#endif
#endif

I guess this might work for future C-standards following C23, hence the >= check, wouldn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

practically, it doesn't make a difference as Nicholas pointed out, so I guess it is OK if the original is simpler to read, even if it feels backwards, but at least it is clear on the intent: "only use [[__fallthrough__]] if supported AND using C23"

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

But it won't be as easy as you hoped. Because one of the source files is in another repo! 😢

I guess we're going to have to make a (very similar) PR into zherczeg/sljit to do the same whole business over there.

Indeed... I hope this finds similar acceptance.

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

Zoltan will take it (I think). It's a pat of PCRE2 that's been split out into a re-usable project. You can define SLJIT_FALLTHROUGH next to SLJIT_ASSERT, same as you defined PCRE2_FALLTHROUGH in the same file as PCRE2_ASSERT.

Gernot Gebhard added 2 commits September 3, 2025 14:17
…s inconsistent to, e.g., __attribute__((uninitialized)) from pcre2_internal.h (yet, making the usage of attributes consistent is IMHO outside the scope of this pull request).
@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Are the recent changes okay for you? If so I'd then do a similar pull request for the SLJIT repository.

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

Are the recent changes okay for you? If so I'd then do a similar pull request for the SLJIT repository.

Yes. I won't randomise you on any cosmetic changes - it's perfectly fine as-is.

I'd be grateful if you did do the change in sljit.

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 3, 2025

Are the recent changes okay for you? If so I'd then do a similar pull request for the SLJIT repository.

Yes. I won't randomise you on any cosmetic changes - it's perfectly fine as-is.

I'd be grateful if you did do the change in sljit.

The pull request has been placed:

zherczeg/sljit#329

--

One thing I still need to investigate is the failure for the manyconfig run:

In file included from src/pcre2_internal.h:2393,
                 from src/pcre2_study.c:46:
src/pcre2_study.c: In function 'set_start_bits':
src/pcre2_util.h:147:27: warning: a label can only be part of a statement and a declaration is not a statement [-Wpedantic]
  147 | #define PCRE2_FALLTHROUGH __attribute__((__fallthrough__))
       |                           ^~~~~~~~~~~~~
src/pcre2_study.c:1805:7: note: in expansion of macro 'PCRE2_FALLTHROUGH'
 1805 |       PCRE2_FALLTHROUGH; /* Fall through */
      |       ^~~~~~~~~~~~~~~~~

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

One thing I still need to investigate is the failure for the manyconfig run:

In file included from src/pcre2_internal.h:2393,
                 from src/pcre2_study.c:46:
src/pcre2_study.c: In function 'set_start_bits':
src/pcre2_util.h:147:27: warning: a label can only be part of a statement and a declaration is not a statement [-Wpedantic]
  147 | #define PCRE2_FALLTHROUGH __attribute__((__fallthrough__))
       |                           ^~~~~~~~~~~~~
src/pcre2_study.c:1805:7: note: in expansion of macro 'PCRE2_FALLTHROUGH'
 1805 |       PCRE2_FALLTHROUGH; /* Fall through */
      |       ^~~~~~~~~~~~~~~~~

That's nasty. That's a very vexing parse!

The compiler sees:

case 1:
__attribute__((fallthrough));
case 2:

The problem is that the attribute attached to the "null statement" is not actually parsed as a statement unless there's a statement before it. (Yeah, C...) It's ambiguous - because it could be a declaration of a variable (which isn't a legal target for a case label). To disambiguate, it's treated as a null statement only if it follows another statement ("statement context").

Here's the solution...

      case OP_NCLASS:
#if defined SUPPORT_UNICODE && PCRE2_CODE_UNIT_WIDTH == 8
      if (utf)
        {
        re->start_bitmap[24] |= 0xf0;            /* Bits for 0xc4 - 0xc8 */
        memset(re->start_bitmap+25, 0xff, 7);    /* Bits for 0xc9 - 0xff */
        }
#elif PCRE2_CODE_UNIT_WIDTH != 8
      SET_BIT(0xFF);                             /* For characters >= 255 */
#else
      ; /* Ensure case label interacts correctly with PCRE2_FALLTHROUGH
#endif
      PCRE2_FALLTHROUGH; /* Fall through */

@carenas
Copy link
Contributor

carenas commented Sep 3, 2025

Here's the solution...

#else
      ; /* Ensure case label interacts correctly with PCRE2_FALLTHROUGH */
#endif
      PCRE2_FALLTHROUGH; /* Fall through */

note, that it was missing a closing */ in the added "null statement", so it should look instead like the one above.

@PhilipHazel
Copy link
Collaborator

I have been entertained by watching this saga roll by ... it just goes to show how complicated the world has become, with multiple compilers and multiple standards. I could never have imagined this when I first wrote PCRE. One comment for the record: the // comment is not new, not even "new". BCPL has had it from its birth in the 1960s, pre-dating C. I always wondered why // got dropped when the BCPL => B => C evolution happened. (Another thing they dropped was "endcase", overloading "break" instead, which is sometimes a right nuisance.)

@NWilson
Copy link
Member

NWilson commented Sep 3, 2025

You can partly blame C++. There's a lot of "enterprise" tooling in that area, which gradually trickles down to C as well, although the C folks try very hard to keep alive the bygone era of simple stuff.

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 4, 2025

One thing I still need to investigate is the failure for the manyconfig run:

In file included from src/pcre2_internal.h:2393,
                 from src/pcre2_study.c:46:
src/pcre2_study.c: In function 'set_start_bits':
src/pcre2_util.h:147:27: warning: a label can only be part of a statement and a declaration is not a statement [-Wpedantic]
  147 | #define PCRE2_FALLTHROUGH __attribute__((__fallthrough__))
       |                           ^~~~~~~~~~~~~
src/pcre2_study.c:1805:7: note: in expansion of macro 'PCRE2_FALLTHROUGH'
 1805 |       PCRE2_FALLTHROUGH; /* Fall through */
      |       ^~~~~~~~~~~~~~~~~

That's nasty. That's a very vexing parse!

The compiler sees:

case 1:
__attribute__((fallthrough));
case 2:

The problem is that the attribute attached to the "null statement" is not actually parsed as a statement unless there's a statement before it. (Yeah, C...) It's ambiguous - because it could be a declaration of a variable (which isn't a legal target for a case label). To disambiguate, it's treated as a null statement only if it follows another statement ("statement context").

Here's the solution...

      case OP_NCLASS:
#if defined SUPPORT_UNICODE && PCRE2_CODE_UNIT_WIDTH == 8
      if (utf)
        {
        re->start_bitmap[24] |= 0xf0;            /* Bits for 0xc4 - 0xc8 */
        memset(re->start_bitmap+25, 0xff, 7);    /* Bits for 0xc9 - 0xff */
        }
#elif PCRE2_CODE_UNIT_WIDTH != 8
      SET_BIT(0xFF);                             /* For characters >= 255 */
#else
      ; /* Ensure case label interacts correctly with PCRE2_FALLTHROUGH
#endif
      PCRE2_FALLTHROUGH; /* Fall through */

Mh, I see. Yet, here I'd rather move the fallthrough statement into the #if/#elif parts. Just adding an empty statement looks odd to me.

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 5, 2025

I've closed my original pull request zherczeg/sljit#329. @carenas has refined that stuff in zherczeg/sljit#330 which is required before this one can finally be merged (unless further changes are required).

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 8, 2025

Given that zherczeg/sljit#330 has been merged can you give the recent changes another try? Or do you expect further changes to be done?

@NWilson
Copy link
Member

NWilson commented Sep 8, 2025

Thank you for the reminder. I was away for a few days and didn't check in on this.

I will do an update of the sljit component (#791) and then we can merge this.

If there any tweaks I wish to make, I'll just push them to this branch and not trouble you with them.

@NWilson NWilson merged commit 7525365 into PCRE2Project:master Sep 12, 2025
35 checks passed
@NWilson
Copy link
Member

NWilson commented Sep 12, 2025

Many thanks again @Gsus42. I don't know how this turned into such an epic thread. That's not the normal process for contributions in PCRE2! Thank you for your patience.

@Gsus42
Copy link
Contributor Author

Gsus42 commented Sep 12, 2025

Many thanks for having this change! Indeed this was much larger than I imagined :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants