Skip to content

Conversation

@czosel
Copy link
Collaborator

@czosel czosel commented May 26, 2022

This is a first naive attempt to fix #896. There are probably several cases that I missed, but I'd like to get some feedback - is this something that can be solved in the lexer instead?

@czosel czosel requested a review from MaartenStaa May 26, 2022 15:29
@MaartenStaa
Copy link
Collaborator

MaartenStaa commented May 27, 2022

@czosel I think this is something we should try to solve in the lexer instead. As an example, given the following PHP code:

enum foo {}
class enum {}

this is the output from token_get_all:

T_ENUM#336 ('enum')
T_WHITESPACE#392 (' ')
T_STRING#262 ('foo')
T_WHITESPACE#392 (' ')
string(1) "{"
string(1) "}"
T_WHITESPACE#392 ('
')
T_CLASS#333 ('class')
T_WHITESPACE#392 (' ')
T_STRING#262 ('enum')
T_WHITESPACE#392 (' ')
string(1) "{"
string(1) "}"

In other words, whether the string "enum" is interpreted as T_STRING or T_ENUM seems to be contextual. It looks like the way PHP handles this is by looking ahead to the next tokens:

// php/php-src/Zend/zend_language_scanner.l
/*
 * The enum keyword must be followed by whitespace and another identifier.
 * This avoids the BC break of using enum in classes, namespaces, functions and constants.
 */
<ST_IN_SCRIPTING>"enum"{WHITESPACE}("extends"|"implements") {
	yyless(4);
	RETURN_TOKEN_WITH_STR(T_STRING, 0);
}
<ST_IN_SCRIPTING>"enum"{WHITESPACE}[a-zA-Z_\x80-\xff] {
	yyless(4);
	RETURN_TOKEN_WITH_IDENT(T_ENUM);
}

@czosel
Copy link
Collaborator Author

czosel commented May 27, 2022

@MaartenStaa I had a feeling this would be the way to go, but didn't check how PHP itself handles this - thanks! 👍 I'm not sure if I'll find the time to implement this in the coming days, so if you (or anyone else) wants to take over that's totally fine by me 😉

@czosel
Copy link
Collaborator Author

czosel commented May 28, 2022

Superseded by #941

@czosel czosel closed this May 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enum is treated as a reserved word, even before PHP 8.1

2 participants