RegExp memory efficiency: built-in negated character classes could use NegatedSet

We are currently manually negating the input set and encoding the full negated set, rather than checking that a set does NOT include a character (for which we already have logic, except for the built-in character sets).

This has a minor to moderate unnecessary memory cost. We should be using whichever set (negated or natural) which contains the fewest characters, which minimizes memory cost -- which also happens to include selecting an ASCII-only set, if possible, which saves lots of memory by avoiding the creation of the non-ASCII external data structure for a character set.

This means pushing the set negation check into the bytecode (for which we already have opcodes).

See https://github.com/Microsoft/ChakraCore/pull/5592#discussion_r211013233

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RegExp memory efficiency: built-in negated character classes could use NegatedSet #5633

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RegExp memory efficiency: built-in negated character classes could use NegatedSet #5633

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions