-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Open
Description
The Entity Ruler doesn't recognize a literal value as regex. The sample value is an italian fiscal code. Changing the last letter from K to another one, e.g., V, will render it recognizable as an entity. The expected behavior should be to recognize also the sample value as is.
How to reproduce the behaviour
patterns = [
{
"label": "CF",
"pattern": [
{
"TEXT": {
"REGEX": "BRRDTR65D02H224K",
}
}
],
},
]
nlp = spacy.blank("en")
ruler = nlp.add_pipe("entity_ruler", name="ner_rules")
ruler.add_patterns(patterns)
text = "BRRDTR65D02H224K"
doc = nlp(text)
doc.ents # it's empty, it should give BRRDTR65D02H224K
Your Environment
- spaCy version: 3.8.7
- Platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39
- Python version: 3.10.17
- Pipelines: it_core_news_lg (3.8.0)
Metadata
Metadata
Assignees
Labels
No labels