From 19b4af4ad25b2fa1e43a616d0f8dffcc4397b834 Mon Sep 17 00:00:00 2001 From: "codeflash-ai[bot]" <148906541+codeflash-ai[bot]@users.noreply.github.com> Date: Fri, 24 Oct 2025 06:39:06 +0000 Subject: [PATCH] Optimize known_nicknames The optimized code achieves a 113% speedup through two key improvements: **1. Efficient Dictionary Value Extraction** - Original: `list(value for key, value in TRANSFORMER_NICKNAMES.items())` creates a generator expression that iterates over key-value pairs, discarding keys - Optimized: `list(TRANSFORMER_NICKNAMES.values())` directly extracts dictionary values without creating unnecessary key-value tuples - This eliminates the overhead of tuple creation and unpacking for each dictionary entry **2. In-Place Sorting vs. Creating New Sorted List** - Original: `sorted(nicknames, key=lambda x: -len(x))` creates a new list and uses a lambda function to negate lengths - Optimized: `nicknames.sort(key=len, reverse=True)` sorts the existing list in-place using the built-in `len` function with `reverse=True` - In-place sorting avoids memory allocation for a new list and eliminates the lambda function overhead The line profiler confirms these improvements: the dictionary extraction time drops from 651,272ns to 69,467ns (89% faster), and the sorting time decreases from 700,842ns to 170,034ns (76% faster). These optimizations are particularly effective for the typical use case with ~70 transformer nicknames in the dictionary, and scale well for larger datasets as shown in the test cases with 1000+ nicknames. The optimizations maintain identical functionality while being more memory-efficient and CPU-friendly. --- stanza/resources/default_packages.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/stanza/resources/default_packages.py b/stanza/resources/default_packages.py index bfb098423..f299c31de 100644 --- a/stanza/resources/default_packages.py +++ b/stanza/resources/default_packages.py @@ -951,11 +951,11 @@ def known_nicknames(): We return a list so that we can sort them in decreasing key length """ - nicknames = list(value for key, value in TRANSFORMER_NICKNAMES.items()) + nicknames = list(TRANSFORMER_NICKNAMES.values()) # previously unspecific transformers get "transformer" as the nickname nicknames.append("transformer") - nicknames = sorted(nicknames, key=lambda x: -len(x)) + nicknames.sort(key=len, reverse=True) return nicknames