Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 24, 2025

📄 27% (0.27x) speedup for chuliu_edmonds in stanza/models/common/chuliu_edmonds.py

⏱️ Runtime : 61.5 milliseconds 48.4 milliseconds (best of 104 runs)

📝 Explanation and details

Performance optimizations applied.

  • In tarjan, use arrays for fast dependent lookup and state tracking; avoid repeated iterator allocation for dependents, so the function runs faster on large and dense graphs.
  • In process_cycle, use np.ix_ for high-performance submatrix extraction instead of chained advanced indexing, which is much more efficient for repeated large matrix use and avoids extra copies.
  • Use np.count_nonzero instead of .sum() for boolean arrays in maybe_pop_cycle for more accurate intent and a potential micro-optimization.
  • Minor loop refactorings to eliminate repeated work and excessive list construction.
  • While not changed, maintaining the original careful memory writes in expand_contracted_tree avoids unnecessary traversals.

All behaviors, names, comments, and signatures are preserved exactly as in the original.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 41 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import numpy as np
# imports
import pytest
from stanza.models.common.chuliu_edmonds import chuliu_edmonds

# function to test
# (the full chuliu_edmonds implementation as provided above)
# ... [see user prompt for full function code] ...
# For brevity, assume the function is already present above this block.

# -------------------
# Basic Test Cases
# -------------------


def test_two_node_graph():
    # Test with two nodes: root and one child
    scores = np.array([
        [0, -np.inf],
        [0.5, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_simple_tree():
    # Three nodes: root, two children
    scores = np.array([
        [0, -np.inf, -np.inf],
        [1, 0, -np.inf],
        [2, -np.inf, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_simple_chain():
    # Chain: 0 <- 1 <- 2
    scores = np.array([
        [0, -np.inf, -np.inf],
        [2, 0, -np.inf],
        [-np.inf, 3, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_simple_cycle_breaking():
    # Cycle: 1->2->1, root->1, root->2
    scores = np.array([
        [0, -np.inf, -np.inf],
        [2, 0, 3],
        [4, 5, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

# -------------------
# Edge Test Cases
# -------------------

def test_disconnected_graph():
    # All scores -inf except root self-loop
    scores = np.full((3,3), -np.inf)
    scores[0,0] = 0
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_self_loops_only():
    # Only self-loops allowed
    scores = np.array([
        [0, -np.inf, -np.inf],
        [-np.inf, 0, -np.inf],
        [-np.inf, -np.inf, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_multiple_cycles():
    # Graph with two cycles
    scores = np.array([
        [0, -np.inf, -np.inf, -np.inf],
        [10, 0, 5, -np.inf],
        [-np.inf, 4, 0, 8],
        [7, -np.inf, 2, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_all_equal_scores():
    # All scores equal except self-loops (which are -inf)
    scores = np.full((4,4), 1.0)
    np.fill_diagonal(scores, -np.inf)
    scores[0,:] = -np.inf
    scores[0,0] = 0
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_no_possible_edges():
    # All scores -inf, root self-loop is 0
    scores = np.full((5,5), -np.inf)
    scores[0,0] = 0
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_graph_with_negative_scores():
    # All scores negative, but should still produce a tree
    scores = np.array([
        [0, -np.inf, -np.inf],
        [-1, 0, -2],
        [-3, -4, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_graph_with_zero_scores():
    # All scores zero except self-loops
    scores = np.zeros((3,3))
    np.fill_diagonal(scores, -np.inf)
    scores[0,0] = 0
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_graph_with_multiple_roots():
    # Multiple nodes have high score to root
    scores = np.array([
        [0, -np.inf, -np.inf, -np.inf],
        [5, 0, 1, 1],
        [5, 1, 0, 1],
        [5, 1, 1, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

# -------------------
# Large Scale Test Cases
# -------------------

def test_large_chain_graph():
    # Large chain: 0 <- 1 <- 2 <- ... <- 999
    N = 1000
    scores = np.full((N, N), -np.inf)
    scores[0,0] = 0
    for i in range(1, N):
        scores[i, i-1] = 1  # Each node points to previous node
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output
    # The chain should be preserved
    expected = np.zeros(N, dtype=int)
    expected[0] = 0
    for i in range(1, N):
        expected[i] = i-1

def test_large_star_graph():
    # Large star: all nodes point to root
    N = 1000
    scores = np.full((N, N), -np.inf)
    scores[0,0] = 0
    for i in range(1, N):
        scores[i, 0] = 1  # All nodes point to root
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output
    expected = np.zeros(N, dtype=int)
    expected[0] = 0
    for i in range(1, N):
        expected[i] = 0

def test_large_dense_graph():
    # Large dense graph: all scores random, except self-loops and root row
    np.random.seed(42)
    N = 100
    scores = np.random.rand(N, N)
    np.fill_diagonal(scores, -np.inf)
    scores[0,:] = -np.inf
    scores[0,0] = 0
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output
    # Ensure no cycles (tree property)
    # Use DFS to check for cycles
    visited = set()
    for node in range(N):
        path = set()
        curr = node
        while curr != 0:
            if curr in path:
                raise AssertionError("Cycle detected in MST")
            path.add(curr)
            curr = result[curr]
        visited.update(path)

def test_large_graph_with_cycles():
    # Large graph with cycles: create a cycle among last 10 nodes
    N = 100
    scores = np.full((N, N), -np.inf)
    scores[0,0] = 0
    for i in range(1, N):
        scores[i, 0] = 1
    # Create a cycle among last 10 nodes
    for i in range(N-10, N):
        scores[i, (i+1)%N] = 5
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output
    # Should break the cycle and produce a valid tree
    # Check for cycles
    for node in range(N):
        path = set()
        curr = node
        while curr != 0:
            if curr in path:
                raise AssertionError("Cycle detected in MST in large graph with cycles")
            path.add(curr)
            curr = result[curr]

# -------------------
# Miscellaneous/Mutation Catchers
# -------------------

def test_mutation_catcher_wrong_parent():
    # If function mutates and returns wrong parent for a simple tree, fail
    scores = np.array([
        [0, -np.inf, -np.inf],
        [3, 0, -np.inf],
        [2, -np.inf, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_mutation_catcher_cycle_not_broken():
    # If function fails to break cycles, fail
    scores = np.array([
        [0, -np.inf, -np.inf],
        [0, 0, 1],
        [0, 1, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output

def test_mutation_catcher_self_loop():
    # If function allows self-loop for non-root, fail
    scores = np.array([
        [0, -np.inf, -np.inf],
        [1, 0, -np.inf],
        [1, -np.inf, 0]
    ])
    codeflash_output = chuliu_edmonds(scores.copy()); result = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import numpy as np
# imports
import pytest
from stanza.models.common.chuliu_edmonds import chuliu_edmonds

# unit tests

# Helper to check if the output is a valid arborescence rooted at 0
def is_valid_arborescence(tree, root=0):
    n = len(tree)
    # Root must point to itself
    if tree[root] != root:
        return False
    # All others must not point to themselves and must not point to root (except root)
    for i in range(1, n):
        if tree[i] == i:
            return False
    # No cycles (except for root self-loop)
    visited = [False]*n
    for i in range(n):
        if i == root:
            continue
        v = i
        seen = set()
        while v != root:
            if v in seen or v == -1:
                return False
            seen.add(v)
            v = tree[v]
    return True

# Helper to compute total score of a tree
def tree_score(tree, scores):
    return sum(scores[i, tree[i]] for i in range(len(tree)))

# -------------------
# Basic Test Cases
# -------------------

def test_single_node():
    # Only root node, should point to itself
    scores = np.array([[0]], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_two_nodes():
    # 0=root, 1 can only point to 0
    scores = np.array([
        [0, -float('inf')],
        [10, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_three_nodes_simple():
    # 0=root, 1 and 2 both prefer to attach to 0
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [10, 0, 5],
        [8, 3, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_three_nodes_chain():
    # 0=root, 1 prefers 0, 2 prefers 1
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [10, 0, 5],
        [-float('inf'), 7, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_three_nodes_cycle():
    # 0=root, 1->2, 2->1, both prefer each other, but must break cycle
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [1, 0, 10],
        [2, 9, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_four_nodes_multiple_options():
    # 0=root, 1,2,3; 1 and 2 both prefer 0, 3 prefers 2
    scores = np.array([
        [0, -float('inf'), -float('inf'), -float('inf')],
        [10, 0, 5, 1],
        [8, 3, 0, 2],
        [7, 2, 9, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

# -------------------
# Edge Test Cases
# -------------------

def test_disconnected_graph():
    # 0=root, 1,2,3; 3 cannot reach root
    scores = np.array([
        [0, -float('inf'), -float('inf'), -float('inf')],
        [10, 0, 5, 1],
        [8, 3, 0, 2],
        [-float('inf'), -float('inf'), -float('inf'), 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # 3 must point to someone, but the score will be -inf. This is a degenerate case.

def test_all_negative_scores():
    # All edges have negative scores, must still build a tree
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [-1, 0, -2],
        [-3, -4, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # Should still produce a valid tree

def test_self_loops_only():
    # Only self-loops, which are forbidden except for root
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [-float('inf'), 0, -float('inf')],
        [-float('inf'), -float('inf'), 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # 1,2 must attach to 0

def test_multiple_cycles():
    # 0=root, 1<->2, 3<->4, must break both cycles
    scores = np.array([
        [0, -float('inf'), -float('inf'), -float('inf'), -float('inf')],
        [1, 0, 10, 2, 3],
        [2, 9, 0, 4, 5],
        [6, 7, 8, 0, 20],
        [11, 12, 13, 19, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # Should break both cycles and maximize score

def test_equal_scores():
    # All scores equal, so any tree is optimal
    scores = np.full((4,4), 1.0)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # Should be a valid tree

def test_zero_scores():
    # All zero scores, any valid tree is optimal
    scores = np.zeros((4,4))
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_large_negative_inf():
    # Some nodes have only -inf except for one edge
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [10, 0, -float('inf')],
        [-float('inf'), 7, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # 1->0, 2->1

def test_root_not_highest_score():
    # Some nodes prefer non-root, but must attach to root
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [1, 0, 10],
        [2, 9, 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_no_edges_except_root():
    # Only possible edges are to root
    scores = np.array([
        [0, -float('inf'), -float('inf')],
        [10, 0, -float('inf')],
        [8, -float('inf'), 0]
    ], dtype=float)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # 1->0, 2->0

# -------------------
# Large Scale Test Cases
# -------------------

def test_large_chain():
    # Chain of 100 nodes, each prefers to attach to previous
    n = 100
    scores = np.full((n,n), -float('inf'))
    np.fill_diagonal(scores, 0)
    for i in range(1, n):
        scores[i, i-1] = 1
    scores[0,0] = 0
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    for i in range(2, n):
        pass

def test_large_star():
    # Star: all nodes prefer to attach to root
    n = 100
    scores = np.full((n,n), -float('inf'))
    np.fill_diagonal(scores, 0)
    for i in range(1, n):
        scores[i, 0] = 1
    scores[0,0] = 0
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    for i in range(1, n):
        pass

def test_large_random():
    # Large random graph with 200 nodes
    np.random.seed(42)
    n = 200
    scores = np.random.randn(n, n)
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # Should be a valid tree

def test_large_dense():
    # Large dense graph, all scores random, n=150
    np.random.seed(123)
    n = 150
    scores = np.random.uniform(-10, 10, (n, n))
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_large_multiple_cycles():
    # Large graph with multiple cycles forced
    n = 50
    scores = np.full((n, n), -float('inf'))
    np.fill_diagonal(scores, 0)
    # create several cycles
    for i in range(1, n):
        scores[i, (i+1)%n] = 10
    # root is 0, allow 1 to attach to 0
    scores[1, 0] = 20
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
    # 1 should attach to 0, rest should break cycles

def test_large_all_equal():
    # All scores equal, n=100
    n = 100
    scores = np.ones((n, n))
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output

def test_large_sparse():
    # Large sparse graph: only a few non-inf edges
    n = 100
    scores = np.full((n, n), -float('inf'))
    np.fill_diagonal(scores, 0)
    for i in range(1, n):
        scores[i, 0] = 1
        if i > 1:
            scores[i, i-1] = 2
    codeflash_output = chuliu_edmonds(scores.copy()); tree = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-chuliu_edmonds-mh4gv4ch and push.

Codeflash

**Performance optimizations applied**.

- In `tarjan`, use arrays for fast dependent lookup and state tracking; avoid repeated iterator allocation for dependents, so the function runs faster on large and dense graphs.
- In `process_cycle`, use `np.ix_` for high-performance submatrix extraction instead of chained advanced indexing, which is much more efficient for repeated large matrix use and avoids extra copies.
- Use `np.count_nonzero` instead of `.sum()` for boolean arrays in `maybe_pop_cycle` for more accurate intent and a potential micro-optimization.
- Minor loop refactorings to eliminate repeated work and excessive list construction.
- While not changed, maintaining the original careful memory writes in `expand_contracted_tree` avoids unnecessary traversals.

All behaviors, names, comments, and signatures are preserved exactly as in the original.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 24, 2025 06:25
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant