LLMs for Graph Repair

Code for the research paper "Graph Repairs with Large Language Models: An Empirical Study" published in GRADES-NDA '25: Proceedings of the 8th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) co-located with SIGMOD 2025.

DOI: https://dl.acm.org/doi/10.1145/3735546.3735859
Arxiv: https://arxiv.org/abs/2507.03410
GRADES-NDA Proceedings: https://dl.acm.org/doi/proceedings/10.1145/3735546

Components

connect.py: Defines Graph class which manages connection to Neo4j Graph Database
dataset.py: Defines GraphDataset class, building on Graph, to provide functions for loading data

add_inconsistency_synthea.py: Add controlled inconsistencies to Synthea Dataset
dataset_synthea.py: Queries for Synthea Dataset
load_synthea.py: Load a dataset

graph.py: Extends networkx.DiGraph to define PropertyGraph class,
inconsistency.py: Find inconsistencies and store them in a pickle with PropertyGraph format
encoding.py: Provides functions for computing text representations of a PG
llm.py: Provides functions for connecting to LLMs and asking questions and getting answers
machine_repair.py: Ask LLM to repair the graph
response_statistics.py: Prepare response statistics

Pipeline

Load dataset using python3 load_synthea.py
Find inconsistencies using python3 inconsistency.py
Control repair parameters in machine_repair.py
Query LLMs for graph repair using python3 machine_repair.py
Prepare response statistics (generating tables, plots) using python3 response_statistics.py

License

This project is licensed under the terms of the GNU General Public License v3.0. See the LICENSE file for details.

How to Cite?

@inproceedings{10.1145/3735546.3735859,
author = {Terdalkar, Hrishikesh and Bonifati, Angela and Mauri, Andrea},
title = {Graph Repairs with Large Language Models: An Empirical Study},
year = {2025},
isbn = {9798400719233},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3735546.3735859},
doi = {10.1145/3735546.3735859},
abstract = {Property graphs are widely used in domains such as healthcare, finance, and social networks, but they often contain errors due to inconsistencies, missing data, or schema violations. Traditional rule-based and heuristic-driven graph repair methods are limited in their adaptability as they need to be tailored for each dataset. On the other hand, interactive human-in-the-loop approaches may become infeasible when dealing with large graphs, as the cost-both in terms of time and effort-of involving users becomes too high. Recent advancements in Large Language Models (LLMs) present new opportunities for automated graph repair by leveraging contextual reasoning and their access to real-world knowledge. We evaluate the effectiveness of six open-source LLMs in repairing property graphs. We assess repair quality, computational cost, and model-specific performance. Our experiments show that LLMs have the potential to detect and correct errors, with varying degrees of accuracy and efficiency. We discuss the strengths, limitations, and challenges of LLM-driven graph repair and outline future research directions for improving scalability and interpretability.},
booktitle = {Proceedings of the 8th Joint Workshop on Graph Data Management Experiences \& Systems (GRADES) and Network Data Analytics (NDA)},
articleno = {9},
numpages = {10},
keywords = {Graph Repair, Large Language Models, Property Graphs},
location = {Berlin, Germany},
series = {GRADES-NDA '25}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
encode_mode_cypher		encode_mode_cypher
encode_mode_graph		encode_mode_graph
encode_mode_llm_deepseek-r1		encode_mode_llm_deepseek-r1
encode_mode_llm_gemma2		encode_mode_llm_gemma2
encode_mode_llm_llama3.2		encode_mode_llm_llama3.2
encode_mode_llm_mistral		encode_mode_llm_mistral
encode_mode_llm_phi4		encode_mode_llm_phi4
encode_mode_llm_qwen2.5		encode_mode_llm_qwen2.5
encode_mode_template		encode_mode_template
example_mode_none		example_mode_none
example_mode_one_large		example_mode_one_large
example_mode_one_small		example_mode_one_small
example_mode_two_mix		example_mode_two_mix
example_mode_two_small		example_mode_two_small
inconsistency		inconsistency
LICENSE		LICENSE
README.md		README.md
add_inconsistency_synthea.py		add_inconsistency_synthea.py
connect.py		connect.py
dataset.py		dataset.py
dataset_synthea.py		dataset_synthea.py
encoding.py		encoding.py
graph.py		graph.py
inconsistencies.pkl		inconsistencies.pkl
inconsistency.py		inconsistency.py
llm.py		llm.py
llm_performance.png		llm_performance.png
load_synthea.py		load_synthea.py
machine_repair.py		machine_repair.py
repair_operations.png		repair_operations.png
requirements.txt		requirements.txt
response_statistics.py		response_statistics.py
rxnav_cache.sqlite		rxnav_cache.sqlite
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLMs for Graph Repair

Components

Pipeline

License

How to Cite?

About

Uh oh!

Releases

Packages

Languages

License

hrishikeshrt/LLM-Graph-Repair

Folders and files

Latest commit

History

Repository files navigation

LLMs for Graph Repair

Components

Pipeline

License

How to Cite?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages