Skip to content
@wbsg-uni-mannheim

Web-based Systems Group @ University of Mannheim

We explore technical and empirical questions concerning the development of global, decentralized information environments.

Pinned Loading

  1. TabAnnGPT TabAnnGPT Public

    This repository contains code and data for reproducing the experiments of three papers that focus on two subtasks of table annotation: column type annotation (CTA) the task of annotating table colu…

    Python 12 2

  2. MatchGPT MatchGPT Public

    This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Entity Matching" and "Entity Matching using Large Language Models".

    Jupyter Notebook 62 12

  3. ExtractGPT ExtractGPT Public

    Attribute Value Extraction using Large Language Models

    Python 26 11

  4. wdcproducts wdcproducts Public

    This repository contains the code and data download links to reproduce building the WDC Products Benchmark.

    Python 13 4

  5. WebMall WebMall Public

    This repository contains the code and data of the WebMall benchmark for evaluating the capability of Web agents to find and compare product offers from multiple e-shops.

    HTML 4 4

  6. PyDI PyDI Public

    The PyDI framework provides methods for end-to-end data integration. The framework covers all steps of the integration process, including schema matching, data translation, entity matching, and dat…

    HTML 5

Repositories

Showing 10 of 33 repositories
  • PyDI Public

    The PyDI framework provides methods for end-to-end data integration. The framework covers all steps of the integration process, including schema matching, data translation, entity matching, and data fusion. The framework offers traditional string-based methods as well as modern LLM- and embedding-based techniques for these tasks.

    wbsg-uni-mannheim/PyDI’s past year of commit activity
    HTML 5 Apache-2.0 0 3 0 Updated Oct 29, 2025
  • WebMall Public

    This repository contains the code and data of the WebMall benchmark for evaluating the capability of Web agents to find and compare product offers from multiple e-shops.

    wbsg-uni-mannheim/WebMall’s past year of commit activity
    HTML 4 4 0 1 Updated Oct 17, 2025
  • AgentLab Public Forked from ServiceNow/AgentLab

    AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.

    wbsg-uni-mannheim/AgentLab’s past year of commit activity
    Python 0 91 0 0 Updated Oct 17, 2025
  • SubsetCreatorJupyterNBs Public

    Jupyter notebooks used to create the schema.org subsets from the MD and JSON-LD corpus for the WDC 2020 structured data extraction.

    wbsg-uni-mannheim/SubsetCreatorJupyterNBs’s past year of commit activity
    Python 3 1 0 0 Updated Oct 16, 2025
  • wdc-page Public

    This repository contains the source files of the Web Data Commons website and is used to maintain the site. The Web Data Commons project extracts structured data from the Common Crawl

    wbsg-uni-mannheim/wdc-page’s past year of commit activity
    HTML 1 1 0 0 Updated Oct 16, 2025
  • WebMall-Interfaces Public

    Modern LLM agents interact with the web through various architectures - from traditional browser automation to API-based approaches. This project provides implementation and evaluation code to systematically compare their effectiveness across 91 realistic e-commerce scenarios.

    wbsg-uni-mannheim/WebMall-Interfaces’s past year of commit activity
    Python 0 1 0 0 Updated Aug 15, 2025
  • BrowserGym Public Forked from ServiceNow/BrowserGym

    🌎💪 BrowserGym, a Gym environment for web task automation

    wbsg-uni-mannheim/BrowserGym’s past year of commit activity
    Python 0 136 0 0 Updated Jul 29, 2025
  • winter Public Forked from olehmberg/winter

    WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.

    wbsg-uni-mannheim/winter’s past year of commit activity
    Java 8 Apache-2.0 33 0 6 Updated Jul 12, 2025
  • TailorMatch Public

    This repository contains code and comprehensive examples to replicate and build upon the experiments presented in our paper “Fine-tuning Large Language Models for Entity Matching” The repository provides resources for implementing fine-tuning techniques on large language models specifically for entity matching tasks.

    wbsg-uni-mannheim/TailorMatch’s past year of commit activity
    Jupyter Notebook 10 4 0 0 Updated Jun 17, 2025
  • TabAnnGPT Public

    This repository contains code and data for reproducing the experiments of three papers that focus on two subtasks of table annotation: column type annotation (CTA) the task of annotating table columns with semantic types from pre-defined vocabularies and column property annotation (CPA) the task of annotating the relationships between two columns.

    wbsg-uni-mannheim/TabAnnGPT’s past year of commit activity
    Python 12 2 0 0 Updated Mar 5, 2025