GitHub - SOARingLab/PMDP-ICSOC2025: The prototype of ICSOC2025 paper "P-MDP: A Framework to Optimize NFPs of Business Processes in Uncertain Environments"

Overview

This repository hosts the prototype code and supplementary materials for the paper published at ICSOC 2025 (the 23rd International Conference on Service-Oriented Computing). The paper is titled:

P-MDP: A Framework to Optimize NFPs of Business Processes in Uncertain Environments.

The supplementary materials include in-depth technical specifications and formal mathematical definitions omitted from the main paper, providing essential details for implementing and validating the P-MDP (Process-aware Markov Decision Process) framework.

Datasets

To support experimentation and validation of the P-MDP framework, three datasets are included or referenced in this repository:

WS-DREAM QoS Dataset 2: is orginal from https://wsdream.github.io what includes real-world QoS evaluation results from 339 users on 5,825 Web services.
QWS Datasets: is orginal from https://qwsdata.github.io what includes a set of 2,507 web services and their Quality of Web Service (QWS) measurements.
TravelAgencyNFPs DataSet: A synthetic dataset by the authors (explained in the paper) for testing NFP modeling with temporal (e.g., weekend premiums), commercial (e.g., bundled discounts), and star-price correlated features. Covers 15 days of 20 flights, 40 trains, and 60 hotels (100 records each for uncertainty simulation).

Samples of Business Processes with User-defined NFPs

A set of BPMN business processes is included for testing, with details below:

Travel Agency

The running example process, featuring gateways (parallel/exclusive) and user-defined functional/non-functional properties (FPs/NFPs). It models travel arrangement processes (e.g., booking flights, hotels) with runtime uncertainties (e.g., price fluctuations, transit delays).

Sequence CSSC-MDP

Processes prefixed with CSSC* (stored in ./BPMN_Models/Benchmark_experiment_models) are serial processes designed for benchmark comparisons with CSSC-MDP. They consist of sequential abstract services, aligning with the setup of prior MDP-based service composition frameworks.

QWS Dataset Processes

Processes in ./BPMN_Models/QWS/ are designed to evaluate complex user-defined NFPs supported by P-MDP, using the QWS Datasets (2,507 web services with 8 core Quality of Service (QoS) measurements). These sequential workflows focus specifically on advancing beyond traditional MDP limitations in modeling complex metrics:

Standard QoS Metrics (with basic aggregations):
- $RT$ (Response Time), $LA$ (Latency), $CO$ (Compliance), $BP$ (Best Practices) – aggregated via summation ($\sum$)
- $AV$ (Availability), $SU$ (Success Rate), $RE$ (Reliability) – aggregated via product ($\prod$)
- $TP$ (Throughput) – aggregated via minimum ($\min$)
Custom User-Defined KPIs (with complex operators, unattainable via traditional MDP methods):
- $LTS$ (Latency-Throughput Score): $LTS = \frac{TP}{\max(1-RT,,1-LA)+\epsilon}$
- $SHI$ (Service Health Index): $SHI = \sqrt{Stability \times Compliance}$, where:
  - $Stability = \sqrt[10]{AV} \times \sqrt[10]{SU} \times \sqrt[10]{RE}$
  - $Compliance = \frac{CO+BP}{2}$
Constraints:
- Hard constraint: $SHI > 0.6$
- Soft constraints: $LTS > 2.0$ and maximization of $SHI$

These processes demonstrate P-MDP’s unique capability to handle sophisticated, user-defined NFPs within business processes, surpassing traditional frameworks limited to basic metric aggregations.

Setup Environment

We provide two methods to set up the execution environment: using Docker (recommended for ease of use and guaranteed reproducibility) or setting up a local Conda environment manually.

Method 1: Using Docker (Recommended)

The best way to ensure a fully reproducible environment is to use our pre-built Docker image. This method encapsulates all dependencies and provides an interactive, web-based VS Code IDE.

Prerequisite: Docker Desktop must be installed and running.

Option A: Pull from Docker Hub (Easiest)

This is the highly recommended approach for artifact evaluation.

Pull the Image: Open your terminal and download the pre-built image from Docker Hub.
```
docker pull togodang/pmdp-artifact-web:1.0
```
Run the Container: Once the download is complete, run the following command to start the interactive environment.
```
docker run --rm -it -p 8080:8080 --gpus all -v "${PWD}/src/training_records:/app/src/training_records" togodang/pmdp-artifact-web:1.0
```
- Note: Remove the --gpus all flag if you do not have a compatible NVIDIA GPU.
- The -v flag synchronizes the src/training_records folder with your local machine, allowing you to easily access generated plots and results.
Access the Environment: Open your web browser and navigate to http://localhost:8080 . You will find a complete, pre-configured VS Code environment.

Option B: Build the Image Locally

If you prefer to build the Docker image from the source code yourself:

Clone the Repository:

git clone [https://github.com/SOARingLab/PMDP-ICSOC2025.git](https://github.com/SOARingLab/PMDP-ICSOC2025.git)
cd PMDP-ICSOC2025

Build the Image: Run the following command in the project's root directory. This may take a considerable amount of time, especially on the first run.
```
docker build -t pmdp-artifact-web .
```

Run the Container: After the build is complete, run the same command as in Option A, Step 2, but use the local image name:

docker run --rm -it -p 8080:8080 --gpus all -v "${PWD}/src/training_records:/app/src/training_records" pmdp-artifact-web

Method 2: Manual Setup with Conda (for Developers)

This method is for users who prefer to set up a local Python environment directly on their machine.

Choose your YAML file:
- For cross-platform reproducibility (recommended for Linux/macOS/WSL), use pmdp_conda.yaml. This file contains a general list of dependencies.
- For exactly replicating the original Windows development environment , use pmdp_conda_windows_full.yaml. This file contains platform-specific build versions and may only work on Windows.
Create the environment: Run the command below, replacing [your_chosen_yaml_file.yaml] with your choice from the previous step.
```
conda env create -f [your_chosen_yaml_file.yaml]
```
Activate the environment: Once created, activate the environment. The environment name Py311Env4PMDP is defined in the YAML file.
```
conda activate Py311Env4PMDP
```

After activation, you are ready to run experiments as described in the next section.

Experiments

How to Run

The main script supports flexible execution via command-line arguments, with configurations stored in src/training_configs/ (including pre-defined configs for paper experiments):

Execution Mode	Command	Description
Direct run	`python src/pmdp/Optimize_NFPs_in_PMDP.py`	Skips config reading, runs main( ) directly
Default config	`python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp` or `python src/pmdp/Optimize_NFPs_in_PMDP.py cssc`	A defualt config example in `training_configs/`: `<br>`- PMDP for running example: `pmdp_default_config.json` `<br>`- CSSC-MDP: `wsdream_cssc_mdp_10AS_config.json`
Custom config	`python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp config_name.json` or `python src/pmdp/Optimize_NFPs_in_PMDP.py cssc config_name.json`	Runs with a specified config file (from `training_configs/` or custom)

1. Benchmark Comparison with CSSC-MDP

We compare P-MDP with CSSC-MDP (a state-of-the-art constraint-satisfied service composition framework) using identical functional/non-functional properties (FPs/NFPs) for fairness. Both frameworks use the WS-DREAM QoS Dataset 2 (339 users, 5,825 web services grouped into 50 abstract services with 116 candidates each), where QoS records (response time, throughput) simulate runtime uncertainty.

NFPs are defined as:

Response time: (rt = \sum_{i=1}^m rt_i < 22,\text{s})
Throughput: (tp = \min_i(tp_i) > 13,\text{Kbps})

(Note: Prior work, including CSSC-MDP, focuses on serial processes; flow diagrams are omitted here.)

Note: Run experiments individually (not in bulk) due to resource requirements:

# CSSC-MDP (10/30/50 abstract services)
python src/pmdp/Optimize_NFPs_in_PMDP.py cssc wsdream_cssc_mdp_10AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py cssc wsdream_cssc_mdp_30AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py cssc wsdream_cssc_mdp_50AS_config.json

# P-MDP (matching abstract services)
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp wsdream_pmdp_10AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp wsdream_pmdp_30AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp wsdream_pmdp_50AS_config.json

Visualizing Results

Where Results Are Stored

Experimental results (success rates, rewards, training times, etc.) from the commands above are saved in the src/training_records/ folder as .txt files (structured as 2D lists for easy loading).

Visualization Tool

Location

Core visualization script: src/experiments_visualizer/Experiments.py Generates all paper figures (Fig.5a-b, Fig.6a-d, Fig.7a-b) using experiment data.

Quick Start

Check DataPrecomputed results are already in Experiment_results/, including:
- Benchmark comparisons (P-MDP vs CSSC-MDP on WSDREAM)
- Parameter sensitivity data (learning rate α, PER ω)
- Training time vs service counts
- Rewards for QWS and TravelAgency datasets

Run the ScriptGenerate all plots automatically:

python src/experiments_visualizer/Experiments.py

**Customization (Optional)**Modify Experiments.py to adjust:
- File paths in path_params (point to your results)
- Labels, plot modes, or zoom ranges in plot_Experiment_pictures()
- Save plots by adding plt.savefig("filename.png")

Plots will display automatically; adjust parameters to refine visualizations.

2. Framework Features Evaluation for User-defined NFPs with Uncertainties

This section focuses on experiments using the business processes introduced in Samples of Business Processes, testing P-MDP’s unique capability to model complex user-defined NFPs and runtime uncertainties.

Using QWS Dataset

Builds on the QWS Dataset Processes (detailed in Samples), evaluating P-MDP’s support for complex NFPs beyond traditional MDP frameworks.

Run the experiment with:

python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp QWS_pmdp_config.json

Using Travel Agency Synthetic Data

Leverages the Travel Agency process (visualized in Samples) to test handling of gateway-induced dependencies and dynamic uncertainties (e.g., price fluctuations, temporal constraints).

For full specifications of FPs, hard/soft constraints, and uncertainty models, refer to the paper (Sect. 4.2).

Run the experiment with:

python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp TravelAgency_pmdp_config.json

Both experiments validate P-MDP’s flexibility in modeling user-defined NFPs and adapting to runtime uncertainties, as detailed in the paper.

Modeling NFPs with BPMN Annotations

Use BPMN annotations to define NFPs (metrics, constraints, variables) for your process. Below is the core syntax, including supported operators and examples:

1. User-Defined Metrics (KPIs)

Define custom key performance indicators with mathematical expressions.

Annotation Format: KPIs::kpi1=expression|kpi2=expression|...
Rules:
- Prefix with KPIs::
- Separate metrics with |
- Reference variables (e.g., R, D) or other KPIs (e.g., TransportCost).
Supported Operators:
- Arithmetic: + (addition), - (subtraction), \times (multiplication), \div (division)
- Vector operations: [a,b,c] (vector), ⋅ (dot product, e.g., [FP,TP]⋅TV)
- Functions:
  - \mm(x): Dynamic max-min normalization (e.g., \mm(FC))
  - \RB[set]: Select maximum from a set (e.g., \RB[1,3,5])
  - \LB[set]: Select minimum from a set (e.g., \LB[2,4,6])
  - \log(x), \abs(x): Logarithm and absolute value
  - \sqrt(x): Square root (e.g., SHI = \sqrt{Stability \times Compliance}), or \sqrt[n]{x} for n-th root
- Grouping: ( and ) for order of operations
Example:KPIs::TripDays=R-D|TransportCost=[\mm(FP),\mm(TP)]⋅TV|TotalCost=\sqrt(TransportCost+HC+SF)\times EP
Attachment: Link to a DataObjectReference named KPIs, then connect to the StartEvent.

2. Constraints

Hard Constraints (Must be satisfied)

Annotation Format: HC::hc1#weight:condition|hc2#weight:condition|...
Rules:
- Prefix with HC::
- Each constraint has a name (e.g., hc1), weight (#0.2 for importance), and condition.
Supported Operators:
- Comparisons: >, ≥ (\ge), <, ≤ (\le), =, ≠ (\ne)
- Logic: ∧ (\wedge for AND), ∨ (\vee for OR), ! (NOT)
Example:HC::hc1#0.2:7≤TripDays∧TripDays≤10|hc2#0.2:TotalCostRatio≤0.2|hc3#0.2:D≤5
Attachment: Link to a DataObjectReference named HardConstraints.

Soft Constraints (To be optimized)

Annotation Format: SC::sc1#weight:condition|sc2#weight:optimization|...
Rules:
- Prefix with SC::
- Supports conditions (same as hard constraints) and optimization targets.
Supported Operators:
- Same comparisons/logic as hard constraints
- Optimization: \max metric (maximize), \min metric (minimize)
Example:SC::sc1#0.33:TotalCost≤UB|sc2#0.33:\max(TravelQoS)|sc3#0.34:\min(TP)
Attachment: Link to a DataObjectReference named SoftConstraints.

3. Variables (Controllable/Uncontrollable)

Define variables influencing metrics/constraints, including controllable choices and uncertain factors.

Annotation Format: C={var1:{[val1],[val2],...}|var2:{...}}||U={var#type:{(val,prob)|(val,unknown)}|...}
Rules:
- C= for controllable variables (values you can select):
  - List individual values as [val1],[val2],... (not vectors).
- U= for uncontrollable variables (uncertainty):
  - Define probability distributions: (value, probability) (e.g., ([2000],0.2)).
  - Use (value, unknown) for undefined distributions.
  - Add type tags (e.g., #positive for non-negative values).
Example:C={D:{[0],[1],[2],...,[15]}|CI:{[0],[1],...,[15]}}||U={UB#positive:{([2000],0.2),([4000],0.35)}|FP#negative:{([100],unknown),([200],unknown)}}
Attachment: Link directly to the activity (task) where the variables apply.

Procedures

Add annotations using the syntax above.
Attach metrics/constraints to DataObjectReference objects (named KPIs, HardConstraints, SoftConstraints).
Connect these objects to the process (e.g., StartEvent for metrics).
Link variable annotations to relevant activities.

For a complete list of supported operators, operator behavior, and guidance on extending the operator set, refer to:

The python file for core operator definitions
The step-by-step extension guide for how to add new operators to P-MDP
The paper for underlying theoretical details.

Important Notes and Best Practices

Please note that the current implementation is a prototype system. It does not yet include a user-friendly graphical interface for defining NFPs, and thus relies on a strict syntax for parsing annotations. Here are some best practices to ensure correctness and avoid common errors:

Decompose Complex Metrics: The syntax parser may have limitations with deeply nested expressions. If you encounter errors when defining a complex metric, try decomposing it. For example, instead of defining a single metric as KPIs::MyKPI=\sqrt(A+B*C), it is more robust to define it in steps:
- KPIs::InnerCalc=A+B*C|MyKPI=\sqrt(InnerCalc) This approach simplifies parsing and improves readability.
Separate Operator Types: Do not mix metric operators (e.g., +, \times, \sqrt) inside a constraint definition. All calculations should be performed within the KPIs annotation first. Then, reference the calculated metric in the HardConstraints or SoftConstraints annotations using only relational or logical operators.
- Incorrect: HC::MyConstraint#1.0:MetricA+MetricB > 10
- Correct:
  - KPIs::TotalMetric=MetricA+MetricB
  - HC::MyConstraint#1.0:TotalMetric > 10
Handling Path Variables with Exclusive Gateways: Path selection in an exclusive (XOR) gateway is modeled using a vector dot product. The path variable acts as a one-hot encoded vector to select the metric of the chosen path.
- Concept: For a metric that depends on a choice between two paths, you define it as a vector of metrics multiplied by a path variable, like [Metric_path1, Metric_path2] ⋅ path_var_1.
- Implementation Steps:
  1. In the KPIs annotation, define the composite metric: KPIs::TransportQoS=[FlightQoS, TrainQoS] ⋅ TV. Here, TV is the path variable.
  2. On each outgoing sequence flow from that the exclusive split gateway, add a text annotation with the corresponding one-hot vector (e.g., TV=[1,0] for the flight path, TV=[0,1] for the train path).
- For a gateway with more branches, simply extend the vector size (e.g., [1,0,0], [0,1,0], [0,0,1] for three paths). You can refer to the exclusive gateway modeling in the TravelAgency example for a practical implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
BPMN_Models		BPMN_Models
Datasets		Datasets
Experiment_results		Experiment_results
src		src
vscode_extensions		vscode_extensions
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Supplementary_PMDP.pdf		Supplementary_PMDP.pdf
pmdp_conda.yaml		pmdp_conda.yaml
pmdp_conda_windows_full.yaml		pmdp_conda_windows_full.yaml

License

SOARingLab/PMDP-ICSOC2025

Folders and files

Latest commit

History

Repository files navigation