This repository hosts the prototype code and supplementary materials for the paper published at ICSOC 2025 (the 23rd International Conference on Service-Oriented Computing). The paper is titled:
P-MDP: A Framework to Optimize NFPs of Business Processes in Uncertain Environments.
The supplementary materials include in-depth technical specifications and formal mathematical definitions omitted from the main paper, providing essential details for implementing and validating the P-MDP (Process-aware Markov Decision Process) framework.
To support experimentation and validation of the P-MDP framework, three datasets are included or referenced in this repository:
- WS-DREAM QoS Dataset 2: is orginal from https://wsdream.github.io what includes real-world QoS evaluation results from 339 users on 5,825 Web services.
- QWS Datasets: is orginal from https://qwsdata.github.io what includes a set of 2,507 web services and their Quality of Web Service (QWS) measurements.
- TravelAgencyNFPs DataSet: A synthetic dataset by the authors (explained in the paper) for testing NFP modeling with temporal (e.g., weekend premiums), commercial (e.g., bundled discounts), and star-price correlated features. Covers 15 days of 20 flights, 40 trains, and 60 hotels (100 records each for uncertainty simulation).
A set of BPMN business processes is included for testing, with details below:
The running example process, featuring gateways (parallel/exclusive) and user-defined functional/non-functional properties (FPs/NFPs). It models travel arrangement processes (e.g., booking flights, hotels) with runtime uncertainties (e.g., price fluctuations, transit delays).
Processes prefixed with CSSC*
(stored in ./BPMN_Models/Benchmark_experiment_models
) are serial processes designed for benchmark comparisons with CSSC-MDP. They consist of sequential abstract services, aligning with the setup of prior MDP-based service composition frameworks.
Processes in ./BPMN_Models/QWS/
are designed to evaluate complex user-defined NFPs supported by P-MDP, using the QWS Datasets (2,507 web services with 8 core Quality of Service (QoS) measurements). These sequential workflows focus specifically on advancing beyond traditional MDP limitations in modeling complex metrics:
-
Standard QoS Metrics (with basic aggregations):
-
$RT$ (Response Time),$LA$ (Latency),$CO$ (Compliance),$BP$ (Best Practices) – aggregated via summation ($\sum$ ) -
$AV$ (Availability),$SU$ (Success Rate),$RE$ (Reliability) – aggregated via product ($\prod$ ) -
$TP$ (Throughput) – aggregated via minimum ($\min$ )
-
-
Custom User-Defined KPIs (with complex operators, unattainable via traditional MDP methods):
-
$LTS$ (Latency-Throughput Score):$LTS = \frac{TP}{\max(1-RT,,1-LA)+\epsilon}$ -
$SHI$ (Service Health Index):$SHI = \sqrt{Stability \times Compliance}$ , where:$Stability = \sqrt[10]{AV} \times \sqrt[10]{SU} \times \sqrt[10]{RE}$ $Compliance = \frac{CO+BP}{2}$
-
-
Constraints:
- Hard constraint:
$SHI > 0.6$ - Soft constraints:
$LTS > 2.0$ and maximization of$SHI$
- Hard constraint:
These processes demonstrate P-MDP’s unique capability to handle sophisticated, user-defined NFPs within business processes, surpassing traditional frameworks limited to basic metric aggregations.
We provide two methods to set up the execution environment: using Docker (recommended for ease of use and guaranteed reproducibility) or setting up a local Conda environment manually.
The best way to ensure a fully reproducible environment is to use our pre-built Docker image. This method encapsulates all dependencies and provides an interactive, web-based VS Code IDE.
Prerequisite: Docker Desktop must be installed and running.
This is the highly recommended approach for artifact evaluation.
-
Pull the Image: Open your terminal and download the pre-built image from Docker Hub.
docker pull togodang/pmdp-artifact-web:1.0
-
Run the Container: Once the download is complete, run the following command to start the interactive environment.
docker run --rm -it -p 8080:8080 --gpus all -v "${PWD}/src/training_records:/app/src/training_records" togodang/pmdp-artifact-web:1.0
- Note: Remove the
--gpus all
flag if you do not have a compatible NVIDIA GPU. - The
-v
flag synchronizes thesrc/training_records
folder with your local machine, allowing you to easily access generated plots and results.
- Note: Remove the
-
Access the Environment: Open your web browser and navigate to
http://localhost:8080
. You will find a complete, pre-configured VS Code environment.
If you prefer to build the Docker image from the source code yourself:
- Clone the Repository:
git clone [https://github.com/SOARingLab/PMDP-ICSOC2025.git](https://github.com/SOARingLab/PMDP-ICSOC2025.git) cd PMDP-ICSOC2025
- Build the Image: Run the following command in the project's root directory. This may take a considerable amount of time, especially on the first run.
docker build -t pmdp-artifact-web .
- Run the Container: After the build is complete, run the same command as in Option A, Step 2, but use the local image name:
docker run --rm -it -p 8080:8080 --gpus all -v "${PWD}/src/training_records:/app/src/training_records" pmdp-artifact-web
This method is for users who prefer to set up a local Python environment directly on their machine.
- Choose your YAML file:
- For cross-platform reproducibility (recommended for Linux/macOS/WSL), use
pmdp_conda.yaml
. This file contains a general list of dependencies. - For exactly replicating the original Windows development environment , use
pmdp_conda_windows_full.yaml
. This file contains platform-specific build versions and may only work on Windows.
- For cross-platform reproducibility (recommended for Linux/macOS/WSL), use
- Create the environment:
Run the command below, replacing [your_chosen_yaml_file.yaml] with your choice from the previous step.
conda env create -f [your_chosen_yaml_file.yaml]
- Activate the environment:
Once created, activate the environment. The environment name Py311Env4PMDP is defined in the YAML file.
conda activate Py311Env4PMDP
After activation, you are ready to run experiments as described in the next section.
The main script supports flexible execution via command-line arguments, with configurations stored in src/training_configs/
(including pre-defined configs for paper experiments):
Execution Mode | Command | Description |
---|---|---|
Direct run | python src/pmdp/Optimize_NFPs_in_PMDP.py |
Skips config reading, runs main( ) directly |
Default config | python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp or python src/pmdp/Optimize_NFPs_in_PMDP.py cssc |
A defualt config example in training_configs/ : <br> - PMDP for running example: pmdp_default_config.json <br> - CSSC-MDP: wsdream_cssc_mdp_10AS_config.json |
Custom config | python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp config_name.json or python src/pmdp/Optimize_NFPs_in_PMDP.py cssc config_name.json |
Runs with a specified config file (from training_configs/ or custom) |
We compare P-MDP with CSSC-MDP (a state-of-the-art constraint-satisfied service composition framework) using identical functional/non-functional properties (FPs/NFPs) for fairness. Both frameworks use the WS-DREAM QoS Dataset 2 (339 users, 5,825 web services grouped into 50 abstract services with 116 candidates each), where QoS records (response time, throughput) simulate runtime uncertainty.
NFPs are defined as:
- Response time: (rt = \sum_{i=1}^m rt_i < 22,\text{s})
- Throughput: (tp = \min_i(tp_i) > 13,\text{Kbps})
(Note: Prior work, including CSSC-MDP, focuses on serial processes; flow diagrams are omitted here.)
Note: Run experiments individually (not in bulk) due to resource requirements:
# CSSC-MDP (10/30/50 abstract services)
python src/pmdp/Optimize_NFPs_in_PMDP.py cssc wsdream_cssc_mdp_10AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py cssc wsdream_cssc_mdp_30AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py cssc wsdream_cssc_mdp_50AS_config.json
# P-MDP (matching abstract services)
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp wsdream_pmdp_10AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp wsdream_pmdp_30AS_config.json
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp wsdream_pmdp_50AS_config.json
Experimental results (success rates, rewards, training times, etc.) from the commands above are saved in the src/training_records/
folder as .txt
files (structured as 2D lists for easy loading).
Core visualization script:
src/experiments_visualizer/Experiments.py
Generates all paper figures (Fig.5a-b, Fig.6a-d, Fig.7a-b) using experiment data.
-
Check DataPrecomputed results are already in
Experiment_results/
, including:- Benchmark comparisons (P-MDP vs CSSC-MDP on WSDREAM)
- Parameter sensitivity data (learning rate α, PER ω)
- Training time vs service counts
- Rewards for QWS and TravelAgency datasets
-
Run the ScriptGenerate all plots automatically:
python src/experiments_visualizer/Experiments.py
-
**Customization (Optional)**Modify
Experiments.py
to adjust:- File paths in
path_params
(point to your results) - Labels, plot modes, or zoom ranges in
plot_Experiment_pictures()
- Save plots by adding
plt.savefig("filename.png")
- File paths in
Plots will display automatically; adjust parameters to refine visualizations.
This section focuses on experiments using the business processes introduced in Samples of Business Processes, testing P-MDP’s unique capability to model complex user-defined NFPs and runtime uncertainties.
Builds on the QWS Dataset Processes (detailed in Samples), evaluating P-MDP’s support for complex NFPs beyond traditional MDP frameworks.
Run the experiment with:
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp QWS_pmdp_config.json
Leverages the Travel Agency process (visualized in Samples) to test handling of gateway-induced dependencies and dynamic uncertainties (e.g., price fluctuations, temporal constraints).
For full specifications of FPs, hard/soft constraints, and uncertainty models, refer to the paper (Sect. 4.2).
Run the experiment with:
python src/pmdp/Optimize_NFPs_in_PMDP.py pmdp TravelAgency_pmdp_config.json
Both experiments validate P-MDP’s flexibility in modeling user-defined NFPs and adapting to runtime uncertainties, as detailed in the paper.
Use BPMN annotations to define NFPs (metrics, constraints, variables) for your process. Below is the core syntax, including supported operators and examples:
Define custom key performance indicators with mathematical expressions.
-
Annotation Format:
KPIs::kpi1=expression|kpi2=expression|...
-
Rules:
- Prefix with
KPIs::
- Separate metrics with
|
- Reference variables (e.g.,
R
,D
) or other KPIs (e.g.,TransportCost
).
- Prefix with
-
Supported Operators:
- Arithmetic:
+
(addition),-
(subtraction),\times
(multiplication),\div
(division) - Vector operations:
[a,b,c]
(vector),⋅
(dot product, e.g.,[FP,TP]⋅TV
) - Functions:
\mm(x)
: Dynamic max-min normalization (e.g.,\mm(FC)
)\RB[set]
: Select maximum from a set (e.g.,\RB[1,3,5]
)\LB[set]
: Select minimum from a set (e.g.,\LB[2,4,6]
)\log(x)
,\abs(x)
: Logarithm and absolute value\sqrt(x)
: Square root (e.g.,SHI = \sqrt{Stability \times Compliance}
), or\sqrt[n]{x}
for n-th root
- Grouping:
(
and)
for order of operations
- Arithmetic:
-
Example:
KPIs::TripDays=R-D|TransportCost=[\mm(FP),\mm(TP)]⋅TV|TotalCost=\sqrt(TransportCost+HC+SF)\times EP
-
Attachment: Link to a
DataObjectReference
namedKPIs
, then connect to theStartEvent
.
-
Annotation Format:
HC::hc1#weight:condition|hc2#weight:condition|...
-
Rules:
- Prefix with
HC::
- Each constraint has a name (e.g.,
hc1
), weight (#0.2
for importance), and condition.
- Prefix with
-
Supported Operators:
- Comparisons:
>
,≥
(\ge
),<
,≤
(\le
),=
,≠
(\ne
) - Logic:
∧
(\wedge
for AND),∨
(\vee
for OR),!
(NOT)
- Comparisons:
-
Example:
HC::hc1#0.2:7≤TripDays∧TripDays≤10|hc2#0.2:TotalCostRatio≤0.2|hc3#0.2:D≤5
-
Attachment: Link to a
DataObjectReference
namedHardConstraints
.
-
Annotation Format:
SC::sc1#weight:condition|sc2#weight:optimization|...
-
Rules:
- Prefix with
SC::
- Supports conditions (same as hard constraints) and optimization targets.
- Prefix with
-
Supported Operators:
- Same comparisons/logic as hard constraints
- Optimization:
\max metric
(maximize),\min metric
(minimize)
-
Example:
SC::sc1#0.33:TotalCost≤UB|sc2#0.33:\max(TravelQoS)|sc3#0.34:\min(TP)
-
Attachment: Link to a
DataObjectReference
namedSoftConstraints
.
Define variables influencing metrics/constraints, including controllable choices and uncertain factors.
-
Annotation Format:
C={var1:{[val1],[val2],...}|var2:{...}}||U={var#type:{(val,prob)|(val,unknown)}|...}
-
Rules:
C=
for controllable variables (values you can select):- List individual values as
[val1],[val2],...
(not vectors).
- List individual values as
U=
for uncontrollable variables (uncertainty):- Define probability distributions:
(value, probability)
(e.g.,([2000],0.2)
). - Use
(value, unknown)
for undefined distributions. - Add type tags (e.g.,
#positive
for non-negative values).
- Define probability distributions:
-
Example:
C={D:{[0],[1],[2],...,[15]}|CI:{[0],[1],...,[15]}}||U={UB#positive:{([2000],0.2),([4000],0.35)}|FP#negative:{([100],unknown),([200],unknown)}}
-
Attachment: Link directly to the activity (task) where the variables apply.
- Add annotations using the syntax above.
- Attach metrics/constraints to
DataObjectReference
objects (namedKPIs
,HardConstraints
,SoftConstraints
). - Connect these objects to the process (e.g.,
StartEvent
for metrics). - Link variable annotations to relevant activities.
For a complete list of supported operators, operator behavior, and guidance on extending the operator set, refer to:
- The python file for core operator definitions
- The step-by-step extension guide for how to add new operators to P-MDP
- The paper for underlying theoretical details.
Please note that the current implementation is a prototype system. It does not yet include a user-friendly graphical interface for defining NFPs, and thus relies on a strict syntax for parsing annotations. Here are some best practices to ensure correctness and avoid common errors:
-
Decompose Complex Metrics: The syntax parser may have limitations with deeply nested expressions. If you encounter errors when defining a complex metric, try decomposing it. For example, instead of defining a single metric as
KPIs::MyKPI=\sqrt(A+B*C)
, it is more robust to define it in steps:KPIs::InnerCalc=A+B*C|MyKPI=\sqrt(InnerCalc)
This approach simplifies parsing and improves readability.
-
Separate Operator Types: Do not mix metric operators (e.g.,
+
,\times
,\sqrt
) inside a constraint definition. All calculations should be performed within theKPIs
annotation first. Then, reference the calculated metric in theHardConstraints
orSoftConstraints
annotations using only relational or logical operators.- Incorrect:
HC::MyConstraint#1.0:MetricA+MetricB > 10
- Correct:
KPIs::TotalMetric=MetricA+MetricB
HC::MyConstraint#1.0:TotalMetric > 10
- Incorrect:
-
Handling Path Variables with Exclusive Gateways: Path selection in an exclusive (XOR) gateway is modeled using a vector dot product. The path variable acts as a one-hot encoded vector to select the metric of the chosen path.
- Concept: For a metric that depends on a choice between two paths, you define it as a vector of metrics multiplied by a path variable, like
[Metric_path1, Metric_path2] ⋅ path_var_1
. - Implementation Steps:
- In the
KPIs
annotation, define the composite metric:KPIs::TransportQoS=[FlightQoS, TrainQoS] ⋅ TV
. Here,TV
is the path variable. - On each outgoing sequence flow from that the exclusive split gateway, add a text annotation with the corresponding one-hot vector (e.g.,
TV=[1,0]
for the flight path,TV=[0,1]
for the train path).
- In the
- For a gateway with more branches, simply extend the vector size (e.g.,
[1,0,0]
,[0,1,0]
,[0,0,1]
for three paths). You can refer to the exclusive gateway modeling in theTravelAgency
example for a practical implementation.
- Concept: For a metric that depends on a choice between two paths, you define it as a vector of metrics multiplied by a path variable, like