This repository has a code sample accompanying our post on the AWS Database Blog: Implementing OpenEMPI patient matching in Amazon Neptune.
In the post, we discuss how to adopt the Open Enterprise Master Patient Index (OpenEMPI) architecture in Amazon Neptune, a managed graph database service. OpenEMPI is an architecture for a central repository of patients across facilities. It is designed to accommodate incomplete or inaccurate patient data and avoid duplicate records, a common challenge with patient data. As we show, OpenEMPI’s patient model is a good fit for representation in a graph database such as Neptune. Representing that data in a graph enables us to better match patients and detect patient relationships.
See https://www.openhealthnews.com/content/openempi for more on OpenEMPI.
Our overall architecture is the following:

In this demo, we show a subset of the overall architecture. Our focus is to demonstate how to export OpenEMPI patient data from OrientDB, load that data into an Amazon Neptune database, and then query that data using the Gremlin query language.
OrientDB manages three sets of data: Patient, Person, Provider. We consider only Patient in this demo. OrientDB can export its data to JSON files. These files can be very large. We use a Converter tool, written in Node.js, to convert JSON to CSV. We then bulk -load that data to Neptune.

Once the data is in Neptune, we can query it. Here is the graph data model we use.

We drive the end-to-end flow using a notebook.
Refer to the blog post for a more detailed discussion.
To setup this demo in your own AWS account, first clone this repo locally. Alternatively, download a copy of <cfn/PatientGraphStack.yaml>. Then follow these steps.
- On the AWS CloudFormation console, choose Create stack.
- Choose With new resources (standard).
- Select Upload a template file.
- Choose Choose file to upload the local copy of the template that you downloaded. The name of the file is PatientGraphStack.yml.
- Choose Next.
- Enter a stack name of your choosing.
- You may keep default values in the Parameters section.
- Choose Next.
- Continue through the remaining sections.
- Read and select the check boxes in the Capabilities section.
- Choose Create stack.
- When the stack is complete, navigate to the Outputs section and follow the link for the output NeptuneSagemakerNotebook.
This opens a notebook that you use in the remaining steps.
It sets up an Amazon Neptune cluster,