Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
0bcda7a
add better implementation for sqlite sequence
rpiazza Jan 5, 2025
5220792
neo4j datastore
rpiazza Jan 13, 2025
ca03792
Merge branch 'relational-data-sink' into neo4j-data-store
rpiazza Jan 13, 2025
4f934ee
fix is_node_available
rpiazza Jan 14, 2025
1cbfc95
remove camel case from relationship names
rpiazza Jan 14, 2025
6cd1fab
create min/max constraint in db - add min/max to confidence common pr…
rpiazza Mar 9, 2025
19fb782
fix typo
rpiazza Mar 9, 2025
de84ebb
add core_sdo table has confidence constraint
rpiazza Mar 9, 2025
98d5fc4
Merge branch 'relational-data-sink' into fix-integer-constraint
rpiazza Mar 12, 2025
f6173c0
fix rdb tests, handle timestamps as text
rpiazza Mar 13, 2025
bbad3b3
Update github workflow and related to fix a db drop error when
chisholm Mar 15, 2025
3ab4d9f
Merge pull request #623 from chisholm/update-workflow
rpiazza Mar 15, 2025
f4e93aa
Change relational-data-sink class scanning to scan the registry
chisholm Mar 25, 2025
40cab27
Fixes to get querying working with sqlite. Also updates to
chisholm Mar 26, 2025
03fd9de
Merge pull request #625 from chisholm/test_workflow
rpiazza Apr 2, 2025
6506d22
Merge pull request #626 from chisholm/sqlite-unit-tests
rpiazza Apr 2, 2025
128b55d
add shorten_extension_definition_id
rpiazza Apr 4, 2025
af7cf97
shorten table names based in extension definitions
rpiazza Apr 5, 2025
692ea6b
merge conflicts
rpiazza Apr 5, 2025
afb737f
Merge branch 'sqlite-sequence' into fix-integer-constraint
rpiazza Apr 7, 2025
8b9c10d
fix-sqlite-core-properties
rpiazza Apr 7, 2025
3832cf0
Add try/finallys around table creation/dropping in relational
chisholm Apr 7, 2025
4a1cc6e
Merge pull request #629 from chisholm/try-finallys-unit-tests
rpiazza Apr 8, 2025
4d42bc7
Merge branch 'fix-integer-constraint' of https://github.com/oasis-ope…
rpiazza Apr 8, 2025
580771e
fix sqlite timestamps, dictionaries, meta_objects
rpiazza Apr 9, 2025
5d4c0a4
Create USING_NEO4J.md
rpiazza Apr 9, 2025
5666ac8
added images
rpiazza Apr 9, 2025
36f76a8
diagrams 2
rpiazza Apr 9, 2025
4e07682
diagrams-for-neo4j-doc
rpiazza Apr 9, 2025
d4c3970
recovered editor changers
rpiazza Apr 9, 2025
7622650
resize last to pngs
rpiazza Apr 9, 2025
53364d4
change try-except to try-finally in some contextmanagers in
chisholm Apr 10, 2025
8b5efa1
respond to review suggestions
rpiazza Apr 10, 2025
0b8069c
Merge pull request #630 from chisholm/fix-register-cleanup
rpiazza Apr 10, 2025
2a77258
flaky
rpiazza Apr 10, 2025
7a50451
flaky again
rpiazza Apr 10, 2025
4fd1eba
flaky, flaky
rpiazza Apr 10, 2025
73621e4
Fix test_datastore_add_raises unit tests
chisholm Apr 11, 2025
a869cbd
Fix test_environment_no_datastore() unit tests
chisholm Apr 11, 2025
088d05f
Merge pull request #631 from chisholm/fix-test-datastore-add-raises
rpiazza Apr 11, 2025
3716b05
Merge pull request #632 from chisholm/fix-test-environment-no-datastore
rpiazza Apr 11, 2025
1ac7f93
Fix unit tests in v21/test_custom.py to use fixtures and
chisholm Apr 23, 2025
ce54692
Fix some v21/test_observed_data.py tests:
chisholm Apr 24, 2025
3ee089f
Merge pull request #633 from chisholm/fix-custom-unit-tests
rpiazza Apr 24, 2025
4567919
Merge pull request #634 from chisholm/fix-test-observed-data
rpiazza Apr 24, 2025
faface0
NotImplementedError for List of Dictionaries
rpiazza Apr 24, 2025
f9093db
fix email message test
rpiazza Apr 24, 2025
854e0c4
flaky
rpiazza Apr 24, 2025
830f5af
Merge pull request #635 from oasis-open/fix-email-message-test
rpiazza Apr 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/workflows/python-ci-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ on: [push, pull_request]
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: postgres
POSTGRES_DB: stix

jobs:
test-job:
runs-on: ubuntu-latest

services:
postgres:
image: postgres:11
image: postgres
# Provide the password for postgres
env:
POSTGRES_USER: postgres
Expand All @@ -34,9 +34,9 @@ jobs:

name: Python ${{ matrix.python-version }} Build
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4.2.2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v5.4.0
with:
python-version: ${{ matrix.python-version }}
- name: Install and update essential dependencies
Expand All @@ -48,7 +48,7 @@ jobs:
run: |
tox
- name: Upload coverage information to Codecov
uses: codecov/codecov-action@v4.2.0
uses: codecov/codecov-action@v5.4.0
with:
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: false # optional (default = false)
Expand Down
76 changes: 76 additions & 0 deletions USING_NEO4J.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Experimenting with the Neo4j graph database Python STIX DataStore

The Neo4j graph database Python STIX DataStore is a proof-of-concept implementation to show how to store STIX content in a graph database.

## Limitations:

As a proof-of-concept it has minimal functionality.

## Installing Neo4j

See https://neo4j.com/docs/desktop-manual/current/installation

This will install the neo4j desktop application, which contains the neo4j browser to view the database.

## Installing Neo4j python library

The python neo4j library used is py2neo, available in pypi at https://pypi.org/project/py2neo/. Note this library is no longer being supported and has reached the "end-of-life". A different implementation of the DataStore could be written using https://neo4j.com/docs/api/python-driver/current/.

## Implementation Details

We would like to that the folks at JHU/APL for their implementation of [STIX2NEO4J.py](https://github.com/opencybersecurityalliance/oca-iob/tree/main/STIX2NEO4J%20Converter), which this code is based on.

Only the DataSink (for storing STIX data) part of the DataStore object has been implemented. The DataSource part is implemented as a stub. However, the graph database can be queried using the neo4j cypher langauge within
the neo4j browser.

The main concept behind any graphs is nodes and edges. STIX data is similar as it contains relationship objects (SROs) and node objects (SDOs, SCOs and SMOs). Additional edges are provided by STIX embedded relationships, which are expressed as properties in STIX node objects. This organization of data in STIX is a natural fit for graph models, such as neo4j.

The order in which STIX objects are added to the graph database is arbitrary. Therefore, when an SRO or embedded relationship is added via the DataStore, the nodes that it connects may not be present in the database, so the relationship is not added to the database, but remembered by the DataStore code as an unconnected relationship. Whenever a new node is
added to the database, the unconnected relationships must be reviewed to determine if both nodes of a relationship can now be represented using an edge in the graph database.

Note that unless both the source and target nodes are eventually added,
the relationship will not be added either.
How to address this issue in the implementation has not been determined.

## Demonstrating a neo4j database for STIX

Open the neo4j desktop app and create a new project named STIX.

Select local DBMS on your local machine.

<img src="docs/diagrams/select-dbms.png" width="500" height="250">

Create the database.

<img src="docs/diagrams/create-dbms.png" width="500" height="300">

Start the database.

<img src="docs/diagrams/start-dbms.png" width="500" height="120">

python demo.py \<STIX bundle file\> is used populate a local neo4j database, which can be viewed using the neo4j browser.
A sample bundle file bundle--21531315-283d-4604-8501-4b7166e58c84.json is provided in the docs directory.

Open the neo4j browser to view the database.

<img src="docs/diagrams/open-browser.png" width="500" height="250">

Query using the cypher language.

<img src="docs/diagrams/query-for-incident.png" width="750" height="450">

Left-clicking on a node gives you a choice of adding all related nodes and edges, removing the node and its edges from the display, or locking the node position.

<img src="docs/diagrams/node-actions.png" width="500" height="320">

Remove the report object node for a better view of the graph.

<img src="docs/diagrams/dont-show-node-and-edges.png" width="750" height="450">

Explore the graph.

<img src="docs/diagrams/exploring-the-graph.png" width="750" height="400">

View the node properties, by mousing-over any node.

<img src="docs/diagrams/node-properties.png" width="750" height="400">
Binary file added docs/diagrams/create-dbms.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/dont-show-node-and-edges.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/exploring-the-graph.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/node-actions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/node-properties.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/open-browser.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/query-for-incident.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/select-dbms.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/start-dbms.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 1 addition & 5 deletions stix2/datastore/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,11 +210,7 @@ def add(self, *args, **kwargs):
stix_objs (list): a list of STIX objects

"""
try:
return self.sink.add(*args, **kwargs)
except AttributeError:
msg = "%s has no data sink to put objects in"
raise AttributeError(msg % self.__class__.__name__)
return self.sink.add(*args, **kwargs)


class DataSink(metaclass=ABCMeta):
Expand Down
153 changes: 153 additions & 0 deletions stix2/datastore/neo4j/STIX2NEO4J.py.doc
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# Reference implementation python script to load STIX 2.1 bundles into
# Neo4J graph database
# Code developed by JHU/APL - First Draft December 2021

# DISCLAIMER
# The script developed by JHU/APL for the demonstration are not “turn key” and are
# not safe for deployment without being tailored to production infrastructure. These
# files are not being delivered as software and are not appropriate for direct use on any
# production networks. JHU/APL assumes no liability for the direct use of these files and
# they are provided strictly as a reference implementation.
#
# NO WARRANTY, NO LIABILITY. THIS MATERIAL IS PROVIDED “AS IS.” JHU/APL MAKES NO
# REPRESENTATION OR WARRANTY WITH RESPECT TO THE PERFORMANCE OF THE MATERIALS, INCLUDING
# THEIR SAFETY, EFFECTIVENESS, OR COMMERCIAL VIABILITY, AND DISCLAIMS ALL WARRANTIES IN
# THE MATERIAL, WHETHER EXPRESS OR IMPLIED, INCLUDING (BUT NOT LIMITED TO) ANY AND ALL
# IMPLIED WARRANTIES OF PERFORMANCE, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
# AND NON-INFRINGEMENT OF INTELLECTUAL PROPERTY OR OTHER THIRD PARTY RIGHTS. ANY USER OF
# THE MATERIAL ASSUMES THE ENTIRE RISK AND LIABILITY FOR USING THE MATERIAL. IN NO EVENT
# SHALL JHU/APL BE LIABLE TO ANY USER OF THE MATERIAL FOR ANY ACTUAL, INDIRECT,
# CONSEQUENTIAL, SPECIAL OR OTHER DAMAGES ARISING FROM THE USE OF, OR INABILITY TO USE,
# THE MATERIAL, INCLUDING, BUT NOT LIMITED TO, ANY DAMAGES FOR LOST PROFITS.

from getpass import getpass
## Import python modules for this script
import json
from typing import List

from py2neo import Graph, Node
from tqdm import tqdm

#Import variables
BundleName = input("Enter the name you want for your bundle: ")
NeoHost = input("Enter the hostname for Neo4j server: ")
NeoUser = input("Neo4j User: ")
NeoPass = getpass("Neo4j Password: ")
JSONFILE = input("Path to STIX JSON: ")

class NeoUploader(object):

def __init__(self):
# Connect to neo4j
self.sgraph = Graph(host=NeoHost, auth=(NeoUser, NeoPass))
self.relations = list()
self.relationship_ids = set()
self.nodes_with_object_ref = list()
self.nodes = list()
self.bundlename = BundleName
self.infer_relation = {
"parent_ref": "parent_of",
"created_by_ref": "created_by",
"src_ref": "source_of",
"dst_ref": "destination_of",
}
self.__load_json(JSONFILE)

def __load_json(self, fd):
data = None
with open(fd) as json_file:
data = json.load(json_file)
for entry in data["objects"]:
if entry["type"] == "relationship":
self.relations.append(entry)
else:
self.nodes.append(entry)

# Make Nodes
def make_nodes(self):
total_nodes=len(self.nodes)
for idx, apobj in tqdm(enumerate(self.nodes), total=total_nodes, desc="Making Nodes", unit="node"):
keys = apobj.keys()
node_contents = dict()
#If the SCO does not have a name field, use the type as name
if 'name' not in keys:
node_name = apobj["type"]
else:
node_name = apobj["name"]
# add id and type to node contents
node_contents["ap_id"] = apobj["id"]
node_contents["type"] = apobj["type"]
# store rest of object contents in node contents
for key in keys:
if key not in ["type", "name", "id"]:
# collections not allowed as neo4j property value
# convert nested collections to string
if isinstance(apobj[key], list) or isinstance(apobj[key], dict):
node_contents[key] = json.dumps(apobj[key])
else:
node_contents[key] = apobj[key]
# Make the Bundle ID a property
# use dictionary expansion as keywork for optional node properties
node = Node(
apobj["type"],
name=node_name,
bundlesource=self.bundlename,
**node_contents,
)
# if node needs new created_by relation, create the node and then the relationship
self.sgraph.create(node)
# save off these nodes for additional relationship creating
if 'object_refs' in keys:
self.nodes_with_object_ref.append(apobj)

# create relationships that exist outside of relationship objects
# such as Created_by and Parent_Of
def __make_inferred_relations(self):
total_nodes=len(self.nodes)
for idx, apobj in tqdm(enumerate(self.nodes), total=total_nodes, desc="Checking Inferred Relationships", unit="node"):
for k in apobj.keys():
k_tokens = k.split("_")
# find refs, but ignore external_references since they aren't objects
if "ref" in k_tokens[len(k_tokens) - 1] and k_tokens[len(k_tokens) - 1] != "references":
rel_type = "_".join(k_tokens[: -1])
ref_list = []
# refs are lists, push singular ref into list to make it iterable for loop
if not type(apobj[k]).__name__ == "list":
ref_list.append(apobj[k])
else:
ref_list = apobj[k]
for ref in ref_list:
# The "b to a" relationship is reversed in this cypher query to ensure the correct relationship direction in the graph
cypher_string = f'MATCH (a),(b) WHERE a.bundlesource="{self.bundlename}" AND b.bundlesource="{self.bundlename}" AND a.ap_id="{str(ref)}" AND b.ap_id="{str(apobj["id"])}" CREATE (b)-[r:{rel_type}]->(a) RETURN a,b'
try:
self.sgraph.run(cypher_string)
except Exception as err:
print(err)
continue

# Make Relationships
def make_relationships(self):
total_rels=len(self.relations)
for idx, apobj in tqdm(enumerate(self.relations), total=total_rels, desc="Making Relationships", unit="rel"):
# Define Relationship Type
reltype = str(apobj['relationship_type'])
# Fix Relationships with hyphens, neo4j will throw syntax error as
# the hyphen is interpreted as an operation in the query string
reltype = reltype.replace('-', '_')
# create the relationship
cypher_string = f'MATCH (a),(b) WHERE a.bundlesource="{self.bundlename}" AND b.bundlesource="{self.bundlename}" AND a.ap_id="{str(apobj["source_ref"])}" AND b.ap_id="{str(apobj["target_ref"])}" CREATE (a)-[r:{reltype}]->(b) RETURN a,b'
self.sgraph.run(cypher_string)
# maintain set of object ids that are in relationship objects
self.relationship_ids.add(str(apobj['source_ref']))
self.relationship_ids.add(str(apobj['target_ref']))
self.__make_inferred_relations()

# run the helper methods to upload bundle to neo4j database
def upload(self):
self.make_nodes()
self.make_relationships()


if __name__ == '__main__':
uploader = NeoUploader()
uploader.upload()
Empty file.
26 changes: 26 additions & 0 deletions stix2/datastore/neo4j/demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@

import json
import sys

from identity_contact_information import \
identity_contact_information # noqa F401
# needed so the relational db code knows to create tables for this
from incident import event, impact, incident, task # noqa F401
from observed_string import observed_string # noqa F401

import stix2
from stix2.datastore.neo4j.neo4j import Neo4jStore
import stix2.properties


def main():
with open(sys.argv[1], "r") as f:
bundle = stix2.parse(json.load(f), allow_custom=True)
store = Neo4jStore(clear_database=True)

for obj in bundle.objects:
store.add(obj)


if __name__ == '__main__':
main()
Loading