-
Notifications
You must be signed in to change notification settings - Fork 0
Annotations
Many components of a node are treated as annotations within PhyloBox. The annotation model is as follows:
class Annotation(db.Model):
tree = db.ReferenceProperty(Tree)
branch = db.BooleanProperty(default=False)
description = db.TextProperty()
category = db.CategoryProperty()
name = db.StringProperty()
value = db.StringProperty()
full = db.StringProperty()
triplet = db.StringProperty() # "{category}:{name}:{value}"
latest = db.BooleanProperty(default=True)
user = db.UserProperty()
addtime = db.DateTimeProperty(auto_now_add=True)
temporary = db.BooleanProperty(default=False)
The category, name, value elements are combined to create the triplet property. Because a triplet needs to be searchable, the combination of the 3 properties needs to be less than 500 bytes. If more data needs to be stored than the limited value property (e.g. sequence data), it can be stored in the full property, but wont be searchable directly by the full value.
The parent of an annotation is defined as db.Key({Kind annotated key}), allowing annotations to be created that describe nodes, trees, or other annotations. The latest property determines whether an annotation is the latest value, allowing for roll backs and provenance.
The category property is designed to be a loose vocabulary, expandable by users, but with suggested standards such as ‘taxonomy’, ‘geography’, and ‘uri’. Common category properties can be written directly into trees returned by the /api/lookup methods.
**The Annotation model supports RDF triplets as follows**
Consider a node that has the color blue. The annotation would be created directly on the node, so storage would simply be,
class Annotation(db.Model):
category = 'color'
name = None
value = 'blue'
Because a triplet will only care about the {node}:{predicate}:{value} the output would be {nodeid stored as Annotation parent}:{category}:{value} or Node 23 has color blue.
In general though, the more valuable piece of will be the name. Consider the following,
db.Annotation( #parent: Node 23
category: 'taxonomy'
name: 'scientific name'
value: 'puma concolor'
)
db.Annotation( #parent: Node 23
category:'taxonomy',
name:'col id',
value:'6862841'
)
From this, we can more accurately model the information about a node through RDF. We can create two pieces of RDF:
Node 23 has COL taxonomy id 6862841
and
COL taxonomy id 6862841 has scientific name Puma concolor