Skip to content

GNIP 89: Architecture Design - Resource and Storage Manager Modules #7664

@afabiani

Description

@afabiani

GNIP 89: Architecture Design - Resource and Storage Manager Modules

Overview

Currently the architectural model of GeoNode is flat.

We have a ResourceBase class, which basically stores almost all the metadata fields for every generic GeoNode resource, and several concrete instances, such as Layer, Map, Document and GeoApp, that add few more attributes and methods specific to the instance itself.

Whenever we need to create a new resource on GeoNode, actually we do:

  1. Create the ResourceBase instance and set/update few or more metadata fields
  2. Set the default permissions
  3. Check for the available GIS backend and invoke a bunch of custom methods and signals
  4. Rely on several signals around, some from the geonode.base module, some others from the geonode.<type> one and, finally, some other from the geonode.<backend_gis> one to finalize the configuration
  5. Perform an amount of hardcoded checks and controls which depends mostly on the installed modules

This "functional" approach is confusing and error prone. Moreover it is quite difficult to update or change the backends, often the developer must deal with a crazy if-else checks on every view and template.

What we would like to achieve with this GNIP, is:

  1. Make GeoNode core more modular, clean and really pluggable.
  2. Get rid of hardcoded if-else checks
  3. Get rid of most of the pre-post-<delete>/<save> signals
  4. Make the backend-gis pluggable and centralize any further check we would need when updating a resource metadata or security permission.

Last, but not least, we would like also make the uploader module more stable and efficient by avoiding redundant calls and signal fallbacks.

Proposed By

Alessio Fabiani <@afabiani>
Giovanni Allegri <@giohappy>
Ricardo Garcia Silva <@ricardogsilva>
Mattia Giupponi <@mattiagiupponi>

Assigned to Release

This proposal is for GeoNode 4.0.

State

  • Under Discussion
  • In Progress
  • Completed
  • Rejected
  • Deferred

Proposal

In order to achieve our goals, we need to review a bit the current GeoNode core architecture design.

image

We envisage four main components of the new GeoNode resource management architecture (see Fig.1).

Storage Manager
The general Storage will be able to organize and allocate GeoNode resources raw data, whatever a GeoNode resource could be. Through the Storage Manager it will be possible to read/write raw data from a pluggable (aka concrete) storage, which could be a FileSytemStorage as well as a DropboxStorage or AmazonS3Storage. The concrete storage will be defined by the settings through a factory mechanism.

On the other side we will benefit of a generic storage interface being able to expose generic methods to access the R/W operations on the raw data.

class StorageManagerInterface(metaclass=ABCMeta):

    @abstractmethod
    def delete(self, name):
        pass

    @abstractmethod
    def exists(self, name):
        pass

    @abstractmethod
    def listdir(self, path):
        pass

    @abstractmethod
    def open(self, name, mode='rb'):
        pass

    @abstractmethod
    def path(self, name):
        pass

    @abstractmethod
    def save(self, name, content, max_length=None):
        pass

    @abstractmethod
    def url(self, name):
        pass

    @abstractmethod
    def size(self, name):
        pass

    @abstractmethod
    def generate_filename(self, filename):
        pass

Resource Manager
The Resource Manager exposes some atomic operations allowing it to manage the GeoNode ResourceBase models. This is an internal component meant to be used primarily by the Resource Service to manage the publication of ResourceBases into GeoNode.

As well as the Storage Manager, the Resource Manager also will benefit also of an abstract Resource Manager Interface, a default implementation and a concrete resource manager which will pluggable, defined through the settings by a factory mechanism, and being able to deal with the backend gis.

Accordingly to this architectural design, almost all the logic specifically bounded to the backend gis, will be moved to the concrete resource manager, ending up with a real separation of concerns.

The proposed Resource Manager Interface will expose some generic methods allowing to perform CRUD operations against a ResourceBase. Therefore, accordingly to this paradigm, a GeoNode developer should never break this contract by manually instantiating a ResourceBase, but, instead, passing through the resource manager methods. This is the only way to guarantee that the ResourceBase state will be always coherent and aligned with the backend gis.

The proposed Resource Manager Interface would be something like the following one

class ResourceManagerInterface(metaclass=ABCMeta):

    @abstractmethod
    def search(self, filter: dict, /, type: object = None) -> QuerySet:
        pass

    @abstractmethod
    def exists(self, uuid: str, /, instance: ResourceBase = None) -> bool:
        pass

    @abstractmethod
    def delete(self, uuid: str, /, instance: ResourceBase = None) -> int:
        pass

    @abstractmethod
    def create(self, uuid: str, /, resource_type: object = None, defaults: dict = {}) -> ResourceBase:
        pass

    @abstractmethod
    def update(self, uuid: str, /, instance: ResourceBase = None, xml_file: str = None, metadata_uploaded: bool = False,
               vals: dict = {}, regions: dict = {}, keywords: dict = {}, custom: dict = {}, notify: bool = True) -> ResourceBase:
        pass

    @abstractmethod
    def exec(self, method: str, uuid: str, /, instance: ResourceBase = None, **kwargs) -> ResourceBase:
        pass

    @abstractmethod
    def remove_permissions(self, uuid: str, /, instance: ResourceBase = None) -> bool:
        pass

    @abstractmethod
    def set_permissions(self, uuid: str, /, instance: ResourceBase = None, owner=None, permissions: dict = {}, created: bool = False) -> bool:
        pass

    @abstractmethod
    def set_workflow_permissions(self, uuid: str, /, instance: ResourceBase = None, approved: bool = False, published: bool = False) -> bool:
        pass

    @abstractmethod
    def set_thumbnail(self, uuid: str, /, instance: ResourceBase = None, overwrite: bool = True, check_bbox: bool = True) -> bool:
        pass

The concrete resource manager will be dealing with the backend gis through the factory instance, by taking care of performing the generic logic against the ResourceBase and later delegate the backend gis to finalize it.

As an instance, at __init__ time the concrete resource manager will instantiate the pluggable backend gis through the factory pattern, like shown below:

class ResourceManager(ResourceManagerInterface):

    def __init__(self):
        self._concrete_resource_manager = self._get_concrete_manager()

    def _get_concrete_manager(self):
        module_name, class_name = rm_settings.RESOURCE_MANAGER_CONCRETE_CLASS.rsplit(".", 1)
        module = importlib.import_module(module_name)
        class_ = getattr(module, class_name)
        return class_()

   ...

The implementation of an operation from the interface will take care of:

  1. Performing the generic logic against the ResourceBase
  2. Performing the specific logic against the real instance

As an instance a delete operation would be implemented as shown here below:

    @transaction.atomic
    def delete(self, uuid: str, /, instance: ResourceBase = None) -> int:
        _resource = instance or ResourceManager._get_instance(uuid)
        if _resource:
            try:
                self._concrete_resource_manager.delete(uuid, instance=_resource)  # backend gis delete
                _resource.get_real_instance().delete()  # ResourceBase delete
                return 1
            except Exception as e:
                logger.exception(e)
        return 0

Uploader
The Uploader module will be fully pluggable and aware of the backend gis.

It will be relying on the Resource Manager and Storage Manager in order to create/update the ReourceBases and raw data into GeoNode.

Harvester
Similarly to the Uploader, the Harvester will be relying on the Resource Manager and Storage Manager in order to create/update the ReourceBases and raw data into GeoNode.

We envisage a complete refactoring of the Harvester by improving the way it synchronizes and fetches the resources from the remote service. This will be better detailed on another GNIP.

Async States
The Resource Manager will be managing the different STATES of the ResourceBase also.

This is a quite important concept. As part of this work we envisage also to move the workflow/processing state into the ResourceBase. This will allow us to ensure we never expose a not fully configured ResourceBase to the users.

As long as any component will respect the architectural contract, we will be also sure that the ResourceBases will be always in a consistent state.

Backwards Compatibility

This work won't be backward compatible with 3.2.x and 3.3.x versions.

However, since we won't break the ResourceBase model, it will be possible to easily upgrade older GeoNode versions to the new one.

Future evolution

This is a preliminary work that prepares the core to go towards a fully pluggable and headless GeoNode middleware.

Through a real separation of concerns, the code will be much more maintainable and extensible, by easily allowing developers to create their own backend gis and storage managers modules.

Feedback

Update this section with relevant feedbacks, if any.

Voting

Project Steering Committee:

  • Alessio Fabiani: 👍
  • Francesco Bartoli:
  • Giovanni Allegri:
  • Simone Dalmasso:
  • Toni Schoenbuchner: 👍
  • Florian Hoedt: 👍

Links

Remove unused links below.

Metadata

Metadata

Labels

gnipA GeoNodeImprovementProcess Issue

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions