Skip to content

Authorization Service

BioConnect Authorization Service leverages the Apache Ranger for authorization, and Apache Ranger is a framework to enable, monitor and manage comprehensive data security.

Ranger Admin Portal

http://35.190.178.188:6080

The Ranger admin portal is a centralized place to mangage service, policies, users, groups and roles

Apache Ranger provides dozens of built-in services for data security for many Hadoop applications. In addition to the built-in services, we built two custom services for our purpose/BioConnect.

  • The BioConnect-Service is a custom service configured to protect the API endpoint and users can be configured to have different permissions to call get, post, delete for API end points
  • The BioConnect-GCS is a custom service configured to protect the Google Cloud Storage (GCS) data security, and users can be set up with read, write permissions for different projects, buckets, subfolders, and files within the bucket.

Ranger Service

The Apache Ranger can provide authorization for many services. The screenshots show the built-in services and two BioConnect custom services.
Ranger Home All the services are managed using the Service Manager page, and a service can be added, deleted or edited. Export and import of the service is an easy way to back up and restore the service policy. The host name is defined in the service and the URL relative path is set up in the policies.

Add a New Service

On the Ranger "Service Manager" page, click on the plus sign next to the service name to open a new window to create a new service. For example, to create new "BIOCONNECT-SERVICE", click on the plus sign pointed to by the red arrow. Ranger New Service

Enter Service Detail

On the "Create Service" window, enter the service name in the service name text field. Enter a list of host URLs for protection in the "A list of service urls" text field. A list of URLs can be entered, separated by comma. This is for the host. Within a service multiple policies can be set up to protect API paths Ranger New Service

Ranger Policy

Each service can define many policies. Each policy will define the resources to be protected and user actions which can be performed on the resources. Users can be specified with group or role. Each policy has allow and deny sections, and each section has exclude conditions. Each section could have multiple policy items to define user actions.

Add a New Policy

On the Ranger "Service Manager" page, click on the service name, not on the edit icon Ranger New Service

List of the Polices

Click on the "Add New Policy" button Ranger New Service

Enter Policy Detail

On the "Create Policy" window, enter the policy name and HTTP Path required fields. HTTP Path is the relative path to the host defined in the service. Multiple paths can be entered. Ranger New Service

New Policy in the List

Now the new policy is in the list along with other already existing policies. Ranger New Service

Users/Groups/Roles

To access the "Users/Groups/Roles" page, click on "Setting" menu on the top and select "Users/Groups/Roles" in the drop down. The users, groups, and roles, plus policy allow/deny are used for granular data authorization. Ranger Home

Audit Trail

When a user accesses an application protected by Ranger, an audit record will be created in the application and sent to the Ranger. The Ranger records the complete audit trial for data access. The audit data is stored in ElasticSearch for easy search and visualization. Ranger Audit

BioConnect-Service

The BioConnect-Service is a custom service for API end point protection. It controls if a user can access an API endpoint and with which HTTP protocol. BioConnect Service

BioConnect-GCS

BioConnect-GCS is a custom service for Google Cloud Storage (GCS) data authorization, it can define user access to GCP bucket folder or files. BioConnect GCS

Authorization in your code

Library installation

The BioConnect authorization library is available from PyPI, the Python packaging index.

https://test.pypi.org/project/bioconnect-lib

Installation can be done in the usual ways by adding the repository to your Poetry pyproject.toml file, or to requirements.py for pip. With pip from the command line do:

pip install -i https://test.pypi.org/simple/ bioconnect-lib

There are two BioConnect custom services as described in the previous section

  • BioConnect-Service
  • BioConnect-GCS

BioConnect-Service

The BioConnect-GCS service can be used in any python application or application framework like Django or Flask

Set up Ranger Policy

In the Ranger Admin Portal, create a service and policy with the host name and HTTP path. The service name needs to be matched with the service_name in the python code

Python application

from bioconnect_lib.ranger.authorizer_api import ranger_is_access_allowed_api
from bioconnect_lib.ranger.request_api import AuthorizerRequestAPI

import logging
logger = logging.getLogger(__name__)

def main():
    auth_request = AuthorizerRequestAPI(
        user_id = 'Hongping.Liang@jax.org',
        service_name = 'bioconnect-api-dev',
        action = 'get',
        host = 'http://bioconnect-api-sqa.azurewebsites.net',
        path = '/data_package/package/519/generate/',
    )

    try:
        response = ranger_is_access_allowed_api(auth_request)
        logger.info(f'response: {response}')
    except Exception as e:
        logger.error("Failed to authorize", exc_info=True)

if __name__ == "__main__":    
    main()

# sample response:
{
   "request":<bioconnect_lib.ranger.request_api.AuthorizerRequestAPI object at 0x0000020A9BC9A370>,
   "policies":[
    ...
   ],
   "matched_policy":{
     ...
   },
   "matched_policy_item":"None",
   "is_allowed":true,
   "audit_doc":{
     ...
   },
   "message":"path: \"/data_package/package/519/generate/\" is not configured in policy: \"data-package-auth \"",
   "agent":"bioconnect-test"
}

Django application

Create a permission class which extends the rest_framework.permissions.BasePermission

from rest_framework import permissions
from bioconnect_lib.ranger.auth import auth_get_user_email
from bioconnect_lib.ranger.authorizer_api import ranger_is_access_allowed_api
from bioconnect_lib.ranger.request_api import AuthorizerRequestAPI
import logging
logger = logging.getLogger(__name__)

class BioConnectPermisison(permissions.BasePermission):
    RANGER_SERVICE_NAME = 'bioconnect-api-dev'

    def has_permission(self, request, view):
        if not request.user.is_authenticated:
            return False

        # get user email from access token
        user_email = auth_get_user_email(request)
        if user_email is None:
            return False

        auth_request = AuthorizerRequestAPI(
            user_id = user_email,
            service_name = BioConnectPermisison.RANGER_SERVICE_NAME,
            action = request.method,
            host = request._request._current_scheme_host,
            path = request.path,
        )

        try:
            response = ranger_is_access_allowed_api(auth_request)
            logger.debug(f'response: {response}')
            return response.is_allowed if response is not None else False
        except Exception as e:
            logger.error("Failed to authorize", exc_info=True)
            return False

    def has_object_permission(self, request, view, obj):
        return True
In the view which needs to be protected, add the import and change permission_classes to use "BioConnectPermisison"

from bioconnect.bioconnect_permission import BioConnectPermisison

class PackageViewSet(viewsets.ModelViewSet):
    # permission_classes = [permissions.IsAuthenticated]
    permission_classes = [BioConnectPermisison]

Flask application

Create a decorator method to use BioConnect authorization

from functools import wraps
from flask_restplus import abort
from flask import request, make_response, jsonify
from bioconnect_lib.ranger.auth import auth_get_user_email
from bioconnect_lib.ranger.authorizer_api import ranger_is_access_allowed_api
from bioconnect_lib.ranger.request_api import AuthorizerRequestAPI

def bioconnect_auth_decorator(f):
    @wraps(f)
    def decorator(*args, **kwargs):        
        user_email = auth_get_user_email(request)
        logger.info(f'user_email: {user_email}')                
        if user_email is None:
            return f(*args, **kwargs)

        auth_request = AuthorizerRequestAPI(
            user_id = user_email,
            service_name = 'snp-grid-api-dev',
            action = request.method,
            host = request.host_url,
            path = request.path,
        )
        response = ranger_is_access_allowed_api(auth_request)
        logger.debug(f'response: {response}')
        if response is None or not response.is_allowed: 
            message = f'user: "{user_email}" does not have permission to access "{auth_request.host}{auth_request.path}"'
            abort(404, message)
        return f(*args, **kwargs) 

    return decorator

Add the decorator to the methods which needs to be protected

from src.auth import bioconnect_auth_decorator

@NS.route('/')
class SNPList(SNPBase):
    parser = reqparse.RequestParser()
    parser.add_argument('chromosome', type=str, help='eg: 1,2,3,4,....19,X,Y')

    @NS.expect(parser)
    @bioconnect_auth_decorator 
    def get(self):
        parameter = self.parse_out_parameter()
        result = self.call_service(parameter)
        return result

BioConnect-GCS

The BioConnect-GCS service can be used in any python application or application framework like Django

Set up Ranger Policy

In the Ranger Admin Portal, create a service and policy with the GCP project name and bucket information. The service name needs to be matched with the service_name in the python code

Python application

from bioconnect_lib.ranger.authorizer_gcs import ranger_is_access_allowed_gcs
from bioconnect_lib.ranger.request_gcs import AuthorizerRequestGCS

import logging
logger = logging.getLogger(__name__)

def main():
    auth_request = AuthorizerRequestGCS(
        user_id = 'hongping.liang@jax.org',
        service_name = 'gcs',
        action = 'write',
        project = 'jax-cube-prd-ctrl-01',
        bucket = 'jax-cube-prd-ctrl-01-project-test',
        object_path = 'test-data'
    )
    try:
        response = ranger_is_access_allowed_gcs(auth_request)

        logger.info(f'response: {response}')
    except Exception as e:
        logger.error("Failed to authorize", exc_info=True)

if __name__ == "__main__":    
    main()

Django application

The Django application can be configured to use BioConnect-GCS plugin for bucket data authorization. The configuration is in the Django View class.

  • Step 1: defining a generic Mixin base class
  • Step 2: add the base class as a first parameter for any view which needs to use the ranger authorization
# import
from django.contrib.auth.decorators import user_passes_test
from django.utils.decorators import method_decorator
from bioconnect_lib.ranger.authorizer_gcs import ranger_is_access_allowed_gcs
from bioconnect_lib.ranger.request_gcs import AuthorizerRequestGCS

# Step 1: 
# create a base class to user ranger authorization
class RangerGCSMixin(object):
    @method_decorator(user_passes_test(ranger_is_access_allowed_gcs))
    def dispatch(self, *args, **kwargs):
        return super(RangerGCSMixin, self).dispatch(*args, **kwargs)

# Step 2:
# for any view class, just add "RangerGCSMixin" to the first parameter of the view constructor, 
# it will enable the ranger authorization
# for example:

class PackageViewSet(RangerGCSMixin, viewsets.ModelViewSet):
   """
   the rest will be the same as usual
   """

Envorinment Variable

The following environmental variables need to be set

# ranger
RANGER_URL=http://35.190.178.188:6080/
RANGER_USER=xxxxxx
RANGER_PASSWORD=xxxxxx
ELASTIC_AUDIT_INDEX_NAME=ranger_audits
RANGER_APPLICATION=bioconnect-test

# elasticsearch
ELASTIC_SEARCH_URL=http://35.231.253.113:9200
ELASTIC_SEARCH_USER=xxxxxx
ELASTIC_SEARCH_PASSWORD=xxxxxx