Authorization Service
BioConnect Authorization Service leverages the Apache Ranger for authorization, and Apache Ranger is a framework to enable, monitor and manage comprehensive data security.
Ranger Admin Portal
The Ranger admin portal is a centralized place to mangage service, policies, users, groups and roles
Apache Ranger provides dozens of built-in services for data security for many Hadoop applications. In addition to the built-in services, we built two custom services for our purpose/BioConnect.
- The BioConnect-Service is a custom service configured to protect the API endpoint and users can be configured to have different permissions to call get, post, delete for API end points
- The BioConnect-GCS is a custom service configured to protect the Google Cloud Storage (GCS) data security, and users can be set up with read, write permissions for different projects, buckets, subfolders, and files within the bucket.
Ranger Service
The Apache Ranger can provide authorization for many services. The screenshots show the built-in services and two BioConnect custom services.
All the services are managed using the Service Manager page, and a service can be added, deleted or edited. Export and import of the service is an easy way to back up and restore the service policy.
The host name is defined in the service and the URL relative path is set up in the policies.
Add a New Service
On the Ranger "Service Manager" page, click on the plus sign next to the service name to open a new window to create a new service. For example, to create new "BIOCONNECT-SERVICE", click on the plus sign pointed to by the red arrow.
Enter Service Detail
On the "Create Service" window, enter the service name in the service name text field. Enter a list of host URLs for protection in the "A list of service urls" text field. A list of URLs can be entered, separated by comma. This is for the host. Within a service multiple policies can be set up to protect API paths
Ranger Policy
Each service can define many policies. Each policy will define the resources to be protected and user actions which can be performed on the resources. Users can be specified with group or role. Each policy has allow and deny sections, and each section has exclude conditions. Each section could have multiple policy items to define user actions.
Add a New Policy
On the Ranger "Service Manager" page, click on the service name, not on the edit icon
List of the Polices
Click on the "Add New Policy" button
Enter Policy Detail
On the "Create Policy" window, enter the policy name and HTTP Path required fields. HTTP Path is the relative path to the host defined in the service. Multiple paths can be entered.
New Policy in the List
Now the new policy is in the list along with other already existing policies.
Users/Groups/Roles
To access the "Users/Groups/Roles" page, click on "Setting" menu on the top and select "Users/Groups/Roles" in the drop down. The users, groups, and roles, plus policy allow/deny are used for granular data authorization.
Audit Trail
When a user accesses an application protected by Ranger, an audit record will be created in the application and sent to the Ranger. The Ranger records the complete audit trial for data access. The audit data is stored in ElasticSearch for easy search and visualization.
BioConnect-Service
The BioConnect-Service is a custom service for API end point protection. It controls if a user can access an API endpoint and with which HTTP protocol.
BioConnect-GCS
BioConnect-GCS is a custom service for Google Cloud Storage (GCS) data authorization, it can define user access to GCP bucket folder or files.
Authorization in your code
Library installation
The BioConnect authorization library is available from PyPI, the Python packaging index.
https://test.pypi.org/project/bioconnect-lib
Installation can be done in the usual ways by adding the repository to your Poetry pyproject.toml file, or to requirements.py for pip. With pip from the command line do:
pip install -i https://test.pypi.org/simple/ bioconnect-lib
There are two BioConnect custom services as described in the previous section
- BioConnect-Service
- BioConnect-GCS
BioConnect-Service
The BioConnect-GCS service can be used in any python application or application framework like Django or Flask
Set up Ranger Policy
In the Ranger Admin Portal, create a service and policy with the host name and HTTP path. The service name needs to be matched with the service_name in the python code
Python application
from bioconnect_lib.ranger.authorizer_api import ranger_is_access_allowed_api
from bioconnect_lib.ranger.request_api import AuthorizerRequestAPI
import logging
logger = logging.getLogger(__name__)
def main():
auth_request = AuthorizerRequestAPI(
user_id = 'Hongping.Liang@jax.org',
service_name = 'bioconnect-api-dev',
action = 'get',
host = 'http://bioconnect-api-sqa.azurewebsites.net',
path = '/data_package/package/519/generate/',
)
try:
response = ranger_is_access_allowed_api(auth_request)
logger.info(f'response: {response}')
except Exception as e:
logger.error("Failed to authorize", exc_info=True)
if __name__ == "__main__":
main()
# sample response:
{
"request":<bioconnect_lib.ranger.request_api.AuthorizerRequestAPI object at 0x0000020A9BC9A370>,
"policies":[
...
],
"matched_policy":{
...
},
"matched_policy_item":"None",
"is_allowed":true,
"audit_doc":{
...
},
"message":"path: \"/data_package/package/519/generate/\" is not configured in policy: \"data-package-auth \"",
"agent":"bioconnect-test"
}
Django application
Create a permission class which extends the rest_framework.permissions.BasePermission
from rest_framework import permissions
from bioconnect_lib.ranger.auth import auth_get_user_email
from bioconnect_lib.ranger.authorizer_api import ranger_is_access_allowed_api
from bioconnect_lib.ranger.request_api import AuthorizerRequestAPI
import logging
logger = logging.getLogger(__name__)
class BioConnectPermisison(permissions.BasePermission):
RANGER_SERVICE_NAME = 'bioconnect-api-dev'
def has_permission(self, request, view):
if not request.user.is_authenticated:
return False
# get user email from access token
user_email = auth_get_user_email(request)
if user_email is None:
return False
auth_request = AuthorizerRequestAPI(
user_id = user_email,
service_name = BioConnectPermisison.RANGER_SERVICE_NAME,
action = request.method,
host = request._request._current_scheme_host,
path = request.path,
)
try:
response = ranger_is_access_allowed_api(auth_request)
logger.debug(f'response: {response}')
return response.is_allowed if response is not None else False
except Exception as e:
logger.error("Failed to authorize", exc_info=True)
return False
def has_object_permission(self, request, view, obj):
return True
from bioconnect.bioconnect_permission import BioConnectPermisison
class PackageViewSet(viewsets.ModelViewSet):
# permission_classes = [permissions.IsAuthenticated]
permission_classes = [BioConnectPermisison]
Flask application
Create a decorator method to use BioConnect authorization
from functools import wraps
from flask_restplus import abort
from flask import request, make_response, jsonify
from bioconnect_lib.ranger.auth import auth_get_user_email
from bioconnect_lib.ranger.authorizer_api import ranger_is_access_allowed_api
from bioconnect_lib.ranger.request_api import AuthorizerRequestAPI
def bioconnect_auth_decorator(f):
@wraps(f)
def decorator(*args, **kwargs):
user_email = auth_get_user_email(request)
logger.info(f'user_email: {user_email}')
if user_email is None:
return f(*args, **kwargs)
auth_request = AuthorizerRequestAPI(
user_id = user_email,
service_name = 'snp-grid-api-dev',
action = request.method,
host = request.host_url,
path = request.path,
)
response = ranger_is_access_allowed_api(auth_request)
logger.debug(f'response: {response}')
if response is None or not response.is_allowed:
message = f'user: "{user_email}" does not have permission to access "{auth_request.host}{auth_request.path}"'
abort(404, message)
return f(*args, **kwargs)
return decorator
Add the decorator to the methods which needs to be protected
from src.auth import bioconnect_auth_decorator
@NS.route('/')
class SNPList(SNPBase):
parser = reqparse.RequestParser()
parser.add_argument('chromosome', type=str, help='eg: 1,2,3,4,....19,X,Y')
@NS.expect(parser)
@bioconnect_auth_decorator
def get(self):
parameter = self.parse_out_parameter()
result = self.call_service(parameter)
return result
BioConnect-GCS
The BioConnect-GCS service can be used in any python application or application framework like Django
Set up Ranger Policy
In the Ranger Admin Portal, create a service and policy with the GCP project name and bucket information. The service name needs to be matched with the service_name in the python code
Python application
from bioconnect_lib.ranger.authorizer_gcs import ranger_is_access_allowed_gcs
from bioconnect_lib.ranger.request_gcs import AuthorizerRequestGCS
import logging
logger = logging.getLogger(__name__)
def main():
auth_request = AuthorizerRequestGCS(
user_id = 'hongping.liang@jax.org',
service_name = 'gcs',
action = 'write',
project = 'jax-cube-prd-ctrl-01',
bucket = 'jax-cube-prd-ctrl-01-project-test',
object_path = 'test-data'
)
try:
response = ranger_is_access_allowed_gcs(auth_request)
logger.info(f'response: {response}')
except Exception as e:
logger.error("Failed to authorize", exc_info=True)
if __name__ == "__main__":
main()
Django application
The Django application can be configured to use BioConnect-GCS plugin for bucket data authorization. The configuration is in the Django View class.
- Step 1: defining a generic Mixin base class
- Step 2: add the base class as a first parameter for any view which needs to use the ranger authorization
# import
from django.contrib.auth.decorators import user_passes_test
from django.utils.decorators import method_decorator
from bioconnect_lib.ranger.authorizer_gcs import ranger_is_access_allowed_gcs
from bioconnect_lib.ranger.request_gcs import AuthorizerRequestGCS
# Step 1:
# create a base class to user ranger authorization
class RangerGCSMixin(object):
@method_decorator(user_passes_test(ranger_is_access_allowed_gcs))
def dispatch(self, *args, **kwargs):
return super(RangerGCSMixin, self).dispatch(*args, **kwargs)
# Step 2:
# for any view class, just add "RangerGCSMixin" to the first parameter of the view constructor,
# it will enable the ranger authorization
# for example:
class PackageViewSet(RangerGCSMixin, viewsets.ModelViewSet):
"""
the rest will be the same as usual
"""
Envorinment Variable
The following environmental variables need to be set
# ranger
RANGER_URL=http://35.190.178.188:6080/
RANGER_USER=xxxxxx
RANGER_PASSWORD=xxxxxx
ELASTIC_AUDIT_INDEX_NAME=ranger_audits
RANGER_APPLICATION=bioconnect-test
# elasticsearch
ELASTIC_SEARCH_URL=http://35.231.253.113:9200
ELASTIC_SEARCH_USER=xxxxxx
ELASTIC_SEARCH_PASSWORD=xxxxxx