Table of contents

Glossary / abbreviations / acronyms 3

Introduction 4

Deploying enterprise-level analytics 4

Common barriers 4

Best practices 5

Applying DevOps principles 6

Analytics, apps and the cloud 6

Lessons learned in the U.S. federal government 8

Use case: Major banking institution supporting the U.S. Treasury's 8 business operations

Use case: Intelligence agency 10

Use case: U.S. Army TAMIS 10

Conclusion 10

Glossary / abbreviations / acronyms

Term

Definition

CI / CD

Continuous integration / continuous deploymentCT

Continuous trainingDevOps

Development operationsAI / ML

Artificial intelligence / machine learning

RPA

Robotic process automation

C2S

Commercial cloud services

ICGC

Intelligence Community government cloud

CSP

Cloud service providers

HPC

High-performance computing

HTC

High-throughput computing

Introduction

High performance compute and advanced analytics are an integral part of public sector organization's missions, particularly regarding key initiatives that will enable a more streamlined and effective government organization. As public sector organizations work to modernize to meet these initiatives, they must find new frameworks that harness innovative technologies in support of their objectives. Recognizing the full potential of any technology being used in advanced analytics requires the successful implementation of an operational framework to support the analytics life cycle from data pipeline construction and model development to broad-scale adoption. This sort of operational pipeline is uniquely complicated compared to a traditional DevOps pipeline commonly used for software development engagement due to important differences with artificial intelligence (AI) / machine learning (ML)-based projects. This white paper explores how to apply DevOps practices tailored for the unique complexities associated with ML projects. We refer to the framework as MLOps to tackle this challenge on cloud computing platforms. This paper is specifically aligned to the unique problems associated with deploying government AI / ML initiatives.

Deploying enterprise-level analytics

Common barriers

Common barriers in analytics

Developments in AI and other advanced forms of analytics are increasing the value that data provides to the mission. To fully realize that value, models and insights must be shared in a timely manner across the organization's operational environment where they are integrated with security controls as well as other enterprise applications and services. However, organizational dynamics can create structural silos that are reflected in the architecture of analytic environments. The resulting disparate environments make it difficult to deploy consistent analytic models quickly from environment to environment. For these organizations, some of the unique barriers to unlocking the value of their data and analytics include:

Disparate environments

  • • Engineers may be required to recode algorithms or reconfigure the target environment with each deployment. This increases the chance of errors and delays due to issues like configuration drift and change management process overhead

  • • Multiple environments increase the complexities of defining and maintaining role-based security for analytics solutions

Models and analytics take too long to deploy

  • • Ad-hoc deployments are labor intensive and prone to complications

  • • Infrastructure is not equipped to incorporate or efficiently scale to process the data sources analysts and data scientists require to improve their products

Incomplete analytics governance

  • • Analytic data is not consistently managed, tested or mapped to standardized business processes leading to non-repeatable or confusing results

  • • Model management lacks version control, recovery and performance monitoring for consistency and continuous improvement

  • • There are no standardized processes to address security and regulatory requirements

Lack of software development team in analytic development

  • • Many AI / ML development teams are research-based individuals, with very little production software development experience. This means models are not optimized for usage but instead for performance against the summary statistics

  • • Very little structure may exist in a production AI / ML project. Most of the time, AI projects are treated as research initiatives as there is quite a bit of unknown about the best method for delivery until the hypothesis is tested

Data changes over time

  • • Deployed models are built and trained for a certain data set. As the data changes, the results and success of that model also change. This means as rows are added or columns edited, the output of the deployed model can be very different than originally expected

  • • As new data becomes available, the model is not always retrained. This is a problem that public sector organizations must be familiar with. As new data comes in from an unprecedented event, the model must take that into effect. For example, if no training data is associated with the type of weather event that is being predicted, the results will most likely be wrong and should not be relied upon

This is an excerpt of the original content. To continue reading it, access the original document here.

Attachments

Disclaimer

Perspecta Inc. published this content on 25 February 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 25 February 2021 23:04:28 UTC.