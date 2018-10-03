Databricks,
the leader in unified analytics and founded by the original creators of
Apache Spark™, and RStudio, today announced a new release of MLflow, an
open source multi-cloud framework for the machine learning lifecycle,
now with R integration. RStudio has partnered with Databricks to develop
an R API for MLflow v0.7.0 which was showcased today at Spark
+ AI Summit Europe. This new integration adds to features that have
already been released, making MLflow the most comprehensive open source
machine learning platform, with support for multiple programming
languages, integrations with popular machine learning libraries, and
support for multiple clouds.
Previous to MLflow, the industry did not have a standard process or
end-to-end infrastructure to develop and productionize machine learning
applications in a simple and consistent way. With MLflow, organizations
can package their code as reproducible runs, execute and compare
hundreds of parallel experiments, leverage any hardware or software
platform for training, tuning, hyperparameter search and more.
Additionally, organizations can deploy and manage models in production
on a variety of clouds and serving platforms. As a testament to MLflow’s
design to be an open platform, RStudio’s contribution extends the MLflow
platform to the large community of data scientists who use RStudio and R
programming language.
"In many organizations machine learning workflows are far too ad-hoc,
with no systematic tracking of experiments, inadequate protocols around
reproducibility, and no consistent way to package and deploy models.
MLflow helps address these issues in a uniform fashion across languages
and frameworks," said JJ Allaire, chief executive officer at RStudio.
“Integration of R with MLflow will significantly broaden the reach of
the project by allowing a broader community to use and contribute to
MLflow.”
Since launching MLflow only four months ago, community engagement and
contributions have led to an impressive array of new features and
integrations that have been released, including:
Support for Multiple Programming Languages: To give developers
a choice, in addition to R, MLflow supports Python, Java and Scala; as
well as a REST server interface which can be used from any language.
Integration with Popular Machine Learning Libraries and Frameworks: MLflow
has built-in integrations with the most popular machine learning
libraries such as scikit-learn, TensorFlow, Keras, PyTorch, H2O, and
Apache Spark MLlib to help teams build, test, and deploy machine
learning applications.
Cross-cloud Support: Organizations can use MLflow to quickly
deploy machine learning models to multiple cloud services, including
Databricks, Azure Machine Learning, and Amazon SageMaker based on
their needs. MLflow leverages AWS S3, Google Cloud Storage, and Azure
Data Lake Storage allowing teams to easily track and share artifacts
from their code.
“With MLflow, data science teams can systematically package and reuse
models across frameworks, track and share experiments locally or in the
cloud, and deploy models virtually anywhere,” according to Matei
Zaharia, chief technologist at Databricks, the original creator of
Apache Spark, and Tech Lead of MLflow. “The flurry of interest and
contributions we’ve seen from the data science community validates the
need for an open source framework to streamline the machine learning
lifecycle.”
MLflow on Databricks’ Unified Analytics Platform
Databricks provides MLflow as a managed service, and early adopters are
experiencing increased efficiency across the machine learning lifecycle.
By leveraging MLflow within Databricks’ Unified Analytics Platform,
users can easily initiate runs from their on-premises environment or
from Databricks notebooks. MLflow’s tight integration with Databricks
Delta enables data science teams to track the large-scale data that fed
the models along with all the other model parameters then reliably
reproduce training runs. By integrating MLflow as part of its Unified
Analytics Platform, Databricks is bringing the overall benefits of one
common security model to the entire machine learning lifecycle.
About Databricks
Databricks’ mission is to accelerate innovation for its customers by
unifying Data Science, Engineering and Business. Founded by the original
creators of Apache Spark, Databricks provides a Unified Analytics
Platform for data science teams to collaborate with data engineering and
lines of business to build data products. Users achieve faster
time-to-value with Databricks by creating analytic workflows that go
from ETL and interactive exploration to production. The company also
makes it easier for its users to focus on their data by providing a
fully managed, scalable, and secure cloud infrastructure that reduces
operational complexity and total cost of ownership. Databricks,
venture-backed by Andreessen Horowitz, NEA and Battery Ventures, among
others, has a global customer base that includes Viacom, Shell and HP.
For more information, visit www.databricks.com.
