April 2019: Data Lakes & AutoML

Click here to download Jason’s slides

Presented by Jason Robey

Cloud Solution Architect, Data & AI, Microsoft

This talk is a two-parter. Microsoft Cloud Solution Architect Jason Robey joins us on April 18th to share two areas of great interest for our members.

The first area of discussion will be on the establishment of a data lake: governance and best practices and how to keep your lake ecosystem healthy and vibrant. Have you swam in a data lake before? Has your company? No matter what your experience level this portion of the talk should offer some great tips.

The second half of Jason’s talk will be on automated machine learning (AutoML). Automating the construction and tuning of machine learning models has long been a goal for many analytics teams. Existing automated machine learning (AutoML) techniques have been remarkably successful in identifying good parameters for a given model, sometimes even outperforming humans. AutoML iterates over many combinations of machine learning algorithms and parameters. It then finds the best-fit model based on your chosen accuracy metric. Jason will demo some of these experiments during the second half of his presentation.

Continue reading “April 2019: Data Lakes & AutoML”

February 2019: Data Set and API Development At Scale with R and Tensorflow

Click here to download Tim’s presentation slides

Presented by Tim O’Brien

T-Mobile AI

When tasked with creating the first customer-facing machine learning model at T-Mobile, we were faced with a conundrum. We had been told time and time again to deploy machine learning models in production you had to use Python, but our very best data scientists were fluent in building neural networks in R with Keras and TensorFlow. Determined to avoid double work, we decided to use R in production for our machine learning models. After months of work, wrangling our containers to meet cloud security compliance, and conforming to DevOps standards, we succeeded in creating a containerized API solution using the keras and plumber R packages and Docker. Today R is actively powering tools that our customers directly interact with and we have open sourced our methods.

In this talk, we’ll walk through how to deploy R models as container-based APIs, the struggles and triumphs we’ve had using R in production, and how you can design your teams to optimize for this sort of innovation. We’ll also cover using Amazon SageMaker GroundTruth for labeling data sets at-scale and demonstrate how we’ve created a data labeling workflow for T-Mobile AI.

 

Continue reading “February 2019: Data Set and API Development At Scale with R and Tensorflow”

January 2019: Benefits and Challenges of Migrating to the Cloud

Click here to view & download the presentation slides

Presented by Joshua Skarphol

Consulting Manager, Snowflake Computing

Moving is difficult, so why bother? In this talk Joshua Skarphol will discuss the benefits of moving your data to the cloud, moving the discussion beyond the hype, and examining the security, scalability, and performance gains that cloud adoption offers. He’ll also dig into some of the common challenges and pitfalls you can expect when embarking on this journey.

In this interactive session, Joshua will discuss:

  • The challenges of migrating to a cloud solution, including overcoming the inertia of the way things are
  • Data processing impacts of moving to the cloud, including coexistence with on-premises tools and the development impacts of moving to the cloud
  • The benefits of seeing it through, including enabling business transformation and better data access

Continue reading “January 2019: Benefits and Challenges of Migrating to the Cloud”

November 2018: Data Management Maturity

Presented by Melanie Mecca

Director, Data Management Product and Services, CMMI Institute

Data Management Maturity – Why We Need It and How It Can Propel You to DM Leadership

Our industry is continually building capabilities based on its considerable accomplishments over the past decades. Some of the (roughly) sequential milestone markers that most organization share include: data design, data administration, data architecture / warehousing, data quality and governance, MDM, and predictive analytics using both structured and unstructured data.

So why haven’t organizations attained DM perfection? As we know, the data layer in the vast majority of organizations grew project by project, typically to meet specific needs of a line of business. Best practices were not usually shared, useful work products languished in project repositories, etc. – and above all, there was no universal mandate to manage data as a critical corporate asset.

The Data Management Maturity (DMM) Model’s primary goals are to accelerate organization-wide DM programs by: providing a sound reference model to quickly evaluate capabilities, strengths and gaps; accelerating business engagement; launching a collaborative vision / strategy; and identifying key initiatives to extend existing capabilities while building new ones – leading to efficiency, cost savings, creativity, and improved data quality.

In this seminar, we’ll address:

  • Data Management Capabilities and Maturity Evaluation
  • The DMM in action – interactive exercise with the Business Glossary – rate your organization!
  • Case study examples – how organizations have accelerated their progress
  • How to leverage Data Management Maturity to empower your career.
    Continue reading “November 2018: Data Management Maturity”