Solve the Last Mile of Analytics with Playbooks for Databricks

Technology   |   Alex Gnibus   |   Jun 10, 2024 TIME TO READ: 5 MINS
TIME TO READ: 5 MINS

Data analytics has a “last mile” problem.

In shipping and transportation, the last-mile problem is the last leg of the journey, whether for a package delivery or a passenger getting to their destination. The last mile can also be the most time-consuming and costly part of the process.

Similarly, the most time-consuming part of data analytics is often the last part: getting the data to business use cases.

Once technical data teams build the pipelines and get the data ready, what happens next? The business needs to use it. The data needs to get from the warehouse to the final destination.

The problem is that data housed in the data warehouse or lakehouse is inaccessible to non-technical end users. So, the data never reaches the finish line of business value. According to an Alteryx survey of IT leaders, 24% of collected data is never used, and only 32% of companies report getting accessible value from data.

Playbooks, powered by Alteryx AiDIN, is a new capability in Alteryx Auto Insights that automatically generates analytics use cases based on your unique business scenario from your existing datasets.

With Playbooks for Cloud Data, Auto Insights Playbooks now allows users to:

  • Connect seamlessly with a cloud data source in Alteryx Analytics Cloud, such as Databricks or Snowflake, and securely search for datasets.
  • Identify unused, missing, or “dormant data” that could be better used in new analysis.
  • Input your own dataset to generate use case ideas mapped to departmental and vertical-specific business goals, such as supply chain optimization.
  • Take action with AI-powered insights and generated presentations that make it easy to deliver insights to stakeholders.

How to use Playbooks for Databricks:

In the spirit of solving last-mile problems, let’s look at supply chain challenges! In this scenario, your organization has adopted the Databricks Data Intelligence Platform, and the supply chain team wants to use Databricks more for business insights.

The data engineer has already done the hard work of loading supply chain data and getting it ready to go in Databricks Unity Catalog. But how will the supply chain team find it and use it?

 Databricks Catalog Explorer interface displaying a table with sample data from the "shipments" table within the "supply_chain_demo" catalog. The table columns include Ship Mode, SLA, Days Late, Shipping Cost, Unit Price, Quantity, and Hoard.

1. Discover relevant data via Alteryx Analytics Cloud:

First, a supply chain analyst can securely connect to Databricks from Alteryx Analytics Cloud. From there, they can search for and access relevant datasets in Unity Catalog without going into the Databricks UI.

 The "Import Data" interface in Analytics Cloud showing options to upload data or choose a table from AG_Databricks. The available tables are "main," "system," "unity_cloud," "unity_cloud2," and "hive_metastore." Other data sources listed are S3 Private Data Storage, Partner Enablement sources (AG_Databricks, Alteryx Salesforce, DatabricksCS_AWS, DBXtest), and SE Enablement MGMT (Management Role).

Screenshot of Analytics Cloud's "Import Data" interface. It shows a table named "shipments" from "AG_Databricks/unity_cloud" being selected, with columns displaying Order ID, Warehouse ID, Order Date, Arrival Date, and Order Priority. Other data sources are listed on the left, including S3 and various partner connections.

Our supply chain analyst has found a useful shipment dataset they didn’t even know existed! We can preview the data to ensure it’s what we want. Let’s add it to our workspace.

2. Create a data layer in Auto Insights:

Now, we’ll add our shipments dataset to Alteryx Auto Insights to make it available for analysis.

A screenshot of Alteryx Auto Insights "Data Layers" interface. A modal window titled "Create Data Layer" is open. It shows a list of available datasets including "shipments," "Chicago Stores 20240521.233743.parquet," and "Chicago Customers 20240521.233700.csv." Options to upload files or create a connection are also available.

3. Generate use cases with Playbooks:

Here, we can select the shipments dataset from a dropdown menu and load use cases. Playbooks will scan your data and create best-fit use cases relevant to supply chain challenges.

(Note: Your admin will need to adjust the dataset permissions to enable Playbooks access).

The Alteryx Auto Insights Playbooks interface with two options: "Enter a Scenario" and "Scan your Data." The "Scan your Data" option is highlighted, showing a dataset called "shipments" ready to be analyzed. A note indicates that Playbooks is powered by AI and that data is shared with Microsoft Azure Cognitive Services.

Alteryx Auto Insights Playbooks interface displaying four use cases for analyzing a "shipments" dataset: Optimize Order Processing Efficiency: Analyze sales, transactions, and quantities per order to identify inefficiencies. Improve Shipping Cost Management: Identify cost reduction opportunities by analyzing the relationship between shipping cost and other variables. Identify Late Delivery Causes: Pinpoint potential reasons for late deliveries by examining days late per order in relation to various factors. Warehouse Performance Evaluation: Evaluate each warehouse's performance by comparing average transactions, sales, and quantity per order. Each use case includes potential benefits and a button to generate a report or view insights.

Select a use case to automatically generate a report with actionable insights, building a shareable proof of concept for business value in minutes. Let’s choose “Identify Late Delivery Causes.”

Auto Insights will generate a Mission (a collection of insights) based on your chosen use case. Think of it as a more dynamic dashboard. Here, we can see the Mission for “Identify Late Delivery Causes” generated from the shipments dataset we found in Databricks.

Auto Insights has identified key trends and data patterns related to late deliveries. Right away, we see that the “Express Air” shipping mode contributed to late deliveries, and surprisingly, critical-priority orders had the most late days.

Alteryx Auto Insights Playbook for "Identify Late Delivery Causes." The mission summary highlights an increase in average days late per order from 19-25 Dec 2021 to 26-31 Dec 2021. The main insights are broken down by warehouse, shipping mode, order priority, city, and hoarding, with a line graph illustrating the trend of average days late over time.

Using Magic Documents in Auto Insights, generate an email summary or presentation to share your findings!

Alteryx Auto Insights report on identifying late delivery causes, summarizing key findings related to shipping modes and order priority, with options to edit the presentation or adjust content for different audiences and objectives.

Playbooks for Cloud Data makes it seamless and secure for business users to discover valuable use cases for the data in your data warehouse (so you can optimize your real warehouse, in the case of a supply chain analyst)!

Interested in learning more about Playbooks for Cloud Data and how it works with your preferred data platform like Databricks? Read the Databricks and Alteryx best practices guide.

Tags