Using GenAI Across the Analytics Lifecycle

Tecnología   |   Taylor Porter   |   6 de noviembre de 2024 TIEMPO DE LECTURA: 7 MIN
TIEMPO DE LECTURA: 7 MIN

A 2016 Forbes article titled “Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task” revealed that data professionals spend a whopping 60-80% of their time on data preparation.

Over seven years later, Anaconda released a report confirming that not much has changed: data preparation and data cleaning still dominate data professionals’ time. Another survey by Microsoft found that 64% of employees don’t have enough time and energy to get their work done.

But as genAI becomes more mainstream, the tables are finally turning on manual, time-consuming work, especially in data analytics.

In this blog, we share some of the best ways data professionals can leverage genAI throughout the analytics lifecycle, including real-world insights from Luke Cornetta, Senior Director at Alvarez & Marsal, who recently joined the Alter Everything podcast to share how he uses generative AI in his tax practice to save time on everything from ETL to drafting PowerPoints.

What is genAI for analytics?

Generative AI is a type of artificial intelligence that generates new content (text, video, or other media) based on input data. It commonly uses machine-learning models like large language models (LLMs) and transformer architectures to generate new content, but it can also use other types of models depending on what content it’s creating.

In practice, it can transform virtually every step of the data analytics lifecycle, starting with essential information gathering.

Quickly contextualize information

Every industry is rife with its own terminology and acronyms. For example, you may have heard sentences in your organization like, “EBITDA adjustments were impacted by accrual-basis amortization entries” or “We analyzed high-demand SKUs to adjust safety stock levels and improve lead time accuracy for just-in-time fulfillment.”

Your head may spin when you hear some of these sentences for the first time. GenAI solutions like ChatGPT are great tools for quickly contextualizing info. “If I’m on the phone with someone and they’re using acronyms I don’t know, or they’re using industry language, I can quickly ask, ‘Hey, what is this?’ and get an answer so at least I have some context,” Cornetta said.

An additional perk of tools like ChatGPT is that they can go beyond providing simple definitions. You can ask targeted follow-up questions, such as how one term relates to another or how one term might impact a given scenario.

Easily process unstructured text data

GenAI is a powerhouse for synthesizing text-based data, and a game-changer for projects dealing with large amounts of unstructured data. In Cornetta’s tax practice, his team pulls a lot of data from ERP systems, including PDF and unstructured Excel files. But for one project in particular, they had to extract text-based comments from an Excel file, each comment ranging from 10 to 5,000 characters.

“In the past, there’s been teams of people reading those comments, or maybe we do some sort of keyword check or logic in Alteryx to try looking at them. Regex and text parsing can get you so far, but it turns into a brute force exercise,” he said.

When the team had a chance to work on a similar project this past year, they already had a secure private LLM setup, meaning they could leverage genAI in a way not possible before.

“The project we were helping on had comments, fields, and notes that contained a lot of critical business information — things like pricing and hours of operation in all sorts of formats from dozens of people over the years typing information in no standardized manner.” The team started pasting the comments into an internal LLM and found it was surprisingly good at making sense of them. The next step was to leverage Alteryx.

“We were able to leverage the traditional Download Tool in Alteryx to make those API calls and essentially pass each comment field through that API, applying more or less the same prompt. Then, we were able to use Alteryx to parse those results into a more structured way to load into the target system.”

Cornetta’s team implemented a similar use case for another client looking to classify their IT support ticket better. The team used a very similar process to better categorize the support tickets to help the IT team better understand where it was spending the majority of its time.

Use it as an end-to-end “copilot”

Cornetta’s team at Alvarez & Marsal has done the groundwork of creating a private and secure LLM, which was an incredibly important requisite given the sensitive nature of the data they work with.

Many data workers find themselves in a similar situation, but once the upfront work of finding and installing a working LLM and setting it up locally is done, the sky is the limit.

“Go and just see how it can shave 10-15 minutes off tasks. … It’s great at even giving you Alteryx formulas. I ask it all the time to help draft an Alteryx formula … not because I don’t know how to do it, it’s just I know that it’ll take me more time to do it myself than it would take the AI to write it.”

Slashing 15 minutes off an hour-long task may not seem like much, but it adds up over weeks and years. According to PwC, genAI can make knowledge workers 30-40% more productive.

Analytics copilots are powerhouse assistants that can help with everything from high-level questions about your data set to more advanced analytics use cases, like helping you choose the right features and model for your latest machine learning project.

Throughout every step in your analytics, genAI solutions like our Workflow Summary Tool can completely automate documentation, creating concise summaries of your workflow’s purpose, inputs, outputs, and key logic steps.

Finally, genAI solutions are great at reporting your insights to stakeholders. From creating PowerPoints to drafting emails, these solutions can save you hours.

Despite genAI’s myriad of use cases, Cornetta says it’s important not to blindly accept its outputs. “There are risks with AI around hallucinations and misinterpretations of a prompt.” For this reason, he and his team methodically check AI’s outputs.

“We do a lot of validation … it’s much easier to build validations on structured data. We were expecting the values we were extracting to be within a certain range or a certain set of values, so we were able to pluck out the outliers.”

The second step to success is ensuring stakeholders are aligned on what AI can and can’t do. “AI’s not going to magically get a hundred percent accuracy — probably not for a long time, if ever — but getting on the same page that AI is will help jumpstart and get us 80-90% of the way there. And then there’s still going to be some that probably need to be looked at after that. So that’s the other hallmark for successful projects of this nature is just making sure expectations are all aligned, and everyone’s comfortable with how it’s working.”

Finally, for anyone wondering how to start using genAI, Alteryx, or any technology, Cornetta’s best advice is to “just get started.”

“Go and find a use case and see how it can work and don’t give up when it doesn’t necessarily give you the right output on the first try,” he said. “I know plenty of people that are afraid of it or intimidated by using new technology, and I’d say just give it a shot. Give it an earnest, real chance, and I think it might surprise you.”

Learn more about using genAI in your analytics.

Try our interactive, browser-based demo of Alteryx Auto Insights

Etiquetas