Six nouvelles façons de voir au-delà des lignes et des colonnes

La déconstruction des données : un processus repensé

Technology   |   Shane Remer   |   May 20, 2020 TIME TO READ: 12 MINS
TIME TO READ: 12 MINS

From pain points to possibilities, moving from worksheets to workflows not only raises the quality of your day, but the trajectory of your career, and the way you orient yourself to the data movement. Will it seem daunting with every advancement or will you see your unique position in this digital transformation?

We’ve listed some of the most common data-related tasks performed in spreadsheets that can be transformed with modern analytics. Let’s explore six new ways to think beyond rows and columns and improve spreadsheet myopia.

#1 INPUT

Can’t all data sources just get along?

Life as a data analyst ought to be pretty sweet. After all, you’re paid to find the golden nuggets of insight buried deep in a mountain of data — and you love to roll up your sleeves and start digging. You start by opening data sets. And more data sets. Each set might be from a different source or program, especially if the data comes from different departments (if you can even get past the gatekeepers to get hold of it.)

This is where the mess starts piling up. Accepting data in multiple formats isn’t particularly easy in spreadsheets, to say the least. To build a data set you can work with, you’ve got some wrangling to do. Maybe this situation isn’t so sweet after all.

Also, as you work to incorporate data, you’ve got nagging worries in the back of your mind: Am I adding duplicate or unnecessary information? Will I lose something across files in different formats? Will I have to leave out important info altogether because it isn’t compatible with my data set?

Analysts everywhere must contend with data silos, where information is trapped in unusable formats and insulated departments. And they dream of data nirvana: smooth, seamless data normalization, where all data is organized logically and consistently.

At one of America’s leading healthcare providers, Thomas Hall works closely with several teams to analyze trends, performance, and create automated reports. Read their worksheets to workflows story.

Data sets really can play nicely together.

There is an easier way to build and normalize a data set, even if you’re working with incompatible file formats, database connections, or cloud data stores. In fact, the possibilities for data types you can include in your work are nearly endless. You simply need a starting point where all formats are welcome, and no data is left behind.

In Alteryx, that starting point is called your canvas. It’s visual, it’s simple, and it can change your life. Just drag and drop an Input Data Tool onto your canvas, locate the data set you’re trying to import, and select.

If you’ve ever wasted most of a sunny day trying to get your spreadsheets solution to accept a data source, you’re going to love the Alteryx way of doing things. Different file formats or structures? Alteryx won’t even blink. You can access data locally from Excel, Access, XML, SAS, SPSS, or MapInfo, as well as data stored in databases or HDFS.

Alteryx also has direct connectors to cloud systems such as Amazon S3, Twitter, Foursquare, Marketo, Salesforce, and Microsoft SharePoint, as well as other Big Data environments such as Amazon Redshift, Impala, and Spark.

#2 CLEANSE

Data cleaning: where joy goes to die.

The secret is out: The mundane job of data cleaning is where you spend most of your time as an analyst. By the time you get to the good part — you know, the “analyzing” — you’re out of gas and out of time. You still need to deliver the analysis, of course. (Hope you didn’t make any mistakes.) Is this lack of balance really the nature of cleaning data? Or is it simply the nature of spreadsheets?

In spreadsheets, cleaning is a highly manual process. Parsing, creating columns, deleting rows, removing white space, and so on can take eons and suck the life right out of you. Believe us, we know.

But there’s a bigger issue to consider. All that cutting and pasting and renaming doesn’t just take a crazy amount of time — it’s also an invitation to make mistakes. Miscalculations, mental errors, and duplicate records in these early stages can send your analysis careening in the wrong direction or even force you to start over.

Data integrity should be your biggest concern in the cleaning stage. You want to be confident in the accuracy and consistency of data, no matter where you move it or how you change its format, and make sure its meaning isn’t unintentionally altered as you tidy it up.

Exercise your own workbook to workflow skills. Take Challenge #19: Excel Record Locator.

It’s time to rethink the manual approach.

Instead of a mind-numbing, soul-crushing series of clicks, what if data cleansing was one broad function accomplished by higher-order tools? What if instead of a thousand actions, you took just one or two?

Switching to Alteryx will dramatically change how much time you spend cleaning data. We won’t lie — it’s a big deal.

You can create new columns, remove rows and columns, and change data types with a single step in Alteryx — a step that applies instantly across your entire data set. You can also let Alteryx take control and automatically interpret your data, assigning types and sizes appropriate for the content. Either way, no more remembering and repeating changes manually in multiple files. And the history of what you did is always there, so you never again have to start over if you mess up. (Seriously.)

This sophisticated approach to data cleaning virtually eliminates manual processes and human error, freeing up your time for more important things.

#3 JOIN

What the analyst has joined, let no one separate.

Here’s where things start getting interesting — or scary, depending on how confident you feel in blending and appending data from separate worksheets. Joining data sets always means altering your source material. If you’re lucky, you get through this stage without accidentally damaging your data set. If not, it’s back to square one.

As with most of the steps for processing data in spreadsheets, the required actions for joining data sets are incredibly tedious. Is it just us, or is repeating “VLOOKUP” a recipe for madness?

The trouble with blending data the old-fashioned way is that spreadsheets aren’t agnostic. Spreadsheet programs recognize only their preferred format, and they can’t step outside that format without direct input from you via manual tools like VLOOKUP or INDEX MATCH. Once you start layering in multiple fields and multiple sheets, the odds for error skyrocket.

More than 1.9 billion Coca-Cola beverages are enjoyed daily around the globe. Coca-Cola uses Alteryx to unlock the value of the deluge of corresponding data. Learn more.

There’s a better way to blend.

What if your program could simply blend the data for you so you didn’t have to waste time worrying about format? And what if it could keep track of everything it did so you could always go back and undo?

One profoundly simple set of tools in Alteryx (Union, Find and Replace, and Join) gives you all the blending functionality of your old spreadsheet program but with a shiny new set of creative capabilities you didn’t even know you needed. You can trace and retrace your steps to any point in your workflow at any time — and follow them right back to your starting point if you need to.

Combine multiple worksheets based on field name while maintaining the position of each column. Or change the column order if you like. Join two data streams with a common field by performing an inner or an outer Join.

In fact, bring data together any way you like. By letting Alteryx do most of the work, you can get right to the business of exploring your data set more deeply.

#4 TRANSFORM

Ever forgotten a filter? We get it.

To discover data’s deeper meaning, you need to view it through your own carefully chosen parameters. To do this in spreadsheets, you filter, sort, and pivot to transpose and rearrange the data exactly the way you want it. Those aren’t necessarily difficult tasks, but they must still be done manually.

You can use your clipboard and undo your way back to a previous point in your data transformation process, but are you really going to rely on this error-prone and manual process?

The data transformation process in spreadsheets can be frustrating and time-consuming for you personally, but the larger concern is that hidden data fields and forgotten filters can wind up costing your company a lot of money. When you finish an analysis, how confident are you that nothing got lost along the way? Would you stake your job on it?

The 21st century is calling.

By performing common data transformation functions with highly intelligent tools, you can reduce errors — and risk — as you move through the most exciting part of your work.

The Sort, Transpose, and Cross Tab Tools in Alteryx allow you to organize and pivot your data in many different directions automatically — allowing you to see the big picture quickly. And by using these tools in workflows, you can always get back to your starting point and account for your steps. Being able to explain your methodology and change views on the fly is priceless.

How would we compare transforming data in spreadsheets to transforming data in workflows? We wouldn’t. It’s like comparing a 1950s rotary phone with one function to a brand-new smartphone that can do a thousand different things. When it comes to establishing a forward-thinking strategy for data methodology, workflows are the difference between a siloed organization and one with a true culture of analytics.
A major fast food chain is entering the next chapter of data-driven success. Read more about their analytics maturity journey.

#5 CALCULATE

Frustrated with formulas? It’s only logical.

Let’s crunch some numbers, shall we? Spreadsheets are built to calculate logical formulas using IF statements, which are painless to set up. But applying those formulas everywhere you want them is something else entirely.

When you apply formulas, you hold a lot of information in your memory about what you’re doing and how you got there. Cutting and pasting formulas starts to feel a little sketchy. Where did you put that set of rows you thought you didn’t want, again? Did you apply that formula everywhere it was supposed to go? Wait — did you hide some cells? What happened to all the stuff on your clipboard? Was it important?

Yikes.

Better to set it and forget it.

Setting a formula once, with a single tool, and applying it exactly where you want it, is a far superior solution to manually applying formulas all over the place and trying to remember what you did.

The Formula Tool in Alteryx is a powerful processor — with a single action, you can add a field to an input table or create or update data fields based on an expression or a data relationship. If anything is added, subtracted, or altered, your workflow shows what, when, and where so there’s always an option to rewind and rethink.

In Alteryx, data, logic, and execution exist in separate layers. That means you can take time to plan out your logic before you execute it, and no data will be harmed once you do. If something changes or new records are added, your logic will automatically apply, so you can be confident in your results.

The Magic of Self-Service Analytics

  • Manual cross tabs? No way.
  • Manual formulas? Gone.
  • Manual summarization? Get out of here.

Get to the good part of analyzing right away.

Another worry crossed off the list. It’s incredibly freeing.

#6 – AGGREGATE

Do you type as fast as you think?

The descriptive and predictive power of data lies in aggregation — it’s where the secrets are revealed. But summarizing data in spreadsheets requires the use of pivot tables, so you’re still in single-focus mode as you discover key insights and deliver your results. Viewing data through a single lens compromises your agility and accuracy in the final phase of analysis and prevents you from being able to answer every tough question your boss dreams up.

And then there’s the technical difficulty of working with extremely large data sets. Sometimes your system can’t handle the load when you need to change your parameters, shift positions, or rethink your assumptions — and that’s when the spinning wheel of spreadsheet death hits the screen.

Manual data summarization will always be limited by the speed of your fingers, the capabilities of your spreadsheet solution, and the power of your computer system. And your output lets you see only a single slice of results at once — not the whole pie.

If you’re impatient with these limitations once you get to the pinnacle of your analysis, who can blame you? When the manual steps of data processing move slower than your brain waves, you can lose a lot of valuable ideas.

Unleash your big, beautiful mind.

What you need is a powerful summary tool that can deliver multiple results and views automatically — allowing you to explore outliers, find patterns, and ask deeper questions as fast as you can think of them.

The Summarize Tool in Alteryx processes data instantaneously at every step along your data journey so that you can see many views at once, speeding your time to impact. No more building pivot tables! Group your data and then perform any number of calculations on any fields you like, including more advanced functions not found in spreadsheets such as financial, numeric, spatial, and behavioral analysis.

Finally, you’re able to deliver deep, nuanced insights — insights you trust, on time and on target. Now do you see the big picture?

WATCH

Hear from Aimpoint Digital on getting started with self-service analytics in this video. It’s perfect for those new to Alteryx Designer looking for a brief overview and reasons to get started.

TRAIN

Test out on-demand training designed for spreadsheet users.

Tags