Working with data in Excel can be simple, but it can also be challenging and time-consuming, especially when working with multiple sources and formats of data. Data has to fit into rows and columns, has size limitations, and joining data from multiple spreadsheets requires complex formulas that leave your results vulnerable to errors. What’s more, whenever your data changes, you have to repeat each step in the data prep and blending process all over again.
Alteryx takes a different approach with a workflow-based environment that allows you to prep, blend, and analyze data from multiple data sources and types, including unstructured data. And because you spend your time constructing a repeatable workflow that documents exactly how you work with data, you simply have to re-run the workflow any time your data changes — resulting in less time spent analyzing data, fewer errors, and greater consistency.
We’ve listed some of the most common data-related tasks that many analysts attempt to do in Excel. Alongside, we show how you would do it using Alteryx. These examples are meant to help Excel users understand how to translate their Excel knowledge into an Alteryx tool and workflow environment.
To learn more about how Alteryx can help empower Excel users, download the e-book A Radical Guide to Data Analytics Mastery for Spreadsheet Users. Then, follow along by downloading a free trial of Alteryx.
To open the Excel program, click File and then click Open.
To access data in Alteryx, you drag-and-drop an Input Data Tool onto the canvas, locate the database, and simply select. Then click on the Connect a File or Database drop down in the configuration section.
Locate the folder in which the file is located, then just click and open.
Next, find your file wherever it might be located.
Excel can open a limited amount of structured data from local files.
Alteryx offers users a wide range of data access to multiple sources and structures of data. Using the Input Tool, users can access data locally from Excel, Access, XML, SAS, SPSS, MapInfo, as well as data stored in databases or HDFS.
In addition, Alteryx has direct connectors to access data stored in cloud systems such as Amazon S3, Twitter, Foursquare, Marketo, Salesforce, and Microsoft Sharepoint, or other Big Data environments such as Amazon Redshift, Impala, Spark, and others.
The file types, sources, and structure can be expanded by accessing data from the web, text, and a wider range of other data sources.
In Excel, if one folder has multiple files of the same type (e.g. .CSV), you need to select one at a time, and each file then opens in a separate worksheet.
To do this in Alteryx, drag an Input Tool onto the canvas, locate the folder and insert a wildcard (*). In this case we are bringing in two .CSV files.
Using the Input Tool locate the directory and file you are looking for. In this case it is POS_Data.
Then once you bring in the POS_Data file, you go to the configuration section and edit the directory path by adding a (_*.csv). This will bring in all of the .CSV files in the same directory and join together as long as they are the same table structure and format.
Renaming columns is quick and easy in Excel. You simply click on the column cell and change the name.
It’s also easy in Alteryx. Using the Select Tool, you simply rename the field to whatever you want.
Renaming columns is quick and easy in Excel. You simply click on the column cell and change the name.
In Alteryx, the Select Tool allows you to change the data type to whatever format you want.
In addition to the Select Tool, you can use the Auto Field Tool. As its name suggests this tool can interpret the data and assign the type and size appropriate for the content. This is especially useful for fields that contain text strings.
Some conversions are a little more complex. Dates can be awkward, because the Select Tool may not be able to understand your preferred date format. To satisfy this, there is a specific DateTime Tool.
To remove a column used in a calculated field, you must first copy and paste values before deleting the original, otherwise your formula will break.
In Alteryx this is not a problem. Just use the Select Tool and uncheck the box next to anything you don’t need or want to hide, and it will be removed from your workflow without affecting formulas.
This is either a manual selection in Excel, or you need to use a quick filter to find what you don’t need.
If you can list the criteria, you can use the Filter Tool. If it’s complicated, you can use a formula to build IF expressions or use the switch () function.
The most common way to create new fields in Excel is to insert a new column, name the column, and write in a formula if necessary.
In Alteryx, rather than inserting a column into the table, you just add a new output field to your workflow using the Formula Tool. Simply type in the new field (Value Targets in this case) and your new field is calculated on every row.
Tip 1: If you need to concatenate with first name and last name together, use a formula with the + symbol. Most other text‐based formulas use the same name and syntax, for instance Left () or Right (). The Alteryx manual (press F1) will help you identify those which don’t.
Tip 2: If you’re splitting a field with a regular format (for instance, maybe splitting a comma-separated address field), use the Text To Columns Tool.
Everybody uses IF statements in Excel. It is one the most popular decision-making functions in the platform. To create a new Sales Potential column that buckets scores as High, Medium, or Low, you utilize this IF statement in Excel:
=IF(V2>2500, “High”, IF(V2>1499, “Medium”, “Low”))
The Formula Tool is a powerful processor of data and formulas. Use it to add a field to an input table, to create new data fields based on an expression or data relationship, or to update an existing field based on these same premises. To create the same IF statement in Alteryx, simply create a new output field called Sales Potential, and using the Total Sales field, input the following expression:
IF [Total Sales] > 2500 THEN “High” ELSEIF [Total Sales] > 1499 && [Total Sales] < 2501 THEN “Medium” ELSE “Low” ENDIF
Creating an Auto Sum in Excel requires that you select the related fields and rows and then click the Auto Sum symbol. Or you can write a =SUMformula.
The Summarize Tool can do a host of summary processes, including grouping, summing, count, spatial object processing, string concatenation, and much more. In this example the Summarize Tool is used to group transactions by customer and to total up the transactions to see the customer’s total spend.
To create the cumulative sum of rows in Excel, you must create a new column, then set the cumulative range. In this case the starting point is =S2, and then you add =S2 + R3 for the second row. Once this is established, you can copy the formula for the rest of the related information.
The Multi-Row Formula Tool takes the concept of the Formula Tool a step further, allowing you to utilize row data as part of the formula creation. This tool is useful for parsing complex data and creating running totals, averages, percentages, and other mathematical calculations. In this case, you create a new field, and then create an expression.
Calculating percentages on each field based on a total in Excel may require you to first build a table of your data, then build a Pivot Table based on that data. The other option is to write an SUMIF() statement to calculate the percentages for each month and amount. Below is a Pivot Table example.
The Multi Field Formula Tool makes it easy to execute a single function on multiple fields. This example builds new columns/fields in the data that display a percentage of each month.
Filtering in Excel is easy. You simply select the worksheet or columns you want to filter, and then select the filter drop-down.
The Filter Tool in Alteryx lets you do a basic filter to split data into two streams: True (data that satisfies the expression) or False (data that does not).
Sorting by multiple columns in Excel is easy. Just highlight the worksheet and do a sort or custom sort.
The Sort Tool in Alteryx makes it just as easy, allowing you to organize your data exactly the way you need it. Just select the name of the column and choose between ascending or descending.
Pivoting data in Excel (such as transposing data or creating a cross tab) will typically require that you build a Pivot Table to get the data in the shape you want.
The Transpose Tool allows you to pivot the orientation of the data table. It transforms the data so you may view horizontal data fields on a vertical axis. This tool is useful for extracting non-conforming data. There is no limit to the amount of records or fields that can be transposed.
The Cross Tab Tool pivots the orientation of the data table. It transforms the data so vertical data fields can be viewed on a horizontal axis summarizing data where specified.
To reorder columns or fields in Excel, you have to highlight the columns, then use the Shift key to drag and drop the columns to the space you want.
To reorder columns in Alteryx, you use the Select Tool or you can do the same in the Formula Tool.
With Excel, you must ensure that the columns of the two tables are exactly the same and in the same order. Then you copy the entire contents of the second table, and paste it into the first empty row of the first table.
Alteryx makes this easier. The Union Tool allows you to combine multiple worksheets based on the field names while maintaining the position of each column. You can easily change the order of the column headers in each worksheet so they match up.
To join multiple worksheets in Excel, you can use the VLOOKUP formula or INDEX MATCH formula to bring in information to the data sets from a different source(s) or table(s). INDEX MATCH is used instead of a VLOOKUP formula if the lookup value is not in the left-most column, which means you can’t use the VLOOKUP formula.
If you need multiple matches to have their own rows, a join does a better job. You can input two data sets and join them using either the Join Tool. If you’re connecting to a compatible data source, you can join them in the Input tool.
With the Join Tool you join two tables with a common field (primary key) to create the following:
You can also utilize the In-DB Join Tool if you are working with data in a database. The Alteryx In-DB Join Tool enables blending and analysis against large sets of data without moving the data out of a database, and can provide significant performance improvements over traditional analysis methods.
The In-DB Join Tool lets you do inner, left, right, and even full joins.
Learn more about Alteryx in-database tools here.
If you want to summarize data in Excel you build a Pivot Table.
The Summarize Tool in Alteryx works just like a Pivot Table in Excel. You can group data, then sum, count, or perform other calculations on other fields.
If you haven’t already downloaded Alteryx, get a free trial.
To learn more or get help with other features in Alteryx from Alteryx users, visit the Alteryx Community.
We would like to extend a special thanks to our partner the Information Lab for their contribution. Learn more about the Information Lab.