Microsoft Excel should be considered as part of your analysis toolkit. Here are some of the tools I use for ad hoc analysis and data cleansing.

### Data cleansing for text

#### PROPER, LOWER, UPPER

To change the case of a cell using one function, insert a new column to the right of your data and use the functions below. Don’t forget to copy and ‘Paste Values’ to make the values remain.

1 2 3 |
=PROPER(A1) =LOWER(A1) =UPPER(A1) |

#### TRIM

Sometimes when importing data, extra spaces can be added during the process before or after the data. To remove these, we use the TRIM function.

1 |
=TRIM(A1) -- removes any leading or trailing spaces from the text in A1 |

#### CONCATENATE

This function is useful for putting a First Name and Last Name field back together with one step. This pastes B2 and A2 together in one cell with one space in between.

1 |
=CONCATENATE(B2," ",A2) |

#### RIGHT, LEFT, MID

These three functions are used to pull out certain parts of the cell without having to do this one by one. LEFT finds the cell entered and returns the number of characters to the left, in the example this would be four characters.

1 2 3 4 5 |
=LEFT(I2,4) =RIGHT(I3,4) =MID(I4,5,5) /*MID finds the cell entered and returns the number of characters requested from the start point given, in this case, five characters from point five */ |

Remember if you haven’t used the TRIM first, any trailing spaces will be included.

### VLOOKUP

VLOOKUP lets Excel do the hard work of matching up a value from one spreadsheet to another. It saves time and is an essential Excel tool.

Insert a column to the right of the column you are using as the reference. This is where your formula and results will go.

Type into your formula bar =VLOOKUP( and complete the formula using the instructions below.

The formula means:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
=VLOOKUP ( lookup_value /* You define a value for the formula to look for by entering the cell here */ table_array /* the range of columns to look in, highlight the whole column where the reference is and drag over to where the value you want to be returned is.*/ index_num /* number of columns from the look_up value to the column where the result is. */ range_lookup /* 0 for an exact match, 1 for the closest match */ ) |

All done! Remember to copy and ‘paste values’ to ensure you the results remain and not the formula. If for any reason your lookup table moves or is deleted, so will your results.

BONUS TIP:

To hide the #N/A error that VLOOKUP throws when it can’t find a value, nest the IFERROR function to catch the error and return any value you like. In this case, it will return ‘Not found’:

1 |
= IFERROR(VLOOKUP(B2,SHEET2!B:C,2,0),"NOT FOUND") |

### Logical Functions for ad-hoc analysis

Here are six more useful functions to use in Excel. This time, logical functions that return a result based on the conditions.

#### AND

Returns TRUE if ALL of the arguments evaluate to TRUE. In this example, the result is TRUE if the value in cell E2 is greater than 10, AND the value in F2 is less than 200

1 |
=AND(E2>10, F2<200) |

#### OR

Returns TRUE if ANY of the arguments evaluate to TRUE. In this example, the result is TRUE if the value in cell E2 is greater than 10, OR the value in F2 is less than 200

1 |
=OR(E2>10, F2<200) |

#### NOT

Returns the reversed logical value of its argument. ie. If the argument is FALSE, then TRUE is returned and vice versa. In this example, the result is TRUE if a value in E2 is NOT greater than 500.

1 |
=NOT(E2>500) |

#### IF

To return a value other than True or False, based on whether a condition has been satisfied, use the IF function.

IF statements are not only useful, they teach junior analysts the basics of Else/ElseIf statements.

The IF function needs 3 parameters:

1 – The test

2 – The result if the test is TRUE

3 – The result if the test is FALSE

1 2 |
=IF(test,"true","false") =IF(B2>C2,"Win","Lose") -- text needs to be in double quotes |

Nesting simply means to combine formulas, one inside the other, so that one formula handles the result of another.

1 2 3 4 5 |
=IF(B5>10,"Platinum", -- if >10 then Platinum IF(B5>3,"Gold", -- if >3 then Gold IF(B5>7.5,"Silver", -- if >7.5 then Silver IF(B5>2.5,"Bronze", -- if >2.5 then Bronze "Starter")))) -- else Starter |

1 2 3 |
SUMIF =SUMIF(range,"criteria") =SUMIF(B1:B4,">= 40") -- to add up only those values in a range |

1 2 3 |
COUNTIF =COUNTIF(range,"criteria") =COUNTIF(B1:B4,"South Africa") -- to count up only those values in a range |

Big and small businesses use Excel because it’s easy to learn and allow analysts and stakeholders to speak the same language. If more sophisticated add-ons are required (Power BI, Power Pivot, Power Maps) these are open source and easy to use.

Excel offers functionality and plenty of bang for your buck with a usable interface and plenty of add-ons for scalability.

Photo by Kaique Rocha from Pexels

This post first appeared on dev.to

Comments are closed.