# How to rename and format time variables 00:00:01,507 --> 00:00:22,063 We're going to do some useful data cleaning by combining two data fields to make them seem  like a number to Excel and then we can do some calculations using that number. In this example we're going to use a month and year variables and they're in columns O and P, and R and S.   00:00:22,063 --> 00:00:49,500 We're going to use those to make date variables in columns V and W and we're going to use a function called CONCATENATE to do this. Now we highlighted the columns with the source data because there's quite a lot going on in this dataset. So first of all label four columns. We want one called 'start_date', one called 'end_date', one called 'loan_duration' and one called 'loan_duration_recoded'. 00:00:49,500 --> 00:01:16,231 It will become clear what these columns are for when we're finished. Now first we need to make sure that the data is correctly coded, then we will make sure that blank cells do not create errors. CONCATENATE glues together text from more than one cell. We're going to do this four columns O and P first of all by gluing column O to column P and putting a hyphen between them. 00:01:16,231 --> 00:02:03,180 Carefully type this into the first cell in the start_date column. So that's =IF and O2<>"", comma, P2<>"", cover, CONCATENATE, O2, comma, then a hyphen in quotes, then a comma, then P2, then you close that bracket, comma, and then finally the null and close the whole bracket, hit Enter. Hang on a minuteL what does this mean? Well the IF function sets a test, and the other arguments are what happens if the test is passed, or what happens if the test is failed. 00:02:03,180 --> 00:02:52,045 So in this case the test is: Is there data in both the cells O2 and P2, which is what we mean by O2 not equalling the null? If there is data in both, we take O2, we add a hyphen, and then we add P2 to the end of it. If there's no data in it, then the empty quotes at the end of this formula mean we don't put anything in the cell. So fill down and we see the concatenation has created a date format. Excel recognizes this as a date. You need to do the same thing using columns R and S and use it to create a date in the next column. Pause the video while we do it - we're just going to fill this in. 00:03:02,140 --> 00:04:28,860 Now we have two sets of dates. We need to find out how long these loans are and we can do this because the dates have numerical values, so finding the difference between them we'll find out how many days have elapsed. So that's what we are going to do in the column called loan_duration. So we type another formula into this using if again so it's the IF and V2 does not equal the null, and W2 does not equal the null. Then W2 minus V2 and the null quotes. Again let's take a minute to look at this formula. We are saying that if both V2 and W2 are not empty, that is, there's a date in both of those columns, count the number of days between those two dates, but if either of them are empty or both of them are empty, return the null. Now fill down but look some of the numbers we've come up with are positive and some are negative. This makes no sense in the real world because you can't have a loan that lasts for a negative number of days. This sort of error often happens in raw data when the people who have entered the data in the first place have made a simple mistake and in this case it looks like they've put the end date in the start date column and vice-versa. 00:04:28,860 --> 00:05:05,203 So our final step will be to clean this up. So we need to take the absolute value of the difference between the days which just means the number without the negative sign in front of it. So copy the formula exactly as before but this time replace the second argument with ABS(X2) which returns the absolute value of X2. Now we have accurate numerical data for the length of the loan that we can just fill down and we know how many days have elapsed for each loan.