Simple date-string conversion

Answering a question an dates and strings.
Python
Data Processing
Author

Thadryan

Published

February 6, 2018

Simple Date Strings Conversion: split-process-join¶

Someone recently asked me about converting dates in number form to dates their names to make approximate ranges that are easier to read. For example, “01/01/2017 to 04/01/2017” would be “January to February” (in the context of what I was being asked, the years were the columns in a dataframe, so I wasn’t concerned with the year.

Firstly, when confronted with this type of challenge, it is best to do a little background reading. If the times are in a certain format, for instance, there might be packages that allow you to manipulate and convert them already written, like Python’s datetime module. For this post, we will just assume we have raw strings and have to deal with them.

That said, let’s see what we are dealing with.

date = "01/01/2017 to 04/01/2017"

Note that the information is all in one string, as opposed to a list with the dates in their own entries.

The first thing that might come to mind is a regular expression. They’re a powerful tool, and they might be useful in a problem like this. I ended finding a way to it without them. In general, I try to avoid them for simple tasks because trying to add a regex to a program is like busting out your pepper spray in a mosh pit. Quick, easy, life saver? Spark that turns a dicey situation into a full-on incident? Well-intended plan that backfires causing hours of pain? You just don’t know.

I used a “split-process-join” approach.

In order to simplify the process, I split the string on ” to ” so we can deal with each date on its own.

# use the split function 
split_date = date.split(" to ")

print(split_date)
['01/01/2017', '04/01/2017']

Now we an iterate over the list and use the .startswith() with method as this is a list of strings. I use the the range() function instead of “for x in y” because we want to refer to list elements by their number.

# for each index in length of list...
for i in range(len(split_date)):
    
    # give it a name to for readability
    entry = split_date[i]
    
    # if the entry there starts with "01/" -> "January"
    if entry.startswith("01/"):
        
        # replace the item in the array with the month
        split_date[i] = "January"
        
    # as above with "February"
    if entry.startswith("04/"):
       
        # put "February" there
        split_date[i] = "February"
        
# see the output 
print(split_date)
['January', 'February']

To finish the process, we simply need to join the entries back on the ” to “. Because .split() and .join() and string methods in Python, the join is admittedly awkward looking.

# on " to " join split_date
final = " to ".join(split_date)

# show results 
print(final)
January to February

And we’re there:) All in once place, and from the top, that gives us:

date = "01/01/2017 to 04/01/2017"

split_date = date.split(" to ")

for i in range(len(split_date)):
    entry = split_date[i]
    if entry.startswith("01/"):
        split_date[i] = "January"
    if entry.startswith("04/"):
        split_date[i] = "February"

final = " to ".join(split_date)

print(final)
January to February

Obviously, this is just a proof of concept that only works for two months, but it is easy to see how it could made into a function. What would your next step be? I’d go with a function using a dict.

def number_to_name(date):
    # create a dic that stores the pairs
    d = {"01/":"January",  "02/":"February", "03/":"March", 
         "04/":"April"  ,  "05/":"May",      "06/":"June",
         "07/":"July",     "08/":"August",   "09/":"September",
         "10/":"October",  "11/":"November", "12/":"December" }
    
    split_date = date.split(" to ")
    for i in range(len(split_date)):
        entry = split_date[i]
        # the first 3 chars denote the month
        month = entry[:3]
        # sub them from dict
        split_date[i] = d[month]
    # return after joining 
    return " to ".join(split_date)               # ugly, but saves a line
                                         
print(number_to_name("01/01/2017 to 04/01/2017"))
January to April

It might make sense to define the dict outside the function so it doesn’t recreate it every time in practice. Feel free to take this and do whatever to it, and I am open to hearing how people would improve it.