= "01/01/2017 to 04/01/2017" date
Simple Date Strings Conversion: split-process-join¶
Someone recently asked me about converting dates in number form to dates their names to make approximate ranges that are easier to read. For example, “01/01/2017 to 04/01/2017” would be “January to February” (in the context of what I was being asked, the years were the columns in a dataframe, so I wasn’t concerned with the year.
Firstly, when confronted with this type of challenge, it is best to do a little background reading. If the times are in a certain format, for instance, there might be packages that allow you to manipulate and convert them already written, like Python’s datetime module. For this post, we will just assume we have raw strings and have to deal with them.
That said, let’s see what we are dealing with.
Note that the information is all in one string, as opposed to a list with the dates in their own entries.
The first thing that might come to mind is a regular expression. They’re a powerful tool, and they might be useful in a problem like this. I ended finding a way to it without them. In general, I try to avoid them for simple tasks because trying to add a regex to a program is like busting out your pepper spray in a mosh pit. Quick, easy, life saver? Spark that turns a dicey situation into a full-on incident? Well-intended plan that backfires causing hours of pain? You just don’t know.
I used a “split-process-join” approach.
In order to simplify the process, I split the string on ” to ” so we can deal with each date on its own.
# use the split function
= date.split(" to ")
split_date
print(split_date)
['01/01/2017', '04/01/2017']
Now we an iterate over the list and use the .startswith() with method as this is a list of strings. I use the the range() function instead of “for x in y” because we want to refer to list elements by their number.
# for each index in length of list...
for i in range(len(split_date)):
# give it a name to for readability
= split_date[i]
entry
# if the entry there starts with "01/" -> "January"
if entry.startswith("01/"):
# replace the item in the array with the month
= "January"
split_date[i]
# as above with "February"
if entry.startswith("04/"):
# put "February" there
= "February"
split_date[i]
# see the output
print(split_date)
['January', 'February']
To finish the process, we simply need to join the entries back on the ” to “. Because .split() and .join() and string methods in Python, the join is admittedly awkward looking.
# on " to " join split_date
= " to ".join(split_date)
final
# show results
print(final)
January to February
And we’re there:) All in once place, and from the top, that gives us:
= "01/01/2017 to 04/01/2017"
date
= date.split(" to ")
split_date
for i in range(len(split_date)):
= split_date[i]
entry if entry.startswith("01/"):
= "January"
split_date[i] if entry.startswith("04/"):
= "February"
split_date[i]
= " to ".join(split_date)
final
print(final)
January to February
Obviously, this is just a proof of concept that only works for two months, but it is easy to see how it could made into a function. What would your next step be? I’d go with a function using a dict.
def number_to_name(date):
# create a dic that stores the pairs
= {"01/":"January", "02/":"February", "03/":"March",
d "04/":"April" , "05/":"May", "06/":"June",
"07/":"July", "08/":"August", "09/":"September",
"10/":"October", "11/":"November", "12/":"December" }
= date.split(" to ")
split_date for i in range(len(split_date)):
= split_date[i]
entry # the first 3 chars denote the month
= entry[:3]
month # sub them from dict
= d[month]
split_date[i] # return after joining
return " to ".join(split_date) # ugly, but saves a line
print(number_to_name("01/01/2017 to 04/01/2017"))
January to April
It might make sense to define the dict outside the function so it doesn’t recreate it every time in practice. Feel free to take this and do whatever to it, and I am open to hearing how people would improve it.