Saturday, February 19, 2011

How do I strptime from a pattern like this?

Hi, I need to use a datetime.strptime on the text which looks like follows.

"Some Random text of undetermined length Jan 28, 1986"

how do i do this?

From stackoverflow
  • Using the ending 3 words, no need for regexps (using the time module):

    >>> import time
    >>> a="Some Random text of undetermined length Jan 28, 1986"
    >>> datetuple = a.rsplit(" ",3)[-3:]
    >>> datetuple
    ['Jan', '28,', '1986']
    >>> time.strptime(' '.join(datetuple),"%b %d, %Y")
    time.struct_time(tm_year=1986, tm_mon=1, tm_mday=28, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=28, tm_isdst=-1)
    >>>
    

    Using the datetime module:

    >>> from datetime import datetime
    >>> datetime.strptime(" ".join(datetuple), "%b %d, %Y")
    datetime.datetime(1986, 1, 28, 0, 0)
    >>>
    
  • You may find this question useful. I'll give the answer I gave there, which is to use the dateutil module. This accepts a fuzzy parameter which will ignore any text that doesn't look like a date. ie:

    >>> from dateutil.parser import parse
    >>> parse("Some Random text of undetermined length Jan 28, 1986", fuzzy=True)
    datetime.datetime(1986, 1, 28, 0, 0)
    
  • Don't try to use strptime to capture the non-date text. For good fuzzy matching, dateutil.parser is great, but if you know the format of the date, you could use a regular expression to find the date within the string, then use strptime to turn it into a datetime object, like this:

    import datetime
    import re
    
    pattern = "((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) [0-9]+, [0-9]+)"
    datestr = re.search(, s).group(0)
    d = datetime.datetime.strptime(datestr, "%b %d, %Y")
    

0 comments:

Post a Comment