hdx.utilities.dateparse
Date parsing utilities.
get_tzinfos
def get_tzinfos(timezone_info: str) -> Dict[str, int]
Get tzinfos dictionary used by dateutil from timezone information string.
Arguments:
timezone_info
str - Timezones information string
Returns:
Dict[str, int]: tzinfos dictionary
parse
def parse(timestr, default=None, ignoretz=False, tzinfos=None, **kwargs)
Parse the date/time string into a :class:datetime.datetime
object.
Arguments:
timestr
: Any date/time string using the supported formats.default
: The default datetime object, if this is a datetime object and notNone
, elements specified intimestr
replace elements in the default object.ignoretz
: If setTrue
, time zones in parsed strings are ignored and a naive :class:datetime.datetime
object is returned.tzinfos
: Additional time zone names / aliases which may be present in the string. This argument maps time zone names (and optionally offsets from those time zones) to time zones. This parameter can be a dictionary with timezone aliases mapping time zone names to time zones or a function taking two parameters (tzname
andtzoffset
) and returning a time zone.
The timezones to which the names are mapped can be an integer
offset from UTC in seconds or a :class:tzinfo
object.
.. doctest:: :options: +NORMALIZE_WHITESPACE
>>> from dateutil.parser import parse
>>> from dateutil.tz import gettz
>>> tzinfos = {"BRST": -7200, "CST": gettz("America/Chicago")}
>>> parse("2012-01-19 17:21:00 BRST", tzinfos=tzinfos)
datetime.datetime(2012, 1, 19, 17, 21, tzinfo=tzoffset(u'BRST', -7200))
>>> parse("2012-01-19 17:21:00 CST", tzinfos=tzinfos)
datetime.datetime(2012, 1, 19, 17, 21,
tzinfo=tzfile('/usr/share/zoneinfo/America/Chicago'))
This parameter is ignored if ignoretz
is set.
- \*\*kwargs
: Keyword arguments as passed to _parse()
.
Raises:
ParserError
: Raised for invalid or unknown string format, if the provided :class:tzinfo
is not in a valid format, or if an invalid date would be created.TypeError
: Raised for non-string or character stream input.OverflowError
: Raised if the parsed date exceeds the largest valid C integer on your system.
Returns:
Returns a :class:datetime.datetime
object or, if the
fuzzy_with_tokens
option is True
, returns a tuple, the
first element being a :class:datetime.datetime
object, the second
a tuple containing the fuzzy tokens.
now_utc
def now_utc() -> datetime
Return now with UTC timezone.
Returns:
datetime
- Now with UTC timezone
parse_date_range
def parse_date_range(
string: str,
date_format: Optional[str] = None,
timezone_handling: int = 0,
fuzzy: Optional[Dict] = None,
include_microseconds: bool = False,
zero_time: bool = False,
max_starttime: bool = False,
max_endtime: bool = False,
default_timezones: Optional[str] = None) -> Tuple[datetime, datetime]
Parse date from string using specified date_format if given and return datetime date range in dictionary keys startdate and enddate. If no date_format is supplied, the function will guess, which for unambiguous formats, should work fine.
By default, no timezone information will be parsed and the returned datetime will have timezone UTC. To change this behaviour, timezone_handling should be changed from its default of 0. If it is 1, then no timezone information will be parsed and a naive datetime will be returned. If it is 2 or more, then timezone information will be parsed. For 2, failure to parse timezone will result in a naive datetime. For 3, failure to parse timezone will result in the timezone being set to UTC. For 4 and 5, the time will be converted from whatever timezone is identified to UTC. For 4, failure to parse timezone will result in a naive (local) datetime converted to UTC. For 5, failure to parse timezone will result in the timezone being set to UTC.
To parse a date within a string containing other text, you can supply a dictionary in the fuzzy parameter. In this case, dateutil's fuzzy parsing is used and the results returned in the dictionary in keys startdate, enddate, date (the string elements used to make the date) and nondate (the non date part of the string).
By default, microseconds are ignored (set to 0), but can be included by setting include_microseconds to True. Any time elements are set to 0 if zero_time is True. If max_starttime is True, then the start date's time is set to 23:59:59. If max_endtime is True, then the end date's time is set to 23:59:59.
When inferring time zones, a default set of time zones will be used unless overridden by passing in default_timezones which is a string of the form:
-11 X NUT SST -10 W CKT HAST HST TAHT TKT
Arguments:
string
str - Dataset date stringdate_format
Optional[str] - Date format. If None is given, will attempt to guess. Defaults to None.timezone_handling
int - Timezone handling. See description. Defaults to 0 (ignore timezone, return UTC).fuzzy
Optional[Dict] - If dict supplied, fuzzy matching will be used and results returned in dictinclude_microseconds
bool - Includes microseconds if True. Defaults to False.zero_time
bool - Zero time elements of datetime if True. Defaults to False.max_starttime
bool - Make start date time component 23:59:59:999999. Defaults to False.max_endtime
bool - Make end date time component 23:59:59:999999. Defaults to False.default_timezones
Optional[str] - Timezone information. Defaults to None. (Internal default).
Returns:
Tuple[datetime,datetime]
- Tuple containing start date and end date
parse_date
def parse_date(string: str,
date_format: Optional[str] = None,
timezone_handling: int = 0,
fuzzy: Optional[Dict] = None,
include_microseconds: bool = False,
zero_time: bool = False,
max_time: bool = False,
default_timezones: Optional[str] = None) -> datetime
Parse date from string using specified date_format and return a datetime object. Raises exception for dates that are missing year, month or day. If no date_format is supplied, the function will guess, which for unambiguous formats, should work fine.
By default, no timezone information will be parsed and the returned datetime will have timezone UTC. To change this behaviour, timezone_handling should be changed from its default of 0. If it is 1, then no timezone information will be parsed and a naive datetime will be returned. If it is 2 or more, then timezone information will be parsed. For 2, failure to parse timezone will result in a naive datetime. For 3, failure to parse timezone will result in the timezone being set to UTC. For 4 and 5, the time will be converted from whatever timezone is identified to UTC. For 4, failure to parse timezone will result in a naive (local) datetime converted to UTC. For 5, failure to parse timezone will result in the timezone being set to UTC.
To parse a date within a string containing other text, you can supply a dictionary in the fuzzy parameter. In this case, dateutil's fuzzy parsing is used and the results returned in the dictionary in keys startdate, enddate, date (the string elements used to make the date) and nondate (the non date part of the string).
By default, microseconds are ignored (set to 0), but can be included by setting include_microseconds to True. Any time elements are set to 0 if zero_time is True. If max_starttime is True, then the start date's time is set to 23:59:59. If max_endtime is True, then the end date's time is set to 23:59:59.
When inferring time zones, a default set of time zones will be used unless overridden by passing in default_timezones which is a string of the form:
-11 X NUT SST -10 W CKT HAST HST TAHT TKT
Arguments:
string
str - Dataset date stringdate_format
Optional[str] - Date format. If None is given, will attempt to guess. Defaults to None.timezone_handling
int - Timezone handling. See description. Defaults to 0 (ignore timezone, return UTC).fuzzy
Optional[Dict] - If dict supplied, fuzzy matching will be used and results returned in dictinclude_microseconds
bool - Includes microseconds if True. Defaults to False.zero_time
bool - Zero time elements of datetime if True. Defaults to False.max_time
bool - Make date time component 23:59:59:999999. Defaults to False.default_timezones
Optional[str] - Timezone information. Defaults to None. (Internal default).
Returns:
datetime
- The parsed date
get_timestamp_from_datetime
def get_timestamp_from_datetime(date: datetime) -> float
Convert datetime to timestamp.
Arguments:
date
datetime - Date to convert
Returns:
float
- Timestamp
get_datetime_from_timestamp
def get_datetime_from_timestamp(
timestamp: float,
timezone: datetime.tzinfo = timezone.utc,
today: datetime = now_utc()
) -> datetime
Convert timestamp to datetime.
Arguments:
timestamp
float - Timestamp to converttimezone
datetime.tzinfo - Timezone to usetoday
datetime - Today's date. Defaults to now_utc.
Returns:
datetime
- Date of timestamp
iso_string_from_datetime
def iso_string_from_datetime(date: datetime) -> str
Convert datetime to ISO formatted date without any time elements
Arguments:
date
datetime - Date to convert to string
Returns:
str
- ISO formatted date without any time elements