SoFunction
Updated on 2025-04-26

Summary of date and date processing in Pandas

Pandas provides powerful date and time processing capabilities, which are crucial for time series analysis. This tutorial will introduce the main methods of handling date and time in Pandas. include:

  • Creation and conversion of date and time data
  • Extraction of date and time attributes
  • Time difference calculation and date calculation
  • Resampling and frequency conversion
  • Time zone processing
  • Date-time-based indexing operations

Date and time types in Pandas

  • Timestamp: It represents a specific moment, such as 8:00 pm on August 8, 2008.
  • Time period (period): Reference the length of time between a specific start and end point; for example, 2015. Time periods usually refer to special cases of time intervals, where each interval has a uniform length and does not overlap (for example, to form a 24-hour period of time per day).
  • Time increment or interval (timedelta): Reference the exact length of time (for example, the interval is 22.56 seconds).

1. Create date and time data

1.1 Use the to_datetime() function

import pandas as pd

# Convert string to datetimedate_str = ['2023-01-01', '2023-01-02', '2023-01-03']
dates = pd.to_datetime(date_str)
print(dates)

1.2 Creation date range

# Create date rangedate_range = pd.date_range('2023-01-01', periods=5, freq='D')
print(date_range)

# Date range with time zonedate_range_tz = pd.date_range('2023-01-01', periods=5, freq='D', tz='Asia/Shanghai')
print(date_range_tz)

2. Access date and time attributes

# Create a sample DataFramedf = ({
    'date': pd.date_range('2023-01-01', periods=5, freq='D'),
    'value': [10, 20, 30, 40, 50]
})

# Extract properties such as year, month, and daydf['year'] = df['date'].
df['month'] = df['date'].
df['day'] = df['date'].
df['day_of_week'] = df['date'].  # Monday=0, Sunday=6df['day_name'] = df['date'].dt.day_name()
df['is_weekend'] = df['date']. >= 5

print(df)

3. Date and time operation

3.1 Time difference calculation

# Calculate the time differencedf['date_diff'] = df['date'] - df['date'].shift(1)
print(df[['date', 'date_diff']])

# Use Timedelta for time operationdf['date_plus_2days'] = df['date'] + (days=2)
df['date_plus_3hours'] = df['date'] + (hours=3)
print(df)

3.2 Date comparison

# Date comparisonstart_date = pd.to_datetime('2023-01-02')
df['after_start_date'] = df['date'] > start_date
print(df[['date', 'after_start_date']])

4. Resampling and time-frequency conversion

# Create sample time series datats = (
    [1, 2, 3, 4, 5],
    index=pd.date_range('2023-01-01', periods=5, freq='D')
)

# Downsampling (low frequency) - Calculate weekly averageweekly = ('W').mean()
print("Weekly resample:\n", weekly)

# Upsampling (high frequency) - fill missing valueshourly = ('H').ffill()
print("Hourly resample (forward fill):\n", (10))  # Show only the first 10 lines

5. Time zone processing

# Localization time zonets = (
    [1, 2, 3],
    index=pd.date_range('2023-01-01', periods=3, freq='D')
)
ts = ts.tz_localize('UTC')
print("UTC timezone:\n", ts)

# Time zone conversionts_shanghai = ts.tz_convert('Asia/Shanghai')
print("Shanghai timezone:\n", ts_shanghai)

6. Date and time index operation

# Set date to indexdf.set_index('date', inplace=True)

# Slice by yearprint(['2023'])

# Slice by monthprint(['2023-01'])

# Slice by date rangeprint(['2023-01-02':'2023-01-04'])

7. Practical application examples

# Read data containing date and time# Suppose there is a CSV file that contains date columns# df = pd.read_csv('', parse_dates=['date_column'])

# Handle missing datesfull_date_range = pd.date_range(start=(), end=(), freq='D')
df = (full_date_range)

# Fill in missing valuesdf['value'] = df['value'].fillna(method='ffill')  # Forward filling
# Calculate the rolling averagedf['7_day_avg'] = df['value'].rolling(window='7D').mean()

print((10))

8. Advanced Tips

8.1 Custom work calendar

from  import CustomBusinessDay
from  import USFederalHolidayCalendar

# Use the United States Federal Holiday Calendarus_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar())
date_range = pd.date_range('2023-01-01', periods=10, freq=us_bd)
print("US business days only:\n", date_range)

8.2 Quarterly data processing

# Create quarterly dataquarterly = (
    [100, 200, 300, 400],
    index=pd.date_range('2023-01-01', periods=4, freq='Q')
)

# Quarterly start and end datesprint("Quarter start:\n", )
print("Quarter end:\n",  + ())

This is the end of this article about the summary of date and date processing in Pandas. For more related Pandas date and date content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!