Sometimes we will have this idea, is for a certain region of the raster data, to extract its average value or other statistical indicators, such as in a province to extract the rainfall data for many years, and finally calculate some statistical values by region, or from a number of raster data to extract the value of a certain region to form a series. In order to facilitate the drawing of a diagram to see, for example, like the extraction of the region of a city in the region, and then form a series of data, which can use the rasterstats library, in addition to the partitioning of statistics can also use this library!
The data format used in this experiment is raster(*.tif) and vector(.shp), after the partition statistics operation and raster data extraction are derived from these two types of data. In order to be able to use this rasterstats library, chose to run the script in the google colab platform, because the installation of libraries is too convenient, the old is not installed on win, in the google notebook immediately get it done, and you can store the data to the google cloud disk, directly in the notebook is to be able to link to use!
So now it's time to do the test, using the data that is the raster and vector dataset on the left side
Importing related modules
import geopandas as gpd import pandas as pd import numpy as np import as plt import rasterio import rasterstats from import show # show() method is used to show raster graphics from import show_hist # Used to display histograms import as ccrs import as cfeature from import LongitudeFormatter, LatitudeFormatter
Read vector and raster data using geopandas and rasterio respectively
# Use geopandas to read vector data districts = gpd.read_file('/content/drive/MyDrive/Datashpraster/Data/Districts/') # Use rasterio to read raster data, the coordinate projection of raster data and vector data should be the same. raster = ('/content/drive/MyDrive/Datashpraster/Data/Rainfall Data Rasters/')
# Plot vector and raster data onto an axis, which is not an axis, but a graphic [''] = 'Times New Roman' [''] = 20 fig, (ax1,ax2) = (1,2,figsize=(15,6)) show(raster, ax=ax1,title='Rainfall') # The read-in vector data can be drawn directly by calling gpd's plot() method. (ax=ax1, facecolor='None', edgecolor='red') show_hist(raster,ax=ax2,title='hist') ()
Let's plot the results first and see
Read raster data:
# Extract rainfall raster values to numpy array # Reads from the first band following GDAL rules rainfall_data = (1) rainfall_data
Start partition statistics:
# Set coordinate transformation information affine = # Ready to start space partitioning calculations # The first parameter is a vector partition, the second is a raster, the third is coordinate transformation information, and the fourth is a statistical mean avg_rallrain = rasterstats.zonal_stats(districts,rainfall_data,affine=affine,stats=['mean'],geojson_out=True) # avg_rallrain # In addition to statistical averages,And the maximum and minimum values.
Draw it up, it's just a simple graphic
Of course the second part is more interesting, which is to extract data from multiple scattered raster data to form a sequence
It's all about the tif data.
loop these raster datasets:
Get the extracted result, yes, it's such a sequence of data, and then it's time to plot it
Converting data formats
# Convert Date columns to time-based data['Date'] = pd.to_datetime(data['Date'], infer_datetime_format=True) # print(data) data['Date'] = data['Date']. print(data)
The result is a simple graph.
# Preparing to draw a graph fig,(ax1,ax2)= (2,1,figsize=(18,6)) [''] = 15 (x='Date', y='Average_RF_Porto', ax=ax1, kind='bar', title='Avg_Rail_Porto') (x='Date', y='Average_RF_Faro', ax=ax2, kind='bar', title='Avg_Rail_Faro',color='red') # Automatically adjust the distribution of graphics plt.tight_layout() ()
The result is such a sequence plot, the purpose of which is to extract the specified study area from the raster, and then extract the values of the raster and plot them again
Although the feeling is not so fancy figure, but this should still be more practical, especially when large quantities of raster value extraction. As in google colab inside the operation of the steps are more, there may be omitted in the middle of the place, but the important should be in the text, of course, can also be migrated to other places, you can also check out this third-party library tutorials, such as read (1) what is the meaning of the official website of the docs on the writing there, it is very convenient!
Above is the raster data with Python partition statistics and batch extraction of the details, more about Python raster data partition statistics and batch extraction of the information please pay attention to my other related articles!