Windowed reading and writing

Windowed reading and writing

来源:https://rasterio.readthedocs.io/en/latest/topics/windowed-rw.html

Beginning in rasterio 0.3, you can read and write “windows” of raster files. This feature allows you to work on rasters that are larger than your computers RAM or process chunks of large rasters in parallel.
rasterio 0.3开始,你可以读写光栅文件的“窗口”(read and write “windows” of raster files)。此功能允许您处理大于计算机内存的栅格,或者并行处理大块的栅格。(说得这么绕,其实就是可以把尺寸很大的栅格分成小块分别处理)

Windows

A window is a view onto a rectangular subset of a raster dataset and is described in rasterio by column and row offsets and width and height in pixels. These may be ints or floats.
窗口(window)是栅格数据集的一个矩形子集的视图,并且通过列和行偏移(column and row offsets)以及像素数量表示的宽度和高度(width and height in pixels)来描述。这些可能是整数或浮点数。

from rasterio.windows import Window

Window(col_off, row_off, width, height)

Windows may also be constructed from numpy array index tuples or slice objects. Only int values are permitted in these cases.
windows也可以由numpy array索引元组或切片对象构成。在这些情况下,只允许使用int值。

Window.from_slices((row_start, row_stop), (col_start, col_stop))
Window.from_slices(slice(row_start, row_stop), slice(col_start, col_stop))

If height and width keyword arguments are passed to from_slices, relative and open-ended slices may be used.
如果将高度和宽度关键字参数传递给from_slices,则可以使用相对切片和开放式切片。

Window.from_slices(slice(None), slice(None), height=100, width=100)
# Window(col_off=0.0, row_off=0.0, width=100.0, height=100.0)

Window.from_slices(slice(10, -10), slice(10, -10), height=100, width=100)
# Window(col_off=10, row_off=10, width=80, height=80)

Reading

Here is an example of reading a 256 row x 512 column subset of the rasterio test file.
下面是一个读取rasterio测试文件的256行x 512列子集的例子。

import rasterio

with rasterio.open('E:/MyStudy/segmentation_models.pytorch-master/data_li/L2A20190813N0213R113.tif') as src:
    w = src.read(1,window=Window(0, 0, 512, 256))

print(w.shape)
# out:(256, 512)

Writing

Writing works similarly. The following creates a blank 500 column x 300 row GeoTIFF and plops 37,500 pixels with value 127 into a window 30 pixels down from and 50 pixels to the right of the upper left corner of the GeoTIFF.
也是如此。下面创建一个空白的500列×300行的GeoTIFF。然后将值为127的37500个像素放入一个window ,该window从GeoTIFF的左上角向下30个像素,向右50个像素。

import numpy as np
image = np.ones((150, 250), dtype=rasterio.ubyte) * 127

with rasterio.open(
        'E:/Resource_Learning/Rasterio_MY/example.tif', 
        'w',
        driver='GTiff', 
        width=500, 
        height=300, 
        count=1,
        dtype=image.dtype) as dst:
    dst.write(image, window=Window(50, 30, 250, 150), indexes=1)

Decimation抽取

If the write window is smaller than the data, the data will be decimated. Below, the window is scaled to one third of the source image.
如果写入窗口小于数据,数据将被抽取。下面,窗口被缩放到源图像的三分之一。

path = 'E:/MyStudy/segmentation_models.pytorch-master/data_li/L2A20190813N0213R113.tif'

with rasterio.open(path) as src:
    b, g, r = (src.read(k) for k in (1, 2, 3))
# src.height = 718, src.width = 791

write_window = Window.from_slices((30, 269), (50, 313))
# write_window.height = 239, write_window.width = 263

with rasterio.open(
        './example_1.tif', 'w',
        driver='GTiff', width=500, height=300, count=3,
        dtype=r.dtype) as dst:
    for k, arr in [(1, b), (2, g), (3, r)]:
        dst.write(arr, indexes=k, window=write_window)

Data windows

Sometimes it is desirable to crop off an outer boundary of NODATA values around a dataset:
有时需要在数据集周围裁剪出NODATA值的外部边界:

import rasterio
from rasterio.windows import get_data_window

with rasterio.open('tests/data/RGB.byte.tif') as src:
    window = get_data_window(src.read(1, masked=True))
    # window = Window(col_off=13, row_off=3, width=757, height=711)

    kwargs = src.meta.copy()
    kwargs.update({
        'height': window.height,
        'width': window.width,
        'transform': rasterio.windows.transform(window, src.transform)})

    with rasterio.open('/tmp/cropped.tif', 'w', **kwargs) as dst:
        dst.write(src.read(window=window))

Window transforms

The affine transform of a window can be accessed using a dataset’s window_transform method:
可以使用数据集的window_transform 方法来访问窗口的仿射变换:

import rasterio
from rasterio.windows import Window
win = Window(256, 256, 128, 128)
with rasterio.open(path) as src:
    src_transform = src.transform
    win_transform = src.window_transform(win)

print(src_transform)
print("---------")
print(win_transform)

Window utilities

Basic union and intersection operations are available for windows, to streamline operations across dynamically created windows for a series of bands or datasets with the same full extent.
基本的并集和交集操作可用于窗口,以简化具有相同完整范围的一系列波段或数据集的动态创建窗口的操作。

>>> from rasterio import windows
>>> # Full window is ((0, 1000), (0, 500))
>>> window1 = Window(10, 100, 490, 400)
>>> window2 = Window(50, 10, 200, 140)
>>> windows.union(window1, window2)
Window(col_off=10, row_off=10, width=490, height=490)
>>> windows.intersection(window1, window2)
Window(col_off=50, row_off=100, width=200, height=50)

Blocks

Raster datasets are generally composed of multiple blocks of data and windowed reads and writes are most efficient when the windows match the dataset’s own block structure. When a file is opened to read, the shape of blocks for any band can be had from the block_shapes property.
栅格数据集通常由多个数据块组成,当窗口与数据集自身的块结构匹配时,窗口读写效率最高。当打开一个文件进行读取时,可以从block_shapes属性获得任何波段的块的形状。

>>> with rasterio.open('tests/data/RGB.byte.tif') as src:
...     for i, shape in enumerate(src.block_shapes, 1):
...         print((i, shape))
...
(1, (3, 791))
(2, (3, 791))
(3, (3, 791))

The block windows themselves can be had from the block_windows function.
可以从block_windows函数获得block窗口本身。

>>> with rasterio.open('tests/data/RGB.byte.tif') as src:
...     for ji, window in src.block_windows(1):
...         print((ji, window))
...
((0, 0), ((0, 3), (0, 791)))
((1, 0), ((3, 6), (0, 791)))
...

This function returns an iterator that yields a pair of values. The second is a window tuple that can be used in calls to read or write. The first is the pair of row and column indexes of this block within all blocks of the dataset.
这个函数返回一个迭代器,产生一对值。第二个是一个窗口元组,可以在读写调用中使用。第一个是数据集所有块中该块的一对行和列索引。
ou may read windows of data from a file block-by-block like this.
可以像这样逐块读取文件中的数据窗口。

>>> with rasterio.open('tests/data/RGB.byte.tif') as src:
...     for ji, window in src.block_windows(1):
...         r = src.read(1, window=window)
...         print(r.shape)
...         break
...
(3, 791)

Well-bred files have identically blocked bands, but GDAL allows otherwise and it’s a good idea to test this assumption in your code.
良好的文件具有相同的 blocked bands,,但是GDAL允许不同的情况,在您的代码中测试这个假设是一个好主意。

>>> with rasterio.open('tests/data/RGB.byte.tif') as src:
...     assert len(set(src.block_shapes)) == 1
...     for ji, window in src.block_windows(1):
...         b, g, r = (src.read(k, window=window) for k in (1, 2, 3))
...         print((ji, r.shape, g.shape, b.shape))
...         break
...
((0, 0), (3, 791), (3, 791), (3, 791))

The block_shapes property is a band-ordered list of block shapes and set(src.block_shapes) gives you the set of unique shapes. Asserting that there is only one item in the set is effectively the same as asserting that all bands have the same block structure. If they do, you can use the same windows for each.
block_shapes属性是块形状和集合(src.block_shapes)的带序列表,为您提供了一组独特的形状。断言集合中只有一个项目实际上等同于断言所有带都具有相同的块结构。如果他们这样做,你可以使用相同的窗口。

上一篇:73. 矩阵置零


下一篇:【SCI论文写作】笔记三:Writing in the Sciences -Unit3