Skip to content Skip to sidebar Skip to footer

Numpy Slicing With Bound Checks

Does numpy offer a way to do bound checking when you slice an array? For example if I do: arr = np.ones([2,2]) sliced_arr = arr[0:5,:] This slice will be okay and it will just ret

Solution 1:

This ended up a bit longer than expected, but you can write your own wrapper that checks the get operations to make sure that slices do not go beyond limits (indexing arguments that are not slices are already checked by NumPy). I think I covered all cases here (ellipsis, np.newaxis, negative steps...), although there might be some failing corner case still.

import numpy as np

# Wrapping functiondefbounds_checked_slice(arr):
    return SliceBoundsChecker(arr)

# Wrapper that checks that indexing slices are within bounds of the arrayclassSliceBoundsChecker:

    def__init__(self, arr):
        self._arr = np.asarray(arr)

    def__getitem__(self, args):
        # Slice bounds checking
        self._check_slice_bounds(args)
        return self._arr.__getitem__(args)

    def__setitem__(self, args, value):
        # Slice bounds checking
        self._check_slice_bounds(args)
        return self._arr.__setitem__(args, value)

    # Check slices in the arguments are within boundsdef_check_slice_bounds(self, args):
        ifnotisinstance(args, tuple):
            args = (args,)
        # Iterate through indexing arguments
        arr_dim = 0
        i_arg = 0for i_arg, arg inenumerate(args):
            ifisinstance(arg, slice):
                self._check_slice(arg, arr_dim)
                arr_dim += 1elif arg isEllipsis:
                breakelif arg is np.newaxis:
                passelse:
                arr_dim += 1# Go backwards from end after ellipsis if necessary
        arr_dim = -1for arg in args[:i_arg:-1]:
            ifisinstance(arg, slice):
                self._check_slice(arg, arr_dim)
                arr_dim -= 1elif arg isEllipsis:
                raise IndexError("an index can only have a single ellipsis ('...')")
            elif arg is np.newaxis:
                passelse:
                arr_dim -= 1# Check a single slicedef_check_slice(self, slice, axis):
        size = self._arr.shape[axis]
        start = slice.start
        stop = slice.stop
        step = slice.step ifslice.step isnotNoneelse1if step == 0:
            raise ValueError("slice step cannot be zero")
        bad_slice = Falseif start isnotNone:
            start = start if start >= 0else start + size
            bad_slice |= start < 0or start >= size
        else:
            start = 0if step > 0else size - 1if stop isnotNone:
            stop = stop if stop >= 0else stop + size
            bad_slice |= (stop < 0or stop > size) if step > 0else (stop < 0or stop >= size)
        else:
            stop = size if step > 0else -1if bad_slice:
            raise IndexError("slice {}:{}:{} is out of bounds for axis {} with size {}".format(
                slice.start ifslice.start isnotNoneelse'',
                slice.stop ifslice.stop isnotNoneelse'',
                slice.step ifslice.step isnotNoneelse'',
                axis % self._arr.ndim, size))

A small demo:

import numpy as np

a = np.arange(24).reshape(4, 6)
print(bounds_checked_slice(a)[:2, 1:5])# [[ 1  2  3  4]#  [ 7  8  9 10]]bounds_checked_slice(a)[:2, 4:10]# IndexError: slice 4:10: is out of bounds for axis 1 with size 6

If you wanted, you could even make this a subclass of ndarray, so you get this behavior by default, instead of having to wrap the array every time.

Also, note that there may be some variations as to what you may consider to be "out of bounds". The code above considers that going even one index beyond the size is out of bounds, meaning that you cannot take an empty slice with something like arr[len(arr):]. You could in principle edit the code if you were thinking of a slightly different behavior.

Solution 2:

If you used range instead of the common slicing notation you could get the expected behaviour. For example for a valid slicing:

arr[range(2),:]

array([[1., 1.],
       [1., 1.]])

And if we tried to slice with for instance:

arr[range(5),:]

It would throw the following error:

IndexError: index 2 is out of bounds for size 2

My guess on why this throws an error is that slicing with common slice notation is a basic property in numpy arrays as well as lists, and thus instead of throwing an index out of range error when we try to slice with wrong indices, it already contemplates this and cuts to the nearest valid indices. Whereas this is apparently not contemplated when slicing with a range, which is an immutable object.

Post a Comment for "Numpy Slicing With Bound Checks"