Skip to content Skip to sidebar Skip to footer

Tensorflow - Is There A Way To Implement Tensor-wise Image Shear/rotation/translation?

I am trying to do different kinds of (image) data augmentation for training my neural network. I know that tf.image offers some augmentation functions, but they are too simple - fo

Solution 1:

Have a look at tf.contrib.image.transform. It enables applying general projective transforms to an image.

You will also need to have a look to tf.contrib.image.matrices_to_flat_transforms to transform your affine matrices into the projective format accepted by tf.contrib.image.transform.

Solution 2:

I usually use tf.data.Datasets with Dataset.map and tf.py_func. Dataset.prefetch means there's usually no time cost (so long as preprocessing on CPU takes less time than running your network on GPU). If you're operating across multiple GPUs you may want to reconsider, but the following works well for me on single GPU systems.

For simplicity I'll assume you have all your images on disk in separate files, though it can easily be adapted for zip archives or other formats like hdf5 (won't work for .tar files - not sure why, but I doubt it would be a good idea anyway.)

import tensorflow as tf
from PIL import Image


defmap_tf(path_tensor, label_tensor):
    # path_tensor and label_tensor correspond to a single exampledefmap_np(path_str):
        # path_str is just a normal string here
        image = np.array(Image.load(path_str), dtype=np.uint8)
        image = any_cv2_or_numpy_augmentations(image)
        return image,

    image, = tf.py_func(
        map_np, (path_tensor,), Tout=(tf.uint8,), stateful=False)
    # any tensorflow operations here.
    image = tf.cast(image, tf.float32) / 255

    image.set_shape((224, 224, 3))
    return image, label


paths, labels = load_image_paths_and_labels()
dataset = tf.data.Dataset.from_tensor_slices((paths, labels))
if is_training:
    shuffle_buffer = len(paths)  # full shuffling - can be shorter
    dataset = dataset.shuffle(shuffle_buffer).repeat()
dataset = dataset.map(map_tf_fn, num_parallel_calls=8)
dataset = dataset.batch(batch_size)

dataset = dataset.prefetch(1)
# play with the following if you want - not finalized API, and only in# more recent version of tensorflow# dataset = dataset.apply(tf.contrib.data.prefetch_to_device('/gpu:0'))

image_batch, label_batch = dataset.make_one_shot_iterator().get_next()

You could also do the decoding in tensorflow and use any_cv2_or_numpy_augmentations directly in py_func (though you don't avoid the tensor -> numpy -> tensor dance you mention in your question). I doubt you'll notice a performance difference either way.

Check this answer for more options.

Post a Comment for "Tensorflow - Is There A Way To Implement Tensor-wise Image Shear/rotation/translation?"