Tensorflow - Is There A Way To Implement Tensor-wise Image Shear/rotation/translation?
Solution 1:
Have a look at tf.contrib.image.transform
. It enables applying general projective transforms to an image.
You will also need to have a look to tf.contrib.image.matrices_to_flat_transforms
to transform your affine matrices into the projective format accepted by tf.contrib.image.transform
.
Solution 2:
I usually use tf.data.Dataset
s with Dataset.map
and tf.py_func
. Dataset.prefetch
means there's usually no time cost (so long as preprocessing on CPU takes less time than running your network on GPU). If you're operating across multiple GPUs you may want to reconsider, but the following works well for me on single GPU systems.
For simplicity I'll assume you have all your images on disk in separate files, though it can easily be adapted for zip archives or other formats like hdf5 (won't work for .tar
files - not sure why, but I doubt it would be a good idea anyway.)
import tensorflow as tf
from PIL import Image
defmap_tf(path_tensor, label_tensor):
# path_tensor and label_tensor correspond to a single exampledefmap_np(path_str):
# path_str is just a normal string here
image = np.array(Image.load(path_str), dtype=np.uint8)
image = any_cv2_or_numpy_augmentations(image)
return image,
image, = tf.py_func(
map_np, (path_tensor,), Tout=(tf.uint8,), stateful=False)
# any tensorflow operations here.
image = tf.cast(image, tf.float32) / 255
image.set_shape((224, 224, 3))
return image, label
paths, labels = load_image_paths_and_labels()
dataset = tf.data.Dataset.from_tensor_slices((paths, labels))
if is_training:
shuffle_buffer = len(paths) # full shuffling - can be shorter
dataset = dataset.shuffle(shuffle_buffer).repeat()
dataset = dataset.map(map_tf_fn, num_parallel_calls=8)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(1)
# play with the following if you want - not finalized API, and only in# more recent version of tensorflow# dataset = dataset.apply(tf.contrib.data.prefetch_to_device('/gpu:0'))
image_batch, label_batch = dataset.make_one_shot_iterator().get_next()
You could also do the decoding in tensorflow and use any_cv2_or_numpy_augmentations
directly in py_func
(though you don't avoid the tensor -> numpy -> tensor dance you mention in your question). I doubt you'll notice a performance difference either way.
Check this answer for more options.
Post a Comment for "Tensorflow - Is There A Way To Implement Tensor-wise Image Shear/rotation/translation?"