DeepLab 257x257 is great but 2049x2049 is even better #8

samhodge · 2019-04-29T04:22:36Z

Do you know how to produce a TFLite file of any arbitary dimension from the deeplab models here:

https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md

I got pretty close.

I have some test code

import numpy as np
import tensorflow as tf

# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="deeplabv3_257_mv_gpu.tflite")
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Which executes flawlessly

but for my own model that I have converted

(tensorflow-v1.13.1) [samh@apollo-centos6 tmp]$ more speedy.py 
import numpy as np
import tensorflow as tf

# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="speedy.tflite")
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.uint8)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

(tensorflow-v1.13.1) [samh@apollo-centos6 tmp]$ python speedy.py 
Traceback (most recent call last):
  File "speedy.py", line 17, in <module>
    interpreter.invoke()
  File "/home/samh/anaconda3/envs/tensorflow-v1.13.1/lib/python3.6/site-packages/tensorflow/lite/python/interpreter.py", line 277, in invoke
    self._interpreter.Invoke()
  File "/home/samh/anaconda3/envs/tensorflow-v1.13.1/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 109, in Invoke
    return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_Invoke(self)
RuntimeError: tensorflow/lite/kernels/depthwise_conv.cc:99 params->depth_multiplier * SizeOfDimension(input, 3) != SizeOfDimension(filter, 3) (0 != 64)Node number 33 (DEPTHWISE_CONV_2D) failed to prepare.

It is erroring on the line .invoke()

the .pb file was created using the export_model.py script here:

https://github.com/tensorflow/models/blob/master/research/deeplab/export_model.py

Using the docs here
https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/export_model.md

It is an xception_65 model

I quantised as follows

tflite_convert --output_file=speedy.tflite --graph_def_file=frozen_graph.pb --inference_type=FLOAT --inference_input_type=QUANTIZED_UINT8 --input_arrays=ImageTensor --input_shapes=1,2049,2049,3 --output_arrays='SemanticPredictions' --std_dev_values=128 --mean_values=127

Which ends cleanly.

Now I know that this will take a while to run on a mobile phone but the end game is to run it on a GPU in OpenGL ES on Linux and Metal on Apple desktop.

Do you have any hints

To repeat here is the error message

  File "/home/samh/anaconda3/envs/tensorflow-v1.13.1/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 109, in Invoke
    return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_Invoke(self)
RuntimeError: tensorflow/lite/kernels/depthwise_conv.cc:99 params->depth_multiplier * SizeOfDimension(input, 3) != SizeOfDimension(filter, 3) (0 != 64)Node number 33 (DEPTHWISE_CONV_2D) failed to prepare.

The text was updated successfully, but these errors were encountered:

samhodge · 2019-04-29T05:38:28Z

Found AN answer on a plate https://github.com/intel/webml-polyfill/tree/master/examples/semantic_segmentation/model

normandra · 2019-10-16T16:00:29Z

@samhodge mind sharing whats your inference speed when using xception with that kind of resolution?

samhodge · 2019-10-16T21:22:52Z

About 0.2 fps with a NVIDIA GTX 1060, some of that time is post processing of the semantic mask from INT8 to antialiased float.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepLab 257x257 is great but 2049x2049 is even better #8

DeepLab 257x257 is great but 2049x2049 is even better #8

samhodge commented Apr 29, 2019

samhodge commented Apr 29, 2019

normandra commented Oct 16, 2019

samhodge commented Oct 16, 2019

DeepLab 257x257 is great but 2049x2049 is even better #8

DeepLab 257x257 is great but 2049x2049 is even better #8

Comments

samhodge commented Apr 29, 2019

samhodge commented Apr 29, 2019

normandra commented Oct 16, 2019

samhodge commented Oct 16, 2019