Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepLab 257x257 is great but 2049x2049 is even better #8

Open
samhodge opened this issue Apr 29, 2019 · 3 comments
Open

DeepLab 257x257 is great but 2049x2049 is even better #8

samhodge opened this issue Apr 29, 2019 · 3 comments

Comments

@samhodge
Copy link

Do you know how to produce a TFLite file of any arbitary dimension from the deeplab models here:

https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md

I got pretty close.

I have some test code

import numpy as np
import tensorflow as tf

# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="deeplabv3_257_mv_gpu.tflite")
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Which executes flawlessly

but for my own model that I have converted

(tensorflow-v1.13.1) [samh@apollo-centos6 tmp]$ more speedy.py 
import numpy as np
import tensorflow as tf

# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="speedy.tflite")
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.uint8)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

(tensorflow-v1.13.1) [samh@apollo-centos6 tmp]$ python speedy.py 
Traceback (most recent call last):
  File "speedy.py", line 17, in <module>
    interpreter.invoke()
  File "/home/samh/anaconda3/envs/tensorflow-v1.13.1/lib/python3.6/site-packages/tensorflow/lite/python/interpreter.py", line 277, in invoke
    self._interpreter.Invoke()
  File "/home/samh/anaconda3/envs/tensorflow-v1.13.1/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 109, in Invoke
    return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_Invoke(self)
RuntimeError: tensorflow/lite/kernels/depthwise_conv.cc:99 params->depth_multiplier * SizeOfDimension(input, 3) != SizeOfDimension(filter, 3) (0 != 64)Node number 33 (DEPTHWISE_CONV_2D) failed to prepare.

It is erroring on the line .invoke()

the .pb file was created using the export_model.py script here:

https://github.com/tensorflow/models/blob/master/research/deeplab/export_model.py

Using the docs here
https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/export_model.md

It is an xception_65 model

I quantised as follows

tflite_convert --output_file=speedy.tflite --graph_def_file=frozen_graph.pb --inference_type=FLOAT --inference_input_type=QUANTIZED_UINT8 --input_arrays=ImageTensor --input_shapes=1,2049,2049,3 --output_arrays='SemanticPredictions' --std_dev_values=128 --mean_values=127

Which ends cleanly.

Now I know that this will take a while to run on a mobile phone but the end game is to run it on a GPU in OpenGL ES on Linux and Metal on Apple desktop.

Do you have any hints

To repeat here is the error message

  File "/home/samh/anaconda3/envs/tensorflow-v1.13.1/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 109, in Invoke
    return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_Invoke(self)
RuntimeError: tensorflow/lite/kernels/depthwise_conv.cc:99 params->depth_multiplier * SizeOfDimension(input, 3) != SizeOfDimension(filter, 3) (0 != 64)Node number 33 (DEPTHWISE_CONV_2D) failed to prepare.
@samhodge
Copy link
Author

Found AN answer on a plate https://github.com/intel/webml-polyfill/tree/master/examples/semantic_segmentation/model

@normandra
Copy link

@samhodge mind sharing whats your inference speed when using xception with that kind of resolution?

@samhodge
Copy link
Author

About 0.2 fps with a NVIDIA GTX 1060, some of that time is post processing of the semantic mask from INT8 to antialiased float.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants