Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics does not work with tf.keras.estimator.model_to_estimator #39

Open
NKUCodingCat opened this issue May 1, 2019 · 8 comments
Open

Comments

@NKUCodingCat
Copy link

I am trying to use tf.keras.estimator.model_to_estimator to convert tf.keras model to be distributed, however, I found that keras-metrics does not work as desired, Is there any idea or work around for me ? thanks

Traceback:

Traceback (most recent call last):
  File "1.py", line 204, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "1.py", line 190, in main
    config=Rcfg
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/estimator/__init__.py", line 73, in model_to_estimator
    config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_estimator/python/estimator/keras.py", line 486, in model_to_estimator
    config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_estimator/python/estimator/keras.py", line 354, in _save_first_checkpoint
    custom_objects)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_estimator/python/estimator/keras.py", line 201, in _clone_and_build_model
    optimizer_iterations=global_step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/models.py", line 511, in clone_and_build_model
    target_tensors=target_tensors)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/checkpointable/base.py", line 442, in _method_wrapper
    method(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/engine/training.py", line 499, in compile
    sample_weights=self.sample_weights)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1844, in _handle_metrics
    return_stateful_result=return_stateful_result))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1800, in _handle_per_output_metrics
    stateful_metric_result = _call_stateful_fn(stateful_fn)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1773, in _call_stateful_fn
    fn, y_true, y_pred, weights=weights, mask=mask)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/engine/training_utils.py", line 852, in call_metric_function
    return metric_fn(y_true, y_pred, sample_weight=weights)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/metrics.py", line 438, in __call__
    update_op = self.update_state(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/metrics.py", line 160, in inner
    return func.__get__(instance_ref(), cls)(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/metrics.py", line 98, in decorated
    update_op = update_state_fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/metrics.py", line 649, in update_state
    matches = self._fn(y_true, y_pred, **self._fn_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/keras_metrics/metrics.py", line 192, in __call__
    tp = self.tp(y_true, y_pred)
  File "/usr/local/lib/python2.7/dist-packages/keras_metrics/metrics.py", line 50, in __call__
    tp_update = K.update_add(self.tp, tp)
  File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 986, in update_add
    return tf.assign_add(x, increment)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 190, in assign_add
    ref, value, use_locking=use_locking, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 107, in assign_add
    "AssignAdd", ref=ref, value=value, use_locking=use_locking, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 350, in _apply_op_helper
    g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 5713, in _get_graph_from_inputs
    _assert_same_graph(original_graph_element, graph_element)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 5649, in _assert_same_graph
    original_item))
ValueError: Tensor("metrics/precision/Sum:0", shape=(), dtype=int32) must be from the same graph as Tensor("Variable:0", shape=(), dtype=int32_ref).

Buggy code (a little bit messy...)

model.py.zip

If I dont add keras_metrics.sparse_categorical_precision() into Accuracy part, it DOES work but fail when I add sparse_categorical_precision...

Tested in Py2.7/3.7 TF 1.13.1

@NKUCodingCat
Copy link
Author

NKUCodingCat commented Jun 10, 2019

I think I found the reason but I have no idea about how to fix it

estimator.model_to_estimator will replace the graph built by keras with the graph they make, and the class is stateful, so an object, for example, built by true_positive, keeps an variant as self.tp, which is built by keras when I call it, but the graph of variant y_true/y_pred is replaced

I am not sure why there is a stateful layer because I check all loss/metrics of keras, I haven't found any loss is relies on the state of object(i.e. they are stateless), but I am not sure how keras_metrics work... Could you please provide some advices @ybubnov ?

@NKUCodingCat
Copy link
Author

Seems it is caused by the behaviour of when model_to_estimator process layer and metrics, it might replace the graph of any layer(I guess) but it will not replace the graph in metrics since they assumed that the metrics is STATLESS.

As far as I concerned, true_positive is implemented as a layer will reduce the cost of calculation when user require multiple metrics provide by keras_metrics(? I am not sure, I had been confused by the calling chain). IF what I think is right, I think it is a meaningless optimization because it makes potential incompatibles.

Sorry for my broken english and hope it helps

@sangyongjia
Copy link

I also faced a samliar issue; the Metrics does not work with tf.keras.estimator.model_to_estimator.
if I remove the tf.keras.mertics.AUC the code can work. or else it has an error like this:

return fn(*args, **kwargs)

File "/Users/sangyongjia/anaconda3/envs/tf1.x/lib/python3.7/site-packages/tensorflow/python/keras/engine/training_utils.py", line 873, in call_metric_function
return metric_fn(y_true, y_pred, sample_weight=weights)
File "/Users/sangyongjia/anaconda3/envs/tf1.x/lib/python3.7/site-packages/tensorflow/python/keras/metrics.py", line 170, in call
update_op = self.update_state(*args, **kwargs) # pylint: disable=not-callable
File "/Users/sangyongjia/anaconda3/envs/tf1.x/lib/python3.7/site-packages/tensorflow/python/keras/utils/metrics_utils.py", line 73, in decorated
update_op = update_state_fn(*args, **kwargs)
File "/Users/sangyongjia/anaconda3/envs/tf1.x/lib/python3.7/site-packages/tensorflow/python/keras/metrics.py", line 1715, in update_state
}, y_true, y_pred, self.thresholds, sample_weight=sample_weight)
File "/Users/sangyongjia/anaconda3/envs/tf1.x/lib/python3.7/site-packages/tensorflow/python/keras/utils/metrics_utils.py", line 268, in update_confusion_matrix_variables
y_pred.shape.assert_is_compatible_with(y_true.shape)
File "/Users/sangyongjia/anaconda3/envs/tf1.x/lib/python3.7/site-packages/tensorflow/python/framework/tensor_shape.py", line 1103, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (?, 1) and (?,) are incompatible

the correspond code is :
model.compile("adam", "binary_crossentropy", metrics=[tf.keras.metrics.BinaryCrossentropy(),tf.keras.metrics.AUC()])

if I remove tf.keras.metrics.AUC()
model.compile("adam", "binary_crossentropy", metrics=[tf.keras.metrics.BinaryCrossentropy()])
it can work

anyone has some great idea? thx in advance.

@ybubnov
Copy link
Member

ybubnov commented Jun 9, 2020

@sangyongjia, it looks that your problem relates to TensorFlow project, not to the keras-metrics

@sangyongjia
Copy link

@sangyongjia, it looks that your problem relates to TensorFlow project, not to the keras-metrics

you mean I should remove the Tensorflow and install it again or update to higher version of tf ?
Now the version of tf I am using is 1.14。

@sangyongjia
Copy link

@ybubnov

@ybubnov
Copy link
Member

ybubnov commented Jun 11, 2020

@sangyongjia, it looks that the problem is in the TensorFlow itself, particularly in to_model_estimator call, JosPolfliet created an issue in that project a long time ago: tensorflow/tensorflow#34040 and it is still unresolved.

I think one of possible ways to deal with the problem is to enforce issue resolution I've posted above. Alternatively, you could not use model_to_estimator if that is possible (simply use a model).

From my side, I'll try to relax the TensorFlow restriction in order to allow newer version of TensorFlow be used with keras-metrics library.

@sangyongjia
Copy link

thx so much @ybubnov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants