Skip to content

Commit

Permalink
Support parsing tensors and text without tensorboard by adding minima…
Browse files Browse the repository at this point in the history
…l stubs

Fixes: #17

The stubs could be simplified after resolving: tensorflow/tensorboard#6899
  • Loading branch information
j3soon committed Aug 15, 2024
1 parent 83ce757 commit e948cf5
Show file tree
Hide file tree
Showing 9 changed files with 171 additions and 28 deletions.
9 changes: 3 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,10 @@ A simple yet powerful tensorboard event log parser/reader.
Installation:

```sh
pip install tensorflow # or tensorflow-cpu
pip install -U tbparse # requires Python >= 3.7
```

**Note**: If you don't want to install TensorFlow, see [Installing without TensorFlow](https://tbparse.readthedocs.io/en/latest/pages/installation.html#installing-without-tensorflow).

We suggest using an additional virtual environment for parsing and plotting the tensorboard events. So no worries if your training code uses Python 3.6 or older versions.
We suggest using an additional virtual environment for parsing and plotting the tensorboard events. So no worries if your training code uses Python 3.6 or older versions.

Reading one or more event files with tbparse only requires 5 lines of code:

Expand Down Expand Up @@ -77,11 +74,11 @@ All events above are generated and plotted in [gallery-pytorch.ipynb](https://gi
## Installation

```sh
pip install tensorflow # or tensorflow-cpu
pip install tensorflow # optional, only required if you want to parse images and audio
pip install -U tbparse # requires Python >= 3.7
```

**Note**: If you don't want to install TensorFlow, see [Installing without TensorFlow](https://tbparse.readthedocs.io/en/latest/pages/installation.html#installing-without-tensorflow).
**Note**: For details on when TensorFlow is required, see [Installing without TensorFlow](https://tbparse.readthedocs.io/en/latest/pages/installation.html#installing-without-tensorflow).

## Testing the Source Code

Expand Down
3 changes: 0 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,8 @@ Installation:

.. code-block:: bash
pip install tensorflow # or tensorflow-cpu
pip install -U tbparse # requires Python >= 3.7
**Note**: If you don't want to install TensorFlow, see :ref:`Installing without TensorFlow <tbparse_installing-without-tensorflow>`.

We suggest using an additional virtual environment for parsing and plotting
the tensorboard events. So no worries if your training code uses Python 3.6
or older versions.
Expand Down
16 changes: 8 additions & 8 deletions docs/pages/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,18 @@ Install from PyPI:

.. code-block:: bash
pip install tensorflow # or tensorflow-cpu
pip install tensorflow # optional, only required if you want to parse images and audio
pip install -U tbparse # requires Python >= 3.7
**Note**: If you don't want to install TensorFlow, see :ref:`Installing without TensorFlow <tbparse_installing-without-tensorflow>`.
**Note**: For details on when TensorFlow is required, see :ref:`Installing without TensorFlow <tbparse_installing-without-tensorflow>`.

Install from Source:

.. code-block:: bash
git clone https://github.com/j3soon/tbparse
cd tbparse
pip install tensorflow # or tensorflow-cpu
pip install tensorflow # optional, only required if you want to parse images and audio
pip install -e . # requires Python >= 3.7
.. _tbparse_installing-without-tensorflow:
Expand All @@ -38,13 +38,13 @@ You can install tbparse with reduced feature set if you don't want to install Te
Without TensorFlow, tbparse supports parsing
:ref:`scalars <tbparse_parsing-scalars>`,
:ref:`histograms <tbparse_parsing-histograms>`, and
:ref:`hparams <tbparse_parsing-hparams>`,
but doesn't support parsing
:ref:`tensors <tbparse_parsing-tensors>`,
:ref:`images <tbparse_parsing-images>`,
:ref:`audio <tbparse_parsing-audio>`, and
:ref:`histograms <tbparse_parsing-histograms>`,
:ref:`hparams <tbparse_parsing-hparams>`, and
:ref:`text <tbparse_parsing-text>`.
but doesn't support parsing
:ref:`images <tbparse_parsing-images>` and
:ref:`audio <tbparse_parsing-audio>`.

tbparse will instruct you to install TensorFlow by raising an error if you try to parse the unsupported event types, such as:

Expand Down
17 changes: 8 additions & 9 deletions tbparse/summary_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
STORE_EVERYTHING_SIZE_GUIDANCE, TENSORS, AudioEvent, EventAccumulator,
HistogramEvent, ImageEvent, ScalarEvent, TensorEvent)
from tensorboard.plugins.hparams.plugin_data_pb2 import HParamsPluginData

from .tensorflow_stub import make_ndarray

try:
import tensorflow
except ImportError:
Expand Down Expand Up @@ -51,7 +54,7 @@
}

ALL_EVENT_TYPES = {SCALARS, TENSORS, HISTOGRAMS, IMAGES, AUDIO, HPARAMS, TEXT}
REDUCED_EVENT_TYPES = {SCALARS, HISTOGRAMS, HPARAMS}
REDUCED_EVENT_TYPES = ALL_EVENT_TYPES.difference({IMAGES, AUDIO})
ALL_EXTRA_COLUMNS = {'dir_name', 'file_name', 'wall_time', 'min', 'max', 'num',
'sum', 'sum_squares', 'width', 'height', 'content_type',
'length_frames', 'sample_rate'}
Expand Down Expand Up @@ -577,7 +580,6 @@ def histogram_to_cdf(counts: np.ndarray, limits: np.ndarray,
i += 1
return np.array(y) / n

# pylint: disable=R0914
@staticmethod
def histogram_to_bins(counts: np.ndarray, limits: np.ndarray,
lower_bound: Optional[float] = None,
Expand All @@ -603,8 +605,9 @@ def histogram_to_bins(counts: np.ndarray, limits: np.ndarray,
each bucket.
:rtype: Tuple[np.ndarray, np.ndarray]
"""
# pylint: disable=R0914
# pylint: disable=C0301
# Ref: https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/histogram/tf_histogram_dashboard/histogramCore.ts#L83 # noqa: E501
# Ref: https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/histogram/tf_histogram_dashboard/histogramCore.ts#L83 # noqa: E501
assert len(counts) == len(limits)
assert counts[0] == 0
if lower_bound is None or upper_bound is None:
Expand Down Expand Up @@ -676,12 +679,10 @@ def _get_tensor_cols(self, tag_to_events: Dict[str, TensorEvent]) -> \
cols = self._get_default_cols(tag_to_events)
if len(tag_to_events) == 0:
return cols
# pylint: disable=C0103
tf = SummaryReader._get_tensorflow()
idx = 0
for tag, events in tag_to_events.items():
for e in events:
value = tf.make_ndarray(e.tensor_proto)
value = make_ndarray(e.tensor_proto)
if value.shape == ():
# Tensorflow histogram may have more than one items
value = value.item()
Expand Down Expand Up @@ -807,12 +808,10 @@ def _get_text_cols(self, tag_to_events: Dict[str, TensorEvent]) -> \
cols = self._get_default_cols(tag_to_events)
if len(tag_to_events) == 0:
return cols
# pylint: disable=C0103
tf = SummaryReader._get_tensorflow()
idx = 0
for tag, events in tag_to_events.items():
for e in events:
value = tf.make_ndarray(e.tensor_proto).item()
value = make_ndarray(e.tensor_proto).item()
assert isinstance(value, bytes)
value = value.decode('utf-8')
cols['step'][idx] = e.step
Expand Down
7 changes: 7 additions & 0 deletions tbparse/tensorflow_stub/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
"""
Provides a stub for the TensorFlow module.
"""

from .tensor_util import make_ndarray

__all__ = ['make_ndarray', ]
109 changes: 109 additions & 0 deletions tbparse/tensorflow_stub/tensor_util.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
import numpy as np
from tensorboard.compat.tensorflow_stub import dtypes


# flake8: noqa
# pylint: skip-file
# Ref: https://github.com/tensorflow/tensorflow/blob/ad6d8cc177d0c868982e39e0823d0efbfb95f04c/tensorflow/python/framework/tensor_util.py#L633
def make_ndarray(tensor):
"""Create a numpy ndarray from a tensor.
Create a numpy ndarray with the same shape and data as the tensor.
For example:
```python
# Tensor a has shape (2,3)
a = tf.constant([[1,2,3],[4,5,6]])
proto_tensor = tf.make_tensor_proto(a) # convert `tensor a` to a proto tensor
tf.make_ndarray(proto_tensor) # output: array([[1, 2, 3],
# [4, 5, 6]], dtype=int32)
# output has shape (2,3)
```
Args:
tensor: A TensorProto.
Returns:
A numpy array with the tensor contents.
Raises:
TypeError: if tensor has unsupported type.
"""
shape = [d.size for d in tensor.tensor_shape.dim]
num_elements = np.prod(shape, dtype=np.int64)
tensor_dtype = dtypes.as_dtype(tensor.dtype)
dtype = tensor_dtype.as_numpy_dtype

if tensor.tensor_content:
return (np.frombuffer(tensor.tensor_content,
dtype=dtype).copy().reshape(shape))

if tensor_dtype == dtypes.string:
# np.pad throws on these arrays of type np.object_.
values = list(tensor.string_val)
padding = num_elements - len(values)
if padding > 0:
last = values[-1] if values else ""
values.extend([last] * padding)
return np.array(values, dtype=dtype).reshape(shape)

if tensor_dtype == dtypes.float16 or tensor_dtype == dtypes.bfloat16:
# the half_val field of the TensorProto stores the binary representation
# of the fp16: we need to reinterpret this as a proper float16
values = np.fromiter(tensor.half_val, dtype=np.uint16)
values.dtype = dtype
# TODO: The following is a temporary fix for float8_e5m2 and float8_e4m3fn
# Ref: https://github.com/tensorflow/tensorboard/issues/6899
elif tensor_dtype in [
dtypes.DType(dtypes.types_pb2.DT_FLOAT8_E5M2),
dtypes.DType(dtypes.types_pb2.DT_FLOAT8_E4M3FN),
]:
values = np.fromiter(tensor.float8_val, dtype=np.uint8)
values.dtype = dtype
elif tensor_dtype == dtypes.float32:
values = np.fromiter(tensor.float_val, dtype=dtype)
elif tensor_dtype == dtypes.float64:
values = np.fromiter(tensor.double_val, dtype=dtype)
elif tensor_dtype in [
dtypes.int32,
dtypes.uint8,
dtypes.uint16,
dtypes.int16,
dtypes.int8,
dtypes.qint32,
dtypes.quint8,
dtypes.qint8,
dtypes.qint16,
dtypes.quint16,
dtypes.int4,
dtypes.uint4,
]:
values = np.fromiter(tensor.int_val, dtype=dtype)
elif tensor_dtype == dtypes.int64:
values = np.fromiter(tensor.int64_val, dtype=dtype)
elif tensor_dtype == dtypes.uint32:
values = np.fromiter(tensor.uint32_val, dtype=dtype)
elif tensor_dtype == dtypes.uint64:
values = np.fromiter(tensor.uint64_val, dtype=dtype)
elif tensor_dtype == dtypes.complex64:
it = iter(tensor.scomplex_val)
values = np.array([complex(x[0], x[1]) for x in zip(it, it)], dtype=dtype)
elif tensor_dtype == dtypes.complex128:
it = iter(tensor.dcomplex_val)
values = np.array([complex(x[0], x[1]) for x in zip(it, it)], dtype=dtype)
elif tensor_dtype == dtypes.bool:
values = np.fromiter(tensor.bool_val, dtype=dtype)
else:
raise TypeError(f"Unsupported tensor type: {tensor.dtype}. See "
"https://www.tensorflow.org/api_docs/python/tf/dtypes "
"for supported TF dtypes.")

if values.size == 0:
return np.zeros(shape, dtype)

if values.size != num_elements:
values = np.pad(values, (0, num_elements - values.size), "edge")

return values.reshape(shape)
12 changes: 10 additions & 2 deletions tests/test_summary_reader/test_no_tensorflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,11 @@ def prepare(testdir):
for i in x:
writer.add_scalar('y=2x', i * 2, i)
writer.add_text('text', 'lorem ipsum', 0)
img_batch = np.zeros((16, 3, 100, 100))
for i in range(16):
img_batch[i, 0] = np.arange(0, 10000).reshape(100, 100) / 10000 / 16 * i
img_batch[i, 1] = (1 - np.arange(0, 10000).reshape(100, 100) / 10000) / 16 * i
writer.add_images('my_image_batch', img_batch, 0)
writer.close()

def test_log_dir(prepare, testdir):
Expand All @@ -24,7 +29,10 @@ def test_log_dir(prepare, testdir):
assert df.columns.tolist() == ['step', 'y=2x']
assert df['step'].to_list() == [i for i in range(100)]
assert df['y=2x'].to_list() == [i*2 for i in range(100)]
df = reader.text
assert df['step'].to_list() == [0]
assert df['text'].to_list() == ["lorem ipsum"]
with pytest.raises(ModuleNotFoundError):
df = reader.text
df = reader.images
with pytest.raises(ModuleNotFoundError):
reader = SummaryReader(log_dir, pivot=True, event_types={'scalars', 'text'})
reader = SummaryReader(log_dir, pivot=True, event_types={'images'})
24 changes: 24 additions & 0 deletions tests/test_summary_reader/test_scalar_new_style_torch_sample.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
import os

import pytest
from tbparse import SummaryReader
from torch.utils.tensorboard import SummaryWriter


@pytest.fixture
def prepare(testdir):
# Ref: https://pytorch.org/docs/stable/tensorboard.html
log_dir = os.path.join(testdir.tmpdir, 'run')
writer = SummaryWriter(log_dir)
x = range(100)
for i in x:
writer.add_scalar('y=2x', i * 2, i, new_style=True)
writer.close()

def test_log_dir(prepare, testdir):
log_dir = os.path.join(testdir.tmpdir, 'run')
reader = SummaryReader(log_dir, pivot=True)
df = reader.tensors
assert df.columns.tolist() == ['step', 'y=2x']
assert df['step'].to_list() == [i for i in range(100)]
assert df['y=2x'].to_list() == [i*2 for i in range(100)]
2 changes: 2 additions & 0 deletions tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ commands =
"{toxinidir}/tests/test_summary_reader/test_histogram_torch_sample.py" \
"{toxinidir}/tests/test_summary_reader/test_hparams_torch_sample.py" \
"{toxinidir}/tests/test_summary_reader/test_scalar_torch_sample.py" \
"{toxinidir}/tests/test_summary_reader/test_scalar_new_style_torch_sample.py" \
"{toxinidir}/tests/test_summary_reader/test_text_torch_sample.py" \
"{toxinidir}/tests/test_summary_reader/test_no_tensorflow.py"
# Test tbparse with full feature set (with TensorFlow)
pip install tensorflow
Expand Down

0 comments on commit e948cf5

Please sign in to comment.