Audio PR - Augmentation support [ Downmix and ToDecibels ] (#125)

* Fix formatting issues - Minor changes * Clean up C++ audio unit test * Remove max frames and channels from decoder * Minor changes * Add Output comparison for python audio unittests * Modify rocal audio unit test Update README * Minor change * Remove NSR * Resolve PR comments * Minor changes - Modifying the names of the arguments * Initial commit for removing file list reader * Added a bried desc for the rocAL enum for border type * Add a WRN statement in PreEmphasis Filter to only use FP32 dtype * minor changes * Change the borderType enum to int32 from uint32 dtype * Minor changes * Minor change * Update README for audio unit test * Parameters for rocALAudioIterator * Removing file list reader and metadata reader * Minor change * Remove the reset_tensor_roi() from the PreEmphasis augmentation making sure the src and dst roi points to the same location * Del the Unit Test Files introduced earlier * Changing python unittests for QA mode * Resolving review comments * Adding comment for file list case in file source reader * Add pre_emphasis function and gollden output comparison in audio unit tests * minor change - add update val in create array * Minor variable name change * Minor additions in the .h file * minor change * Formatting Changes * Update unit test * Minor change * minor change * Resolving review comments * Minor changes Add wav extension in file reader Add reader in unit test * Minor change * Add file reader to python audio unit test * Add to_decibels augmentations to rocAL * Formatting , review comments resolution and change enum dtype to int32 instead of uint32 * Update C++ unit test * Update python audio unit test * Remove the unused variable output - Resolve warnings in cpp unit test * Remove the dst_roi arg passed to rpp * Minor changes Change borderType enum prefix * Adding file list reader to C++ unit tests * Fixing issues with C++ audio unit tests * Adding test case for to_decibels and downmix * Modifying python unittests * Fixing spectogram test case * Remove the reset_tensor_roi calls * Resolving review comments * Resolve some PR comments * Minor changes * Resolving review comments * Change the dims[0] and dims[1] positioning for Spectrogram * Resolving review comments * Resolving review comments * Resolving PR comments * Updating audio unit tests for default file list path * Minor changes * Minor Change * Minor change * Name change from sample to data * Change from decoded_data_info to DecodedDataInfo * Revert "Change the dims[0] and dims[1] positioning for Spectrogram" This reverts commit d791b9a. * Remove audio_decoder_factory.cpp file * Minor change * Change variable name * Add Spectrogram Case in unit tests * Add spectrogram case in python unit tests * Update the struct variable name in audio files * Fixing issues with downmix node output * Adding ROI updation in downmix node * Adding downmix test case for python unit tests * Adding downmix and to_decibels test case in C++ tests * Minor changes * Change ROCAL_DATA_PATH to exclude rocal_data * Update ROCAL_DATA_PATH to exclude rocal_data * Use Pascal case for function names in audio decoder * Add audio path for downmix test case * Fix review comments * Modify cmake to have SNDFILE in all capital * Minor changes * Add struct for audio info in AudioReadAndDecode * Fix merge conflict * Renaming crop_image_info to CropImageInfo * Remove - actual_host_buffers - Unused * Rename TimingDBG to TimingDbg * Move the instances of DecodedDataInfo to its base class LoaderModule * Fix a WRN msg in master_graph.cpp * Remove a dangling comment * Rename _circ_data_info to _circ_buff_data_info * Add Glob to CMakeLists.txt * Rename SndFileDecoder to GenericAudioDecoder * Fix build issues * Minor change * Update python API README.md for audio unit test * Update audio unit test README * Adding missed param in python unit tests * Revert "Add Glob to CMakeLists.txt" This reverts commit 47263d9. * Fix include headers for Audio files * Fix copy data 2D * Minor changes * Pass decoded data info to load routine instead of separate vectors * Update CHANGELOG.md * Update CHANGELOG.md * Change swap_handle_time variable name in loader * Update the changelog.md * Update ChangeLog.md * Update CHANGELOG.md * Formatting changes Add comments * Update doxygen comments * Move file source reader from readers/image to readers folder * Update README and add doxygen description * Update CMakeLists and README for audio test * Update README for audio test * Minor fix * Fix merge from PR 2 * Minor changes shard_count argument name * Rename set and get functions of data_info to decoded_data_info * Revert empty line removed in CMakeLists.txt * Removed prefix original for audio vectors * Resolve PR comments * Add @params to all args in pytorch.py * Fix build issue * Minor changes in unit test * Minor changes * Change ROCAL instaces to rocAL in pytorch.py * Resolve the PR comments * Minor changes in decoders.py - Modify the comment for shard_size * Minor changes * Address the PR comments * Address Review comments * Introduce Audio layouts * Add layout changes for spectrogram * Fix the unit tests - c++ & python * Minor fix * Adding changes for spec layout changes * Fix merge conflicts * Resolving review comments --------- Co-authored-by: swetha097 <[email protected]> Co-authored-by: fiona-gladwin <[email protected]> Co-authored-by: Swetha B S <[email protected]> Co-authored-by: Fiona-MCW <[email protected]> Co-authored-by: SundarRajan28 <[email protected]> Co-authored-by: Swetha B S <>
ROCm · Jun 19, 2024 · 48d0617 · 48d0617
1 parent 112c50f
commit 48d0617
Show file tree

Hide file tree

Showing 18 changed files with 344 additions and 19 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -20,6 +20,8 @@
 * Support for Audio augmentation - PreEmphasis filter
 * Support for reading from file lists in file reader
 * Support for Audio augmentation - Spectrogram
+* Support for Audio augmentation - ToDecibels
+* Support for downmixing audio channels during decoding
 
 ### Optimizations
 

diff --git a/rocAL/include/api/rocal_api_augmentation.h b/rocAL/include/api/rocal_api_augmentation.h
@@ -1144,4 +1144,23 @@ extern "C" RocalTensor ROCAL_API_CALL rocalSpectrogram(RocalContext context,
                                                        RocalTensorLayout output_layout = ROCAL_NFT,
                                                        RocalTensorOutputType output_datatype = ROCAL_FP32);
 
+/*! \brief A
+ * \ingroup group_rocal_augmentations
+ * \param [in] p_context Rocal context
+ * \param [in] p_input Input Rocal tensor
+ * \param [in] is_output is the output tensor part of the graph output
+ * \param[in] cutoff_db minimum or cut-off ratio in dB
+ * \param[in] multiplier factor by which the logarithm is multiplied
+ * \param[in] reference_magnitude Reference magnitude which if not provided uses maximum value of input as reference
+ * \param [in] rocal_tensor_output_type the data type of the output tensor
+ * \return RocalTensor
+ */
+extern "C" RocalTensor ROCAL_API_CALL rocalToDecibels(RocalContext p_context,
+                                                      RocalTensor p_input,
+                                                      bool is_output,
+                                                      float cutoff_db,
+                                                      float multiplier,
+                                                      float reference_magnitude,
+                                                      RocalTensorOutputType rocal_tensor_output_type);
+
 #endif  // MIVISIONX_ROCAL_API_AUGMENTATION_H
diff --git a/rocAL/include/api/rocal_api_types.h b/rocAL/include/api/rocal_api_types.h
@@ -249,7 +249,13 @@ enum RocalTensorOutputType {
     ROCAL_UINT8 = 2,
     /*! \brief AMD ROCAL_INT8
      */
-    ROCAL_INT8 = 3
+    ROCAL_INT8 = 3,
+    /*! \brief AMD ROCAL_UINT32
+     */
+    ROCAL_UINT32 = 4,
+    /*! \brief AMD ROCAL_INT32
+     */
+    ROCAL_INT32 = 5
 };
 
 /*! \brief rocAL Decoder Type enum

diff --git a/rocAL/include/augmentations/audio_augmentations/node_downmix.h b/rocAL/include/augmentations/audio_augmentations/node_downmix.h
@@ -0,0 +1,34 @@
+/*
+Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
+*/
+
+#pragma once
+#include "pipeline/graph.h"
+#include "pipeline/node.h"
+class DownmixNode : public Node {
+   public:
+    DownmixNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs);
+    DownmixNode() = delete;
+
+   protected:
+    void create_node() override;
+    void update_node() override;
+};
diff --git a/rocAL/include/augmentations/audio_augmentations/node_to_decibels.h b/rocAL/include/augmentations/audio_augmentations/node_to_decibels.h
@@ -0,0 +1,41 @@
+/*
+Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
+*/
+
+#pragma once
+#include "pipeline/graph.h"
+#include "pipeline/node.h"
+
+class ToDecibelsNode : public Node {
+   public:
+    ToDecibelsNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs);
+    ToDecibelsNode() = delete;
+    void init(float cutoff_db, float multiplier, float reference_magnitude);
+
+   protected:
+    void create_node() override;
+    void update_node() override;
+
+   private:
+    float _cutoff_db = -200.0;
+    float _multiplier = 10.0;
+    float _reference_magnitude = 0.0;
+};
diff --git a/rocAL/include/augmentations/augmentations_nodes.h b/rocAL/include/augmentations/augmentations_nodes.h
@@ -57,3 +57,4 @@ THE SOFTWARE.
 #include "augmentations/node_sequence_rearrange.h"
 #include "augmentations/audio_augmentations/node_preemphasis_filter.h"
 #include "augmentations/audio_augmentations/node_spectrogram.h"
+#include "augmentations/audio_augmentations/node_to_decibels.h"
diff --git a/rocAL/include/pipeline/tensor.h b/rocAL/include/pipeline/tensor.h
@@ -191,7 +191,7 @@ class TensorInfo {
     }
     void set_tensor_layout(RocalTensorlayout layout) {
         if (layout == RocalTensorlayout::NONE) return;
-        if (_layout != layout && _layout != RocalTensorlayout::NONE) {  // If layout input and current layout's are different modify dims accordingly
+        if (_layout != layout && _layout != RocalTensorlayout::NONE && (_num_of_dims > 3)) {  // If layout input and current layout's are different modify dims accordingly
             std::vector<size_t> new_dims(_num_of_dims, 0);
             get_modified_dims_from_layout(_layout, layout, new_dims);
             _dims = new_dims;

diff --git a/rocAL/source/api/rocal_api_augmentation.cpp b/rocAL/source/api/rocal_api_augmentation.cpp
@@ -2250,3 +2250,37 @@ rocalSpectrogram(
     }
     return output;
 }
+
+RocalTensor ROCAL_API_CALL
+rocalToDecibels(
+    RocalContext p_context,
+    RocalTensor p_input,
+    bool is_output,
+    float cutoff_db,
+    float multiplier,
+    float reference_magnitude,
+    RocalTensorOutputType output_datatype) {
+    Tensor* output = nullptr;
+    if ((p_context == nullptr) || (p_input == nullptr)) {
+        ERR("Invalid ROCAL context or invalid input tensor")
+        return output;
+    }
+    auto context = static_cast<Context*>(p_context);
+    auto input = static_cast<Tensor*>(p_input);
+    try {
+        RocalTensorDataType op_tensor_data_type = static_cast<RocalTensorDataType>(output_datatype);
+        TensorInfo output_info = input->info();
+        if (op_tensor_data_type != RocalTensorDataType::FP32) {
+            THROW("Only FP32 dtype is supported for To decibels augmentation.")
+        }
+        output_info.set_data_type(op_tensor_data_type);
+        if (input->info().layout() == RocalTensorlayout::NFT || input->info().layout() == RocalTensorlayout::NTF) // Layout is changed when input is from spectrogram/mel filter bank
+            output_info.set_tensor_layout(RocalTensorlayout::NHW);
+        output = context->master_graph->create_tensor(output_info, is_output);
+        context->master_graph->add_node<ToDecibelsNode>({input}, {output})->init(cutoff_db, multiplier, reference_magnitude);
+    } catch (const std::exception& e) {
+        context->capture_error(e.what());
+        ERR(e.what())
+    }
+    return output;
+}
diff --git a/rocAL/source/api/rocal_api_data_loaders.cpp b/rocAL/source/api/rocal_api_data_loaders.cpp
@@ -38,6 +38,7 @@ THE SOFTWARE.
 #include "loaders/audio/audio_source_evaluator.h"
 #include "loaders/audio/node_audio_loader.h"
 #include "loaders/audio/node_audio_loader_single_shard.h"
+#include "augmentations/audio_augmentations/node_downmix.h"
 #endif
 #include "augmentations/geometry_augmentations/node_resize.h"
 #include "rocal_api.h"
@@ -2219,7 +2220,19 @@ rocalAudioFileSourceSingleShard(
         auto cpu_num_threads = context->master_graph->calculate_cpu_num_threads(shard_count);
         context->master_graph->add_node<AudioLoaderSingleShardNode>({}, {output})->Init(shard_id, shard_count, cpu_num_threads, source_path, "", StorageType::FILE_SYSTEM, DecoderType::AUDIO_SOFTWARE_DECODE, shuffle, loop, context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader());
         context->master_graph->set_loop(loop);
-        if (is_output) {
+        if (downmix && (max_channels > 1)) {
+            TensorInfo output_info = info;
+            std::vector<size_t> output_dims = {context->user_batch_size(), info.dims()[1], 1};
+            output_info.set_dims(output_dims);
+            auto downmixed_output = context->master_graph->create_tensor(output_info, false);
+            std::shared_ptr<DownmixNode> downmix_node = context->master_graph->add_node<DownmixNode>({output}, {downmixed_output});
+
+            if (is_output) {
+                auto actual_output = context->master_graph->create_tensor(output_info, is_output);
+                context->master_graph->add_node<CopyNode>({downmixed_output}, {actual_output});
+            }
+            return downmixed_output;
+        } else if (is_output) {
             auto actual_output = context->master_graph->create_tensor(info, is_output);
             context->master_graph->add_node<CopyNode>({output}, {actual_output});
         }
@@ -2264,7 +2277,18 @@ rocalAudioFileSource(
         auto cpu_num_threads = context->master_graph->calculate_cpu_num_threads(shard_count);
         context->master_graph->add_node<AudioLoaderNode>({}, {output})->Init(shard_count, cpu_num_threads, source_path, "", StorageType::FILE_SYSTEM, DecoderType::AUDIO_SOFTWARE_DECODE, shuffle, loop, context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader());
         context->master_graph->set_loop(loop);
-        if (is_output) {
+        if (downmix && (max_channels > 1)) {
+            TensorInfo output_info = info;
+            std::vector<size_t> output_dims = {context->user_batch_size(), info.dims()[1], 1};
+            output_info.set_dims(output_dims);
+            auto downmixed_output = context->master_graph->create_tensor(output_info, false);
+            std::shared_ptr<DownmixNode> downmix_node = context->master_graph->add_node<DownmixNode>({output}, {downmixed_output});
+            if (is_output) {
+                auto actual_output = context->master_graph->create_tensor(output_info, is_output);
+                context->master_graph->add_node<CopyNode>({downmixed_output}, {actual_output});
+            }
+            return downmixed_output;
+        } else if (is_output) {
             auto actual_output = context->master_graph->create_tensor(info, is_output);
             context->master_graph->add_node<CopyNode>({output}, {actual_output});
         }

diff --git a/rocAL/source/augmentations/audio_augmentations/node_downmix.cpp b/rocAL/source/augmentations/audio_augmentations/node_downmix.cpp
@@ -0,0 +1,49 @@
+/*
+Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
+*/
+
+#include "augmentations/audio_augmentations/node_downmix.h"
+
+#include <vx_ext_rpp.h>
+
+#include "pipeline/exception.h"
+
+DownmixNode::DownmixNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs) : Node(inputs, outputs) {}
+
+void DownmixNode::create_node() {
+    if (_node)
+        return;
+
+    vx_status status = VX_SUCCESS;
+    _node = vxExtRppDownmix(_graph->get(), _inputs[0]->handle(), _outputs[0]->handle(), _inputs[0]->get_roi_tensor());
+
+    if ((status = vxGetStatus((vx_reference)_node)) != VX_SUCCESS)
+        THROW("Adding the downmix (vxExtRppDownmix) node failed: " + TOSTR(status))
+}
+
+void DownmixNode::update_node() {
+    for (unsigned i = 0; i < _batch_size; i++) {
+        unsigned *tensor_shape = _inputs[0]->info().roi()[i].end;
+        unsigned *output_tensor_shape = _outputs[0]->info().roi()[i].end;
+        output_tensor_shape[0] = tensor_shape[0];
+        output_tensor_shape[1] = 1;  // Setting channels to 1 for downmix output
+    }
+}
diff --git a/rocAL/source/augmentations/audio_augmentations/node_to_decibels.cpp b/rocAL/source/augmentations/audio_augmentations/node_to_decibels.cpp
@@ -0,0 +1,56 @@
+/*
+Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
+*/
+
+#include "augmentations/audio_augmentations/node_to_decibels.h"
+
+#include <vx_ext_rpp.h>
+
+#include "pipeline/exception.h"
+
+ToDecibelsNode::ToDecibelsNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs) : Node(inputs, outputs) {}
+
+void ToDecibelsNode::create_node() {
+    if (_node)
+        return;
+
+    vx_status status = VX_SUCCESS;
+    vx_scalar cutoff_db_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_FLOAT32, &_cutoff_db);
+    vx_scalar multiplier_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_FLOAT32, &_multiplier);
+    vx_scalar reference_magnitude_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_FLOAT32, &_reference_magnitude);
+    int input_layout = static_cast<int>(_inputs[0]->info().layout());
+    int output_layout = static_cast<int>(_outputs[0]->info().layout());
+    vx_scalar input_layout_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_INT32, &input_layout);
+    vx_scalar output_layout_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_INT32, &output_layout);
+    _node = vxExtRppToDecibels(_graph->get(), _inputs[0]->handle(), _inputs[0]->get_roi_tensor(), _outputs[0]->handle(), cutoff_db_vx,
+                               multiplier_vx, reference_magnitude_vx, input_layout_vx, output_layout_vx);
+
+    if ((status = vxGetStatus((vx_reference)_node)) != VX_SUCCESS)
+        THROW("Adding the to_decibels (vxRppToDecibels) node failed: " + TOSTR(status))
+}
+
+void ToDecibelsNode::update_node() {}
+
+void ToDecibelsNode::init(float cutoff_db, float multiplier, float reference_magnitude) {
+    _cutoff_db = cutoff_db;
+    _multiplier = multiplier;
+    _reference_magnitude = reference_magnitude;
+}
diff --git a/rocAL_pybind/amd/rocal/fn.py b/rocAL_pybind/amd/rocal/fn.py
@@ -1112,3 +1112,16 @@ def spectrogram(*inputs, bytes_per_sample_hint = [0], center_windows = True, lay
                      "power": power, "nfft": nfft, "window_length": window_length, "window_step": window_step, "output_layout": layout, "output_dtype": output_dtype}
     spectrogram_output = b.spectrogram(Pipeline._current_pipeline._handle, *(kwargs_pybind.values()))
     return (spectrogram_output)
+
+def to_decibels(*inputs, bytes_per_sample_hint = [0], cutoff_db = -200.0, multiplier = 10.0, reference = 0.0, seed = -1, output_dtype = types.FLOAT):
+    '''
+    Converts a magnitude (real, positive) to the decibel scale.
+
+    Conversion is done according to the following formula:
+
+    min_ratio = pow(10, cutoff_db / multiplier)
+    out[i] = multiplier * log10( max(min_ratio, input[i] / reference) )
+    '''
+    kwargs_pybind = {"input_audio": inputs[0], "is_output": False, "cutoff_db": cutoff_db, "multiplier": multiplier, "reference_magnitude": reference, "rocal_tensor_output_type": output_dtype}
+    decibel_scale = b.toDecibels(Pipeline._current_pipeline._handle, *(kwargs_pybind.values()))
+    return decibel_scale
diff --git a/rocAL_pybind/rocal_pybind.cpp b/rocAL_pybind/rocal_pybind.cpp
@@ -731,5 +731,7 @@ PYBIND11_MODULE(rocal_pybind, m) {
             py::return_value_policy::reference);
     m.def("spectrogram", &rocalSpectrogram,
           py::return_value_policy::reference);
+    m.def("toDecibels", &rocalToDecibels,
+          py::return_value_policy::reference);
 }
 }  // namespace rocal
diff --git a/tests/cpp_api/audio_tests/README.md b/tests/cpp_api/audio_tests/README.md
@@ -40,3 +40,5 @@ python3 audio_tests.py --gpu <0/1> --downmix <True/False> --test_case <case_numb
 * Case 0 - Audio Decoder
 * Case 1 - PreEmphasis Filter
 * Case 2 - Spectrogram
+* Case 3 - Downmix
+* Case 4 - ToDecibels