Skip to content

Commit

Permalink
Audio PR - Augmentation support [ Downmix and ToDecibels ] (#125)
Browse files Browse the repository at this point in the history
* Fix formatting issues - Minor changes

* Clean up C++ audio unit test

* Remove max frames and channels from decoder

* Minor changes

* Add Output comparison for python audio unittests

* Modify rocal audio unit test

Update README

* Minor change

* Remove NSR

* Resolve PR comments

* Minor changes - Modifying the names of the arguments

* Initial commit for removing file list reader

* Added a bried desc for the rocAL enum for border type

* Add a WRN statement in PreEmphasis Filter to only use FP32 dtype

* minor changes

* Change the borderType enum to int32 from uint32 dtype

* Minor changes

* Minor change

* Update README for audio unit test

* Parameters for rocALAudioIterator

* Removing file list reader and metadata reader

* Minor change

* Remove the reset_tensor_roi() from the PreEmphasis augmentation making sure the src and dst roi points to the same location

* Del the Unit Test Files introduced earlier

* Changing python unittests for QA mode

* Resolving review comments

* Adding comment for file list case in file source reader

* Add pre_emphasis function and gollden output comparison in audio unit tests

* minor change - add update val in create array

* Minor variable name change

* Minor additions in the .h file

* minor change

* Formatting Changes

* Update unit test

* Minor change

* minor change

* Resolving review comments

* Minor changes

Add wav extension in file reader
Add reader in unit test

* Minor change

* Add file reader to python audio unit test

* Add to_decibels augmentations to rocAL

* Formatting , review comments resolution and change enum dtype to int32 instead of uint32

* Update C++ unit test

* Update python audio unit test

* Remove the unused variable output - Resolve warnings in cpp unit test

* Remove the dst_roi arg passed to rpp

* Minor changes

Change borderType enum prefix

* Adding file list reader to C++ unit tests

* Fixing issues with C++ audio unit tests

* Adding test case for to_decibels and downmix

* Modifying python unittests

* Fixing spectogram test case

* Remove the reset_tensor_roi calls

* Resolving review comments

* Resolve some PR comments

* Minor changes

* Resolving review comments

* Change the dims[0] and dims[1] positioning for Spectrogram

* Resolving review comments

* Resolving review comments

* Resolving PR comments

* Updating audio unit tests for default file list path

* Minor changes

* Minor Change

* Minor change

* Name change from sample to data

* Change from decoded_data_info to DecodedDataInfo

* Revert "Change the dims[0] and dims[1] positioning for Spectrogram"

This reverts commit d791b9a.

* Remove audio_decoder_factory.cpp file

* Minor change

* Change variable name

* Add Spectrogram Case in unit tests

* Add spectrogram case in python unit tests

* Update the struct variable name in audio files

* Fixing issues with downmix node output

* Adding ROI updation in downmix node

* Adding downmix test case for python unit tests

* Adding downmix and to_decibels test case in C++ tests

* Minor changes

* Change ROCAL_DATA_PATH to exclude rocal_data

* Update ROCAL_DATA_PATH to exclude rocal_data

* Use Pascal case for function names in audio decoder

* Add audio path for downmix test case

* Fix review comments

* Modify cmake to have SNDFILE in all capital

* Minor changes

* Add struct for audio info in AudioReadAndDecode

* Fix merge conflict

* Renaming crop_image_info to CropImageInfo

* Remove - actual_host_buffers - Unused

* Rename TimingDBG to TimingDbg

* Move the instances of DecodedDataInfo to its base class LoaderModule

* Fix a WRN msg in master_graph.cpp

* Remove a dangling comment

* Rename _circ_data_info to _circ_buff_data_info

* Add Glob to CMakeLists.txt

* Rename SndFileDecoder to GenericAudioDecoder

* Fix build issues

* Minor change

* Update python API README.md for audio unit test

* Update audio unit test README

* Adding missed param in python unit tests

* Revert "Add Glob to CMakeLists.txt"

This reverts commit 47263d9.

* Fix include headers for Audio files

* Fix copy data 2D

* Minor changes

* Pass decoded data info to load routine instead of separate vectors

* Update CHANGELOG.md

* Update CHANGELOG.md

* Change swap_handle_time variable name in loader

* Update the changelog.md

* Update ChangeLog.md

* Update CHANGELOG.md

* Formatting changes

Add comments

* Update doxygen comments

* Move file source reader from readers/image to readers folder

* Update README and add doxygen description

* Update CMakeLists and README for audio test

* Update README for audio test

* Minor fix

* Fix merge from PR 2

* Minor changes shard_count argument name

* Rename set and get functions of data_info to decoded_data_info

* Revert empty line removed in CMakeLists.txt

* Removed prefix original for audio vectors

* Resolve PR comments

* Add @params to all args in pytorch.py

* Fix build issue

* Minor changes in unit test

* Minor changes

* Change ROCAL instaces to rocAL in pytorch.py

* Resolve the PR comments

* Minor changes in decoders.py - Modify the comment for shard_size

* Minor changes

* Address the PR comments

* Address Review comments

* Introduce Audio layouts

* Add layout changes for spectrogram

* Fix the unit tests - c++ & python

* Minor fix

* Adding changes for spec layout changes

* Fix merge conflicts

* Resolving review comments

---------

Co-authored-by: swetha097 <[email protected]>
Co-authored-by: fiona-gladwin <[email protected]>
Co-authored-by: Swetha B S <[email protected]>
Co-authored-by: Fiona-MCW <[email protected]>
Co-authored-by: SundarRajan28 <[email protected]>
Co-authored-by: Swetha B S <>
  • Loading branch information
6 people authored Jun 19, 2024
1 parent 112c50f commit 48d0617
Show file tree
Hide file tree
Showing 18 changed files with 344 additions and 19 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
* Support for Audio augmentation - PreEmphasis filter
* Support for reading from file lists in file reader
* Support for Audio augmentation - Spectrogram
* Support for Audio augmentation - ToDecibels
* Support for downmixing audio channels during decoding

### Optimizations

Expand Down
19 changes: 19 additions & 0 deletions rocAL/include/api/rocal_api_augmentation.h
Original file line number Diff line number Diff line change
Expand Up @@ -1144,4 +1144,23 @@ extern "C" RocalTensor ROCAL_API_CALL rocalSpectrogram(RocalContext context,
RocalTensorLayout output_layout = ROCAL_NFT,
RocalTensorOutputType output_datatype = ROCAL_FP32);

/*! \brief A
* \ingroup group_rocal_augmentations
* \param [in] p_context Rocal context
* \param [in] p_input Input Rocal tensor
* \param [in] is_output is the output tensor part of the graph output
* \param[in] cutoff_db minimum or cut-off ratio in dB
* \param[in] multiplier factor by which the logarithm is multiplied
* \param[in] reference_magnitude Reference magnitude which if not provided uses maximum value of input as reference
* \param [in] rocal_tensor_output_type the data type of the output tensor
* \return RocalTensor
*/
extern "C" RocalTensor ROCAL_API_CALL rocalToDecibels(RocalContext p_context,
RocalTensor p_input,
bool is_output,
float cutoff_db,
float multiplier,
float reference_magnitude,
RocalTensorOutputType rocal_tensor_output_type);

#endif // MIVISIONX_ROCAL_API_AUGMENTATION_H
8 changes: 7 additions & 1 deletion rocAL/include/api/rocal_api_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,13 @@ enum RocalTensorOutputType {
ROCAL_UINT8 = 2,
/*! \brief AMD ROCAL_INT8
*/
ROCAL_INT8 = 3
ROCAL_INT8 = 3,
/*! \brief AMD ROCAL_UINT32
*/
ROCAL_UINT32 = 4,
/*! \brief AMD ROCAL_INT32
*/
ROCAL_INT32 = 5
};

/*! \brief rocAL Decoder Type enum
Expand Down
34 changes: 34 additions & 0 deletions rocAL/include/augmentations/audio_augmentations/node_downmix.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
/*
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#pragma once
#include "pipeline/graph.h"
#include "pipeline/node.h"
class DownmixNode : public Node {
public:
DownmixNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs);
DownmixNode() = delete;

protected:
void create_node() override;
void update_node() override;
};
41 changes: 41 additions & 0 deletions rocAL/include/augmentations/audio_augmentations/node_to_decibels.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
/*
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#pragma once
#include "pipeline/graph.h"
#include "pipeline/node.h"

class ToDecibelsNode : public Node {
public:
ToDecibelsNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs);
ToDecibelsNode() = delete;
void init(float cutoff_db, float multiplier, float reference_magnitude);

protected:
void create_node() override;
void update_node() override;

private:
float _cutoff_db = -200.0;
float _multiplier = 10.0;
float _reference_magnitude = 0.0;
};
1 change: 1 addition & 0 deletions rocAL/include/augmentations/augmentations_nodes.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,4 @@ THE SOFTWARE.
#include "augmentations/node_sequence_rearrange.h"
#include "augmentations/audio_augmentations/node_preemphasis_filter.h"
#include "augmentations/audio_augmentations/node_spectrogram.h"
#include "augmentations/audio_augmentations/node_to_decibels.h"
2 changes: 1 addition & 1 deletion rocAL/include/pipeline/tensor.h
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ class TensorInfo {
}
void set_tensor_layout(RocalTensorlayout layout) {
if (layout == RocalTensorlayout::NONE) return;
if (_layout != layout && _layout != RocalTensorlayout::NONE) { // If layout input and current layout's are different modify dims accordingly
if (_layout != layout && _layout != RocalTensorlayout::NONE && (_num_of_dims > 3)) { // If layout input and current layout's are different modify dims accordingly
std::vector<size_t> new_dims(_num_of_dims, 0);
get_modified_dims_from_layout(_layout, layout, new_dims);
_dims = new_dims;
Expand Down
34 changes: 34 additions & 0 deletions rocAL/source/api/rocal_api_augmentation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2250,3 +2250,37 @@ rocalSpectrogram(
}
return output;
}

RocalTensor ROCAL_API_CALL
rocalToDecibels(
RocalContext p_context,
RocalTensor p_input,
bool is_output,
float cutoff_db,
float multiplier,
float reference_magnitude,
RocalTensorOutputType output_datatype) {
Tensor* output = nullptr;
if ((p_context == nullptr) || (p_input == nullptr)) {
ERR("Invalid ROCAL context or invalid input tensor")
return output;
}
auto context = static_cast<Context*>(p_context);
auto input = static_cast<Tensor*>(p_input);
try {
RocalTensorDataType op_tensor_data_type = static_cast<RocalTensorDataType>(output_datatype);
TensorInfo output_info = input->info();
if (op_tensor_data_type != RocalTensorDataType::FP32) {
THROW("Only FP32 dtype is supported for To decibels augmentation.")
}
output_info.set_data_type(op_tensor_data_type);
if (input->info().layout() == RocalTensorlayout::NFT || input->info().layout() == RocalTensorlayout::NTF) // Layout is changed when input is from spectrogram/mel filter bank
output_info.set_tensor_layout(RocalTensorlayout::NHW);
output = context->master_graph->create_tensor(output_info, is_output);
context->master_graph->add_node<ToDecibelsNode>({input}, {output})->init(cutoff_db, multiplier, reference_magnitude);
} catch (const std::exception& e) {
context->capture_error(e.what());
ERR(e.what())
}
return output;
}
28 changes: 26 additions & 2 deletions rocAL/source/api/rocal_api_data_loaders.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ THE SOFTWARE.
#include "loaders/audio/audio_source_evaluator.h"
#include "loaders/audio/node_audio_loader.h"
#include "loaders/audio/node_audio_loader_single_shard.h"
#include "augmentations/audio_augmentations/node_downmix.h"
#endif
#include "augmentations/geometry_augmentations/node_resize.h"
#include "rocal_api.h"
Expand Down Expand Up @@ -2219,7 +2220,19 @@ rocalAudioFileSourceSingleShard(
auto cpu_num_threads = context->master_graph->calculate_cpu_num_threads(shard_count);
context->master_graph->add_node<AudioLoaderSingleShardNode>({}, {output})->Init(shard_id, shard_count, cpu_num_threads, source_path, "", StorageType::FILE_SYSTEM, DecoderType::AUDIO_SOFTWARE_DECODE, shuffle, loop, context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader());
context->master_graph->set_loop(loop);
if (is_output) {
if (downmix && (max_channels > 1)) {
TensorInfo output_info = info;
std::vector<size_t> output_dims = {context->user_batch_size(), info.dims()[1], 1};
output_info.set_dims(output_dims);
auto downmixed_output = context->master_graph->create_tensor(output_info, false);
std::shared_ptr<DownmixNode> downmix_node = context->master_graph->add_node<DownmixNode>({output}, {downmixed_output});

if (is_output) {
auto actual_output = context->master_graph->create_tensor(output_info, is_output);
context->master_graph->add_node<CopyNode>({downmixed_output}, {actual_output});
}
return downmixed_output;
} else if (is_output) {
auto actual_output = context->master_graph->create_tensor(info, is_output);
context->master_graph->add_node<CopyNode>({output}, {actual_output});
}
Expand Down Expand Up @@ -2264,7 +2277,18 @@ rocalAudioFileSource(
auto cpu_num_threads = context->master_graph->calculate_cpu_num_threads(shard_count);
context->master_graph->add_node<AudioLoaderNode>({}, {output})->Init(shard_count, cpu_num_threads, source_path, "", StorageType::FILE_SYSTEM, DecoderType::AUDIO_SOFTWARE_DECODE, shuffle, loop, context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader());
context->master_graph->set_loop(loop);
if (is_output) {
if (downmix && (max_channels > 1)) {
TensorInfo output_info = info;
std::vector<size_t> output_dims = {context->user_batch_size(), info.dims()[1], 1};
output_info.set_dims(output_dims);
auto downmixed_output = context->master_graph->create_tensor(output_info, false);
std::shared_ptr<DownmixNode> downmix_node = context->master_graph->add_node<DownmixNode>({output}, {downmixed_output});
if (is_output) {
auto actual_output = context->master_graph->create_tensor(output_info, is_output);
context->master_graph->add_node<CopyNode>({downmixed_output}, {actual_output});
}
return downmixed_output;
} else if (is_output) {
auto actual_output = context->master_graph->create_tensor(info, is_output);
context->master_graph->add_node<CopyNode>({output}, {actual_output});
}
Expand Down
49 changes: 49 additions & 0 deletions rocAL/source/augmentations/audio_augmentations/node_downmix.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
/*
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#include "augmentations/audio_augmentations/node_downmix.h"

#include <vx_ext_rpp.h>

#include "pipeline/exception.h"

DownmixNode::DownmixNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs) : Node(inputs, outputs) {}

void DownmixNode::create_node() {
if (_node)
return;

vx_status status = VX_SUCCESS;
_node = vxExtRppDownmix(_graph->get(), _inputs[0]->handle(), _outputs[0]->handle(), _inputs[0]->get_roi_tensor());

if ((status = vxGetStatus((vx_reference)_node)) != VX_SUCCESS)
THROW("Adding the downmix (vxExtRppDownmix) node failed: " + TOSTR(status))
}

void DownmixNode::update_node() {
for (unsigned i = 0; i < _batch_size; i++) {
unsigned *tensor_shape = _inputs[0]->info().roi()[i].end;
unsigned *output_tensor_shape = _outputs[0]->info().roi()[i].end;
output_tensor_shape[0] = tensor_shape[0];
output_tensor_shape[1] = 1; // Setting channels to 1 for downmix output
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
/*
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#include "augmentations/audio_augmentations/node_to_decibels.h"

#include <vx_ext_rpp.h>

#include "pipeline/exception.h"

ToDecibelsNode::ToDecibelsNode(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs) : Node(inputs, outputs) {}

void ToDecibelsNode::create_node() {
if (_node)
return;

vx_status status = VX_SUCCESS;
vx_scalar cutoff_db_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_FLOAT32, &_cutoff_db);
vx_scalar multiplier_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_FLOAT32, &_multiplier);
vx_scalar reference_magnitude_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_FLOAT32, &_reference_magnitude);
int input_layout = static_cast<int>(_inputs[0]->info().layout());
int output_layout = static_cast<int>(_outputs[0]->info().layout());
vx_scalar input_layout_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_INT32, &input_layout);
vx_scalar output_layout_vx = vxCreateScalar(vxGetContext((vx_reference)_graph->get()), VX_TYPE_INT32, &output_layout);
_node = vxExtRppToDecibels(_graph->get(), _inputs[0]->handle(), _inputs[0]->get_roi_tensor(), _outputs[0]->handle(), cutoff_db_vx,
multiplier_vx, reference_magnitude_vx, input_layout_vx, output_layout_vx);

if ((status = vxGetStatus((vx_reference)_node)) != VX_SUCCESS)
THROW("Adding the to_decibels (vxRppToDecibels) node failed: " + TOSTR(status))
}

void ToDecibelsNode::update_node() {}

void ToDecibelsNode::init(float cutoff_db, float multiplier, float reference_magnitude) {
_cutoff_db = cutoff_db;
_multiplier = multiplier;
_reference_magnitude = reference_magnitude;
}
13 changes: 13 additions & 0 deletions rocAL_pybind/amd/rocal/fn.py
Original file line number Diff line number Diff line change
Expand Up @@ -1112,3 +1112,16 @@ def spectrogram(*inputs, bytes_per_sample_hint = [0], center_windows = True, lay
"power": power, "nfft": nfft, "window_length": window_length, "window_step": window_step, "output_layout": layout, "output_dtype": output_dtype}
spectrogram_output = b.spectrogram(Pipeline._current_pipeline._handle, *(kwargs_pybind.values()))
return (spectrogram_output)

def to_decibels(*inputs, bytes_per_sample_hint = [0], cutoff_db = -200.0, multiplier = 10.0, reference = 0.0, seed = -1, output_dtype = types.FLOAT):
'''
Converts a magnitude (real, positive) to the decibel scale.
Conversion is done according to the following formula:
min_ratio = pow(10, cutoff_db / multiplier)
out[i] = multiplier * log10( max(min_ratio, input[i] / reference) )
'''
kwargs_pybind = {"input_audio": inputs[0], "is_output": False, "cutoff_db": cutoff_db, "multiplier": multiplier, "reference_magnitude": reference, "rocal_tensor_output_type": output_dtype}
decibel_scale = b.toDecibels(Pipeline._current_pipeline._handle, *(kwargs_pybind.values()))
return decibel_scale
2 changes: 2 additions & 0 deletions rocAL_pybind/rocal_pybind.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -731,5 +731,7 @@ PYBIND11_MODULE(rocal_pybind, m) {
py::return_value_policy::reference);
m.def("spectrogram", &rocalSpectrogram,
py::return_value_policy::reference);
m.def("toDecibels", &rocalToDecibels,
py::return_value_policy::reference);
}
} // namespace rocal
2 changes: 2 additions & 0 deletions tests/cpp_api/audio_tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,5 @@ python3 audio_tests.py --gpu <0/1> --downmix <True/False> --test_case <case_numb
* Case 0 - Audio Decoder
* Case 1 - PreEmphasis Filter
* Case 2 - Spectrogram
* Case 3 - Downmix
* Case 4 - ToDecibels
Loading

0 comments on commit 48d0617

Please sign in to comment.