Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPP Sobel Filter Tensor HIP #345

Open
wants to merge 33 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
22a39a4
initial commit
sampath1117 Jul 26, 2024
568a71a
added initial support for 3x3 kernel gradient x variant
sampath1117 Jul 26, 2024
50bd38b
added support for gradient y and gradient xy variants
sampath1117 Jul 26, 2024
5794a05
fixed the issue with border pixel processing in raw c code
sampath1117 Jul 27, 2024
e71e319
reverted back to previous version
sampath1117 Jul 29, 2024
cde5e22
fixed the xy gradient output issues
sampath1117 Jul 29, 2024
1e006e5
decoupled xy gradient from x, y gradient variants
sampath1117 Jul 29, 2024
2f281c5
modified raw c process function names for better clarity
sampath1117 Jul 29, 2024
2a1e892
modified to process pixels in vectorized manner per iteration
sampath1117 Jul 29, 2024
792afb8
added support for other data types
sampath1117 Jul 29, 2024
02b9d4f
added support to convert RGB images to grey scale images before passi…
sampath1117 Jul 30, 2024
1b59d6c
added support for 5x5 kernel size
sampath1117 Jul 30, 2024
49bb291
added support for 7x7 kernel size
sampath1117 Jul 30, 2024
ac1ed74
added sobel filter case number in runTests.py
sampath1117 Jul 30, 2024
eebd625
made changes in test suite to test all gradient types for all layout …
sampath1117 Jul 31, 2024
256622d
added golden output for 3x3 kernel size
sampath1117 Jul 31, 2024
5668ebd
fixed pointer assignment w.r.t ifdef for AVX2 flag inside kernel
sampath1117 Jul 31, 2024
0d2d6d6
added golden output for kernelsize 5
sampath1117 Jul 31, 2024
1e66162
fixed the output issues with 7x7 kernel and added QA support
sampath1117 Jul 31, 2024
fb085a5
Merge branch 'develop' into sr/sobel_filter_host
sampath1117 Jul 31, 2024
78604b4
modified the docs as per the latest changes in kernel
sampath1117 Aug 5, 2024
a679b56
fixed variable names in helper functions added in test suite
sampath1117 Aug 5, 2024
0cac8e6
added validation checks for sobelType and kernelSize
sampath1117 Aug 5, 2024
4543311
reverted unwanted changes added in rpp_cpu_simd.hpp
sampath1117 Aug 5, 2024
0320eae
added blank line at EOF for sobel_filter.hpp
sampath1117 Aug 5, 2024
7bf0a22
Merge branch 'develop' into sr/sobel_filter_host
sampath1117 Aug 30, 2024
b0101cd
added the required version changes
sampath1117 Aug 30, 2024
7749560
added validation for dst channels 3
sampath1117 Aug 30, 2024
ef9bb75
Add 3x3 intial HIP implementation
HazarathKumarM Sep 20, 2024
9a63e78
Adds 5x5 HIP implementation for sobel filter
HazarathKumarM Sep 20, 2024
d98da4c
Adds 7x7 HIP implementation for sobel filter
HazarathKumarM Sep 20, 2024
036317c
Add version changes , update maps in common.py and code cleanup
HazarathKumarM Sep 24, 2024
3cba12f
Minor code cleanup
HazarathKumarM Sep 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,14 @@
# Changelog for RPP

Full documentation for RPP is available at [https://rocm.docs.amd.com/projects/rpp/en/latest](https://rocm.docs.amd.com/projects/rpp/en/latest)


## RPP 1.10.1 (unreleased)

### Changes

* RPP Tensor Sobel Filter support on HIP


## RPP 1.9.1 for ROCm 6.3.0

### Changes
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ endif()
set(CMAKE_CXX_STANDARD 17)

# RPP Version
set(VERSION "1.9.1")
set(VERSION "1.10.1")

# Set Project Version and Language
project(rpp VERSION ${VERSION} LANGUAGES CXX)
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion include/rpp_version.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ extern "C" {
#endif
// NOTE: IMPORTANT: Match the version with CMakelists.txt version
#define RPP_VERSION_MAJOR 1
#define RPP_VERSION_MINOR 9
#define RPP_VERSION_MINOR 10
#define RPP_VERSION_PATCH 1
#ifdef __cplusplus
}
Expand Down
44 changes: 44 additions & 0 deletions include/rppt_tensor_filter_augmentations.h
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,50 @@ RppStatus rppt_box_filter_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t
RppStatus rppt_gaussian_filter_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32f *stdDevTensor, Rpp32u kernelSize, RpptROIPtr roiTensorPtrSrc, RpptRoiType roiType, rppHandle_t rppHandle);
#endif // GPU_SUPPORT

/*! \brief Sobel Filter augmentation on HOST backend for a NHWC/NCHW layout tensor
* \details The sobel filter augmentation runs for a batch of RGB(3 channel) / greyscale(1 channel) images with NHWC/NCHW tensor layout.<br>
* - srcPtr depth ranges - Rpp8u (0 to 255), Rpp16f (0 to 1), Rpp32f (0 to 1), Rpp8s (-128 to 127).
* - dstPtr depth ranges - Will be same depth as srcPtr.
* \image html img150x150.png Sample Input
* \image html filter_augmentations_sobel_filter_kSize3_img150x150.png Sample 3x3 Output
* \param [in] srcPtr source tensor in HOST memory
* \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3)
* \param [out] dstPtr destination tensor in HOST memory
* \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1)
* \param [in] sobelType sobel type for sobel filter (a single Rpp32u number with sobelType = 0 (X Gradient) / 1 (Y Gradient) / 2 (XY Gradient) that applies to all images in the batch)
* \param [in] kernelSize kernel size for sobel filter (a single Rpp32u odd number with kernelSize = 3/5/7 that applies to all images in the batch)
* \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y))
* \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB)
* \param [in] rppHandle RPP HOST handle created with <tt>\ref rppCreateWithBatchSize()</tt>
* \return A <tt> \ref RppStatus</tt> enumeration.
* \retval RPP_SUCCESS Successful completion.
* \retval RPP_ERROR* Unsuccessful completion.
*/
RppStatus rppt_sobel_filter_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u sobelType, Rpp32u kernelSize, RpptROIPtr roiTensorPtrSrc, RpptRoiType roiType, rppHandle_t rppHandle);

#ifdef GPU_SUPPORT
/*! \brief Sobel Filter augmentation on HIP backend for a NHWC/NCHW layout tensor
* \details The sobel filter augmentation runs for a batch of RGB(3 channel) / greyscale(1 channel) images with NHWC/NCHW tensor layout.<br>
* - srcPtr depth ranges - Rpp8u (0 to 255), Rpp16f (0 to 1), Rpp32f (0 to 1), Rpp8s (-128 to 127).
* - dstPtr depth ranges - Will be same depth as srcPtr.
* \image html img150x150.png Sample Input
* \image html filter_augmentations_sobel_filter_kSize3_img150x150.png Sample 3x3 Output
* \param [in] srcPtr source tensor in HIP memory
* \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW, c = 1/3)
* \param [out] dstPtr destination tensor in HIP memory
* \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1)
* \param [in] sobelType sobel type for sobel filter (a single Rpp32u number with sobelType = 0 (X Gradient) / 1 (Y Gradient) / 2 (XY Gradient) that applies to all images in the batch)
* \param [in] kernelSize kernel size for sobel filter (a single Rpp32u odd number with kernelSize = 3/5/7 that applies to all images in the batch)
* \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y))
* \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB)
* \param [in] rppHandle RPP HIP handle created with <tt>\ref rppCreateWithStreamAndBatchSize()</tt>
* \return A <tt> \ref RppStatus</tt> enumeration.
* \retval RPP_SUCCESS Successful completion.
* \retval RPP_ERROR* Unsuccessful completion.
*/
RppStatus rppt_sobel_filter_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u sobelType, Rpp32u kernelSize, RpptROIPtr roiTensorPtrSrc, RpptRoiType roiType, rppHandle_t rppHandle);
#endif // GPU_SUPPORT

/*! @}
*/

Expand Down
4 changes: 2 additions & 2 deletions src/include/cpu/rpp_cpu_common.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -512,12 +512,12 @@ inline int power_function(int a, int b)

inline void saturate_pixel(Rpp32f pixel, Rpp8u* dst)
{
*dst = RPPPIXELCHECK(pixel);
*dst = RPPPIXELCHECK(std::nearbyintf(pixel));
}

inline void saturate_pixel(Rpp32f pixel, Rpp8s* dst)
{
*dst = (Rpp8s)RPPPIXELCHECKI8(pixel - 128);
*dst = (Rpp8s)RPPPIXELCHECKI8(std::nearbyintf(pixel) - 128);
}

inline void saturate_pixel(Rpp32f pixel, Rpp32f* dst)
Expand Down
Loading