Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Noise Functionality to DataFrame Columns in Utils Package #72

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ThanosTsiamis
Copy link
Contributor

Description:

This PR introduces a new function, add_noise_to_df_column, designed to add noise to a specified column in a DataFrame. The function addresses issue #63, where there was a request to incorporate noise addition functionality into a specific DataFrame column.

Changes Made:

  • Added a new function named add_noise_to_df_column in the utils package.
    The function utilizes numpy and pandas libraries for numerical operations and DataFrame manipulation.
    Implemented noise addition functionality based on the type of data in the column:
    For numerical columns (int or float), Gaussian noise with a mean of 0 and standard deviation equal to the specified noise level is added.

For string columns (object), characters of some strings are randomly permuted with a probability determined by the noise level.
This addition enhances the utility of the utils package by providing a flexible method to introduce noise into DataFrame columns, facilitating various data processing and analysis tasks.

The commented code at the bottom of the file is left on purpose to allow testing of the function's behavior.

Please review at your earliest convenience.

Copy link

codecov bot commented May 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.23%. Comparing base (705339c) to head (8d209f1).
Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #72      +/-   ##
==========================================
+ Coverage   87.10%   87.23%   +0.12%     
==========================================
  Files          40       40              
  Lines        1760     1770      +10     
==========================================
+ Hits         1533     1544      +11     
+ Misses        227      226       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kPsarakis kPsarakis requested a review from chrisk21 May 10, 2024 15:14
@kPsarakis kPsarakis added the enhancement New feature or request label May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants