A comprehensive Python toolkit for exploratory data analysis with advanced visualization and statistical analysis capabilities.
InsightfulPy simplifies the process of exploring and understanding your data through intuitive functions for statistical analysis, data quality assessment, and professional visualization. Whether you're a data scientist, analyst, or researcher, this package provides the tools you need for thorough data exploration.
Document | Description |
---|---|
Overview | Package introduction, features, and architecture |
Installation Guide | Installation instructions and setup verification |
Quick Start | Basic workflow and essential functions tutorial |
User Guide | Complete workflow tutorial with step-by-step examples |
API Reference | Detailed function documentation and parameters |
Contributing | Guidelines for contributing to the project |
- Quick Examples - Additional usage scenarios
- Output Results Gallery - Some output results
- diagram Gallery - diagram collection
- Statistical Analysis: Comprehensive statistics, distribution analysis, and normality testing
- Data Quality Assessment: Missing value detection, outlier identification, and data type validation
- Professional Visualization: Box plots, distribution plots, correlation analysis, and categorical charts
- Dataset Comparison: Multi-dataset analysis and column linking capabilities
- Batch Processing: Handle large datasets with intelligent batching for visualizations
- Easy Integration: Works seamlessly with pandas DataFrames
pip install insightfulpy
import pandas as pd
import insightfulpy as ipy
# Load your data
df = pd.read_csv('your_data.csv')
# Basic data exploration
ipy.columns_info('My Dataset', df)
ipy.num_summary(df)
ipy.cat_summary(df)
# Data quality checks
ipy.missing_inf_values(df)
ipy.detect_outliers(df)
# Visualization
ipy.show_missing(df)
ipy.plot_boxplots(df)
ipy.kde_batches(df, batch_num=1)
num_summary(df)
- Statistical summary of numerical columnscat_summary(df)
- Analysis of categorical columnscolumns_info(title, df)
- Dataset structure overviewmissing_inf_values(df)
- Missing and infinite value detectiondetect_outliers(df)
- Outlier identification using IQR method
show_missing(df)
- Missing data pattern visualizationplot_boxplots(df)
- Box plots for all numerical columnskde_batches(df)
- Distribution plots organized in batchescat_bar_batches(df)
- Bar charts for categorical datacat_pie_chart_batches(df)
- Pie charts for categorical analysis
grouped_summary(df, groupby)
- Statistical analysis by groupscompare_df_columns()
- Multi-dataset comparisoninterconnected_outliers()
- Cross-column outlier analysisnum_vs_num_scatterplot_pair_batch()
- Numerical correlation plotscat_vs_cat_pair_batch()
- Categorical relationship heatmaps
calc_stats(series)
- Comprehensive statistical calculationscalculate_skewness_kurtosis(df)
- Distribution shape analysisiqr_trimmed_mean(data)
- Robust mean calculationmad(data)
- Mean absolute deviation
InsightfulPy includes a built-in help system for easy reference:
import insightfulpy as ipy
# Get help overview
ipy.help()
# List all functions
ipy.list_all()
# Quick start guide
ipy.quick_start()
# Usage examples
ipy.examples()
Python 3.8+ with pandas (≥1.3), numpy (≥1.20), matplotlib (≥3.3), seaborn (≥0.11), scipy (≥1.7), plus researchpy, tableone, missingno, and tabulate.
- For detailed documentation and examples, visit GitHub repository.
- Contributions are welcome! Please read contributing guidelines and submit pull requests to GitHub repository.
- If you encounter any issues or have questions, please open an issue on GitHub Issues page.
- This project is licensed under the MIT License - see the LICENSE file for details.
Package Information:
Version: 0.1.8 | Author: Dhanesh B. B. | License: MIT | Python: 3.8+
InsightfulPy makes data exploration intuitive and comprehensive.