Discover Your Next Favorite Audiobook with AI-Powered Recommendations
An intelligent book recommendation system that leverages advanced machine learning algorithms to help users discover their next favorite audiobook from a curated collection of 4,000+ titles.
- Features
- Demo Screenshots
- Technology Stack
- Dataset Overview
- Quick Start
- Usage Guide
- Model Performance
- Project Architecture
- Contributing
- License
- Acknowledgments
- Content-Based Filtering: Uses TF-IDF vectorization and cosine similarity
- Cluster-Based Recommendations: Groups similar books using K-means clustering
- Hybrid Approach: Combines multiple algorithms for optimal results
- 95%+ Success Rate in generating relevant recommendations
- π Intelligent Search: Find books by title, author, or keywords
- π·οΈ Genre Explorer: Browse 100+ genres and categories
- π€ Personalized Recommendations: Custom suggestions based on user preferences
- π Hidden Gems Discovery: Uncover highly-rated books with fewer reviews
- Data Insights Dashboard: Explore trends and patterns
- Visual Analytics: Interactive charts and graphs
- Performance Metrics: Real-time recommendation quality tracking
- Export Functionality: Download recommendations as CSV
- Responsive Design: Works seamlessly on desktop and mobile
- Gradient UI: Modern, professional interface with smooth animations
- Horizontal Layouts: Optimized book card displays
- Real-time Updates: Instant recommendation generation
- scikit-learn - Clustering, similarity calculations, and ML algorithms
- NumPy - Numerical computing and array operations
- pandas - Data manipulation and analysis
- NLTK - Natural language processing and text analysis
- Streamlit - Interactive web application framework
- Plotly - Interactive charts and data visualizations
- Seaborn & Matplotlib - Statistical visualizations
- TF-IDF Vectorization - Text feature extraction
- Cosine Similarity - Content similarity calculations
- K-means Clustering - Book grouping and categorization
- Streamlit Cloud - Application hosting and deployment
- π 4,002 Unique Books - Diverse collection across all genres
- βοΈ 2,694 Authors - From bestselling to emerging writers
- π·οΈ 100+ Genres - Fiction, non-fiction, self-help, business, and more
- β 4.46 Average Rating - High-quality, well-reviewed content
- Ratings & Reviews - User ratings and review counts
- Pricing Information - Current pricing data
- Genre Classifications - Multiple genre tags per book
- Detailed Descriptions - Full book summaries and synopses
- Author Information - Complete author details
- Listening Time - Duration for audiobooks
- β Cleaned & Processed - Removed duplicates and handled missing values
- β Standardized Formats - Consistent data types and structures
- β Validated Entries - Quality checks for accuracy
- β Regular Updates - Maintained and refreshed dataset
# 1. Clone the repository
git clone https://github.com/your-username/audible-insights.git
cd audible-insights
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Run the application
streamlit run streamlit_app.py
# 5. Open browser to http://localhost:8501
- **Quick Statistics**: Overview of total books, ratings, and genres
- **Featured Recommendations**: Curated selection of top-rated books
- **Instant Search**: Fast search across titles and authors
---
- Select a Book: Choose from 4,000+ titles using the dropdown
- Choose Method: Pick from Hybrid, Content-Based, or Cluster-Based algorithms
- Get Recommendations: Receive 1β20 personalized suggestions
- Explore Results: View detailed information and similarity scores
- Select Genres: Choose from 100+ available genres
- Set Filters: Adjust minimum rating and result count
- Discover Books: Find books matching your genre preferences
- View Statistics: Explore genre popularity charts
- Set Preferences: Select favorite genres, rating threshold, and price range
- Choose Length: Specify preferred audiobook duration
- Get Personalized Results: Receive custom recommendations with match scores
- Refine Choices: Adjust preferences for better matches
- Rating Distribution: Analyze book rating patterns
- Price vs Rating: Explore pricing and quality relationships
- Author Analytics: Discover most prolific authors
- Cluster Analysis: Understand book groupings and themes
- Success Rate: 95%+ β Reliable recommendation generation
- Average Rating: 4.2+ β High-quality suggested books
- Diversity Score: 0.75+ β Balanced genre and author representation
- Response Time: Sub-second β Fast recommendation processing
Method | Success Rate | Diversity | Novelty | Quality | Use Case |
---|---|---|---|---|---|
Hybrid | 95% | 0.75 | 0.65 | 4.3 | Best Overall |
Content-Based | 92% | 0.68 | 0.58 | 4.2 | Similar Books |
Cluster-Based | 88% | 0.82 | 0.72 | 4.1 | Genre Discovery |
- Precision@10: 0.78 β Relevant recommendations in top 10
- Recall@10: 0.65 β Coverage of relevant items
- Diversity Score: 0.75 β Variety across genres and authors
- Novelty Score: 0.65 β Discovery of lesser-known gems
- Serendipity Score: 0.42 β Unexpected but relevant finds
audible-insights/ βββ π± streamlit_app.py # Main Streamlit application βββ π€ recommendation_classes.py # ML recommendation algorithms βββ π οΈ app_utils.py # Utility functions βββ π requirements.txt # Python dependencies βββ π README.md # Project documentation βββ π« .gitignore # Git ignore rules βββ π data/ # Dataset files β βββ streamlit_dataset.csv βββ π§ models/ # Trained models β βββ combined_tfidf_matrix.npy β βββ content_recommender.pkl β βββ cluster_recommender.pkl β βββ hybrid_recommender.pkl βββ π notebooks/ # Development notebooks βββ 01_data_preparation.ipynb βββ 02_data_cleaning.ipynb βββ 03_exploratory_data_analysis.ipynb βββ 04_nlp_and_clustering.ipynb βββ 05_recommendation_systems.ipynb βββ 06_model_evaluation.ipynb βββ 07_streamlit_app.ipynb βββ 08_summary_and_conclusion.ipynb
- Data Collection: Gathered Audible audiobook metadata
- Data Cleaning: Processed and standardized 4,000+ book records
- Exploratory Analysis: Analyzed patterns, ratings, and genres
- Model Development: Built and trained recommendation algorithms
- Evaluation: Tested models with multiple performance metrics
- Deployment: Created production-ready web application
- Feature Engineering: TF-IDF vectorization of book descriptions
- Similarity Calculation: Cosine similarity for content-based filtering
- Clustering: K-Means algorithm for grouping similar books
- Hybrid Method: Weighted combination of multiple approaches
- Evaluation: Comprehensive testing with precision, recall, and diversity metrics
This project is licensed under the MIT License β see the LICENSE
file for details.
- Audible: Audiobook metadata and catalog information
- Open Source Community: Libraries and frameworks used
- Netflix: Recommendation system architecture
- Amazon: E-commerce recommendation patterns
- Spotify: Music discovery algorithms
- Streamlit: Web application framework
- scikit-learn: Machine learning and clustering
- Plotly: Interactive data visualization
- Data Science Community: For tutorials and best practices
- Open Source Contributors: For maintaining excellent libraries
- Beta Testers: For feedback and improvement suggestions