Skip to content

priyanka7411/audible-insights-recommender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š Audible Insights: Intelligent Book Recommendations

Streamlit App Python License: MIT Contributions Welcome

Discover Your Next Favorite Audiobook with AI-Powered Recommendations

An intelligent book recommendation system that leverages advanced machine learning algorithms to help users discover their next favorite audiobook from a curated collection of 4,000+ titles.



πŸ“‹ Table of Contents


✨ Features

πŸ€– Advanced AI Recommendation Engine

  • Content-Based Filtering: Uses TF-IDF vectorization and cosine similarity
  • Cluster-Based Recommendations: Groups similar books using K-means clustering
  • Hybrid Approach: Combines multiple algorithms for optimal results
  • 95%+ Success Rate in generating relevant recommendations

🎯 Smart Discovery Tools

  • πŸ” Intelligent Search: Find books by title, author, or keywords
  • 🏷️ Genre Explorer: Browse 100+ genres and categories
  • πŸ‘€ Personalized Recommendations: Custom suggestions based on user preferences
  • πŸ’Ž Hidden Gems Discovery: Uncover highly-rated books with fewer reviews

πŸ“Š Interactive Analytics

  • Data Insights Dashboard: Explore trends and patterns
  • Visual Analytics: Interactive charts and graphs
  • Performance Metrics: Real-time recommendation quality tracking
  • Export Functionality: Download recommendations as CSV

🎨 Modern User Experience

  • Responsive Design: Works seamlessly on desktop and mobile
  • Gradient UI: Modern, professional interface with smooth animations
  • Horizontal Layouts: Optimized book card displays
  • Real-time Updates: Instant recommendation generation

πŸ–ΌοΈ Demo Screenshots

Homepage - Discover Featured Books

Homepage

Book Search - Find Similar Recommendations

Book Search

Personal Recommendations - Customized for You

Personal Recommendations

Data Insights - Analytics Dashboard

Data Insights


πŸ› οΈ Technology Stack

Machine Learning & Data Science

  • scikit-learn - Clustering, similarity calculations, and ML algorithms
  • NumPy - Numerical computing and array operations
  • pandas - Data manipulation and analysis
  • NLTK - Natural language processing and text analysis

Web Framework & Visualization

  • Streamlit - Interactive web application framework
  • Plotly - Interactive charts and data visualizations
  • Seaborn & Matplotlib - Statistical visualizations

Text Processing & NLP

  • TF-IDF Vectorization - Text feature extraction
  • Cosine Similarity - Content similarity calculations
  • K-means Clustering - Book grouping and categorization

**Deployment **

  • Streamlit Cloud - Application hosting and deployment

πŸ“Š Dataset Overview

Comprehensive Audiobook Collection

  • πŸ“š 4,002 Unique Books - Diverse collection across all genres
  • ✍️ 2,694 Authors - From bestselling to emerging writers
  • 🏷️ 100+ Genres - Fiction, non-fiction, self-help, business, and more
  • ⭐ 4.46 Average Rating - High-quality, well-reviewed content

Rich Metadata

  • Ratings & Reviews - User ratings and review counts
  • Pricing Information - Current pricing data
  • Genre Classifications - Multiple genre tags per book
  • Detailed Descriptions - Full book summaries and synopses
  • Author Information - Complete author details
  • Listening Time - Duration for audiobooks

Data Quality Assurance

  • βœ… Cleaned & Processed - Removed duplicates and handled missing values
  • βœ… Standardized Formats - Consistent data types and structures
  • βœ… Validated Entries - Quality checks for accuracy
  • βœ… Regular Updates - Maintained and refreshed dataset

πŸƒβ€β™‚οΈ Quick Start

Option 1: Local Development

# 1. Clone the repository
git clone https://github.com/your-username/audible-insights.git
cd audible-insights

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Run the application
streamlit run streamlit_app.py

# 5. Open browser to http://localhost:8501

πŸ“– Usage Guide

🏠 Homepage Features

- **Quick Statistics**: Overview of total books, ratings, and genres  
- **Featured Recommendations**: Curated selection of top-rated books  
- **Instant Search**: Fast search across titles and authors  

---

πŸ” Book Search & Recommendations

  • Select a Book: Choose from 4,000+ titles using the dropdown
  • Choose Method: Pick from Hybrid, Content-Based, or Cluster-Based algorithms
  • Get Recommendations: Receive 1–20 personalized suggestions
  • Explore Results: View detailed information and similarity scores

🏷️ Genre Explorer

  • Select Genres: Choose from 100+ available genres
  • Set Filters: Adjust minimum rating and result count
  • Discover Books: Find books matching your genre preferences
  • View Statistics: Explore genre popularity charts

πŸ‘€ Personal Recommendations

  • Set Preferences: Select favorite genres, rating threshold, and price range
  • Choose Length: Specify preferred audiobook duration
  • Get Personalized Results: Receive custom recommendations with match scores
  • Refine Choices: Adjust preferences for better matches

πŸ“Š Data Insights

  • Rating Distribution: Analyze book rating patterns
  • Price vs Rating: Explore pricing and quality relationships
  • Author Analytics: Discover most prolific authors
  • Cluster Analysis: Understand book groupings and themes

πŸ“ˆ Model Performance

Recommendation Quality

  • Success Rate: 95%+ β€” Reliable recommendation generation
  • Average Rating: 4.2+ β€” High-quality suggested books
  • Diversity Score: 0.75+ β€” Balanced genre and author representation
  • Response Time: Sub-second β€” Fast recommendation processing

Algorithm Comparison

Method Success Rate Diversity Novelty Quality Use Case
Hybrid 95% 0.75 0.65 4.3 Best Overall
Content-Based 92% 0.68 0.58 4.2 Similar Books
Cluster-Based 88% 0.82 0.72 4.1 Genre Discovery

Evaluation Metrics

  • Precision@10: 0.78 β€” Relevant recommendations in top 10
  • Recall@10: 0.65 β€” Coverage of relevant items
  • Diversity Score: 0.75 β€” Variety across genres and authors
  • Novelty Score: 0.65 β€” Discovery of lesser-known gems
  • Serendipity Score: 0.42 β€” Unexpected but relevant finds

πŸ—οΈ Project Architecture

File Structure

audible-insights/ β”œβ”€β”€ πŸ“± streamlit_app.py # Main Streamlit application β”œβ”€β”€ πŸ€– recommendation_classes.py # ML recommendation algorithms β”œβ”€β”€ πŸ› οΈ app_utils.py # Utility functions β”œβ”€β”€ πŸ“‹ requirements.txt # Python dependencies β”œβ”€β”€ πŸ“– README.md # Project documentation β”œβ”€β”€ 🚫 .gitignore # Git ignore rules β”œβ”€β”€ πŸ“Š data/ # Dataset files β”‚ └── streamlit_dataset.csv β”œβ”€β”€ 🧠 models/ # Trained models β”‚ β”œβ”€β”€ combined_tfidf_matrix.npy β”‚ β”œβ”€β”€ content_recommender.pkl β”‚ β”œβ”€β”€ cluster_recommender.pkl β”‚ └── hybrid_recommender.pkl └── πŸ““ notebooks/ # Development notebooks β”œβ”€β”€ 01_data_preparation.ipynb β”œβ”€β”€ 02_data_cleaning.ipynb β”œβ”€β”€ 03_exploratory_data_analysis.ipynb β”œβ”€β”€ 04_nlp_and_clustering.ipynb β”œβ”€β”€ 05_recommendation_systems.ipynb β”œβ”€β”€ 06_model_evaluation.ipynb β”œβ”€β”€ 07_streamlit_app.ipynb └── 08_summary_and_conclusion.ipynb

Development Process

Data Science Pipeline

  • Data Collection: Gathered Audible audiobook metadata
  • Data Cleaning: Processed and standardized 4,000+ book records
  • Exploratory Analysis: Analyzed patterns, ratings, and genres
  • Model Development: Built and trained recommendation algorithms
  • Evaluation: Tested models with multiple performance metrics
  • Deployment: Created production-ready web application

Machine Learning Approach

  • Feature Engineering: TF-IDF vectorization of book descriptions
  • Similarity Calculation: Cosine similarity for content-based filtering
  • Clustering: K-Means algorithm for grouping similar books
  • Hybrid Method: Weighted combination of multiple approaches
  • Evaluation: Comprehensive testing with precision, recall, and diversity metrics

License

This project is licensed under the MIT License – see the LICENSE file for details.


Acknowledgments

Data Sources

  • Audible: Audiobook metadata and catalog information
  • Open Source Community: Libraries and frameworks used

Inspiration

  • Netflix: Recommendation system architecture
  • Amazon: E-commerce recommendation patterns
  • Spotify: Music discovery algorithms

Technologies

  • Streamlit: Web application framework
  • scikit-learn: Machine learning and clustering
  • Plotly: Interactive data visualization

Special Thanks

  • Data Science Community: For tutorials and best practices
  • Open Source Contributors: For maintaining excellent libraries
  • Beta Testers: For feedback and improvement suggestions

About

An intelligent book recommendation system using NLP, clustering, and hybrid recommender techniques on Audible audiobook data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published