🎬 Cinema Scraper

🚀 Lightning-fast movie data scraper for Thailand's major cinema chains

Extract real-time movie listings from Major Cineplex and SF CinemaCity with ease

✨ Features

🎯 Multi-Cinema Support: Scrapes from Major Cineplex and SF CinemaCity
⚡ High Performance: Built with Bun.js for blazing-fast execution
🤖 Smart Scraping: Uses Puppeteer with randomized user agents
📊 Structured Data: Outputs clean, standardized JSON format
🔄 Real-time Updates: Gets current and upcoming movie listings
🐳 Docker Ready: Containerized for easy deployment
📤 API Integration: Built-in support for data uploading to external APIs

🎬 Supported Cinemas

Cinema Chain	Status	Movies Count
🏢 Major Cineplex	✅ Active	~2000+ movies
🎪 SF CinemaCity	✅ Active	~1500+ movies

🚀 Quick Start

Prerequisites

Bun.js runtime
Node.js 18+ (if using npm/yarn)

Installation

# Clone the repository
git clone https://github.com/dvgamerr/cinema-scraper.git
cd cinema-scraper

# Install dependencies
bun install

# Run the scraper
bun dev

📁 Output Structure

The scraper generates JSON files in the ./output directory:

output/
├── results.json          # 📋 Combined standardized data
├── major-cineplex.json   # 🏢 Raw Major Cineplex data
└── sf-cinemacity.json    # 🎪 Raw SF CinemaCity data

📄 Sample Output Format

{
  "name": "movie-slug",
  "name_en": "Movie Title in English",
  "name_th": "ชื่อหนังภาษาไทย",
  "display": "Display Name",
  "release": "2025-06-06T17:00:00.000Z",
  "genre": "Action",
  "time": 120,
  "theater": {
    "major": {
      "cover": "https://cdn.majorcineplex.com/...",
      "url": "https://www.majorcineplex.com/..."
    }
  }
}

🐳 Docker Deployment

# Build the image
docker build -t cinema-scraper .

# Run the container
docker run -v $(pwd)/output:/app/output cinema-scraper

📊 Performance

⚡ Speed: Processes 3000+ movies in ~2-3 minutes
🧠 Memory: Optimized memory usage with chunked processing
🔄 Reliability: Built-in error handling and retry mechanisms
📱 Anti-Detection: Randomized user agents and request patterns

📄 License

Made with ❤️ in Thailand

If this project helps you, please consider giving it a ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.github		.github
.vscode		.vscode
plugins		plugins
untils		untils
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
bun.lockb		bun.lockb
eslint.config.js		eslint.config.js
index.js		index.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

🎬 Cinema Scraper

✨ Features

🎬 Supported Cinemas

🚀 Quick Start

Prerequisites

Installation

📁 Output Structure

📄 Sample Output Format

🐳 Docker Deployment

📊 Performance

📄 License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Languages

Uh oh!

License

dvgamerr/etl-cinema-scraper

Folders and files

Latest commit

History

Repository files navigation

🎬 Cinema Scraper

✨ Features

🎬 Supported Cinemas

🚀 Quick Start

Prerequisites

Installation

📁 Output Structure

📄 Sample Output Format

🐳 Docker Deployment

📊 Performance

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Languages

Packages