Skip to content

Resources about Machine Learning, mostly oriented to newcombers to ML with online courses and introductory books.

Notifications You must be signed in to change notification settings

dfbarrero/MLresources

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 

Repository files navigation

Machine Learning resources

Resources about Machine Learning, mostly oriented to newcombers to ML with online courses and introductory books.

Courses

Books

Reddit

Introduction

ML Twitter accounts

Datasets

CBM datasets

Learning resources

Tools

ML Python tools

MLOps tools

Big Data tools

Distributed file systems:

Data injection tools:

  • Apache Kafka - Event-based data streaming tool.

  • Apache NiFi - Visual data routing, transformation and system mediation logic platform.

  • Apache Flume - Distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.

Scripting and query languages for Big Data.

Query engines for Big Data:

  • Apache Impala - Interactive SQL queries on HDFS and HBase on top of Hadoop. Real-time queries.

  • Apache Drill - Schema-free SQL query engine for Hadoop, NoSQL and cloud storage.

  • Apache HBase - Random, real-time read/write access to Big Data. NoSQL. Suited for real-time queries.

  • Apache Hive - Reading, writing, and managing petabytes of data residing in distributed storage using SQL. Batch processing on Hadoop and does not provide interactive querying. Not suited for real-time queries. Best used for analytical querying of data.

  • Apache Sqoop - Importing and exporting data from relational databases to Hadoop via JDBC. Retired.

  • Apache Spark SQL - Mix SQL queries with Spark programs. Provides dataframes.

Big Data processing:

Monitoring tools:

  • Apache Ambari - Provisioning, managing, and monitoring Apache Hadoop clusters.

  • Apache Ranger - Enable, monitor and manage comprehensive data security across the Hadoop ecosystem.

Orchestation:

  • Apache Oozie - Workflow scheduler system to manage Apache Hadoop jobs.

Data visualization tools

AI/ML in Space conferences

AI/ML in Space papers

Space data sources

ML interesting readings

Genetic Programming tools

ML in Robotics

ML in Education

Visualizations

Cool examples

Generative applications

2024

2023

2022

Before

Applications

Competitions

About

Resources about Machine Learning, mostly oriented to newcombers to ML with online courses and introductory books.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published