If you only have limited time to learn Artificial Intelligence, here’s what I recommend:
- 📘 Read this book: AI Engineering: Building Applications with Foundation Models
- 🎥 Watch this video: Deep Dive into LLMs like ChatGPT
- 🧠 Follow this course: 🤗 Agents Course
If you want more (and there’s a lot more) keep reading.
Learning often feels like walking down a road that forks every few meters; you’re always exploring, never really arriving. And that’s the beauty of it.
When I was working in games, people would ask me: “How do I learn to make games?” My answer was always: “Pick a game, and build it, learn the tools and concepts along the way.” I’ve taken the same approach with AI.
This repository is a collection of the material I’ve used (and continue to use) to learn AI: books, courses, papers, tools, models, datasets, and notes. It’s not a curriculum, it’s more like a journal. One that’s helped me build, get stuck, and keep going.
Do I know AI? Not really. But I’m learning, building, and having a great time doing it.
I hope something in here is useful to you too. And if you have suggestions or feedback, I’d love to hear it.
Here are the books I've read to make sense of AI/ML/DL. Some are more technical, some are more theoretical, and some are more ideas.
They are not in any particular order, although I tried to group them together based on what i think makes sense.
- Hands-On Large Language Models: Language Understanding and Generation Building an LLM from scratch is difficult, even understanding existing open-source options can be challenging. This book does a great job of explaining how LLMs work and introduces common architectures at a deep enough level to be practical without being overwhelming.
- AI Engineering: Building Applications with Foundation Models If you feel lost and don’t know where to start, this book can serve as a great map. Chip explains the most common concepts behind AI in a clear and approachable way.
- Deep Learning - A Visual Approach Probably the best resource out there for building solid intuition about the many concepts surrounding deep learning. Andrew, the author, did a wonderful job illustrating these concepts, making it much easier to develop a real understanding of them.
- Practical Deep Learning: A Python-Based Introduction Probably the best resource for balancing deep learning concepts with a hands-on, Python-based approach. It’s much easier to follow if you actively implement the code, even if that just means typing out the examples from the book.
- Hands-On Generative AI with Transformers and Diffusion Models This book provides a panoramic view of different model types to generate images, audio, and text. It shows a lot of code, and comes with challenges that are interesting to tackle if you want to get hands-on experience. Omar(one of the authors) is an ex-Hugging Face, now working at Google Deep Mind.
- Dive Into Data Science: Use Python To Tackle Your Toughest Business Challenges Not strictly about AI, although if you think about it, AI is deeply tied to data science. This book is great for understanding real-world scenarios and how to approach them using AI tools and Python. For me, the data preparation part was especially helpful.
- Dive into Deep Learning An amazing free book that covers the fundamental concepts of deep learning, including how to build your own models from scratch. It leans toward the heavier side, but it comes with plenty of Jupyter notebooks you can run to test your understanding. I still can’t believe this resource is free.
- Grokking Deep Learning When I read this book, I had a lot of “aha” moments. I’m not sure if it was the clarity of the explanations or the fact that I already had some background from other books, but either way, the examples and explanations are excellent. It strikes a good balance between theory, algorithms, and practical implementation.
- Prompt Engineering for LLMs This book provides useful tips and tricks to program using LLMs, essentially via prompts. As of 2025, I think the techniques discussed here, can be very useful for people using LLMs as part of their programming stack.
- Math for Deep Learning: A Practitioner's Guide to Mastering Neural Networks If you’re interested in how AI works under the hood, it ultimately comes down to a lot of math. This book does a great job explaining the essential mathematics behind implementing neural networks. It can feel overwhelming at times. The book is incredibly helpful for understanding how popular frameworks work, and how to better evaluate them for your use cases.
- Why Machines Learn: The Elegant Math Behind Modern AI A fascinating mix of math and history that explores how we arrived at large language models (LLMs) today (2024). I recommend reading it after you’ve built some basic math foundations — you’ll find it much more helpful that way.
- Understanding Deep Learning Another amazing book, and it’s completely free! (But buy it if you can). This book puts special focus on the pedagogy of how to learn (and teach) deep learning. It also comes with interactive notebooks so you can try out the explanations yourself.
- Practical Deep Learning for Coders with Fastai and PyTorch: AI Applications Without a PhD This book comes with video lectures by Jeremy Howard, the author and creator of FastAI. The goal is to give coders a head start in deep learning without diving too deep into the weeds. It still covers how to achieve a lot using the FastAI framework and PyTorch. Having the videos included made it especially cool, since I could both listen to and watch the author explain the concepts.
- The Shape of Data: Geometry-Based Machine Learning and Data Analysis in R When I got to this book, I already had a decent understanding of machine learning. Reading its geometric approach(spatial, geometrical, and visual rather than only statistical) gave me many moments where I smiled, things made sense even when framed through a different paradigm. I do wish the code examples weren’t in R, but I still managed to follow the concepts easily.
- The Art of Machine Learning: A Hands-On Guide to Machine Learning with R I would recommend this book to anyone who is already familiar with R or wants to learn both R and machine learning together. It’s a quick read.
- How AI Works: From Sorcery to Science I recommend this book if you don’t want to code or dive into the technical details of AI, deep learning, or machine learning, but still want a high-level understanding of what it’s all about. It’s great for people in non-engineering or non-technical roles. The book gives clear explanations of how things work without getting lost in the weeds.
- Superintelligence: Paths, Dangers, Strategies This was the first non-technical book I read on AI. It was fascinating because it offered a glimpse into the broader implications of what we’re building. I don’t think everything predicted in the book will happen, but it presents great perspectives on how to mitigate potential risks.
- The Myth of Artificial Intelligence This book offers the perspective that artificial intelligence won’t become the powerful, dangerous new form of life many imagine. It provides solid arguments for why that is, presenting a view that is much less apocalyptic than what you usually hear.
- The Art of Doing Science and Engineering: Learning to Learn While this book is not about AI specifically, Hamming covers what he thought AI would evolve into (which it kind of did) in chapters 6, 7, and 8. Even with what we are now capable of doing, he doesn’t think we need fewer programmers — just smarter ones. And that we should focus on what problems to solve, and why. Also, his in-person classes are available in Archive.org.
- The Coming Wave: Technology, Power, and the Twenty-first Century's Greatest Dilemma This book explores more catastrophic scenarios. Scenarios that, in my opinion, are entirely feasible. The book warns how the AI race could go wrong. It highlights not just the risks from machines themselves, but also the dangers of powerful tools falling into the wrong hands.
- Everything Is Predictable: How Bayesian Statistics Explain Our World This book provides a compelling view at how everything in life can be predicted, which essentially is what we are making machines do. Predict data.
- Your Brain Is a Time Machine: The Neuroscience and Physics of Time AI-adjacent, but this book provides an interesting view of how our brains handle time, and how we can bring in ideas to the world of neural networks, as I believe there are untapped opportunities to allow AIs to make better use of the concept and perception of time, particularly when dealing with memory and planning.g.
- Ways of Being: Animals, Plants, Machines: The Search for a Planetary Intelligence This book explores not just artificial intelligence, but the many forms of intelligence that already exist alongside us on Earth. It’s a great source of inspiration and a reminder that intelligence comes in many shapes, most of which we don't understand.
Not everyone learns the same way, sometimes I get too tired of just reading, and tutorials or courses in video form make me feel like I’m talking to someone. If you prefer learning through videos or more interactive formats, I recommend taking a look at the following materials:
- Getting Started with Deep Learning A short introduction to deep learning, delivered by NVIDIA. If you just want a quick glimpse of the very basics before jumping into higher-level implementation, this is a solid place to start.
- MIT Intro to Deep Learning A free, intensive bootcamp taught by MIT researchers. Getting direct access to this content (updated every time they teach it) is amazing. (~10 hours of deep learning concepts, plus interesting guest lectures.)
- Practical Deep Learning for Coders This is the accompanying course version of the FastAI book by Jeremy Howard.
- C++ Neural Network in a Weekend This might be too much if you’re just starting out, or if you’re not interested in low-level C++ implementations of neural networks. But I found it fascinating, and it made me appreciate how far modern frameworks and APIs have come.
- 🤗 Agents Course Now that agentic AI is trending, Hugging Face launched this free course showcasing their
smolagents
framework. It also covers LlamaIndex and LangGraph.
I find it increasingly difficult these days to stay focused on videos. Maybe it’s because watching usually means being on a computer or phone, and distractions are always just a click away.
However, the videos listed here are so well-made, well-researched, and genuinely interesting that I believe they’re not just useful, but sticky.
One more thing I appreciate: in the world of AI, many of the best video talks and tutorials often come directly from the people actually building the models, frameworks, and tools. What a great time to be learning.
- Andrej Karpathy: Software Is Changing (Again) This session is great session outlining how the way we talk to computers is changing. We're still just transforming data, but the way we do so, is changing. I think this is particularly useful for software developers, Computer Scientists, and students who wonder if a CS career is worth it in 2025.
- Transformers (how LLMs work) explained visually 3Blue1Brown’s beautifully and clearly explained video on how large language models (LLMs) work, including the transformer architecture and the concept of embeddings.
- Visualizing transformers and attention | Talk for TNG Big Tech Day '24 Grant Sanderson's (3Blue1Brown) live explanation on how to think visually about transformers. This session is at the intersection of art and science.
- Deep Dive into LLMs like ChatGPT Aderej Karpathy's(OpenAI, Tesla, Stanford) masterclass on how LLMs work. In-depth, but super accesible.
- Let's build GPT: from scratch, in code, spelled out. Andrej Karpathy's step-by-step guide on building GPT.
Understanding all the tools, frameworks, architectures, and ecosystems around AI can sometimes feel harder than understanding AI itself. Below are the ones I’ve explored and used enough to feel confident recommending.
Of course, these won’t solve every use case, and I’m not listing every supporting technology you might need to build real-world AI systems, but it’s a start.
Category | Tools |
---|---|
Core Frameworks | - Hugging Face: A hub, that hosts models, datasets, apps, and communities around AI. - Ollama: Run LLMs locally in your computer(CLI). - LM Studio: Discover, and run LLMs in your computer, using a UI. |
Developer Tools | - Gradio: Create ML-powered apps for the web. Easy to use UI API. - Google Colab: You have seen probably many resources use Jupyter Notebooks, this platform allows you to run them. - MongoDB: Database that allows you to perform vector search. |
AI goes far beyond any single language, operating system, hardware, or framework. There are countless implementations across different programming languages, runtimes, and platforms. From my experience, though, Python is what most people use and teach.
Following that path, I’ve focused most of my learning around Python as well. That said, similar libraries (and often the same ones) likely exist for your favorite environment too.
Category | Libraries |
---|---|
Data Science & Computation | Pandas, NumPy, SciPy, scikit-learn |
Plotting & Visualization | Matplotlib, Seaborn |
Machine Learning / Deep Learning | TensorFlow, PyTorch |
Image Processing | Pillow |
Web Scraping | Beautiful Soup, Selenium |
At the core of deep learning are the models. Some are general-purpose large language models (LLMs), while others are specialized for specific tasks like text generation, image creation, or coding.
These are the models I’ve used or explored:
- GPT (OpenAI)
- Claude (Anthropic)
- Gemini (Google DeepMind)
- LLaMA (Meta)
- DeepSeek
- Meta Segment Anything Model 2
If you're looking to explore beyond these, I recommend checking out the following model hubs. They host a wide variety of models with different licenses and for many use cases:
- AI Is Nothing Like a Brain, and That’s OK - 04/30/25
- John Carmack: Different path to AGI - 02/02/23
- Deep Learning in a Nutshell Part 1: Core Concepts | Part 2: History and Training | Part 3: Sequence Learning | Part 4: Reinforcement Learning | Part 5: Reinforcement Learning - 11/03/2015
- LLM Visualization This interactive article and visualization of how an LLM works (specifically the inference process) is a wonderful resource for reducing the abstraction around LLMs and seeing them in action. It focuses on the nanoGPT model.
At the core of most AI advances is research — deep, complex work published by people pushing the boundaries of what’s possible. These breakthroughs often appear in the form of academic papers. Reading papers can be overwhelming at first. It’s not always easy to unpack their meaning or follow the math. I suggest using tools like NotebookLM or joining a local AI paper-reading club.
- Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning
- On the Biology of a Large Language Model
- PrimitiveAnything
- DreamFusion: Text-to-3D using 2D Diffusion
- GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
- Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
- Imagen: Text-to-Image Diffusion Models
- Gen-1
- Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification
- Uncovering and Mitigating Algorithmic Bias through Learned Latent Structure
The current approach to teaching machines relies heavily on data — often, massive amounts of it. In some cases, we use datasets that were created and labeled by humans. In others, we rely on synthetic data generated by machines, or a combination of both. This section includes some well-known datasets you can explore and use to train your models. Platforms like Hugging Face also host a wide range of datasets for different tasks and domains.
Name | Domain |
---|---|
Kaggle Datasets | Various / General ML |
CelebA | Computer Vision / Facial Attributes |
COCO | Computer Vision / Object Detection |
ImageNet | Computer Vision / Classification |
Cityscapes Dataset | Computer Vision / Segmentation |
ObjectNet | Computer Vision / Robustness Testing |
LAION 5B | Multimodal / Vision-Language |
NAIRR Datasets | Various / Research Datasets |
UCI Machine Learning Datasets | Traditional ML / Tabular |
Common Crawl | NLP / Web-Scale Corpus |
The Pile | NLP / Language Modeling |
C4 (Colossal Clean Crawled Corpus) | NLP / Pretraining Corpus |