<aside>
💡
Curriculum
The overall curriculum is based on: https://github.com/mlabonne/llm-course
Some other inspirations:
-
https://github.com/mlabonne/llm-course?search=1
-
https://github.com/louisfb01/start-llms?tab=readme-ov-file (good paper lists)
Deploy Models & Paper Lists:
Great Thoughts on LLM:
https://github.com/Mooler0410/LLMsPracticalGuide?tab=readme-ov-file#practical-guide-for-models
Monthly Paper List:
https://github.com/Mooler0410/LLMsPracticalGuide?tab=readme-ov-file#practical-guide-for-models
LLM Fundamentals
Module 1. Mathematics for Machine Learning
Calculus
- Overview: Many machine learning algorithms involve the optimization of continuous functions, which requires an understanding of derivatives, integrals, limits, and series. Multivariable calculus and the concept of gradients are also important.
- Resources
Probability and Statistics
- Overview: These are crucial for understanding how models learn from data and make predictions. Key concepts include probability theory, random variables, probability distributions, expectations, variance, covariance, correlation, hypothesis testing, confidence intervals, maximum likelihood estimation, and Bayesian inference.
- Resources
Linear Algebra
- Overview: This is crucial for understanding many algorithms, especially those used in deep learning. Key concepts include vectors, matrices, determinants, eigenvalues and eigenvectors, vector spaces, and linear transformations.
- Resources:
Python Basics
- Python Basics: Python programming requires a good understanding of the basic syntax, data types, error handling, and object-oriented programming.
- Resources:
- Real Python: A comprehensive resource with articles and tutorials for both beginner and advanced Python concepts.
- freeCodeCamp - Learn Python: Long video that provides a full introduction into all of the core concepts in Python.
Data Science Libraries & Data PreProcessing
- Data Science Libraries: It includes familiarity with NumPy for numerical operations, Pandas for data manipulation and analysis, Matplotlib and Seaborn for data visualization.
- Data Preprocessing: This involves feature scaling and normalization, handling missing data, outlier detection, categorical data encoding, and splitting data into training, validation, and test sets.
- Resources:
Machine Learning Libraries
- Machine Learning Libraries: Proficiency with Scikit-learn, a library providing a wide selection of supervised and unsupervised learning algorithms, is vital. Understanding how to implement algorithms like linear regression, logistic regression, decision trees, random forests, k-nearest neighbors (K-NN), and K-means clustering is important. Dimensionality reduction techniques like PCA and t-SNE are also helpful for visualizing high-dimensional data.
- Resources:
- freeCodeCamp - Machine Learning for Everybody: Practical introduction to different machine learning algorithms for beginners.
- Udacity - Intro to Machine Learning: Free course that covers PCA and several other machine learning concepts.
</aside>
LLM Fundamentals Recordings & Materials