This project implements and analyzes two fundamental machine learning algorithms:
- Perceptron algorithm for binary classification
- AdaBoost algorithm using line classifiers as weak learners
problem_2.py
: Perceptron implementation and experiments with true margin calculationperceptron.py
: Core perceptron algorithm and data preparation functionstrue_margin.py
: True margin calculation utilitiesproblem_3.py
: AdaBoost implementation and experimentsiris.txt
: Iris dataset
- Binary classification on Iris dataset pairs:
- Setosa vs Versicolor
- Setosa vs Virginica
- Features used: sepal width and petal length
- Calculates true margin between classes
- Runs until perfect separation is found (no iteration limit)
python problem_2.py
For each pair of classes:
- Final weight vector
- Number of mistakes
- True margin
- Uses line classifiers as weak learners
- Runs for T=8 iterations
- Performs 100 experimental runs
- 50-50 train-test split on Versicolor vs Virginica
python problem_3.py
For each classifier H₁ through H₈:
- Average True Error (test error)
- Average Empirical Error (training error)
- Python 3.10.12
- NumPy
- scikit-learn (for loading Iris dataset)
- Continues until perfect separation is found
- No maximum iteration limit
- Updates weights when classification mistakes are made
- Creates weak classifiers using lines between point pairs
- Updates distribution weights according to classification errors
- Combines weak classifiers with learned weights (α values)
- Calculates maximum possible margin between linearly separable classes
- Uses geometric approach with point-to-line distances
- Considers all valid three-point combinations
- Ensure all dependencies are installed
- Place iris.txt in the same directory
- Run either problem_2.py or problem_3.py
- Perceptron will run indefinitely for non-linearly-separable classes
- AdaBoost shows characteristic resistance to overfitting despite increasing model complexity