Skip to content

Conversation

Sakeeb91
Copy link

Summary

  • add random forest training and evaluation helpers
  • extend dataset and plotting utilities for classification workflows
  • build a Streamlit page to explore metrics, confusion matrix, ROC curve, and feature importances

Testing

  • python3 - <<'PY'
    from models.random_forest import train_random_forest, evaluate_random_forest
    from utils.data_helpers import generate_sample_classification
    from sklearn.model_selection import train_test_split

X, y = generate_sample_classification(n_samples=300, n_features=6, n_informative=4, n_redundant=1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, stratify=y, random_state=42)
model = train_random_forest(X_train, y_train, n_estimators=100, max_depth=None, criterion='gini', random_state=42)
y_pred, y_proba, metrics = evaluate_random_forest(model, X_test, y_test)
print('Metrics:', metrics)
print('Proba available:', y_proba is not None)
PY

Closes #58.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Random Forest simulator

2 participants