In this capstone project, the goal is to build a classifier that can classify the tickets by analysing text. details about the data and dataset files are given in below link https://drive.google.com/file/d/1OZNJm81JXucV3HmZroMq6qCT2m7ez7IJ
Overview
- Exploring the given Data files
- Understanding the structure of data
- Missing points in data
- Finding inconsistencies in the data
- Visualizing different patterns
- Visualizing different text features
- Dealing with missing values
- Text pre-processing
- Creating word vocabulary from the corpus of report text data
- Creating tokens as required
Overview
- Building a model architecture which can classify.
- Trying different model architectures by researching state of the art for similar tasks.
- Train the model to deal with large training time, save the weights so that you can use them when training the model for the second time without starting from scratch.
Overview
- Test the model and report as per evaluation metrics
- Try different models
- Try different evaluation metrics
- Set different hyper parameters, by trying different optimizers, loss functions, epochs, learning rate, batch size, checkpointing, early stopping etc. for these models to fine-tune them
- Report evaluation metrics for these models along with your observation on how changing different hyper parameters leads to change in the final evaluation metric.
``` ----- to be updated further as and when we move along ---- ```