PREDICT CLOSED QUESTIONS ON STACK OVERFLOW

Project dependencies

This project aims to develop a machine learning model using Bidirectional Encoder Representations from Transformers (BERT) to predict whether a question posted on Stack Overflow will be closed. This can assist moderators in identifying potentially low-quality questions and improve the overall platform's efficiency.

Requirements

Python 3.6+
Necessary libraries (install using pip install <library_name>)
- transformer
- torch (GPU-accelerated training highly recommended)
- pandas
- numpy
- sklearn (for preprocessing)

DATA

The code assumes you have access to a Stack Overflow dataset containing questions (textual content), labels indicating whether they were open, not a real question, not constructive, too localized or off topic, and potentially additional features like tags, timestamps, or user information.
Download or prepare your dataset accordingly.

Model Architecture

This code utilizes a fine-tuned BERT model for text classification. BERT effectively captures contextual relationships within text, making it well-suited for this task.

This is the architecture on which the model is trained

Code Structure

tokenizer.py: This Python file (tokenizer.py) is designed to implement a tokenizer function that breaks down text strings into smaller units called tokens. These tokens can be words, punctuation marks, characters, or n-grams (sequences of n words) depending on the specific tokenization method employed.
preprocessing.py: This is a Python file (preprocessing.py) which is designed to implement on to extract all the important columns in the given dataset and to concat to form a single column containing all the important information and the tokenization is applied on this column and this is send as input in input layer of BERT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
predict-closed-questions-on-stackoverflow.ipynb		predict-closed-questions-on-stackoverflow.ipynb
preprocess.py		preprocess.py
tokenizer.py		tokenizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PREDICT CLOSED QUESTIONS ON STACK OVERFLOW

Project dependencies

Requirements

DATA

Model Architecture

Code Structure

About

Releases

Packages

Languages

Saptarshi-iitbhu/Predict-Closed-Questions-on-StackOverflow

Folders and files

Latest commit

History

Repository files navigation

PREDICT CLOSED QUESTIONS ON STACK OVERFLOW

Project dependencies

Requirements

DATA

Model Architecture

Code Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages