Kakao Arena - Product Classification

Team: baseIine

Public Leaderboard(2019/01/07)

Features

Fully dockerized environment
Input Pipeline
- Tokenize product metadata with Okt POS Tagger
- Use TFRecord
5 classifiers with 2-layer MLP
- one for concatenated label of b,m,s,d
- 4 classifiers for each category
Adversarial Training

Results

The metric 'score' is calculated by the equation as follows:
- score=(1.0 * b_acc + 1.2 * m_acc + 1.3 * s_acc + 1.4 * d_acc)
The model Final was used to report our final results on dev, test
Download trained weights here

Model	Dev score	Test score(TBD)	File Size
Intermediate	1.07799	-	966MB
Ensemble	1.080755	-	5*966MB
*Final	1.077696	-	966MB

Requirements

Docker
python >=2.7
- Tensorflow >=1.12
- Keras
- Othres: h5py, tqdm, easydict
Enough storage space at least 400GB

Reproduce results

Setup

Download datasets from kakao arena

Run a docker

$ bash build.sh
$ bash run.sh

[Note] Edit DATA_PATH from run.sh

For example,

ls $DATA_PATH
|- dev.chunk.01
|- test.chunk.01
|- test.chunk.02
|- train.chunk.01
|- train.chunk.02
|- train.chunk.03
|- train.chunk.04
|- train.chunk.05
|- train.chunk.06
|- train.chunk.07
|- train.chunk.08
`- train.chunk.09

Option1: Use pretrained weights

Download weights Dropbox Link
Copy weights to /data/output/interim, /data/output/final

$ bash scripts/eval.sh 0 interim 70 # for validation
$ bash scripts/inference.sh 0 interim 70 dev # for submission
$ bash scripts/inference.sh 0 interim 70 test # for submission

$ bash scripts/inference.sh 0 final 12 dev # for submission
$ bash scripts/inference.sh 0 final 12 test # for submission

Option2: Train a model from scratch

$ bash reproduce.sh

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
docs		docs
models		models
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
build.sh		build.sh
eval.py		eval.py
inference.py		inference.py
input_generator.py		input_generator.py
logger.py		logger.py
losses.py		losses.py
misc.py		misc.py
reproduce.sh		reproduce.sh
run.sh		run.sh
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kakao Arena - Product Classification

Features

Results

Requirements

Reproduce results

Setup

Run a docker

Option1: Use pretrained weights

Option2: Train a model from scratch

Reference

License

About

Releases

Packages

Languages

License

tantara/kakao-arena-product-classification

Folders and files

Latest commit

History

Repository files navigation

Kakao Arena - Product Classification

Features

Results

Requirements

Reproduce results

Setup

Run a docker

Option1: Use pretrained weights

Option2: Train a model from scratch

Reference

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages