Technology forecasting using GNN

Research for master's degree of data scienece

Autonomous technology forecasting with GNN

About this project

In this project, I used link prediction algorithm based graph neural network to predict promising technology at self-driving vehicle field. I compared two different GNN models, graph convolutional network and variational graph auto-encoder. Among them, variational graph auto-encoder performs better than GCN. So I conduct link prediction task using VGAE in this project.

I will upload my paper ASAP.

Please check this if you want to know, how to make co-contribution network and how to extract promising technologies from network.

I will upload presentation file about link prediction.

Experiments details and results

The framework of this project is as follows.

A network was built based on a 'co-contribution relationship' between repositories. This is like projecting a heterogeneous network of developer-repositories to developers.

Refer to the figure below for how to build the network.

Community detection (Louvaion method) was used to create a community in the network, which represents an independent research area in the field of autonomous driving open source. In this study, six current major technical fields were derived.

The figure below represents six major autonomous driving open source technologies at the present time. Each node represents a repository. Through this, you can know the main technologies at the moment and the main repositories for each technology.

The figure below shows the result of running community detection again after link prediction. Through this, promising technologies for autonomous driving open source in the future can be derived.

Dataset

Studies on prediction of promising technologies in the past have mostly used paper data. However, the paper data has a disadvantage in that it is difficult to discover the latest research trends due to the time it takes from research to registration. So, I would like to use open source data to solve such shortcomings and make predictions about promising technologies that reflect the latest research trends.

The data used in the project are 385 repositories including keywords related to 'autonomous driving'.

Each repository has basic information such as 'repository name', 'owner', and 'star counts' as well as data such as 'contributor list'.

Statistics

23,017 repositories contain related keywords such as 'self-driving car' or 'autonomous drivig'
3.2% repositories are owned by 'organization' not 'user'. In this study, only repositories owned by these 'organizations' are dealt with.
385 repositories remained after filtering by 'contributor conts', 'stargazer couns' and 'forker counts'. They are finally used in experiments.

Features

data	data type
repository name	str
repository ID	int
owner ID	int
owner type	str
repository full name	str
topcis	list
contributors	list
contributor counts	int
stargazer counts	int
forker counts	int
created date	date
last updated datae	date
readme	str

Software Requirements

python >= 3.5
pytorch >= 1.9
pytorch geometric >= 2.02 : There are methods that are not supported in lower versions, so be sure to install them in this version or higher. Typically, the 'Train test edge split' method is not supported in previous versions.
scikit-learn
numpy
pandas
scipy
gephi : Tools for network visualization. It is not necessary to use this, but in this project, network visualization was performed using gephi. See here for more details.

Key files

link_prediction_GCN.py : Conduct link prediction using graph convolutional network model. In this project this model was not used because it did not perform well compared to other models.
link_prediction_GAE.py : The model used to predict the actual link. It gave better performance compared to GCN.
utils.py : Files are included to build the network and visualize the results. If you want to check to the degree or centrality of the network, run this file.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitignore		.gitignore
data		data
model		model
network		network
network_data		network_data
readme		readme
result		result
README.md		README.md
data_load.py		data_load.py
link_prediction_GAE.py		link_prediction_GAE.py
link_prediction_GAE_test.py		link_prediction_GAE_test.py
link_prediction_GCN.py		link_prediction_GCN.py
link_prediction_node2vec.py		link_prediction_node2vec.py
network_feature.ipynb		network_feature.ipynb
utils.py		utils.py
결과 테이블.pdf		결과 테이블.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Technology forecasting using GNN

Autonomous technology forecasting with GNN

About this project

Experiments details and results

Dataset

Statistics

Features

Software Requirements

Key files

About

Uh oh!

Releases

Packages

Languages

Kiminjo/Technology-forecasting-using-GNN

Folders and files

Latest commit

History

Repository files navigation

Technology forecasting using GNN

Autonomous technology forecasting with GNN

About this project

Experiments details and results

Dataset

Statistics

Features

Software Requirements

Key files

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages