Skip to content
jordanell edited this page Jul 3, 2012 · 14 revisions

Welcome to the scm2pgsql wiki!

The scm2pgsql is a tool developed by the Eggnet team in the S.E.G.A.L lab at the University of Victoria. This tool will take a Git repository (with tools to convert other major SCMs to Git) and dump all of it's commit, revision, file, etc. data into a PostgreSQL database for further analysis.


Quick User Guide

To use the scm2pgsql tool, you must have a local Git repository. If you do not have a git repository but another major SCM, please follow one of the conversion guides found below.

Convert SVN to Git
Convert CVS to Git
Convert Mercurial to Git

Once you have your local git repository, you must have a PostgreSQL database to be able to dump the information to. You can install PostgreSQL on a local machine or set up a server for remote access. PostgreSQL can be downloaded here.

To be able to run scm2pgsql, the project assumes that https://github.com/eggnet/libs is checked out in a sibling directory to the scm2pgsql project

To compile from the top folder: javac -cp .:libs/*:libs/db:../libs/differ/*.jar:../libs/database/*.jar -sourcepath src src/scm2pgsql/Main.java

Running from the top folder: java -cp .:libs/*:libs/db/*:../libs/database/*:../libs/differ/*:src/ scm2pgsql/Main [path to git repo]

You might need to expand the *.jar manually in case of problems

How It Works

Scm2pgsql accepts a single argument which is the path to a local git repository. Once started, scm2pgsql will traverse over the entire git repository and dump all relevant information into the database schema. Scm2pgsql uses jGit to traverse the tree and obtain all information regarding commits data, but uses eggnet's differ project in order to obtain any information regarding diffs between two commits.

Result

The result of scm2pgsql is similar to a database version of a git repository. The goal of this project is to supply a database to the Call Graph Analyzer program so that technical networks may be generated. This intermediate step of converting to a database speeds up creating technical networks by allowing data mining that just isn't easily accomplished through a standard git repository.

Because the technical networks are only generated through the Call Graph Analyzer program, the networks, edges and nodes tables are left empty after this project has completed a run.

The database created by scm2pgsql is also modified when the Fix Inducing Changes has been run by adding an additional table.

Clone this wiki locally