Skip to content

Latest commit

 

History

History
114 lines (95 loc) · 3.98 KB

README.md

File metadata and controls

114 lines (95 loc) · 3.98 KB

SOFTware-Viz

logo_full_HUB

Developed with the software and tools below.

HTML5 Python last-commit repo-top-language repo-language-count

Capture d’écran du 2024-06-03 16-39-41

Presentation of the project

🛑 This application is currently designed to interact with and harvest metadata from HAL linked to the database.

🛑 A lighter version of the application is under development, allowing anyone to create their own application without requiring a connection to HAL.

DB of PDF

The process begins with a Database of PDF files. These PDFs are scholarly PDFs that need to be extracted and processed.

GROBID

The PDFs are sent to GROBID, a tool used to extract structured data (like bibliographic information) from scholarly PDFs. GROBID processes the PDFs and outputs XML files. This is a crucial step in extracting machine-readable information from the documents.

SOFTCITE

After GROBID, the extracted data (likely enriched or supplemented data) is passed to SOFTCITE, which generates JSON outputs. SOFTCITE analyzes citations, software mentions, or related information in the PDF files like references.

SOFTware-Sync

The extracted data (XML and JSON) is then passed to SOFTware-Sync, which is a tool that synchronizes the data into one single XML.

SOFTware-Viz

SOFTware-Viz is responsible for visualizing the processed data. It likely takes the synchronized data from SOFTware-Sync and transforms it into visual outputs or dashboards.

ArangoDB

The processed data is stored in ArangoDB, a multi-model NoSQL database, to manage both structured data. This database serves as the main storage for the extracted information/mentions.

Flask

Flask is a web framework used for developing web applications. Flask interacts with both SOFTware-Viz (for visualizations) and ArangoDB (for retrieving data).


Installation

From source

  1. Clone the repository:
git clone ../
  1. Change to the project directory:
cd ./SOFTware-viz
  1. Create a virtualenv:
python -m venv env
  1. Install docker image
docker pull arangodb/arangodb:3.11.6
  1. Launch docker container
docker run -p 8529:8529 -e ARANGO_NO_AUTH=1 arangodb/arangodb:3.11.6
  1. Create the database "SOF-viz"
go to the port http://localhost:8529/ and create mannualy the database named "SOF-viz"
  1. Launch the virtualenv
source env/bin/activate
  1. Install the dependencies:
pip install -r requirement.txt
  1. Launch the app
python run.py

Usage

From source

Run using the command below (the database will create itself only on the first launch):

python run.py

License

This project is protected under the SELECT-A-LICENSE License. For more details, refer to the LICENSE file.