This is a PHP API client/connector for Software Heritage (SWH) web API - currently in Beta phase. The client is wrapped round the Illuminate Http package
and the GuzzleHTTP
library.
Note
Detailed documentation can be found in the wiki pages of this very repository.
A demonstrable version (some features) can be accessed here: Demo Version
Working on new features and fixes will be gladly considered. Please feel free to report.
1) Clone this project.
2) Open a console session and navigate to the cloned directory:
Run "composer install"
This should involve installing the PHP REPL, PsySH
3) (Optional) Acquire SWH tokens for increased SWH-API Rate-Limits.
4) Prepare .env file and add tokens:
4.1) Rename/Copy the cloned ".env.example" file to .env
cp .env.example .env
4.2) (Optional) Edit these two token keys:
SWH_TOKEN_PROD=Your_TOKEN_FROM_SWH_ACCOUNT # step 3)
SWH_TOKEN_STAGING=Your_STAGING_TOKEN_FROM_SWH_ACCOUNT # step 3)
5) (optional) Add psysh to PATH.
In a console session inside the cloned directory, start the php REPL:
$ psysh // if not added to PATH replace with: vendor/bin/psysh
Psy Shell v0.12.0 (PHP 8.2.0 — cli) by Justin Hileman
This will open a REPL console-based session where one can test the functionality of the api classes and their methods before building a suitable workflow/use-cases.
As a one-time configuration parameter, you can set the desired returned data type by SWH (default JSON):
> namespace Module\HTTPConnector;
> use Module\HTTPConnector;
> HTTPClient::setOptions(responseType:'object') // json/collect/object available
- More details on the default configs: Default Configurations
- More details on further options set: Preset Configurations.
Retrieve Latest Full Visit in the SWH archive:
> namespace Module\OriginVisits;
> use Module\OriginVisits;
> $visitObject = new SwhVisits('https://github.com/torvalds/linux/');
> $visitObject->getVisit('latest', requireSnapshot: true)
More details on further swh visits methods: SwhVisits.
As graph Nodes, retrieve node Contents, Edges or find a Path to other nodes (top-bottom):
> namespace Module\DAGModel;
> use Module\DAGModel;
> $snpNode = new GraphNode('swh:1:snp:bcfd516ef0e188d20056c77b8577577ac3ca6e58')
> $snpNode->nodeHopp() // node contents
> $snpNode->nodeEdges() // node edges keyed by the respective name
> $revNode = new GraphNode('swh:1:rev:9cf5bf02b583b93aa0d149cac1aa06ee4a4f655c')
> $revNode->nodeTraversal('deps/nghttp2/lib/includes/nghttp2/nghttp2ver.h.in') // traverse to a deeply nested file
More details on:
- General Node Methods.
- The Graph methods:
You can specify repositories URL w/o paths and archive to SWH using one of the two variants (static/non-static methods
):
> namespace Module\Archival;
> use Module\Archival;
> $saveRequest = new Archive('https://github.com/torvalds/linux/') // Example 1
> $saveRequest->save2Swh()
> $newSaveRequest = Archive::repository('https://github.com/hylang/hy/tree/stable/hy/core') // Example 2
// in both cases: the returned POST response contains the save request id and date
Enquire about archival status using the id/date of the archival request (available in the initial POST response)
> $saveRequest->getArchivalStatus($saveRequestDateOrID) // current status is returned
> $saveRequest->trackArchivalStatus($saveRequestDateOrID) // tracks until archival has succeeded
More details on further archive methods: Archive.
Validate a given swhID. TypeError
is thrown for non-valid swhIDs.
> namespace Module\DataType;
> use Module\DataType;
$snpID = new SwhcoreId('swh:1:snp:bcfd516ef0e188d20056c77b8577577ac3ca6e5Z') // throws TypeError Exception
Full details of the SWHID persistent Identifiers: Syntax
Note
Todo: Core identifiers with qualifiers.
Returns a list of metadata authorities that provided metadata on the given target
> namespace Module\MetaData;
> use Module\MetaData;
> SwhMetaData::getOriginMetaData('https://github.com/torvalds/linux/')
More details on further metadata methods: Metadata.