Skip to content

Commit

Permalink
readme and start debuging cron jobs
Browse files Browse the repository at this point in the history
  • Loading branch information
pjc09h committed Sep 25, 2023
1 parent 826cf46 commit 48237ae
Show file tree
Hide file tree
Showing 5 changed files with 81 additions and 130 deletions.
160 changes: 46 additions & 114 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,9 @@
# 🧪 BioGazelle

This software is twice removed from the original
[What.cd Gazelle](https://github.com/WhatCD/Gazelle).
It's based on the security hardened PHP7 fork
[Oppaitime Gazelle](https://github.com/biotorrents/oppaiMirror).
It shares several features with
[Orpheus Gazelle](https://github.com/OPSnet/Gazelle)
and incorporates certain innovations by
[AnimeBytes](https://github.com/anniemaybytes).
The goal is to organize a functional database with pleasant interfaces,
and render insightful views using data from robust external sources.
This software is twice removed from the original [What.cd Gazelle](https://github.com/WhatCD/Gazelle).
It's based on the security hardened PHP7 fork [Oppaitime Gazelle](https://github.com/biotorrents/oppaiMirror).
It shares several features with [Orpheus Gazelle](https://github.com/OPSnet/Gazelle) and incorporates certain innovations by [AnimeBytes](https://github.com/anniemaybytes).
The goal is to organize a functional database with pleasant interfaces, and render insightful views using data from robust external sources.

# Changelog: Bio ← OT

Expand All @@ -20,150 +14,99 @@ The points are presented in no particular order.
## Built to scale, micro or macro

BioGazelle is pretty fast out of the box, on a single budget VPS.
If you want to scale horizontally, the software supports both
[Redis clusters](app/Cache.php) and
[database server replication](app/Database.php).
If you want to scale horizontally, the software supports both [Redis clusters](app/Cache.php) and [database server replication](app/Database.php).
Please note that Redis clusters expect at least three nodes.
This lower limit is inherent to Redis'
[cluster implementation](https://redis.io/docs/management/scaling/).
This lower limit is inherent to Redis' [cluster implementation](https://redis.io/docs/management/scaling/).

### Universal database id's

BioGazelle is in the process of migrating to
[UUID v7 primary keys](https://uuid.ramsey.dev/en/stable/rfc4122/version7.html)
to enable useful content-agnostic operations such as tagging and AI integration.
BioGazelle is in the process of migrating to [UUID v7 primary keys](https://uuid.ramsey.dev/en/stable/rfc4122/version7.html) to enable useful content-agnostic operations such as tagging and AI integration.
This will consolidate the database and allow for powerful cross-object association.
The UUIDs are stored as binary strings for index speed and to minimize disk usage.
By the way, *all* binary data is transparently converted by the
[database wrapper](app/Database.php).
By the way, *all* binary data is transparently converted by the [database wrapper](app/Database.php).

## Full stack search engine rewrite

Data indexing is important, so BioGazelle has upgraded to
[Manticore Search](https://manticoresearch.com),
the successor to Sphinx.
This upgrade also involved a
[rewrite of the search configuration](utilities/config/manticore.conf)
from scratch, based on AnimeBytes' example.
The Gazelle frontend itself uses a
[rewritten browse.php controller](sections/torrents/browse.php) and a
[brand new Twig template](templates/torrents/search.twig).
Oh yeah, the
[PHP backend class](app/Manticore.php)
is also completely rewritten, replacing at least four legacy classes.
Data indexing is important, so BioGazelle has upgraded to [Manticore Search](https://manticoresearch.com), the successor to Sphinx.
This upgrade also involved a [rewrite of the search configuration](utilities/config/manticore.conf) from scratch, based on AnimeBytes' example.
The Gazelle frontend itself uses a [rewritten browse.php controller](sections/torrents/browse.php) and a [brand new Twig template](templates/torrents/search.twig).
Oh yeah, the [PHP backend class](app/Manticore.php) is also completely rewritten, replacing at least four legacy classes.

## Secure authentication system

The user handling, including registration, logins, etc.,
has been rewritten into a unified system in the
[Auth class](app/Auth.php).
The user handling, including registration, logins, etc., has been rewritten into a unified system in the [Auth class](app/Auth.php).
The system acts as an oracle that takes inputs and returns messages.
Passphrase hashing is all done with `PASSWORD_DEFAULT`, ready for Argon2id.

I tested this extensively and determined that prehashing passphrases was no good.
Not only it is impossible upgrade the algorithm, e.g., from `sha256` to `sha3-512`,
but prehashing lowers the total entropy of long strings even if binary is used throughout.
Not only it is impossible upgrade the algorithm, e.g., from `sha256` to `sha3-512`, but prehashing lowers the total entropy of long strings even if binary is used throughout.
Test it yourself with 72 bytes of random binary data (the `bcrypt` max) and an entropy calculator.

BioGazelle enforces a 15-character minimum passphrase length and imposes no other limitations.
This is consistent with the list of
[OWASP best practices](https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html).
This is consistent with the list of [OWASP best practices](https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html).
In fact, the whole class is informed by this document.

### Bearer token authorization

[Read the API documentation.](https://docs.torrents.bio)
API tokens can be generated in the
[user security settings](templates/user/settings/settings.twig)
and used with the JSON API.
[Internal API calls](app/API/Internal.php)
for Ajax and such use a special token that can safely be exposed to the frontend.
It's based on hashing a
[rotating server secret](utilities/crontab/siteApiSecret.php)
concatenated with a secure session cookie.
API tokens can be generated in the [user security settings](templates/user/settings/settings.twig) and used with the JSON API.
[Internal API calls](app/Api/Internal.php) for Ajax and such use a special token that can safely be exposed to the frontend.
It's based on hashing a [rotating server secret](utilities/crontab/hourly/siteApiSecret.php) concatenated with a secure session cookie.

The session cookies themselves are tight, btw.
No JavaScript access, scoped to the same site, long length, etc.
This kind of stuff is in the
[low level Http class](app/Http.php).
This kind of stuff is in the [low level Http class](app/Http.php).

### WebAuthn security tokens

BioGazelle has always supported hardware keys thanks to Oppaitime.
But we took it up a notch by upgrading this system to use the
[modern WebAuthn standard](app/WebAuthn/Base.php)
instead of the deprecated FIDO U2F standard.
[This specification](https://webauthn.guide)
is well supported in all major browsers, and it doesn't require a $50 dongle:
But we took it up a notch by upgrading this system to use the [modern WebAuthn standard](app/WebAuthn/Base.php) instead of the deprecated FIDO U2F standard.
[This specification](https://webauthn.guide) is well supported in all major browsers, and it doesn't require a $50 dongle:
use a hardware key, a smartphone fingerprint or QR code reader, or just generate a key in the browser.
The underlying library is the canonical
[web-auth/webauthn-lib](https://github.com/web-auth/webauthn-lib).
The underlying library is the canonical [web-auth/webauthn-lib](https://github.com/web-auth/webauthn-lib).

## OpenAI integration

One of BioGazelle's goals is to place data in context using
[OpenAI's completions API](app/OpenAI.php)
to generate tl;dr summaries and tags from content descriptions.
Just paste your abstract into the torrent group description
and get a succinct natural language summary with tags.
One of BioGazelle's goals is to place data in context using [OpenAI's completions API](app/OpenAI.php) to generate tl;dr summaries and tags from content descriptions.
Just paste your abstract into the torrent group description and get a succinct natural language summary with tags.
It's possible to disable AI content display in the user settings.

## Twig template system

[BioGazelle's Twig interface](app/Twig.php)
takes cues from OPS's extended filters and functions.
Twig provides a security benefit by escaping rendered output,
and a secondary benefit of clarifying the PHP running the site sections.
[BioGazelle's Twig interface](app/Twig.php) takes cues from OPS's extended filters and functions.
Twig provides a security benefit by escaping rendered output, and a secondary benefit of clarifying the PHP running the site sections.
Everything you could need is a globally available template variable.

A quick note about template inheritance.
Everything extends a clean HTML5 base template.
Torrent, collections, requests, etc., and their respective sidebars
are implemented as semantic HTML5 in easily digestible chunks of content.
No more mixed PHP code and HTML markup!
A quick note about template inheritance: everything extends a clean HTML5 base template.
Torrent, collections, requests, etc., and their respective sidebars are implemented as semantic HTML5 in easily digestible chunks of content.
No more mixed PHP code and HTML markup, at least in new development!

### Markdown and BBcode support

BioGazelle uses the
[SimpleMDE markdown editor](https://simplemde.com)
with a reasonably extended
[custom editor interface](templates/_base/textarea.twig).
All the Markdown Extra features supported by
[Parsedown Extra](https://github.com/erusev/parsedown-extra)
are documented and the useful ones are exposed in the editor.
The default recursive regex BBcode parser (yuck) is replaced by
[Vanilla NBBC](https://github.com/vanilla/nbbc).
BioGazelle uses the [SimpleMDE markdown editor](https://simplemde.com) with a reasonably extended [custom editor interface](templates/_base/textarea.twig).
All the Markdown Extra features supported by [Parsedown Extra](https://github.com/erusev/parsedown-extra) are documented and the useful ones are exposed in the editor.
The default recursive regex BBcode parser (yuck) is replaced by [Vanilla NBBC](https://github.com/vanilla/nbbc).
Parsed texts are cached for speed, using both Redis and the Twig disk cache.

### Good typography

BioGazelle supports an array of
[unobtrusive fonts](resources/scss/assets/fonts.scss)
with the appropriate glyphs for bold, italic, and monospace.
BioGazelle supports an array of [unobtrusive fonts](resources/scss/assets/fonts.scss) with the appropriate glyphs for bold, italic, and monospace.
These options are available to every theme.
Font Awesome 5 is also universally available, as is the
[entire Material Design color palette](resources/scss/assets/colors.scss).
Font Awesome 5 is also universally available, as is the [entire Material Design color palette](resources/scss/assets/colors.scss).
[Download the fonts to get started.](https://torrents.bio/fonts.tgz)
Also, there are two simple color modes,
[calm mode and dark mode](resources/scss/global/colors.scss),
that I like to think are pleasing to the eye.
Also, there are two simple color modes, [calm mode and dark mode](resources/scss/global/colors.scss), that I like to think are pleasing to the eye.

## Active data minimization

BioGazelle has
[real lawyer-vetted policies](templates/siteText/legal).
In the process of matching the tech to the legal word,
I dropped support for a number of compromising features:
BioGazelle has [real lawyer-vetted policies](templates/siteText/legal).
In the process of matching the tech to the legal word, I dropped support for a number of compromising features:

- Bitcoin, PayPal, and currency exchange API and system calls;
- Bitcoin addresses, user donation history, and similar metadata; and
- IP address and geolocation, email address, passphrase, and passkey history.

The software license is also updated to use
[OpenBSD's license template](https://www.openbsd.org/policy.html)
instead of the questionable Unlicense that may be illegal in Germany and elsewhere.
The software license is also updated to [OpenBSD's license template](https://www.openbsd.org/policy.html) instead of the potentially unenforceable Unlicense.
We seek to make our original code available to all with as few restrictions as possible.
No fuss, no muss.
Besides that, BioGazelle has several passive developments in progress:

- prepare all queries with parameterized statements;
Expand All @@ -174,37 +117,26 @@ Besides that, BioGazelle has several passive developments in progress:

## Proper application layout

BioGazelle takes cues from the best-of-breed PHP framework Laravel.
BioGazelle takes cues from the best-of-breed PHP framework Laravel, to a carefully measured extent.
The source code is reorganized along Laravel's lines while maintaining the comfy familiarity of OT/WCD Gazelle.
The app logic, config, and Git repo lies outside the web root for enhanced security.
The app logic, config, cron jobs, Git repo, etc., lie outside the web root for better security.

BioGazelle uses the Flight router to define app routes.
Features include clean URIs and centralized middleware.
An ongoing project involves modernizing the app based on Laravel's excellent tools,
with help from other personally-vetted libraries that may be lighter.
BioGazelle uses the Flight router to define app routes, implementing clean URIs and centralized middleware.
An ongoing project involves modernizing the app based on Laravel's tools, with help from lighter personally-vetted libraries.

### App singleton

[The main site configuration](config/public.php)
uses recursive
[Laravel Collections](https://laravel.com/docs/master/collections)
with the
[ENV special class](app/ENV.php).
Also, the whole app is always instantly available:
the config, database, cache, current user, Twig engine, etc.,
are accessible with a simple call to `Gazelle\App::go()`.
All such objects use the same quick and easy go → factory → thing API.
Just in case you need to extend some core object without headaches.
[The main site configuration](config/public.php) implements recursive [Laravel Collections](https://laravel.com/docs/master/collections) with the [ENV special class](app/ENV.php).
Also, the whole app is always instantly available: the config, database, cache, current user, Twig engine, etc., are accessible with a simple call to `Gazelle\App::go()`.
All such objects use the same quick and easy go → factory → thing API, just in case you need to extend some core object without headaches.

### Decent debugging

BioGazelle seeks to be easy and fun to develop.
I collected the old debug class monstrosity into a nice little bar.
There's also no more `DEBUG_MODE` or random permissions.
There's just a development mode that spits everything out, and a production mode that doesn't.
There's also no more `DEBUG_MODE` or random permissions, just a simple toggle on `$app->env->dev`.

The entire app is also available on the command line for cron jobs, development, and fun.
Good for BioGazelle, good for America!
Just run `php shell` from the repository root to get up and running.
This is based on Laravel Tinker and in fact uses the same REPL under the hood.

Expand Down
18 changes: 14 additions & 4 deletions app/Database.php
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
*
* @see https://phpdelusions.net/pdo/pdo_wrapper
* @see https://github.com/DoctorMcKay/php-mypdoms
*
* todo: consider updating this to use RecursiveCollection instances
*/

namespace Gazelle;
Expand Down Expand Up @@ -68,6 +70,9 @@ public function __wakeup()
}


/** */


/**
* go
*
Expand Down Expand Up @@ -319,8 +324,11 @@ public function slug(string $string): string
*
* Determine the identifier to use for a query.
* Used for finding stuff by id, uuid, or slug.
*
* @param int|string $id
* @return string
*/
public function determineIdentifier(int|string $id)
public function determineIdentifier(int|string $id): string
{
$app = \Gazelle\App::go();

Expand Down Expand Up @@ -351,7 +359,7 @@ public function determineIdentifier(int|string $id)
* @param array $row single database row
* @return array translated row
*/
private function translateBinary(array $row)
private function translateBinary(array $row): array
{
# uuid v7
$row["uuid"] ??= null;
Expand Down Expand Up @@ -469,8 +477,9 @@ public function do(string $query, array $arguments = [], ?string $hostname = nul
* @param string $query
* @param array $arguments
* @param ?string $hostname
* @return mixed
*/
public function single(string $query, array $arguments = [], ?string $hostname = null)
public function single(string $query, array $arguments = [], ?string $hostname = null): mixed
{
$app = \Gazelle\App::go();

Expand Down Expand Up @@ -502,8 +511,9 @@ public function single(string $query, array $arguments = [], ?string $hostname =
* @param string $query
* @param array $arguments
* @param ?string $hostname
* @return ?array
*/
public function row(string $query, array $arguments = [], ?string $hostname = null)
public function row(string $query, array $arguments = [], ?string $hostname = null): ?array
{
$app = \Gazelle\App::go();

Expand Down
10 changes: 5 additions & 5 deletions app/ENV.php
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ class ENV
private static $instance = null;

# config option receptacles
public RecursiveCollection $public; # site meta, options, resources, etc.
private RecursiveCollection $private; # passwords, app keys, database, etc.
public Gazelle\RecursiveCollection $public; # site meta, options, resources, etc.
private Gazelle\RecursiveCollection $private; # passwords, app keys, database, etc.


/**
Expand Down Expand Up @@ -164,8 +164,8 @@ public static function go(array $options = []): self
*/
private function factory(array $options = []): void
{
$this->public = new RecursiveCollection();
$this->private = new RecursiveCollection();
$this->public = new Gazelle\RecursiveCollection();
$this->private = new Gazelle\RecursiveCollection();
}


Expand Down Expand Up @@ -231,7 +231,7 @@ public function toArray(mixed $object): mixed
public function toObject(mixed $array): mixed
{
if (is_iterable($array)) {
$return = new RecursiveCollection($array);
$return = new Gazelle\RecursiveCollection($array);

foreach ($return as &$item) {
$item = $this->toObject($item);
Expand Down
17 changes: 10 additions & 7 deletions app/RecursiveCollection.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@
* @see https://github.com/etconsilium/php-recursive-array-object
*/

class RecursiveCollection extends Illuminate\Support\Collection
namespace Gazelle;

class RecursiveCollection extends \Illuminate\Support\Collection
{
/**
* __construct
Expand All @@ -28,12 +30,13 @@ public function __construct(mixed $input = [])
{
/*
# https://laravel.com/docs/master/collections#extending-collections
self::make($this->macros())
->reject(fn ($class, $macro) => self::hasMacro($macro))
->each(fn ($class, $macro) => self::macro($macro, $class()));
# second try, maybe this works
foreach ($this->macros() as $macro => $class) {
self::macro($macro, function () use ($class) {
return $this->map(function ($value) use ($class) {
return new self(new $class($value));
});
});
self::macro($macro, $class());
}
*/

Expand All @@ -53,7 +56,7 @@ public function __construct(mixed $input = [])
*/
public function __get(mixed $key): mixed
{
return $this->get($key) ?? null;
return $this->get($key);
}


Expand Down
Loading

0 comments on commit 48237ae

Please sign in to comment.