Skip to content

SeoFriendlyURLs

bgreenwood edited this page Jul 4, 2012 · 3 revisions

In Release 8, ANDS decided to move from the very programatic approach to URL structure to a more human-readable and informative URL. This is best explained by example:

Pre-R8:   http://services.ands.org.au/home/orca/rda/view/?key=516811d7-cd55-207a-e0440003ba8c79dd
Release8: http://researchdata.ands.org.au/analysis-of-dugong-distribution-coastwatch-data

Whilst the domain name is independent of the ANDS software, the ability to use these human-readable and SEO-friendly "SLUGs" is powered by changes to the .htaccess (utilising Apache's mod_rewrite module) and a new "dispatcher" which overrides the CodeIgniter default core functionality for mapping URL requests to Controllers' methods. The dispatcher will receive all URL requests, check whether an existing controller exists (using almost identical logic to CodeIgniter's core) and if not, attempt to resolve the URL using the tbl_url_mappings in the ORCA database (See /rda/.../controllers/dispatcher.php). This is assigned as the default CodeIgniter route in rda/.../config/routes.php using the catch-all statement: $route['(:any)'] = 'dispatcher/$1'; (effectively forward all web requests to the dispatcher controller).

Some implementation notes:

  • SLUGs are based on a scrubbed version of the display title for the record and are generated by the registry software and stored in the url_slug field in tbl_registry_objects
  • Only published records are allocated a SLUG
  • If a record changes title, all previous "histories" of the SLUG will still point to record key (prevent dead links)
  • If a record is deleted the link generated from the SLUG will give a "soft 404" explaining that the record is no longer in the registry but suggesting search results with a similar title (see /rda/.../views/soft404.php)
  • As the mapping from keys to slugs moves from a higher dimension space to a lower dimension space, there exists the possibility of conflicts for SLUGs. In this case:
    • Temporal precedence will mean that the first record in the registry will be allocated the SLUG
    • Any subsequent records will be allocated the slug with a scrubbed version of their record key appended
    • In the highly unlikely event that this still doesn't produce a unique SLUG, dashes will be appended until this uniqueness is achieved
  • Records which have the same name (and therefore SLUG) as an RDA link (such as "view", "home", "search") will not be able to be viewed in RDA. Use of these names should be discouraged and doesn't serve any practical purpose.
  • SLUGs are capped at 255 characters of the scrubbed title

Implementing URL SLUGs

  • URL slugs will be automatically generated by Release 8+ when the "HOURLY_REGISTRY_MAINTENANCE" task runs for the first time
  • In order to configure apache to rewrite all requests to the dispatcher, this .htaccess file should be included in the root directory of your web server:

.htaccess

<IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteRule ^assets/(.*) rda/assets/$1 [L]
    RewriteRule ^css/(.*) rda/css/$1 [L]
    RewriteRule ^js/(.*) rda/js/$1 [L]
    RewriteRule ^img/(.*) rda/img/$1 [L]

    RewriteRule ^$ rda/index.php? [L]

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*)$ rda/index.php?/$1 [L]
</IfModule>
<IfModule !mod_rewrite.c>
    # If we don't have mod_rewrite installed, all 404's
    # can be sent to index.php, and everything works as normal.
    # Submitted by: ElliotHaughin

    ErrorDocument 404 /rda/index.php
</IfModule>

Other Technical Notes

  • Logic for the scrubbing and creation of unique SLUGs is contained in generateSlug() in /orca/_functions/orca_presentation_functions.php
  • Logic for the updating and maintenance of SLUGs as a record is updated is executed in importRegistryObjects() (/orca/_functions/orca_import_functions.php) after the registry elements are included. Note that this logic is complicated by the fact that a URL Slug should be reused if the record key and title has not changed and that the previous url_slug is no longer available once the record has been deleted & reinserted.
  • The dispatcher actually maps SLUGs from URL to key_hash (not registry object key) (see view_by_hash() in /rda/.../controllers/view.php)