Skip to content

ckruse/microformats2-elixir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Microformats2

Module Version Hex Docs Total Download License Last Updated

A Microformats2 parser for Elixir.

Installation

The package can be installed by adding :microformat2 to your list of dependencies in mix.exs:

def deps do
  [
    {:microformats2, "~> 1.0.0"}
  ]
end

If you want to directly parse from URLs, add :tesla to your list of dependencies in mix.exs:

def deps do
  [
    {:microformats2, "~> 1.0.0"},
    {:tesla, "~> 1.4.4"}
  ]
end

Usage

Give the parser an HTML string and the URL it was fetched from:

Microformats2.parse("""
<div class="h-card">
  <img class="u-photo" alt="photo of Mitchell"
        src="https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"/>
  <a class="p-name u-url"
      href="http://blog.lizardwrangler.com/">Mitchell Baker</a>
  (<a class="u-url" href="https://twitter.com/MitchellBaker">@MitchellBaker</a>)
  <span class="p-org">Mozilla Foundation</span>
  <p class="p-note">
    Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities.
  </p>
  <span class="p-category">Strategy</span>
  <span class="p-category">Leadership</span>
</div>
""", "http://example.org")

It will parse the object to a structure like that:

%{
  "items" => [
    %{
      "properties" => %{
        "category" => ["Strategy", "Leadership"],
        "name" => ["Mitchell Baker"],
        "note" => ["Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities."],
        "org" => ["Mozilla Foundation"],
        "photo" => [
          %{
            "alt" => "photo of Mitchell",
            "value" => "https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"
          }
        ],
        "url" => ["http://blog.lizardwrangler.com/",
         "https://twitter.com/MitchellBaker"]
      },
      "type" => ["h-card"]
    }
  ],
  "rel-urls" => %{},
  "rels" => %{}
}

You can also provide HTML trees already parsed with Floki:

Microformats2.parse(Floki.parse("<div class=\"h-card\">...</div>"), "http://example.org")

Or URLs if you have Tesla installed:

Microformats2.parse("http://example.org")

Dependencies

We need Floki for HTML parsing and optionally Tesla for fetching URLs.

Features

Implemented:

Not implemented:

Copyright and License

Copyright (c) 2018 Christian Kruse [email protected]

This work is free. You can redistribute it and/or modify it under the terms of the MIT License. See the LICENSE.md file for more details.

About

Microformats2 parser in Elixir

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 7