nitlink

Nitlink

Coverage badge Build Status Version 1.1.0 Code Climate


Nitlink is a nice, nitpicky gem for parsing Link headers, which sticks as closely as possible to Mark Nottingham’s parsing algorithm (from his most recent redraft of RFC 5988). That means it’s particularly good at handling weird edge cases, UTF-8 encoded parameters, URI resolution, boolean parameters and more. It also plays nicely with a bunch of popular HTTP client libraries, has an extensive test suite, and zero external dependencies.

Tested with Ruby versions from 1.9.3 up to 2.4.1. Ruby 2.0+ is fully supported, 1.9.3 has fully functional parsing, but the support for third-party clients is somewhat limited (because, for example, Net::HTTPResponse doesn’t expose the request URI in 1.9.3).

Contents

Installation

Install the gem from RubyGems:

gem install nitlink

Or add it to your Gemfile and run bundle install

gem 'nitlink', '~> 1.1'

And you’re ready to go!

require 'httparty'
require 'nitlink/response'

HTTParty.get('https://www.w3.org/wiki/Main_Page').links.by_rel('last').target
 => #<URI::HTTPS https://www.w3.org/wiki/index.php?title=Main_Page&oldid=100698>

Usage

The most basic way to use Nitlink is to directly pass in a HTTP response from Net::HTTP:

require 'nitlink'
require 'net/http'
require 'awesome_print' # <- not required, just for this demo

link_parser = Nitlink::Parser.new
response = Net::HTTP.get_response(URI.parse 'https://api.github.com/search/code?q=addClass+user:mozilla')

links = link_parser.parse(response)
ap links

# =>
[
    [0] #<Nitlink::Link:0x7fcd09019158
        context = #<URI::HTTPS https://api.github.com/search/code?q=addClass+user:mozilla>,
        relation_type = "next",
        target = #<URI::HTTPS https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2>,
        target_attributes = {}
    >,
    [1] #<Nitlink::Link:0x7fcd09011fe8
        context = #<URI::HTTPS https://api.github.com/search/code?q=addClass+user:mozilla>,
        relation_type = "last",
        target = #<URI::HTTPS https://api.github.com/search/code?q=addClass+user%3Amozilla&page=34>,
        target_attributes = {}
    >
]

links is actually a Nitlink::LinkCollection - an enhanced array which makes it convenient to grab a link based on its relation_type:

links.by_rel('next').target.to_s
#=> 'https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2'

Third-party clients

Nitlink also supports a large number of third-party HTTP clients:

You can pass a HTTP response from one of these libraries straight into the #parse method:

response = HTTParty.get('https://api.github.com/search/code?q=addClass+user:mozilla')
links = link_parser.parse(response)


For the extra lazy, you can instead require nitlink/response which decorates the various response objects from third-party clients with a new #links method, which returns the parsed Link headers from that response. nitlink/response must be required after the third-party client. (Note: Net::HTTPResponse also gets decorated, even though it’s not technically third-party).

require 'httparty'
require 'nitlink/response'

ap HTTParty.get('https://api.github.com/search/code?q=addClass+user:mozilla').links

# =>
[
    [0] #<Nitlink::Link:0x7fcd09019158
        context = #<URI::HTTPS https://api.github.com/search/code?q=addClass+user:mozilla>,
# ....

response.links is just syntactic sugar for calling Nitlink::Parser.new.parse(response)

Response as a hash

You can also pass the relevant response data as a hash (with keys as strings or symbols):

links = link_parser.parse({
  request_uri: 'https://api.github.com/search/code?q=addClass+user:mozilla',
  status: 200,
  headers: { 'Link' => '<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2>; rel="next", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=34>; rel="last"' }
})

Non-GET requests

For fully correct behavior, when the making a request using a HTTP method other than GET, specify the method type as the second argument of #parse:

response = HTTParty.post('https://api.github.com/search/code?q=addClass+user:mozilla')
links = link_parser.parse(response, 'POST')

This allows Nitlink to correctly set the context of links (resources fetched by a method other than GET or HEAD generally have an anonymous context) - but otherwise everything works OK if you don’t specify this.

Here we make an initial call to the Github API’s search endpoint then iterate through the pages of results using Link headers:

require 'nitlink'
require 'net/http'

link_parser = Nitlink::Parser.new
first_page = HTTParty.get('https://api.github.com/search/code?q=onwheel+user:mozilla')
links = link_parser.parse(first_page)

results = first_page.parsed_response['items']

while links.by_rel('next')
  response = HTTParty.get(links.by_rel('next').target)
  results += first_page.parsed_response['items']

  links = link_parser.parse(response)
end

Feature comparison

A few different Link header parsers (in various languages) already exist. Some of them are quite lovely :relaxed: ! Nitlink does its best to be as feature complete as possible; as far as I know it’s the first library to cover all the areas the spec (RFC 5988) sets out:

Feature Nitlink parse-link-header link_header li weblinking link-headers backbone-paginator http-link node-http-link-header
Encoded params (per RFC 5987) <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p>
URI resolution <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p>
Establish link context <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p>
Ignore quoted separators <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:white_check_mark:</p>
Parse “weird” headers <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p>
Proper escaping <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p>
Boolean attributes <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p>
Ignore duplicate params <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p>
Multiple relation types <p align='center'>:white_check_mark:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:white_check_mark:</p> <p align='center'>:x:</p> <p align='center'>:x:</p> <p align='center'>:x:</p>

i.e. can it parse weird looking, but technically valid headers like <http://example.com/;;;,,,>; rel="next;;;,,, next"; a-zA-Z0-9!#$&+-.^_|~=!#$%&'()*+-./0-9:<=>?@a-zA-Z[]^_{|}~; title*=UTF-8'de'N%c3%a4chstes%20Kapitel?

API

Nitlink::Parser

#parse(response, method = 'GET') => Nitlink::LinkCollection

Accepts the following arguments:

Returns a Nitlink::LinkCollection containing Nitlink::Link objects:

If the Link header does not begin with "<", or "<" isn’t followed by ">" it’s considered malformed and unparseable - in which case a Nitlink::MalformedLinkHeaderError is thrown. If response is an instance of a class which Nitlink doesn’t know how to handle (e.g. from an unsupported third-party client) a Nitlink::UnknownResponseTypeError is thrown.

Nitlink::LinkCollection

An extension of Array with additional convenience methods for handling links based on their relation type.

Accepts the following argument:

Returns a single Nitlink::Link object whose relation_type attribute matches the relation type provided, or nil if the collection doesn’t contain a matching link. If two links exist which match the provided relation type (this should never happen in practice), the first matching link in the collection is returned.

Raises an ArgumentError if the relation_type is blank.

#to_h(options = { with_indifferent_access: true }) => Nitlink::HashWithIndifferentAccess or Hash

Also aliased as to_hash.

Accepts the following arguments:

An empty collection will return an empty Nitlink::HashWithIndifferentAccess/Hash. If two links exist which match a given relation type, the value will be the first link in the collection.

A Struct representing a single link with a specific relation type. It has four attributes:

#<Nitlink::Link:0x7fcda89489a0
    context = #<URI::HTTP http://example.com>,
    target = #<URI::HTTP http://example.com/readme>,
    relation_type = "about",
    target_attributes = {
      "title" => "About us"
    }
>

#context => URI or nil

Returns the context of the link as a URI object. Usually this will be the same as the request URI, but may be modified by the anchor parameter or Content-Location header. Additionally some HTTP request methods or status codes result in an “anonymous” link context being assigned (represented by nil).

#target => URI

Returns the target of the link as a URI object. If the URI given in the Link header is relative, Nitlink resolves it (based on the request URI).

#relation_type => String

A single relation type, describing the kind of relationship this link represents. For example, "prev" would indicate that the target resource immediately precedes the context. It could also be an extension relation type (an absolute URI serialized as a string).

Relation types are always case-normalized to lowercase.

#target_attributes => Hash

Captures the values of the parameters that aren’t used to construct the context or target (i.e. other than rel and anchor) title, for example.

Parameters ending in * are decoded per RFC 5987, bis-03. Where decoding fails, the parameter is omitted.

Boolean parameters (e.g. crossorigin) have their values set to nil. Any backslash escaped characters within quoted parameter values are unescaped. The names of attributes are case-normalized to lowercase. Only the first occurrences of media, title, title* or type parameters are parsed, subsequent occurrences are ignored.

If no additional parameters exist, target_attributes is an empty hash.

ap parser.parse({
  request_uri: 'http://example.com',
  status: 200,
  headers: { 'Link' => %q{</about>; rel=about; title="About us"; title*=utf-8'en'About%20%C3%BCs; crossorigin} }
})

#=>
[
    [0] #<Nitlink::Link:0x7fcda9274bc8
        context = #<URI::HTTP http://example.com>,
        relation_type = "about",
        target = #<URI::HTTP http://example.com/about>,
        target_attributes = {
                  "title" => "About us",
                 "title*" => "About üs",
            "crossorigin" => nil
        }
    >
]

Nitlink::HashWithIndifferentAccess

Implements a hash where keys :foo and “foo” are considered to be the same. It’s closely modeled on Thor’s implementation (Thor::CoreExt::HashWithIndifferentAccess), except without the magic predicates.

Instances are largely interchangeable with ActiveSupport::HashWithIndifferentAccess, but doesn’t have the additional methods not present on Hash, like stringify_keys, symbolize_keys, regular_writer etc.

Changelog

Nitlink follows semantic versioning.

1.1.0 (1 September 2017)

1.0.0 (7 November 2016)

  1. Clone the git repo
  git clone git://github.com/alexpeattie/nitlink.git
  1. Install dependencies
  cd nitlink
  bundle install

You can skip installing the various third-party HTTP clients Nitlink supports, to get up and running faster (some specs will fail)

  bundle install --without clients
  1. Run the test suite
  bundle exec rspec

You can also generate a Simplecov coverage report by setting the COVERAGE environment variable:

  COVERAGE=true bundle exec rspec

Contributing

Pull requests are very welcome! Please try to follow these simple rules if applicable:

Future features

License

Nitlink is released under the MIT license. (See License.md)

Author

Alex Peattie / alexpeattie.com / @alexpeattie