The grant extension also helped to define the milestones for 2023. Here they are, in no particular order — the order they get implemented depends on many factors, including some non-obvious interplay between them. Some of these milestones are pretty simple, some will require substantial re-writes. Exciting times ahead!
LibResilient needs a “still loading” screen to be displayed when loading HTML resources over
slow transports. For example, retrieving content from IPFS
can take upwards of 20s to load
sometimes. Currently the user experience here is lacking: depending on browser timeout defaults,
the page fails to load, showing an obscure error, and then suddenly loads and displays content.
Also, LibResilient currently kicks-in on the second request to a site that uses it, as that’s when
the ServiceWorker gets actually loaded. This could be improved by using Clients.claim()
.
Clients.claim()
use and implement if makes senseWhen retrieving content using certain transports (for example IPFS
), MIME-type information is not
available. Currently, plugins that face this issue naïvely try to guess the MIME-type based on file
extension. What is needed is a facility to reliably establish MIME-types of requested content,
made available by the ServiceWorker itself to all plugins that need it.
The interplay between LibResilient, CORB
, CORS
, and CSP
is not well documented.
Documenting it well requires substantial testing on small but purpose-built infrastructure.
CORB
/CORS
/CSP
in the context of LibResilientThe assumption is that cookies and other credentials should not be exposed to alternative transports (for example), but this needs strict testing and documentation. Perhaps this should also be configurable.
Currently LibResilient uses Jest and Node.js for tests of the browser-side code, and Deno for the CLI and CLI-related tests. This makes maintenance difficult. Deno is a much better choice as it implements WebAPIs natively. So, browser-side code tests should be re-written for Deno, and Node.js dependency completely removed from te project.
LibResilient has to handle request errors better. It needs to be smart about displaying the original
404
page from the original domain, and otherwise needs to show some form of a plugin call stack
or other explanation if a request failed. Perhaps a “development” mode should be implemented.
Currently when a request fails (for example, due to 404
error, or because integrity check fails,
or…), a browser-internal “request failed” page is displayed to the user. This is not very helpful
when debugging issues, and at the same time this is confusing to users.
From the developer (and user) perspective, it’s difficult to figure out what went wrong when a resource is not successfully fetched, and if the failure is related to LibResilient or not. This is especially problematic when fetching resources that are protected by subresource integrity.
This will require substantial rewrites of crucial pieces of the ServiceWorker and plugins.
config.json
handling when fetched via alternative channelsWhen loading config.json
via alternative transports, if config.json
is broken but the original website is not available, a LibResilient-enabled site might end up in a broken state
LibResilient should verify that a newly loaded config is valid broken (say, by deploying the new
configuration and attempting to load the config.json
file it just loaded), and reverting to the
previous, clearly working config otherwise.
config.json
loaded via alternative transportsconfig.json
There are several “papercut” issues in LibResilient and its plugins that are too small to be considered a separate project plan items, but are nonetheless important to fix.
basic-integrity
plugin: how to treat URIs without domain name?DNSlink-based plugins as currently implemented can only use DNS-over-HTTPS servers that offer JSON endpoints. Implementing pure DNS-over-HTTPS would greatly (several orders of magnitude) improve the number of DoH servers DNSlink-based plugins can use.
Now, documentation (both general docs, and per-plugin documentaion) is available directly on the website. Still not perfect, but considerably better nonetheless. Importantly, plugins documentation is also gathered in a single place.
Documentation is divided into two parts:
This should make the information there more easily available and discoverable. There is no search (yet?), as the size of project’s documentation is still small and arguably manageable with the help of a decent index pages.
These are documentation resources discussing LibResilient generally, or providing step-by-step guides on deployment.
This includes a high-level overview of the philosophy guiding the project, and an (still not entirely complete) description of its architecture. There is also an extensive Frequently Asked Questions section, diving into things like interactions with web analytics systems and admin panels, how does LibResilient handle interactivity on a website, and a deep-dive into Service Workers as used by LibResilient.
The technical, step-by-step guides include the Quickstart guide and the example deployment document. There are also technical write-ups focusing on specific features of LibResilient: its ability to update configuration even during disruption and on ensuring security and content integrity — a topic particularly important when using third-party-run alternative endpoints.
This section contains documentation of every plugin available in LibResilient’s main code tree.
How extensively a plugin is documented differs between plugins. Some, like dnslink-fetch
offer a reasonably good write-up. Others, like gun-ipfs
. Some plugins are considered stable or late beta
, some are broken and need to be re-written. This is clearly expressed on the plugin overview page:
It’s important to recognize that good documentation is crucial to adoption. Quite a lot of work went into improving LibResilient’s documentation situation, but by no means is it done and perfect. If you’d like to get involved in helping out with this — or with any other aspect of LibResilient — check out the code in GitLab.
General documentation available on this website is built from the content in the /docs/
directory in the repository. Plugin documentation is built from README.md
files in each individual plugin’s directory. And if you’re wondering how that’s achieved, the absolutely horrendously ugly code for that is here. Actual LibResilient code is much cleaner, pinky promise!
dnslink-ipfs
expects content to be published on IPFS, and the latest IPFS address to then be pushed to DNS.
How to push this information and data out was so far left to website administrators, creating a relatively large obstacle to LibResilient adoption. Now this might gradually start getting solved, thanks to lrcli
, the LibResilient CLI.
The big problem with creating a consistent CLI for a tool like LibResilient is that basically all relevant functionality is related to specific LibResilient plugins. Additionally, at least in case of some plugins there is more than one way to push the information where it needs to be for LibResilient to be able to make use of it.
Consider the alt-fetch
plugin, which allows LibResilient to fetch content from alternative HTTPS endpoints. Can the lrcli
make assumptions regarding how content should be pushed to them? Should FTP or SFTP be used? Or maybe some REST API needs to be used instead and PUT
HTTPS requests need to be issued? Or perhaps it’s some kind of proprietary service that requires a specific proprietary protocol?
There is a growing number of plugins that might need some CLI functionality to make Libresilient easy to deploy when using them. And in case of many if not most of these plugins there are simply too many possible ways of pushing the information out for there to be a general CLI that implements all of them.
The plugin-based approach seems to work well for LibResilient itself. It might work well for the CLI itself, then, as well. After all, the author of a plugin probably knows best what kind of tools might a website administrator need to properly push the content and any necessary additional information out for the plugin to be able to make use of it.
LibResilient CLI is built around a simple plugin architecture. It assumes a cli.js
file in plugin’s main directory. The file should be a valid Deno module (lrcli
is written for Deno JS runtime), and export an object that defines the name, description, version, and actions implemented by the plugin. Based on that, CLI knows how to run specific actions and interpret relevant command line arguments.
Here is a simple example for the basic-integrity
plugin.
With time, more plugins will gain a CLI component. For some of them — like the basic-integrity
or signed-integrity
, which already have it — CLI’s role is going to be limited to generate data locally, for use with other tools in the publishing pipeline. For other plugins — for example, IPFS-based transport plugins — it makes sense to implement actions that push content out, actually publishing it.
And in some cases, this will remain somewhat complicated. There are simply too many ways to push out content to a simple HTTPS endpoint, that are often also very specific to a given website, for them to all be implementable in a single LibResilient CLI plugin. Same is probably true for pushing out DNS updates required for DNSLink-based plugins. In such cases, most broadly used mechanisms will probably be implemented (FTP/SFTP for fetch
-based plugins? DNS UPDATE
for DNSLink-based plugins?), but anything fancier than that will have to be left to the website admin, who knows their infrastructure and how to distribute content on it.
When run, lrcli
expects the name of the plugin to load, and tries to be helpful in guiding the user in how its usage:
$ cli/lrcli.js
Command-line interface for LibResilient.
This script creates a common interface to CLI actions implemented by LibResilient plugins.
Usage:
lrcli.js [options] [plugin-name [plugin-options]]
Options:
-h, --help [plugin-name]
Print this message, if no plugin-name is given.
If plugin-name is provided, print usage information of that plugin.
Plugin names are assumed to be sub-directories under plugins/
directory in LibResilient’s code directory:
$ cli/lrcli.js no-such-plugin
*** TypeError: Module not found "file:///home/user/Projects/libresilient/plugins/no-such-plugin/cli.js". ***
If plugin exists, usage information can be printed, based on data exported by the plugin’s cli.js
module:
$ cli/lrcli.js basic-integrity
*** No action specified for plugin ***
CLI plugin:
basic-integrity
Plugin Description:
Verifying subresource integrity for resources fetched by other plugins.
CLI used to generate subresource integrity hashes for provided files.
Usage:
lrcli.js [general-options] basic-integrity [plugin-action [action-options]]
General Options:
-h, --help [plugin-name]
Print this message, if no plugin-name is given.
If plugin-name is provided, print usage information of that plugin.
Actions and Action Options:
get-integrity [options...] <file...>
calculate subresource integrity hashes for provided files
<file...>
paths of files to be processed
--algorithm (default: SHA-256)
SubtleCrypto.digest-compatible algorithm names to use when calculating digests (default: "SHA-256")
--output (default: json)
a string, defining output mode ('json' or 'text'; 'json' is default)
The plugin controls its output, but a good practice is to provide support at least for json
and text
when useful data is returned, to simplify integration into any other tools the website admin chooses to use in their deployment pipeline:
$ cli/lrcli.js basic-integrity get-integrity libresilient.js
{"libresilient.js":["sha256-UrkUn2KwKBQ93jS/pSd3Kt0/+9XkDT6Rj93jec/lOZY="]}
$ cli/lrcli.js basic-integrity get-integrity libresilient.js --output text
libresilient.js: sha256-UrkUn2KwKBQ93jS/pSd3Kt0/+9XkDT6Rj93jec/lOZY=
They fetch content using means that have been employed by LibResilient plugins before, but use DNSLink to figure out where to fetch content from.
DNSLink is a standard for storing information on where a content related to a given domain can be found directly in DNS, using TXT
records.
Let’s say you are running a website at https://example.org
and want to provide information on where relevant content can be found in case, for example, the main site goes down. You could put this in some place on the site itself, but this creates a chicken-and-egg problem: information on where to get content if the site is not available is only accessible as long as the site is available.
Instead, you could use DNSLink and put that information directly in DNS. To do that you would create TXT
records for _dnslink.example.org
label, like so:
_dnslink.example.org. 60 IN TXT "dnslink=/ipfs/Qm..."
_dnslink.example.org. 60 IN TXT "dnslink=/https/gateway.ipfs.io/ipfs/Qm..."
_dnslink.example.org. 60 IN TXT "dnslink=/https/example.com/"
Software that understands DNSLink would take this to mean that content for example.org
is also available directly on IPFS, or via HTTPS on specific IPFS gateways, or via HTTPS on an alternative endpoint (in this case, example.com
). As long as DNS remains available, if client software understands DNSLink and supports the relevant transport protocols, the content could be retrieved even if https://example.com
site itself is down.
This is where the new LibResilient plugins come in.
With the new plugins, LibResilient turns any modern browser into client software that understands DNSLink and can retrieve content related to a website that happens to be down (as long as that particular visitor had visited that site once before and Service Worker got loaded).
The first of the two, dnslink-fetch
, is very similar to the alt-fetch
plugin: given alternative endpoints, it performs HTTP fetch()
requests to them to pull relevant content. But instead of requiring the endpoints to be configured explicitly in the config.json
configuration file (and thus be somewhat inflexible), it pulls the endpoints from DNS, in accordance to the DNSLink standard.
The second plugin, dnslink-ipfs
, uses DNSLink to figure out which IPFS CID
to use when fetching the content from IPFS. This is necessary because IPFS uses content-addressing: when content itself changes, the address will change too. When using IPFS for content retrieval, the address of the current, up-to-date version of content must first be known. DNSLink provides a good way of making that information available, and the plugin leans on that.
…is not, however, LibResilient’s responsibility. So, if you want to use these DNSLink-based plugins, you will need to separately implement some way of updating relevant TXT
records when content gets modified or new content gets published. This can be done via your DNS hosting provider’s APIs, using the DNS UPDATE
query if your DNS nameservers support that, or some other means.
At least for now.
Some work has started on implementing a command-line tool that would simplify deployment of websites that use LibResilient, so perhaps one day this will be handled by the LibResilient CLI. Stay tuned!
]]>Your config sets up the fetch
plugin, then local cache
, and then alt-fetch
with some nice independent endpoints (say, an IPFS gateway here, an Tor Onion gateway there). Perhaps with some content integrity checking deployed too, for a peace of mind (no need to completely trust those third party gateway operators, after all).
Obviously you run your own IPFS node and a Tor Hidden Service for these to work correctly, but your website visitors do not need any special software, extensions, or configuration — they just visit your site in their regular browser; LibResilient handles everything else behind the scenes.
Maybe your server keeled over, or maybe it’s a DDoS.
Good news is: for all visitors who had visited your site before, everything seems to work just fine (if perhaps a tiny bit slower than normally). Your website content is cached on IPFS nodes, and IPFS gateways are happily serving the requests LibResilient is sending their way. Local cache makes the experience quite seamless for content those visitors had viewed before, and the IPFS-related slowdown for content they have not is still small.
For whatever reason, however, you figure out the outage will last a bit longer, and you’d like to swap out the fetch
pluging completely (no reason for visitors to wait for something that isn’t going to work anyway for the time being). You’d perhaps also want to remove the Tor Onion gateways from the alt-fetch
endpoints — after all your Tor Hidden Service is down as well.
Bad news is: LibResilient’s config is a JavaScript file imported directly in the Service Worker, you have no way to update it until your site comes back up.
This is what this milestone (third one supported by a small grant from NGI Assure) was all about.
To make config updates possible during disruption and outage, the config format needed to be changed (JSON was the obvious choice), and then the whole machinery of verifying, loading, and caching it needed to be implemented.
And so now, the config file (config.json
) is just a regular file that can be retrieved via any configured plugins. You don’t have to do anything special for this to work.
Let’s dive deeper into what exactly has been done this month.
This was done because the ServiceWorkers API does not provide a way to update
JavaScript scripts that were imported into the Service Worker via importScripts()
call,
in any other way that via a direct HTTPS fetch() to the original website.
For obvious reasons that’s unworkable for updating the config during disruption/outage.
A bunch of research was required, as expected. In the end, LibResilient needed
to have a roughly full implementation of what the browser does to scripts
imported via importScripts(): fetching config.json
, caching it, and
establishing it as stale so that it can be re-fetched.
Additional benefit of this is that the config file is now not code, it’s an “inert” format (JSON). It is no longer possible to include running code directly in the configuration file. This is important for various reasons that the LANGSEC community explores at length.
This work also included implementing validity checks on the config file — something that was not really possible when config was written in directly-loaded JavaScript.
Implementing this change required me finally diving deep into the ServiceWorker lifecycle, especially the parts of it that are mostly glossed over or not mentioned at all in most documentation: what exactly happens when a ServiceWorker has been registered and installed, but is now stopped, and is being restarted?
This research was crucial to implementing the JSON config change correctly, and provided important insight that will potentially be very useful for implementing future improvements.
Once the JSON config change was implemented, it was possible to implement
background fetching of the updated config.json
file.
This required cleaning up and refactoring code implementing JSON config
support, and deciding what criteria to use when establishing if a cached
config.json
is “stale” (currently: over 24 hours old, based on the Date:
header
on the cached response).
The biggest issue was figuring out what should happen if the freshly retrieved
config file configures plugins that have not been loaded upon Service Worker
installation. Because an updated config.json
file is processed after Service Worker restart
(so, not during installation), importScripts()
is not available.
A decision was made to test for such such config changes and reject such a
config file outright, falling back to the already cached, if stale,
config.json
, if the updated file was not retrieved using a regular fetch
.
The rationale for this is that in such circumstances:
config.json
file was retrieved using an alternative transport;config.json
is functional, as we were,
in fact, able to retrieve the updated config.json
.Ideas for potential further improvements to this are listed here.
Documentation was written on how JSON config loading and updating works, and how the config can be updated during disruption or outage. It explains the rationale behind implementation decisions and their technological context.
There is obviously more work needed to make this documentation more useful and readable. But it’s a start.
Code written for this milestone is of course covered by tests; overall test coverage went up to ~62%.
As before, I have avoided any external dependencies what-so-ever. LibResilient remains easily deployable by simply copying a few JS files (and now, a single JSON file) and adding a single line to your HTML.
There are four milestones on the todo list. Unclear which one I will focus on next, but that should be resolved soon. Keep an eye on the issues assigned to those milestones if you want to be the first to know!
]]>Today another milestone was completed, focusing on integrity of content fetched via LibResilient, and thus on the security of websites deploying it.
On a very basic level, LibResilient’s job is fetching website content from places other than the original domain of that website.
This can mean alternative endpoints controlled by the website owners (say, on secondary domains, or just IP addresses), or it can mean third-party endpoints like IPFS gateways, Tor2web proxies, or any location where the website’s operator can upload website content, and from which that content can then be fetched.
This, however, creates a problem — operators of such third party services effectively get the ability to modify the content (accidentally, or… maliciously).
The solution is to verify content integrity. We can leverage the Subresource Integrity (SRI) feature of modern browsers — but that has several downsides:
integrity
attribute is only defined for <script>
and <link>
elements.Thankfully, it turns out integrity data can be provided for fetch
requests in JavaScript for any content type, and once provided the browser will do the heavy lifting of verifying it!
That also means we can have plugins that provide that integrity data: for example, directly from the config, or through an somewhat elaborate but considerably more flexible process of fetching a signed file with the relevant integrity data for each request separately.
Finally, for transport plugins that do not rely on the Fetch API, and thus do not benefit from the browser checking if integrity data matches fetched content, we now have a wrapper plugin that explicitly checks integrity of any resource, using the SubtleCrypto API.
Let’s break down specific work done in this latest milestone.
This meant, first and foremost, identifying how SRI should be supported in LibResilient. Options included supporting it directly in the service worker code, or bubbling it down to the plugins. In the end, the latter approach was elected as more flexible.
Then, identifying places in service-worker.js
and plugin code where SRI was
not being correctly handled, and fixing that.
Some research was also necessary to establish if SRI can be set (and if it is enforced by the JS Fetch API) for resources other than scripts and CSS.
Once we had the basic SRI compatibility ensured, it was possible to write SRI- related wrapper plugins.
The first, basic-integrity
,
makes it possible to statically configure integrity data for specific URLs.
It doesn’t check the integrity itself, just makes sure that integrity data configured for a given URL is added to the request data when the URL is being fetched by LibResilient. Actual verification is assumed to be done by any plugin wrapped by it.
Secondly, integrity-check
wrapper plugin uses the SubtleCrypto API to implement integrity check directly in JS.
This makes it possible to check integrity (if present in the request being handled) of content fetched by transport plugins that do not guarantee integrity will be checked by the browser — such as any plugin not using the Fetch API.
Finally, the signed-integrity
plugin is a proof-of-concept demonstrating how
SRI could be used in LibResilient for sites that are not completely static.
For each content URL being fetched it first fetches integrity data from an URL
built by appending .integrity
to the content URL, expecting a JSON Web Token.
That JWT’s signature is verified using a pre-configured public key (assumption being that it was signed with a related private key on the server). JWT’s payload should contain an “integrity” field, which is then used to set the SRI data on the request being handled.
The plugin itself does not check integrity, it is assumed that the wrapped plugin will do that check.
By combining these plugins (for example, signed-integrity
to retrieve
integrity data for content, wrapping the integrity-check
plugin that
actually verifies integrity of content fetched by a transport plugin wrapped by
it in turn) it is possible to provide SRI for transport plugins not built
around the Fetch API.
A document on content integrity in the context of LibResilient was also created. It discusses SRI, different available plugins, pros and cons of different approaches to content integrity when using LibResilient, and mentions possible future developments.
Code written for this milestone is of course covered by tests, and so overall test coverage for the project went up to ~60%.
All of the functionality in this milestone was implemented without any
external dependencies (npm and package.json
are only used for running
the unit tests, and have nothing to do with LibResilient’s browser-side code).
The aim remains for LibResilient to be deployable by simply copying a few JS
files over to the directory from which a website is served. No dependency hell,
no bundling, no stress.
Work has already started on the next milestone, focusing on being able to deploy LibResilient configuration changes even when the original website is not available.
]]>First, that means that development is actually happening again. Second, one of the goals of this milestone was to create a decent testing harness, so that any errors or breakage caused by code changes can be caught early and fixed.
Third, this is the first milestone covered by the NLnet grant for LibResilient. You read that right: LibResilient development is now supported as part of the NGI Assure project. There are six more milestones defined as part of that grant.
Biggest specific pieces of work done for this milestone focused on:
Currently test coverage is at ~53% overall. This might not sound like a lot, until one considers that for all mature plugins (fetch
, cache
, any-of
, alt-fetch
) it’s at 100%, and for service-worker.js
– the most important and complicated piece of the project – it’s at ~95%.
This means development can go ahead with reasonable confidence, as most bugs get caught early.
Less stable plugins (gun-ipfs
, ipns-ipfs
), and user-facing libresilient.js
do not have a lot of test coverage currently, as they will require substantial rewrites in near future.
Security testing infrastructure was also set-up, which led to code quality improvements:
This was a rather hairy piece of work, as it required refactoring of a lot of tightly-coupled code which was also handling core functionality of libresilient, and used to have certain old assumptions (like: “a plugin can only be used once in the config”) baked-in.
Having a testing harness helped a lot in making those changes without worrying if LibResilient remains functional, limiting the chance serious bugs were introduced.
This also laid down the groundwork for future work on other milestones, for example by making the config-handling code cleaner and easier to reason about.
Along the way some old code was re-written and random issues fixed; importantly, more error handling code was added to service-worker.js, for example:
Improved error handling means LibResilient should work better in Safari now (Safari implementing some new web APIs in the meantime also helped, of course).
I’m going to continue to work on LibResilient in a rather organic way – without a very strict plan. There’s plenty to be done, and basically all planned improvements are intertwined with one another. Setting up a proper website for the project, with some demos and examples, is high on the list though.
]]>