How we built Known

How we built Known
Image for post

We’ve got some exciting new things in store for 2016 that solve real problems for both higher and corporate education. We’ll discuss this in a future post on the Known blog. First, though, I wanted to take a step back and explain the technical decisions we made for Known.

What is Known?

Known is an open source web platform that allows groups and individuals to publish in a group with a variety of media. You can choose who can see the content you publish, as well as where you reach your audience: you can syndicate your content to services like Twitter, Facebook, SoundCloud, Flickr, LinkedIn and more.

Image for post

It’s also an open platform designed to be extended:

  • Every content type is provided by a plugin, so any organization can add new kinds of content. (For example, we don’t provide video out of the box — but you could.)
  • Every syndication service can also be extended, so while we provide plugins for social media, a university could extend Known to allow students to submit their work to their Learning Management System. Internally in the company, we’ll often create Known sites for particular projects and then syndicate our posts to Slack.
  • Known supports themes.
  • Plugins can also provide Single Sign On (we provide LTI and LDAP to our enterprise customers, but for example, KQED uses SSO to link Known to WordPress accounts).

Known works for a single user — my website runs on it — or five thousand. It’s up to you.

Did I mention it’s fully responsive, meaning it works just as well on your smartphone as it does on your laptop? Or that every page is an API endpoint?

Install anywhere, extend easily

A key goal for Known is the ability to install it virtually anywhere.

Installing self-hosted web software is, unfortunately, not as easy as installing an app on your iPhone or your laptop. However, it doesn’t need to be a developer-centric process.

Shared web hosts are immensely popular, and abstract away a lot of the really technical work involved in maintaining a server. You can often select an application to install from a directory of available projects, answer a few questions, and be ready to go in a couple of minutes. At its hardest, you can upload some files via FTP. You never have to drop to a command line and run Linux commands — and indeed, often you can’t.

We wanted to be compatible with these hosts (our web hosting sponsor is DreamHost), as well as power users who have deeper technical control over their servers. That implied a number of requirements:

  • The software language needs to be compatible with a large number of servers
  • Users need to be able to install the software without command line tools
  • Knowledge of version control systems like git, or managers like GitHub, shouldn’t be required
  • Use of a package manager like npm or composer shouldn’t be required

It turns out that the most widely-supported language on shared hosts is PHP.

PHP has received not a small amount of scorn in developer circles over the last decade, and a lot of it is fairly earned. But the truth is that modern versions — particularly 5.4 and above — have consistent interfaces, and modern language features like namespaces and closures that bring it closer in line with more cutting-edge languages. The PHP style recommendations produced by the Framework Interop Group and popularized by PHP The Right Way have done a lot to standardize PHP code.

In fact, PSR-4, which defines a template for class namespaces and a way for objects to be autoloaded on demand, turns out to be useful. Every plugin in Known uses this standard for autoloading.

The only question is PHP version: not every host supports these features. In fact, while it turns out that 98.8% of PHP hosts support version 5 or above, 34.3% of these are on version 5.3. We expect this number to shrink over time, and consider it acceptable to be supported by the remaining 65% of web hosts. The syntactic features you gain, like closures, are worth it.

To support virtual URLs, we initially required the Apache web server (which is still the leader overall on the web). However, a number of community members have created open source configurations for nginx.

The data model

I don’t think it’s acceptable for plugins to create and maintain their own database tables. For one thing, you may wish to prevent Known from having database modification access permissions. For another, this means that every plugin is a potential database security risk or performance drain.

Instead, from the beginning I wanted plugins to access the database via an abstracted interface, and never have to worry about the schema. At the same time, I wanted plugins to be able to store any data they needed to function, in a way that made sense in the context of that plugin.

The first versions of Known used a NoSQL database, MongoDB, as its sole data store. This worked well for development, but it quickly became apparent that shared hosting would not support this as a data layer. In interview after interview, users said they wanted to run Known on hosts like Reclaim Hosting and Nearly Free Speech. In fact, many shared hosts support MySQL — and that’s it. This left us with a challenge: could we provide a schemaless database layer while providing full support for MySQL?

Kevin Marks provided the answer: a balanced schema developed by FriendFeed back in the days before NoSQL databases became commonplace. We created a highly-indexed metadata table, which is purely used for searching for objects, and then stored the complete objects in JSON in an object database. All of this is provided by a seamless database layer called the Data Concierge, that abstracted many of the functions provided by the MongoDB PHP extension.

A side effect of this abstraction is that more databases could be added easily. Today, as well as MongoDB and MySQL, Known supports SQLite and Postgres.

Distributed social networking and uncool URIs

One of the core original visions for Known was that data could be distributed. A user on site A could participate in a community on site B. Imagine creating a group for a project across two companies, and then allowing users from a second company to join and collaborate without re-registering! There are lots of real-world possibilities for distributed social networking.

To prepare for this, we decided that every object would have a URI as its definitive UUID. The idea was that you could access any resource by its UUID anywhere on the web, and as long as the request was properly signed, you’d be able to access it as if it was locally stored. In the end, I consider this a core mistake, but one that is hard to move away from.

Tim Berners-Lee famously said that “cool URIs don’t change”. Unfortunately, in the real world, URIs change all the time — and there’s no way to require that they don’t.

  • Domain names expire
  • People lose control over their domain names (eg a subdomain at a university)
  • People choose to move sites into subdirectories

Imploring people to strongly consider their website layouts, as the W3C does, is not helpful for individuals who just want to run a site. The web is not set in stone; websites change, and URIs should be treated as volatile in any internal data model.

As it stands, Known contains a number of protections that allow it to be moved to different domains or directory locations, so users don’t notice a difference. It’s not a technical decision I’m proud of — but it may yet come into its own. We already use the indie web technologies for some distributed social networking, and it’s an idea that I’m convinced will transform the web.

The front end

Image for post

Creating a native mobile app for a platform that can be infinitely extended is difficult. Instead, we created a fully responsive, touch-friendly interface.

Known separates model, view and controller, and any page can be viewed with a different template. For example, here’s my website using a JSON template, and here’s a Star Wars crawl. Any plugin or theme can override any template element, so I could write a plugin that changes out the WYSIWYG editor (we use TinyMCE), or that displays avatar images as 3D spheres (if I really wanted to). I could write a template to display Known sites using a virtual reality browser — and someone really should!

For the default template, we chose Bootstrap and jQuery. The former provides a solid, responsive UI that can be extended easily (and which removed the need to develop it from scratch). The latter provides a powerful, performant way to query elements on the page. Not only did this combination let us get up and running quickly, but plugin authors could use them to create simple, grid-based user interfaces that would be in line with the platform as a whole.

For glyphs like social media logos, we use FontAwesome. The latest version contains 605 different user interface icons, is well tested, has a good community, and a compatible open source license. All of these things made it perfect for our use — and, again, making features available to plugin authors.

Every page is HTML5, CSS3. Content is encoded using microformats, allowing software to read and extract meaning from our human interfaces. This forms the basis of important decentralized social web protocols like those used by the indie web community.

Over time, we’ve learned that we do need to support a mobile app. The mobile web has evolved to be decent for consumption, but there are obvious missing pieces for producing content on a mobile over the web.

For example: it’s difficult to upload media. Resizing camera JPEGs in front-end Javascript on a mobile device is not a reliable process. The web audio API produces WAV files, rather than MP3s, which are uncompressed and potentially large. We could resample these on the server side using something like ffmpeg, but it’s not reasonable to expect a shared host to support media encoding — and nor is it reasonable to expect users to link up to a third-party media encoder like Zencoder. Worse: we found that the web audio API actually crashed many mobile browsers!

This problem is compounded by video uploads. Video files are huge, and there’s no way to compress them in a browser. Backround uploads are hugely tricky, and resuming failed uploads is also hard. That’s even before they’ve reached the server — and when files can be as large as 1GB per minute of footage, both storage and encoding is hard.

For the mobile web to effectively compete with apps, it needs to support the content composition experiences that native apps have been using for years. If we want people to build websites, the web needs to support building, across devices. It’s a frustration, and an ongoing problem.

Moving on

Our PHP-based infrastructure and need to support shared hosts means that some features are much harder to produce. The truth is that technologies like websockets (useful for performant real-time user interfaces) are hard for non-developers to self-host. New web platform features like web workers show enormous promise, but require secure connections — and even with empowering projects like Let’s Encrypt, setting up secure sites is still too complicated for most people.

The good news is that some progressive enhancement is possible: companion services that provide extra capabilities to hosted software. It’s also true that hosts are evolving, and our friends at DreamHost and Reclaim Hosting are thinking hard about the future of the space.

I’m proud of the platform we’ve created — it’s one we use every day, and I’m delighted to see people posting on their own servers all over the world. We’ve got big plans for the Known open source project this year, and we’re looking forward to sharing them with you, in conjunction with something new that we’ll tell you about soon.

It’s going to be a great year.

Originally published at werd.io on January 13, 2016.