ocPortal Ramblings
Since I’ve now been working with
ocPortal long enough to get to
know its internals pretty well, and which bits are a joy to use and
which aren’t, I thought I’d write down a few thoughts about what I think
works badly, which areas would benefit from attention, what features I’d
like to see and what may be possible in the future. Due to ocPortal’s
nature, as a layer on top of stubborn databases, flaky libraries and
inconsistent languages, which needs to offer flashy, dynamic,
user-editable coolness in a reliable way, this list will inevitably
include low-level and high-level details. My bias is, of course, on the
internals, but I am also a Web user so I care about the user-facing
parts too :)
The Good (the majority of
ocPortal)
Before I get into a rant about the less
brilliant parts of ocPortal, I thought I’d make it absolutely clear that
these are specific examples picked out because of their annoyingness;
they aren’t anything like a representative sample of ocPortal as a
whole. ocPortal contains loads of really cool features, which make it a
really nice platform to code on.
Some examples of this are
the “require” systems, which allow you to say “require_code(‘a’)”,
“require_lang(‘b’)”, “require_javascript(‘c’)” and “require_css(‘d’)”.
This will tell ocPortal that, before your code runs, the page should
have access to the code in ‘a.php’, the language strings in ‘b.ini’, the
Javascript in ‘c.js’ and the style information in ‘d.css’. The major
problem this solves is that it’s often a bad idea to include some
dependency “just in case” it’s not been included yet, because this can
cause a load of errors about conflicts where the dependency tries to
overwrite a previously included version of itself. With ocPortal, these
headaches never occur. A similar feature is the “extra_head” function,
which allows things (for example, raw Javascript, metadata tags, etc.)
to be written into the page’s <head></head> section at any
point during page generation, rather than having to specify them all at
the start.
A related feature is ocPortal’s fallback system.
Code requested by “require_code” will be taken from the “sources_custom”
folder if possible, and if there’s nothing which matches then ocPortal
falls back to the regular “sources” folder. The same happens for themes,
where the “templates_custom” folder of a theme will be checked first,
falling back to the “templates” folder, then if there is still no match
the “templates_custom” folder of the default theme is checked, and
finally the “templates” folder of the default theme is used. The same
applies to the “css_custom” and “images_custom” folders. Languages also
use fallbacks, going from “lang_custom” of the desired language, to
“lang” for the desired language, to “lang_custom” for the default
(currently English) and finally to the usual “lang” folder for the
default language (English). This makes it very easy to specify, for
example, tweaks to a theme (in the theme’s *_custom folders), specify a
theme’s standard behaviour (the theme’s non-custom folders), specify new
functionality available to all themes (the *_custom folders of the
default theme) and keep the original, known-to-work ocPortal defaults
intact (the non-custom folders of the default theme). The same is true
for the sources and sources_custom, although a little magic here allows
you to specify only the additions/changes you want to make, rather than
having to make a complete copy of the file first.
ocPortal’s
abstractions are also nice to work with, namely the database and forum
drivers. They are rather leaky and can easily be abused (for example
dumping raw SQL into the parameters of a nice abstract function), but as
long as this is avoided on the whole, then it can be a handy thing to
keep in place when you’re in a pickle and need a bit of a hacked
solution. It’s like the UNIX philosophy that if you take away people’s
ability to do stupid things, you also take away their ability to do
clever things (otherwise known as “giving you enough rope to hang
yourself ;) ).
There’s a lot more I could write about cool
features of ocPortal, like the Tempcode templating language, the hooks
system, the AJAX workflows, etc. but I am writing this to talk about
what can be done, rather than what is already done.
Let’s move on.
The Bad (things which aren’t the best
way to solve the problems they’re for)
I mentioned in
the above section that I could have gone on about ocPortal’s hooks being
a good thing. They are. However, they are also”bad” in that there is the
potential for them to be so much more. OK, a little background: a lot of
systems in ocPortal aren’t really suited to being written in one big
chunk, the classic example being site-wide search which has to include
bits from every other system. To handle these cases, ocPortal has a
feature called “hooks”; instead of putting the desired behaviour in a
few strategic places, we instead put a call to every one of the hooks we
have for that system. So for example, rather than having a large chunk
of search code which will look in the galleries, the downloads, the
forums, the news, the catalogues, the users, and so on, we instead say
“grab all of the search hooks and run each one”. Then, in the
“hooks/systems/search” folder of sources (or sources_custom) we can put
a new file to deal with galleries, one to deal with downloads, and so
on. Since the search says “all of the search hooks”, rather than some
specific list, we can add new hooks into the folder whenever we like and
they’ll be used everywhere that the built-in ones are.
Hooks
rely heavily on a methodology called “metaprogramming”, where a language
manipulates some text, adding bits on, replacing bits, etc. just like a
regular program, but the text it’s manipulating is actually code written
in that language, which then gets run. Metaprogramming is really cool,
but can get quite confusing quite quickly, so it’s usually reserved for
when nothing else is available. ocPortal hooks are created through
metaprogramming as PHP objects, which are then asked to perform their
task. Now, the problem with hooks is that this metaprogramming magic
must be invoked each time we want to use a hook or hooks, and there’s a
lot of duplication in the hooks’ contents.
When we design a
system we usually try to keep related things together, but we inevitably
end up having some similar code split up into various disparate places.
With hooks, to me, the problem is that the search hooks have a gallery
search. The introspection hooks have gallery introspection. The
configuration hooks have gallery configuration. And so on. I think a
better Object Oriented way to do this would be to generate large,
amalgamated, Katamari-style objects to represent each part of ocPortal
(galleries, downloads, catalogues, etc.) which a hook may want (each
being defined via a hook, of course), and generate their methods from
the current hooks (eg. gallery->search, downloads->search,
downloads->introspection, etc.). This also makes it unnecessary to
specify which hooks you want to instantiate before using them (an
annoyance with the current model), as instead you can just ask ocPortal
for a specific hook object (eg. ‘search/galleries’), or all objects used
by a set of hooks (eg. ‘search’). This can be encapsulated behind a
single function call like “require_object”. Then the methods can be
generated and called via proxy methods on the objects. For example if we
ran “$galleries = require_object(‘search/galleries’); galleries− > search(params);”
then the call to “search” is intercepted, a relevant hook is looked for
and instantiated, then called with $params.
That keeps down the overhead of having to generate objects with all
hooks instantiated in them to start with. Performing a complete search
of the site would be as simple as "$site_objects =
get_objects(‘search’); results = array(); foreach(site_objects
as $obj) { results = arraymerge(results,
obj− > search(params));
}“. We could even recurse through all of the hooks, bunging the methods
into the generated objects. For example if we were writing part of the
gallery system then we might access our desired hooks via”$objects =
get_objects(); $search_results = $objects['galleries']->search($params);
$relevant_configs = $objects[‘galleries’]->configs();“. Here I’ve
used the convention that”hooks/foo/bar.php” will be at the key “bar” in
the objects array, and will have the method “foo”.
The
Ugly (things which are unfortunate, and waiting to be
fixed)
At the moment ocPortal has a few historic
systems which aren’t actually used. For example, there is infrastructure
to create new configuration options. This is actually useless, however,
since the configuration options are designated via hooks. Such
implementation changes have left various cruft around, and this can be
confusing for programmers as they wonder why calls to these things don’t
do as they expect.
Some parts of HTML and its kin are a
little annoying, for which various workarounds can be used.
Unfortunately, ocPortal uses Flash to work around some of these, for
example the limitations of file uploading (a form must be submitted (ie.
a new page loaded), no progress information is given, etc.). This is
unfortunate because the only viable runtime for Flash is Adobe’s
(formerly Macromedia’s) and this isn’t Free Software. Another example is
video and audio playing, where the pending HTML5 standard defines how to
embed such things, but makes no requirements for the formats to use. The
current formats most used are h.264, which gives high quality video, is
supported by many proprietary vendors like Apple, but Free Software
implementations of it are illegal in many countries due to patents; Ogg
Theora, which is a medium quality codec (used to be poor, but has
improved a lot recently) which has been the format of choice for Free
Software for years, and is thus supported by browsers like Firefox
easily; and WebM, a new format pushed by Google, which is patent-free
(as far as we know) and supported by Free Software browsers. Until this
settles down, it is difficult to make a HTML5 video site which will work
on the majority of machines, without keeping at least 3 copies of each
video (which is an awful lot of space) and potentially infringing patent
law when converting to h.264. This unfortunately makes Flash a more
widespread option than the standard, for the moment at least. It is only
a matter of time before these things get replaced, but I would like to
see it sooner rather than later, especially since I refuse to comply
with Adobe’s licensing terms.
The
Future (my wishlist and thoughts)
These are
ideas for where ocPortal can go. They’re not a roadmap, they’re not a
vision, they’re just things I’ve noticed and thought about, along with
explorations of what new possibilities we would have if we implemented
any of
these.
WebDAV:
WebDAV is a filesystem which runs over HTTP. On UNIX systems we’re used
to accessing everything via a file descriptor, and interfacing with all
of this via some address underneath /. More recently, thanks to
FUSE,
we’ve become used to the idea of mounting filesystems which
transparently encrpyt their contents, which transparently compress their
contents, those which offer access to machines over a network, those
which auto-generate contents from some source such as
WikipediaFS, and so
on. Now, spare a minute for those poor folk still stuck on legacy
operating systems, which can’t even handle ext2 by default. They can,
however, mount WebDAV. That makes WebDAV, as a standard, a much better
filesystem concept than an ocPortal-specific filesystem akin to
WikipediaFS which very few users would be able to use.
By
having ocPortal available via a filesystem like WebDAV, we can remove
the barrier which prevents UNIX awesomeness from being unleashed on it.
With an ocPortal filesystem, auto-generated and managed by PHP
server-side code rather than being a direct representation of some
disk’s contents, we can allow users new ways to interact with a site
(mount a site with your login credentials and have access to all of the
files you’re allowed to view, organised neatly into folders, which are
populated automatically, and offer several concurrent hierarchies to
choose from (eg. “images/dave/october” and “images/october/dave” could
both generate the same results)), we can allow administrators new ways
to manage their site (copying CSVs directly into and out of the site,
dragging RSS in to post news, etc.) and we allow developers new ways to
customise a site (direct manipulation of a site’s source with native
programming tools, with the changesets being available directly to the
site’s PHP).
Another advantage to doing this is that we can
cover the same ground as ownCloud.
This tries to offer an online filesystem like Dropbox or UbuntuOne but
Free Software (CPAL’s good for this, although not quite AGPL). Users can
store their desktop preferences there so they can be available on
whatever machine they’re using. We’ve already got a great Web frontend,
just not the filesystem backend (so the opposite to ownCloud at the
moment).
The point about handling changesets is what
intrigues me the most. Since all writes to the site will be handled by
PHP, we can do what we like with them. A nice way to handle them would
be to have the whole site in a distributed version control system like
Git, and have the changes saved as
commits as they are made. This would let site admins roll back changes
very easily, whilst also allowing changes to ocPortal sites to
cross-polinate as users can pull changesets from each others’ sites (if
this is specifically enabled by the admin, of course). ocPortal change
from being broadcasted from a restricted SVN repository to those sites
which poll the central server; into being a true community of shared
code, with no barriers to entry.
There would be no need to
keep backups of things which are removed (like addons), since they can
be restored from the site’s revision history or pulled in from the
ocPortal network. Indeed, the entire concept of addons can be
restructured into Git changesets which, along with themes, can spread
through the network without the need for a central repository.
ocPortal.com would be a showcase of suggestions, rather than a key piece
of infrastructure.
There are lots of services out there which
would enhance various parts of ocPortal, even if they remain as separate
servers. The obvious one is email. Forum threads should generate mailing
lists and emailing the list should populate the correct forum (there are
PHPBB mods which do this, but I’ve not tried them). All content, like
blogs, news, galleries, etc. should be emailed out to those who want it
(in the body if possible, or else as an attachment) whilst comments can
be posted by sending a reply. Interacting with an ocPortal site should
be, as far as possible, doable via email.
If a site is being
hooked into an email server, then there should be the ability to
auto-generate email accounts for users. This is less standardised, but
some reference implementation could be written, eg. against
SquirrelMail. This would only
need to tie together the account credentials, it wouldn’t need any
interface work since a direct link to the mail frontend can be
given.
The same goes for XMPP/Jabber. There is limited
support in ocPortal at the moment for using XMPP in the chat rooms. I
think this should be available as standard, such that every chat room is
discoverable via XMPP service discovery on the site’s URL. Going
further, the site can offer content via Publish/Subscribe and allow all
of the same kind of interactions that are possible via
email.
A nice feature would be for ocPortal to seed its
content via Bittorrent, offering Magnet links via the site and using the
Distributed Hash Table to manage peers (rather than a
tracker).
There is a lot of opportunity for ocPortal to
sprinkle more metadata around sites. At the moment it uses Dublin Core
plus a custom ontology to describe documents. The most obvious next step
is to generate RDF for users’ profiles, using ontologies like Friend of
a Friend.
There is the choice to use RDFa, which is scattered
throughout the pages, but this would make templating much harder, so I
think having separate RDF files for each thing we care to describe is
enough. We should make sure that we’re following the Linked Data
recommendations of using HTTP redirects whenever a non-digital resource
is requested (eg. the URI of a person is requested). Generic concepts
common to all ocPortal sites, like “image”, “video”, etc., could be
linked to a reference on DBPedia for that concept. Sites may have an RDF
endpoint, if we find bots are taking too much bandwidth or aren’t
finding what we’re giving out, but it’s not too important. While we’re
discussing metadata, we can scrape incoming data for metadata, for
example EXIF data in an uploaded image. Display this alongside
regular
ocPortal fields.
There should be a federation
between ocPortal sites, such that accounts made on one site will work on
another ocPortal site, if the admins turn on this option. This could be
achieved via OpenID, ie. scrap ocPortal-specific users in favour of
everyone being identified by OpenID. In which case we would like to have
a mechanism to generate these for users who don’t already have one, or
who want another. There is some nice code floating about which used to
run xmppid.net which allows an XMPP ID to be used as an OpenID, by
asking for confirmation out-of-band via XMPP. If we have an XMPP server
hooked up to the site, with accounts for each user, then this would be a
nice solution.
Along with this, users on any ocPortal site
should be able to message any user on another ocPortal site. This could
be achieved via XMPP/email integration, and/or through some form of
OStatus-style messaging.
We should allow sites to use
client-side cryptography. Thus the site is merely a host; the user’s
browser does the encrypting via Javascript before uploading, and does
the decrypting after downloading. This would be difficult if we use
WebDAV.
If we had OpenCollaborationServices support then lots
of desktop applications would be able to grab data from ocPortal sites.
This is a bit of a moving target at the moment, with the only major
implementations being on the various parts of opendesktop.org (eg.
gnome-look.org, kde-apps.org, etc.), but this would give us a hook in to
many existing desktop applications.
Would be nice to have a
few more protocols coming in and out, under some abstraction (maybe an
“activities” log). PSYC can be used for server-server communications
(eg. syncing), Wave could be used for client-server. We might even look
at the Gobby protocol if we wanted to allow concurrent
editing.
More coherent Javascript management. Currently
Javascript works on DOM
elements in a procedural way. Would be
nicer to have an Object Oriented
approach, where the objects are
whatever the hell we want them to be, and
there may be some DOM
objects associated with an object if we like; the
objects should
sort out all of the logic amongst themselves.
There is a lot
of interest in distributed social networks like Diaspora, GNUSocial,
StatusNet, etc. which are trying to be Free social networks or
replacements for specific services. They have the ideas right, some have
protocols, but none have any kind of decent sites coming out of their
HTML generators. Could we usurp them by taking the existing awesomeness
of ocPortal sites and making them support these protocols? Would be
nice, in that we’d get publicity, we’d get new features, plus everyone
would get ocPortal as a kind of uber-Diaspora/GNUSocial/StatusNet. CPAL
terms are like AGPL, which helps
the cause.
Not a
specific criticism, but Tempcode is a bit daunting to use at first.
Being a string-replacement system, it is all about metaprogramming. This
is powerful, but do we want it to be a text replacement system, or do we
maybe want a generative language? Do we want to work at the string
level, or at the token level? What’s wrong with XML? What’s wrong with
s-expressions? Would be nice to have the minimum amount of syntax
possible, and no special-cases (we can break this symmetry at the
function/directive level, which is interpreted by PHP anyway).
s-expressions would be good for this. Similarly, if we want to overhaul
templating, we could make quite a nice generative language for CSS with
bugger all syntax and much more flexibility and extensibility. Would be
a job to get working in a way that Web people would accept though. Could
we make an XHTML-to-Tempcode convertor, to handle all of our legacy
support? If we had a higher-level (eg. token-based) language then it
would be a lot easier to make higher-level tools, like drag ‘n’ drop of
elements/directives, live previews, etc.