Migrating from Drupal 7 to Known

10 min read

What's Next?

As you can see funnymonkey.com has quite a facelift. When it was realized that FunnyMonkey would be going through a transition Bill and I reviewed what the future of funnymonkey.com would look like. Historically the reason to keep coming back has been Bill's blogging on education and education policy. So the focus would be on something that worked well as a blogging platform. The net was cast wide and we considered many options including; staying with Drupal, migrating to wordpress, laravel, revel, go, etc.

In the end we chose Known. After having met Ben Werdmüller and Erin Jo Richey at Reclaim Your Domain: The UMW Hackathon Known was already on my radar. Besides being great people to talk with and work with, Erin and Ben have a great vision for Known and a solid architecture. Known is built with the ethos of the IndieWeb movement and the POSSE publishing model. The ethos of Known and FunnyMonkey line up pretty closely.

How do we get our content into Known

Okay now we've chosen Known, we have 10 years of content currently in a Drupal 7 site, now what?

After a cursory review the import and export routines within Known appeared to be hardcoded and as far as I could tell not pluggable. That's a minor disappointment (more on this later). At this point it looked like a custom plugin was the way forward. Known plugins are pretty straightforward and looking at the default ones proved to be quite helpful. For instance take a look at Bridgy's Main.php file (found under IdnoPlugins);


    namespace IdnoPlugins\Bridgy {
        use Idno\Common\Plugin;
        class Main extends Plugin {
            function registerPages() {
                \Idno\Core\site()->template()->extendTemplate('account/menu/items', 'bridgy/menu');
                \Idno\Core\site()->addPageHandler('account/bridgy/?','IdnoPlugins\Bridgy\Pages\Account');
            }
        }
    }

That's it for the minimal plugin, just register some pages and templates. Past that there is an expected directory structure where Known will find the registered page handlers and templates. Again, reviewing Bridgy;


Bridgy/=
├── Main.php
├── Pages
│  └── Account.php
├── plugin.ini
└── templates
    └── default
        └── bridgy
            ├── account.tpl.php
            ├── facebook.tpl.php
            ├── menu.tpl.php
            └── twitter.tpl.php

We see that the call to \Idno\Core\site()->addPageHandler() registers a page handler for account/bridgy located in the PHP file IdnoPlugins\Bridgy\Pages\Account. That's the basic structure. I'm covering Bridgy for a couple reasons;

  1. It's simple: It doesn't take much code to constitute a plugin in Known.
  2. It's included: The code I'm about to show you is my first Known code and is largely oneoff since it is a migration and will not have an ongoing use. So using Bridgy is a bit more illuminating as it's fair to say it is likely ideomatic Known code.

Writing a content migration plugin

Caveat: This is not exemplary code and can be improved in many ways. What it does show you is how easy it is to get content from other systems into Known. There many points worth considering for refactoring, such as storing the new ID to old ID association as the content is imported and not outside of the save routine(s). That said, you can find the code we used over here.

I'm going to defer the detailed points of the code with the hopes that the code is commented well enough and easy enough to read. This will instead focus on the overview of the process.

Assumption

  1. The drupal DB will be available during the import routines. For this we just backed up the FunnyMonkey.com db and restored locally on our developement stack.
  2. The drupal files directory will be available during the import routines. These were just rsync'd from the production site into /srv/www/legacy/files.
  3. The migration will proceed in the following order as depicted by dependencies;
    1. Files: Have no requirements
    2. User: Have user profile pictures and require Files
    3. Nodes: Have authors and files associated and thus require the Files and User imports
    4. Comments: Require nodes
  4. The source content is in MySQL
  5. URL rewrites will be created to map all content
  6. Some method to check old content and new content will be necessary for quality checking

Writing our plugin

Registering pages


function registerPages() {
    // Administration page
    \Idno\Core\site()->addPageHandler('admin/drupalmigration','\IdnoPlugins\DrupalMigration\Pages\Admin');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/users','\IdnoPlugins\DrupalMigration\Pages\User');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/nodes','\IdnoPlugins\DrupalMigration\Pages\Node');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/files','\IdnoPlugins\DrupalMigration\Pages\File');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/comments','\IdnoPlugins\DrupalMigration\Pages\Comment');
    \Idno\Core\site()->template()->extendTemplate('admin/menu/items','admin/drupalmigration/menu');
}

In order, we register pages for the following details;

  1. Admin page: this will be our overview where we set our database settings. Arguably this could be omitted and just hardcoded.
  2. User Page: This will be the overview for user import.
  3. Node Page: This will be the overview for node import.
  4. File Page: This will be the overview for file import.
  5. Comment Page: This will be the overview for the comment import.

Then we register a template extension to get our 'DrupalMigration' into the menu. This is just a snippet that extends the existing menu to include our options for the DrupalMigration. Review the contents of

DrupalMigration/templates/default/admin/drupalmigration/menu.tpl.php

to see how this injects our menu options into the default menu.

Implementing a page

I'm only going to cover the process for the File portion as that is our first page and is exemplary of the process for all the other pages (excluding the overview page where the db settings are input). The framework for this file is the following;


  namespace IdnoPlugins\DrupalMigration\Pages {
    class File extends \Idno\Common\Page {
        function getContent() {
        }

        function postContent() {
        }
    }
}

We extend \Idno\Common\Page and implement two processes, one for a GET request and one for a POST request. In the file's getContent() method we ensure that only admins can access this page via $this->adminGatekeeper(); then we proceed to build out some tabular data to give an overview of the files to be imported and their status. We store ongoing migration data inside of Known's site config. Arguably we should have used an external table to manage this and would be especially necessary for larger migrations. The filemap which tracks files we have already imported is stored in \Idno\Core\site()->config()->drupal_migration_file_map. Most of this code consists of building up a data structure which we then pass to our admin/file template.

You can review the template in DrupalMigration/templates/default/admin/file.tpl.php. Again this should be better architected to do more of the logic work inside the getContent() process so that the template is just iterating and outputting and not doing any calculations. That said, our template does do a bit of work to present some URL rewrites for those files that have been migrated so that we can include those in our .htaccess after the migration.

For the postContent()> method we again ensure the user is an admin and then iterate over the files and use our plugin classes methods to handle all of the heavy lifting of getting the files into Known. After we process all of the files we redirect via $this->forward(\Idno\Core\site()->config()->getDisplayURL() . 'admin/drupalmigration/files'); back to the same page so the user can see the results.

Additional details

Hopefully everything so far has been helpful. The code could be used as a starting point for other Drupal site migrations into Known. The constants at the top of the file will need adjustment to appropriately grab your content. Assuming you use the same field names for the SQL queries the rest of the import code should largely work. Outside of those constants at the top the following methods will likely need review & refactoring to meet your needs;

  • getFiles(): This currently includes a bunch of unmanaged files and dummies them up to match the managed files data structure. The list of unmanaged files that should migrate will vary from site to site.
  • addUser(): Hardcodes adding a couple users as admins. This could be omitted. All user accounts have mangled passwords between 68 and 127 in length. The idea here is to require users to set a new password via Known's password reset process
  • rewriteURL(): Can be modified to clean up any garbage content and normalize URLs into one particular format. We opted to switch to relative rather than absolute links so that testing would work fine when we were not on the funnymonkey.com domain. This could also be extended to support rewriting node references to other nodes as well, but we opted to defer to 301 (moved permanently) redirects.
  • rewriteContentLinks(): We rewrite content references using our rewriteURL() process so that we can map files to their new destination and normalize on the same process for all content.

Taxonomy is handled by mapping to hashtags appended to the end of the content. See addNode() for more details.

URL rewrites

In addition to each step in the migration rendering a list of rewrites at the bottom of the import screen, Drupal also uses url_aliases that we need to account for.

The following SQL does that for us, we omit all url_aliases that are not users or nodes.


SELECT CONCAT('RewriteRule "^', alias, '$" "', source, '" [L,R=301]') FROM url_alias where source like '%user%' OR source like '%node%';

Points for improvement in Known

Overall the experience with Known was fantastic and a very refreshing experience working with a system with such a tightly focused use case and quality implementation. That said, the following details were points that I saw as potential opportunities for improvement.

Modular import/export process
Arguably this can be better handled with custom code like we did. However, having a modular import/export process lowers a barrier to collaborate and get content into Known. Perhaps the import/export functionality should itself be a Known plugin. In fairness what is currently there handles other platforms that have a standardized export process, and that's a good first step. Besides, Drupal is far from being in a place to have a standard export routine across various implementations. For Drupal there could be a standard views export template that you can map your content into a views export and then a generic Drupal to Known importer that imports data formatted in a particular as defined by the views template, but that's a Drupal project.
AddAnnotation() doesn't return ID
The other processes and methods for saving other Known content all return the newly created ID when creating new objects. This is really a minor nitpick but it made checking the import routine a bit haphazard and prevented a one-to-one on the URL rewrites. In our case we opted to rewrite to the source document rather than the specific comment. While this loses the direct link it does not break the link in the event anybody had linked to the site externally.