<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[The Readium Dev Blog]]></title>
  <link href="http://readium.github.com/atom.xml" rel="self"/>
  <link href="http://readium.github.com/"/>
  <updated>2012-09-29T17:00:50-07:00</updated>
  <id>http://readium.github.com/</id>
  <author>
    <name><![CDATA[Readium]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Readium 0.5.3 additions and improvements]]></title>
    <link href="http://readium.github.com/blog/2012/09/28/readium-0-dot-6-additions-and-improvements/"/>
    <updated>2012-09-28T20:20:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/09/28/readium-0-dot-6-additions-and-improvements</id>
    <content type="html"><![CDATA[<p>A new version of <a href="https://chrome.google.com/webstore/detail/fepbnnnkkadjhjahcafoaglimekefifl">Readium</a> is up in the Chrome store, which also means I should go over some of the things that have been added to Readium in the last little while. A list of new additions follows, and as usual, if you have any questions, feel free to come find us on Github or on IRC - #readium on freenode.net.</p>

<p>Additionally, the wiki has been updated with new <a href="https://github.com/readium/readium/wiki/How-to-contribute-to-Readium">information</a> on how you can learn more about Readium and get involved. I encourage you to go check it out!</p>

<h3>EPUB Canonical Fragment Identifiers (CFIs)</h3>

<p>A stand-alone <a href="https://github.com/readium/EPUBCFI">library</a> that provides support for the EPUB 3.0 CFI specification has been added to the Readium repository. This library has also been integrated into Readium and is used to support CFI based page-list navigation. This is demonstrated by the <a href="http://code.google.com/p/epub-samples/downloads/detail?name=georgia-cfi-20120521.epub">Georgia-CFI</a> EPUB sample.</p>

<p>The intention is also to use the CFI library to support bookmarking and annotations in Readium in the next little while.</p>

<h3>Rate and volume controls for Media Overlays</h3>

<p>New UI controls have been added to the Readium toolbar to support rate-of-playback and volume adjustment for Media Overalys. The MO controls now appear only for the parts of EPUBs that have Media Overlays. This should make it more obvious when this feature is available in an EPUB, or a particular part of it.</p>

<p>A number of other improvements have been made to the way that Media Overlay playback reacts to changes in the state of the EPUB, such as for page-turns, loading new content documents and saving and reloading the position of the playback. Improvements have also been made to the way MO highlighting is shown on the screen.</p>

<h3>Drag and drop and faster unpacking in the library view</h3>

<p>A new library - and some HTML 5 web workers - are now being used to unpack .epub files. This has increased the speed of unpacking and means that for all but the largest EPUBs, loading the packed .epub files is probably pretty feasible. EPUB files can now also be dragged and dropped onto the library view to load them into Readium, rather than having to use the &#8220;add EPUB&#8221; button on the toolbar.</p>

<h3>SVG and Bitmap images in spine</h3>

<p>Both SVG and bitmap images, included in an EPUB as spine items, are now supported in Readium.</p>

<h3>Internationalization framework</h3>

<p>The Chrome i8ln framework has been added to Readium to provide language support for Readium. We hope to add to the set of translations available for Readium, so if you&#8217;re able to provide translations in a language besides English, this would be a good way to help out!</p>

<h3>Internal changes</h3>

<p>Aside from the changes above, we&#8217;re continually refactoring the code base, fixing bugs and looking out for new ways to improve Readium. As usual, providing feedback and creating issues in the issue tracker is always welcome.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Time for refactoring, and some Readium concepts...]]></title>
    <link href="http://readium.github.com/blog/2012/07/17/it-was-time-for-some-refactoring-dot-dot-dot/"/>
    <updated>2012-07-17T12:11:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/07/17/it-was-time-for-some-refactoring-dot-dot-dot</id>
    <content type="html"><![CDATA[<p>This (rather long) post is all about a recent refactoring of Readium. I also thought it would be a good time to go over some of the concepts underlying the way Readium renders EPUBs, much of which hasn&#8217;t been discussed before.</p>

<p>So, I&#8217;m going to cover a few topics which include the rationale and goals for refactoring, Readium&#8217;s conceptual model for rendering EPUBs, the refactored implementation AND how we&#8217;re going to go about deploying the changes.</p>

<p>But first&#8230;</p>

<h2>Housekeeping</h2>

<p>Beyond the refactoring, we&#8217;ve moved Readium to manifest version 2 for Chrome extensions. There were some complications due to a new content security policy for Chrome extensions, which Matthew discusses <a href="http://matthewrobertson.org">here</a>. Interesting stuff if you&#8217;re using extensions elsewhere.</p>

<p>Also, we have a Readium contributors call today where we&#8217;ll establish new priorities for Readium development over the next couple months. We&#8217;re also thinking about a more formal way to keep the community up-to-date on current development priorities and/or something resembling milestones or a release calendar. Stay tuned for more on this.</p>

<p>Anyway, back to the main purpose of this post&#8230;</p>

<h2>Readium is successful, time to refactor?</h2>

<p>Readium has been going pretty well so far. We&#8217;ve got a lot of positive feedback from stakeholders and we&#8217;ve managed to continually add features and address concerns. Considering that things are trucking along just fine, why the refactoring?</p>

<p>Well, first off, Readium development has been iterating quickly from a starting point of nothing. As Readium has developed, so has our understanding of how best to implement certain features. Also, some tight deadlines have meant that we had to hack in some changes that we knew would be best implemented differently; all part of the typical development process. Since there was a lull between the end of the Tokyo Book Fair (where versions of Readium were presented) and our next priorities call, it seemed like a good time to refactor parts of the code base to address some of these issues and lay the groundwork for the next set of features.</p>

<h2>What were the goals for refactoring?</h2>

<p>There were a number of goals for refactoring. First and foremost was to increase the accessibility and understandability of the Readium codebase. We wanted to refactor to ensure Readium&#8217;s conceptual model for rendering EPUBs was represented in an obvious way in the code. It is a major goal of the Readium project to provide a reference implementation of an EPUB 3 reader. As such, we don&#8217;t just want Readium to &#8220;go&#8221;, we want developers to be able to understand (easily) how and why it &#8220;goes.&#8221;</p>

<p>The second goal was to support further development on Readium. Having a clear and understandable code base is important for our ability to continually extend the application with new functionality. Extending the application with new features was already getting tricky due to the growing complexity of the code. A few of our backbone models were getting large and dodgy dependencies were growing&#8230; It was about time for two of our best friends, abstraction and encapsulation, to come help us manage complexity.</p>

<p>However, before I get into the details of the refactoring it&#8217;ll be useful to discuss Readium&#8217;s conceptual model for rendering EPUBs, as this isn&#8217;t something we&#8217;ve provided a lot of documentation on in the past.</p>

<h2>The conceptual model</h2>

<p>There are a number of important concepts providing the foundation for rendering EPUBs in Readium; some obvious, some not. They are&#8230;</p>

<h3>An EPUB</h3>

<p>Basically everything that makes up a .epub file. Given that Readium is a viewer and not an authoring tool, the concept of an EPUB is of a read-only nature. Readium unpacks them, reads them, but doesn&#8217;t alter them.</p>

<h3>Rendering (a view of) an EPUB</h3>

<p>There can be a lot to an EPUB. How all the various parts - and which parts - of the EPUB get put up on the screen is called rendering. We have different ways (views) responsible for rendering different types of EPUB.</p>

<h3>EPUB type</h3>

<p>There are a number of ways that EPUB authors can choose to have an EPUB rendered.</p>

<p>One is &#8220;fixed layout,&#8221; where the author of the EPUB specifies the way in which the EPUB will be rendered on the screen, as well as the properties of that rendering. This includes fonts, backgrounds, margins, sizes etc.</p>

<p>Another is &#8220;reflowable,&#8221; where it is up to the reading system to decide how an epub is rendered. &#8220;Reflowable&#8221; means that content such as text can &#8220;flow&#8221; into available space on the screen and reflow, as that space changes.</p>

<p>The final option is scrolling text, which is reflowable text rendered in a scrolling window. This is not actually part of the EPUB spec but rather a convenience in Readium for solving a few different problems. However, the scrolling view may be removed at some point in the future.</p>

<h3>Pagination</h3>

<p>This is the concept of converting the content of an EPUB (text, images etc.) into a discrete number of &#8220;pages&#8221; we can put up on a screen. Pages are a useful abstraction that allow us to do things in a consistent way with renderings of the different types of EPUB.</p>

<p>Fixed layout EPUBs define their &#8220;pages&#8221; explicitily, reflowable EPUBs do not. In the case of a reflowable EPUB the number, size and display of pages is entirely dependent on the reading device and screen. In the case that the screen is resized, the pagination of content changes. It is important to handling some of the complexity here that the concept of &#8220;pages&#8221; is distinct from that of the content of the EPUB (although obviously, a conceptual link is maintained). Readium must maintain page state, including the number of pages in the current pagination, the currently displayed page etc.</p>

<h3>Modifiable EPUB rendering properties</h3>

<p>Readium allows users to set a number of properties for the rendering of an EPUB. These include the number of pages to display on the screen (one or two), font size, margin size, day or night styles etc. These preferences are persisted for each EPUB a user has loaded into Readium. As such, this state is maintained in Readium.</p>

<p>The above concepts are how we think about putting EPUBs up on the screen. The conceptual model is pretty simple but a picture is equal to ONE THOUSAND WORDS:</p>

<p><img class="center" src="https://dl.dropbox.com/u/89630457/readium_conceptual_model.png"></p>

<h2>Before and after refactoring</h2>

<p>The conceptual model I just described has existed all along. However, it was becoming increasingly muddied as to which parts of the code were responsible for which parts of that model. Essentially, as the original set of backbone models were extended, our abstractions and encapsulation were breaking down a bit. The image below shows the set of backbone models at the beginning of the refactoring process:</p>

<p><img class="center" src="https://dl.dropbox.com/u/89630457/readium_before_refactoring.png"></p>

<p>In particular, the <a href="https://github.com/readium/readium/blob/master/scripts/models/ebook.js">Ebook</a> model was becoming the &#8220;kitchen sink,&#8221; and was increasingly responsible for a great deal of the conceptual model I described. This included interaction with the currently rendered EPUB, behaviour for pages and the implementation of parts of the EPUB spec (such as Media Overlays). Additionally, the lack of clarity regarding the roles and responsibilities of the various models and views was resulting in new functionality bleeding into less-than-ideal places.</p>

<p>To meet our two goals for refactoring, outlined above, this core set of models and views was refactored into the following (A description follows of how each backbone model was refactored):</p>

<p><img class="center" src="https://dl.dropbox.com/u/89630457/readium_after_refactoring.png"></p>

<h3><a href="https://github.com/readium/readium/blob/master/scripts/models/ebook.js">Ebook</a></h3>

<p>This model was refactored into a set of models with specific responsibilities:</p>

<p>1) <strong><a href="https://github.com/readium/readium/blob/readium-refactored/scripts/models/epub.js">EPUB</a>:</strong> This is a &#8220;read-only&#8221; model that represents the content of an EPUB loaded in the Readium application. The attributes of this model are now persisted separately from those of EPUBController, in order to maintain a clear separation between the &#8220;read-only&#8221; properties of an EPUB and the rendering properties maintained by Readium to display an EPUB. The attributes for this model are only persisted a single time when an EPUB is unpacked and are not updated by Readium.</p>

<p>2) <strong><a href="https://github.com/readium/readium/blob/readium-refactored/scripts/models/epub_controller.js">EPUBController</a>:</strong> This model represents the concept of &#8220;an EPUB being represented&#8221; in the Readium system. It wraps EPUB (and others, like the package document) and exposes it to the Readium application, as well as maintaining attributes such as the current location (the current spine element) in the EPUB&#8217;s content, whether one or two pages are displayed, current font-size etc. This model is also intended to manage the interaction between the rest of the application and the content of an EPUB.</p>

<p>3) <strong><a href="https://github.com/readium/readium/blob/readium-refactored/scripts/models/readium_pagination.js">ReadiumPagination</a>:</strong>
 This model represents a set of EPUB content that has been &#8220;paginated&#8221; by one of the backbone views (discussed below). The intention is to abstract and encapsulate page management from other parts of the system. Instances of this model are managed by the currently instantiated backbone view, which itself is responsible for paginated some set of content.</p>

<p>4) <strong><a href="https://github.com/readium/readium/blob/readium-refactored/scripts/models/page_number_display_logic.js">PageNumberDisplayLogic</a>:</strong> In Readium, displaying &#8220;pages&#8221; includes toggling the number of pages displayed (one or two), navigating forward or backward through an EPUB or switching the view to a particular page. The logic required for implementing this is actually more complex than it would first appear. The correct layout depends on the type of EPUB, any preferences the author might have set for the EPUB, the page-progression-direction and the specified reading-order for the EPUB. This logic is now encapsulated in its own model, with all the attendent benefits.</p>

<p>5) <strong>Media Overlay models:</strong> Media Overlays functionality was temporarily taken out and will be refactored into its own set of models from the MO development fork.</p>

<h3><a href="https://github.com/readium/readium/blob/master/scripts/models/paginator.js">Paginator</a></h3>

<p>This model was refactored into <a href="https://github.com/readium/readium/blob/readium-refactored/scripts/models/pagination_strategy_selector.js">PaginationStrategySelector</a>. Additionally, the logic of this model was simplified, with some of the code for rendering fixed layout EPUBs refactored into the fixed layout backbone view. This model was also renamed to clarify its role: It does not actually paginate EPUB content but rather selects a pagination strategy and calls the appropriate backbone view to &#8220;paginate&#8221; a set of content.</p>

<h3>The views</h3>

<p>The backbone views have the primary responsibilty of paginating the content of an EPUB and rendering it on the screen. The inheritence structure of the views did not change, but some functionality was refactored from EBook and Paginator into the appropriate view. In particular, the API for each of the views was made consistent (<a href="https://github.com/readium/readium/blob/readium-refactored/scripts/views/pagination_strategies/pagination_view_base.js">base</a>, <a href="https://github.com/readium/readium/blob/readium-refactored/scripts/views/pagination_strategies/fixed_pagination_view.js">fixed layout</a>, <a href="https://github.com/readium/readium/blob/readium-refactored/scripts/views/pagination_strategies/reflowable_pagination_view.js">reflowable</a> and <a href="https://github.com/readium/readium/blob/readium-refactored/scripts/views/pagination_strategies/scrolling_pagination_view.js">scrolling</a>). Each of these views has a render() method that when called, paginates some of the content of an EPUB and creates an instance of the ReadiumPagination model. The instantiated ReadiumPagination object then represents the current set of paginated content.</p>

<h2>Some other notes about the code</h2>

<h3>Interfaces</h3>

<p>I find it&#8217;s very helpful to be able to look at an object or model and know right away which methods are called by other parts of the code and which are intended to be &#8220;helpers&#8221; within the object/model itself. <strong>tl;dr</strong> I want to know the interface and the abstraction it&#8217;s implementing. Javascript doesn&#8217;t provide us some of the tools for this that we might have in another language (public/private declarations, interfaces etc.). However, I still think it&#8217;s useful to apply conventions to make the interface clear.</p>

<p> In the refactored models, we&#8217;ve re-ordered the methods and created comment headers to indicate which methods are part of the interface of each model/view, and which are &#8220;internal&#8221; helpers for that object. Obviously, nothing in the code actually enforces &#8220;public&#8221; and &#8220;private&#8221; - and neither do I think it&#8217;s necessary we do so. However, the hope is that this convention will make clearer the intention and dependencies of each model/view.</p>

<h3>Code commenting</h3>

<p>Generally, I think it&#8217;s best that code speak for itself. It should be written-to-be-read (by humans). Comments that document step-by-step functionality are a bit redundant (although not always). On the other hand, I think in-line comments are very useful for documenting both high-level intentions (for methods/blocks etc.) and for, most importantly, rationale. I find rationale - the set of reasons underlying a particular approach or implementation - to be the most difficult thing to infer from code. To this end, comments at the beginning of important methods usually include an attempt at explaining intention and rationale; in particular, things that have a non-obvious or unusual rationale. I encourage you to do the same if you&#8217;re contributing to Readium.</p>

<h2>Deployment</h2>

<p>A <a href="https://github.com/readium/readium/tree/readium-refactored">branch</a> has been published in readium/readium. It&#8217;ll be there to test for regression bugs and other omissions and while we update some of the test cases and add in some of the excluded Media Overlay functionality. When we&#8217;re comfortable with it, probably early next week, we&#8217;ll push it up as the master and continue Readium development from there. If you have any questions, please come and find us.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[An EPUB 2.x/3 validating parser, C and iOS!]]></title>
    <link href="http://readium.github.com/blog/2012/07/17/an-epub-2-dot-x-slash-3-validating-parser-in-c-dot-and-working-in-ios/"/>
    <updated>2012-07-17T09:57:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/07/17/an-epub-2-dot-x-slash-3-validating-parser-in-c-dot-and-working-in-ios</id>
    <content type="html"><![CDATA[<p>We&#8217;re always happy to have people contribute to the Readium project, whether that contribution is directly to the Readium source, submitting of bug reports, feature requests, or more generally in support of the EPUB 3 standard.</p>

<p>To this end, I wanted to bring your attention to an <a href="https://github.com/MedallionMediaGroup/EPUB3Processor">epub validating parser</a> that Brian Buck has been working on. More details are available on his github page but as an overview, the project is a (fast) EPUB validating parser that has been tested in a number of environments, including iOS. We haven&#8217;t been doing much iOS work with Readium yet, but I suspect this&#8217;ll be interesting to a lot of people out there. By way of a disclaimer, this is not the &#8220;official&#8221; epub validiator we&#8217;re using for Readium but it&#8217;s great to have people working on and looking at all parts of the EPUB toolchain. We hope that the interest and improvements generated by Brian&#8217;s project will help us improve Readium in the future.</p>

<p>Thanks Brian!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Readium Turns 0.4]]></title>
    <link href="http://readium.github.com/blog/2012/06/21/readium-turns-0.4/"/>
    <updated>2012-06-21T10:17:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/06/21/readium-turns-0.4</id>
    <content type="html"><![CDATA[<p>We recently pushed out version 0.4.0 of Readium so it is time for an update.</p>

<h2>Contributing</h2>

<p>First up, there has been a lot of recent interest from people who want to contribute to the Readium project, which is always encouraging and very welcomed. It&#8217;s especially helpful to have contributors who can read and debug the rendering of ePubs in languages other than English. Matthew did a great job (more below) getting <a href="http://code.google.com/p/epub-samples/wiki/SamplesListing#kusamakura-japanese-vertical-writing">vertical Japanese text</a> to render, which he was pretty excited about. Did it actually render something that made sense? We had no idea (it did). But it <em>was</em> vertical.</p>

<p>We also have users contributing bug reports, which is definitely helpful as these improve our understanding of what kind of ePubs are getting put together out there. So, if you&#8217;ve come across the Readium project and tried your unique ePub and it&#8217;s not working, let us know by creating an issue in the <a href="https://github.com/readium/readium/issues?direction=desc&amp;sort=created&amp;state=open">tracker</a>. We can&#8217;t promise quick fixes for everything, but it goes a long way to helping build a full-spec reader.</p>

<h2>docco</h2>

<p>We had a suggestion from <a href="https://github.com/schmidsi">Simon Schmid</a> to set up <a href="http://jashkenas.github.com/docco/">docco</a> to make the code base more accessible and understandable. docco pulls comments directly out of source files and lines them up with your code in a nicely formatted web page. It uses markdown and only pulls out the // comments; /**/ comments are left in the source. In any case, you don&#8217;t have to do anything but comment, we&#8217;ll take care of deploying the updated docco-mentation. We&#8217;ve got it set up and it can be accessed from the <a href="http://github.readium.org/docs/ebook.html">documentation</a> tab on this blog (you can navigate to other docco pages with the widget in the top right). In general contributing documentation is a great way to help out while familiarizing yourself with an open source project and Readium could use some more comments, so comment away.</p>

<h2>IRC channel</h2>

<p>We&#8217;ve set up an IRC channel for the Readium project on freenode.net. The channel is #Readium. Matthew or I will try to be on there during PST working hours and hopefully we can answer questions if they come up. I can&#8217;t imagine we&#8217;re going to have too many flame wars about Readium but since this <em>is</em> the internet, I would feel negligent not to remind everyone to keep it respectful.</p>

<h2>CSS regions no more</h2>

<p>In order to better support different forms of writing - right-to-left, vertical etc. - Matthew changed the reflowable pagination approach from the use of CSS3 regions to CSS columns. This was a big step forward for Readium as CSS columns are supported on <a href="http://caniuse.com/#feat=multicolumn">almost all modern browsers</a>. Also, because this approach is able to execute entirely within an <code>iframe</code> it solved a number of other issues we had been having (eg scoping of style rules). We are also able to allow spine level scripting for the reflowable content, which a lot of EPUB content creators were really excited to start working with.</p>

<h2>Alternate Style Tags</h2>

<p>An implementation of the Alternate Style Tags specification has been added to Readium. It is set up to interpret arbitrary sets of tags generally, although currently only day/night tags are specifically handled. This can be demonstrated with the use of the <a href="http://code.google.com/p/epub-samples/wiki/SamplesListing#wasteland">Wasteland</a> ePub from the Sample-Project, which has an alternate style set for &#8220;night&#8221; mode. If anyone sees any problems, missing parts of the spec or misinterpretations, please let me know!</p>

<h2>Wrap-up</h2>

<p>So that&#8217;s it for now. As always if you have questions, want to contribute, or submit bug reports, we can be contacted on <a href="https://github.com/readium/readium/">github</a>, by email, or now on IRC!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Unpacking EPUBs in Readium]]></title>
    <link href="http://readium.github.com/blog/2012/05/04/unpacking-epubs-in-readium/"/>
    <updated>2012-05-04T19:29:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/05/04/unpacking-epubs-in-readium</id>
    <content type="html"><![CDATA[<p>Something has been weighing heavily on my conscience for a while. Readium suffers from a serious lack of technical documentation. The closest I have come to putting anything into writing was creating <a href="https://github.com/readium/readium/wiki/Architecture">this wiki page</a> over two months ago. If we want to attract contributors to the project, we need to make diving into the source code as streamlined and painless as possible. Our architecture wiki is a pretty pathetic attempt in this regard.</p>

<p>But today I am going to take the first step towards writing this wrong. I am planning a series of (at least) 3 blog posts explaining how the internals of Readium operate. I think the end result will be interesting not only to Readium contributors, but also to ebook publishers, developers of other ebook reading systems and anyone building large applications with HTML5 / javascript.</p>

<h2>Unpacking an EPUB</h2>

<p>I felt that the most natural place to start diving into Readium is the beginning. Readium currently ships with an empty library, so the first thing that every new user of our system does is import a file into their library.</p>

<h3>How do you wanna do this?</h3>

<p>As of writing, Readium offers users three ways to import content into their library: from a remote URL, from their local filesystem in packed format and from their local filesystem as an unpacked directory. Of these three cases, importing from the local filesystem in packed form is the most complex (the other two options are really just subcases of this problem), so we will explore this use case for the rest of the post.</p>

<h3>Pick a file any file</h3>

<p>To enable a user to select a file from their local filesystem Readium makes use of a plain html <code>&lt;input type='file'&gt;</code> element. But if you have done any web development before you will notice that what happens next is completely non-traditional. In a classic web application an HTML form is really a user friendly tool for adding parameters to an HTTP request. When you click the <strong>submit</strong> button of an HTML form, the browser serializes whatever you typed in the form&#8217;s fields into <code>application/x-www-form-urlencoded</code> data, tacks it on as the payload of an HTTP POST request (usually) and sends to the server.</p>

<p>But Readium is 100% client side; there is no server to send the data to. We use the <code>&lt;input type='file'&gt;</code> element only to get a copy of user&#8217;s file into our javascript sandbox. Once we have that we use the HTML5 javascript method <a href="https://developer.mozilla.org/en/DOM/window.URL.createObjectURL">window.URL.createObjectURL</a> to create a URL that points to the file they selected. This is exactly where we would start if instead of selecting an EPUB from their filesystem, the user entered a URL (like I said, subcases).</p>

<h3>XYZ PDQ</h3>

<p>The next step is to unpack the file. EPUBs use the <a href="http://en.wikipedia.org/wiki/Zip_(file_format)">ZIP file format</a> to compress and archive their contents. To decompress this content we make use of an <a href="http://cheeso.members.winisp.net/examples.aspx#jsunzip">open source library</a> for unzipping the content in javascript. Unfortunately, decompressing files in javascript is dreadfully slow. This is because even though it has them, the <a href="https://developer.mozilla.org/en/JavaScript/Reference/Operators/Bitwise_Operators">bitwise operators</a> in javascript are slow. This is because javascript does not have an <code>int</code> type. But, when performing bitwise operations it pretends that it does. The result is that in order for a javascript engine to perform a bitwise operation, it needs to do a conversion from its floating point <code>Number</code> type to the equivalent 32bit signed integer, perform the operation, and then convert back to its <code>Number</code> type.</p>

<p>This performance constraint is what motivated us to start allowing users to load an unpacked directory into Readium. To get this working we use the nifty <code>webkitdirectory</code> attribute on the form <code>&lt;input&gt;</code>. If you have never seen this in action before (or never knew of its existence) there is a pretty good demo available <a href="http://html5-demos.appspot.com/static/html5storage/demos/upload_directory/index.html">here</a> or better yet, check it out in the Readium <a href="https://github.com/readium/readium">source code</a>.</p>

<h3>The Extraction</h3>

<p>All of Readium&#8217;s logic for importing an epub from both zipped and unpacked formats is contained in <a href="https://github.com/readium/readium/blob/master/scripts/extractBook.js">/scripts/extractBook.js</a>. For those of you who speak the <a href="http://en.wikipedia.org/wiki/Unified_Modeling_Language">Unified Modeling Language</a>, here is a rundown of what is in there:</p>

<p><img class="center" src="https://github.com/readium/readium/raw/master/docs/unpacking_uml.png"></p>

<p>Perched at the top of the hierarchy is <code>Backbone.Model</code>. This is not actually defined in Readium. It is an object provided by <a href="http://documentcloud.github.com/backbone/">Backbone.js</a>, a framework used widely throughout Readium. I am planning on explaining our decision to use Backbone in more depth in a future post, but for now the most important part to understand is its <a href="http://documentcloud.github.com/backbone/#Events">Events module</a>. Basically Backbone makes triggering and subscribing to custom event super simple. Here is a quick rundown of it in code:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="c1">// create a vanilla Backbone model</span>
</span><span class='line'><span class="kd">var</span> <span class="nx">foo</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Backbone</span><span class="p">.</span><span class="nx">Model</span><span class="p">();</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// register a handler on foo</span>
</span><span class='line'><span class="nx">foo</span><span class="p">.</span><span class="nx">on</span><span class="p">(</span><span class="s2">&quot;banana&quot;</span><span class="p">,</span> <span class="kd">function</span><span class="p">()</span> <span class="p">{</span>
</span><span class='line'>  <span class="c1">// this code will execute when the banana event occurs</span>
</span><span class='line'>  <span class="nx">alert</span><span class="p">(</span><span class="s1">&#39;Wow, that was easy&#39;</span><span class="p">);</span>
</span><span class='line'><span class="p">})</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// fire the banana event on foo</span>
</span><span class='line'><span class="nx">foo</span><span class="p">.</span><span class="nx">trigger</span><span class="p">(</span><span class="s2">&quot;banana&quot;</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Instances of <code>Backbone.Model</code> also have a set of <a href="http://en.wikipedia.org/wiki/Observer_pattern">observable</a> attributes, that is to say, they fire <code>change</code> events when they are changed <em>automagically</em>. The entire control flow of unpacking logic is driven by Backone events. Event firing happens both explicitly (ie <code>this.trigger("something")</code>) and implicitly by changing an observable attribute (ie <code>this.set("someprop", "newVal")</code>).</p>

<p>The next level in the hierarchy is <code>BookExtractorBase</code>. This is where all of the shared logic for parsing, managing and writing the imported EPUB to disk lives. I am going to come back to this in the next section.</p>

<p>At the bottom of the hierarchy are <code>ZipBookExtractor</code> and <code>UnpackedBookExtractor</code>. Each of these contains the logic for going through either a ZIP archive or unpacked EPUB and calling the necessary extraction / importing logic (located primarily in <code>BookExtractorBase</code>). Let’s consider <code>ZipBookExtractor</code>.</p>

<h3>ZipBookExtractor</h3>

<p>Constructor logic is added to a <code>Backbone.Model</code> by overloading its <code>initialize()</code> method. The <code>Readium.ZipBookExtractor.Initialize</code> method does a few things:</p>

<ol>
<li>It makes sure that the a URL has been passed in to the constructor, if not, it throws an exception.</li>
<li>If a URL has been passed in, it takes an MD5 hash of the string formed by concatenating a timestamp to the URL. This serves as a unique identifier, which readium uses to keep track of the file and also the name of the root filesystem directory Readium will write its contents to.</li>
<li>It captures the src filename of the users upload. This is captured only for the purpose of displaying it later in the UI, and is not needed for any technical reason.</li>
</ol>


<p>Once a <code>ZipBookExtractor</code> instance has been created, the importing process is initiated by calling its <code>extract()</code> method. This method sets up the order in which the rest of the extraction process should execute by hooking up each method of the process to the event that will be fired upon completion of the previous step. Finally, it calls the asynchronous constructor of <code>Readium.FileSystemApi</code>, which is our own very thin wrapper around the <a href="http://www.html5rocks.com/en/tutorials/file/filesystem/">HTML5 filesystem API</a>.</p>

<p>Once the filesystem is opened, the process continues by calling the constructor of our zip library. This constructor is also asynchronous. When initialized, it executes callback with the <code>Zip</code> object as its <code>this</code> context. The <code>Zip</code> has not been decompressed at this stage, only the archive data structures have been parsed, so it does contain information about all of the archive entries. We decompress individual entries as needed (the code for this is in the <code>extractEntryByName</code> method).</p>

<h3>What to unpack first</h3>

<p>After we perform some very rudimentary validations on the <code>Zip</code> object (mostly checking for the existence of a few essential items) we get into the brunt of unpacking the book. In general when unpacking an epub, what you want to get your hands on is the <em>OPF Package Document</em> but first you need to find it. In order to find it, you have to parse another xml file first: <code>META-INF/container.xml</code>. All epubs must have this file and it must be located at this absolute path in the archive. The <code>container.xml</code> specifies the absolute path of <code>package document</code> within the archive on an xml <code>&lt;rootfile&gt;</code> node. This is the only information that we take from this file.</p>

<h3>Parsing the package document</h3>

<p>Once we are able to find the <code>package document</code> within the archive the next step is to parse it. The logic for this is contained in the <a href="https://github.com/readium/readium/blob/master/scripts/models/packageDocument.js">PackageDocumentBase</a> class definition. This is separated from the unpacking code because we also parse package documents right before we display a publication.</p>

<p>Whenever possible, we try to parse xml using a library called <a href="https://github.com/dnewcome/jath">Jath</a>. Jath allows declarative, template based parsing of xml into a structure of Javascript objects. Again I will differ the design justification for this to a later post.</p>

<p>At this stage when parsing the <code>package document</code> we are really only interested in the meta data that it contains. Essentially what we are looking for is everything we need to show the publication in the library view (title, author, cover image etc). Once we have this information, we add in a bit more of our own (most importantly where we are going to write the package document on disk) and then save it to the browser’s local storage for quick access. We then iterate through the remaining archive contents and extract them one by one and write them to the HTML filesystem.</p>

<h3>Monkey Patching</h3>

<p>The last step of the unpacking process is something that I have termed <strong>Monkeypatching the URLs</strong>. There are a couple of reasons for this but the main motivation is that when we open an EPUB in the viewer, there are no guarantees that it will be loaded in a document with a URL pointing to where it actually exists on disk. The implication is that in order for us to ensure that URLs in individual spine items are handled correctly by the browser (for example consider the <code>src</code> attribute of an <code>&lt;img&gt;</code> tag) we need to convert them into fully qualified, absolute URLs, pointing to where we saved the file in the HTML5 filesystem. This logic is contained in <a href="https://github.com/readium/readium/blob/master/scripts/models/path_resolver.js">scripts/models/path_resolver.js</a>. This process is a little muddy right now and there are a <a href="https://github.com/readium/readium/issues/66">few bugs</a> left to iron out (so if you are looking for some way to contribute :) this might be a good spot to jump in).</p>

<h3>Time to open</h3>

<p>Once the book is fully unpacked, monkey patched and written to disk, we fire off an event to let the library code know that we are done. In response, we open the file in the viewer. How this happens, I will save for a future post.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Running Readium on Mobile]]></title>
    <link href="http://readium.github.com/blog/2012/04/16/running-readium-on-mobile/"/>
    <updated>2012-04-16T17:14:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/04/16/running-readium-on-mobile</id>
    <content type="html"><![CDATA[<p>Lately we have been experimenting with running the Readium source code in new and interesting ways. The entire application is written in HTML5 (html, css and javascript) so theoretically we should be able to package it up like a web site and run it in any modern browser. Practically we are utilizing a few cutting edge HTML5 features for which there is not yet widespread support (namely CSS3 regions and the HTML5 filesystem api). That said we do have a working, albeit somewhat fragile demo, that you can check out <a href="http://github.readium.org">here</a>. It is still riddled with bugs so if you get into trouble, refresh liberally or try another browser (for those of you with android devices, the demo works surprisingly well on <a href="https://play.google.com/store/apps/details?id=com.android.chrome&amp;hl=en">Chrome for Android</a>).</p>

<h2>How it works</h2>

<p>Because we cannot access the HTML5 filesystem API in most browsers, we are hosting the EPUB files on the server and fetching the content via AJAX from the viewer when they are opened. This means that you cannot actually load an EPUB file into this version of Readium but we think this will be OK for a lot of use cases (think online bookstore that wants to allow access to files for its customers) and there may be ways to implement this functionality (see below).</p>

<p>The next thing you might notice is that pagination of re-flowable content is very broken on most browsers. That is because almost nobody has support for the CSS regions property yet. Thankfully one of contributors has come up with a polyfill that should provide a fallback in these cases and <a href="https://github.com/readium/readium/issues/61">is currently working on integrating it</a>.</p>

<h2>What&#8217;s next</h2>

<p>There is obviously a lot left to do if we want to get this version to a publicly usable state. Here are a few things we have thought about taking on:</p>

<ul>
<li><strong>Fix bugs</strong> -  there are a lot of issues that are stemming from the fact that serving thing over the web is different than serving them from a .crx. We need to take care of these one by one</li>
<li><strong>Tune up the UI / UX</strong> - the Readium UI needs to be tweaked a little to make if feel at home on a mobile touch interface.</li>
<li><strong>Package up the build as a Phonegap application</strong> - This may be a little further down the road, but once done it will give as another means to interact with the filesystem and let users import and read their own EPUB3 files on their devices.</li>
</ul>


<p>As always in anyone thinks they can help with any of these items (or anything else for that matter). Please don&#8217;t hesitate to create items in the <a href="https://github.com/readium/readium/issues?sort=created&amp;direction=desc&amp;state=open">github issue tracker</a> or better yet submit pull requests with patches.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Readium Adds MathML Support by Integrating MathJax]]></title>
    <link href="http://readium.github.com/blog/2012/03/18/readium-adds-mathml-support-by-integrating-mathjax/"/>
    <updated>2012-03-18T17:11:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/03/18/readium-adds-mathml-support-by-integrating-mathjax</id>
    <content type="html"><![CDATA[<p>Readium now renders MathML by integrating MathJax. <a href="http://www.mathjax.org">MathJax</a> is an open source, cross-browser JavaScript library sponsored by the American Mathematical Society, Design Science, and the Society for Industrial and Applied Mathematics as well as <a href="http://www.mathjax.org/sponsors/">several partners</a>.</p>

<p>Work on native support of MathML in WebKit <a href="https://trac.webkit.org/wiki/MathML">has begun</a> but is a long way from being ready to use. With MathJax, Readium will be able to render MathML, an important step towards Readium&#8217;s full EPUB 3 support. Additionally, MathJax offers zoom for accessibility and cut-and-paste of mathematical content.</p>

<p>The necessary code has been pulled into the master branch and, as of version 0.1.9, Readium displays math beautifully. You can find a sample <a href="https://github.com/dpvc/readium/blob/mathjax/examples/sample-tex-mml.epub">here</a>.</p>

<p>Thanks to Davide Cervone, MathJax&#8217;s Lead Developer, for his help in making MathJax work in Readium.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Readium Running On Chrome Canary]]></title>
    <link href="http://readium.github.com/blog/2012/03/14/readium-running-on-chrome-canary/"/>
    <updated>2012-03-14T17:17:00-07:00</updated>
    <id>http://readium.github.com/blog/2012/03/14/readium-running-on-chrome-canary</id>
    <content type="html"><![CDATA[<p>Way back in version 16 the Chrome browser began shipping support for <a href="http://dev.w3.org/csswg/css3-regions/">CSS3 regions</a>. This feature defined a declarative markup to have HTML content flow into multiple regions such that the overflow from the first regions will spill over into the second region, overflow from the second will flow into the third etc. Parts of the implementation were shaky and / or missing (eg the javascript APIs were completely stubbed out) nevertheless, it was supported and it worked in the basic use cases, so we decided to leverage it in Readium to perform pagination of reflowable content.</p>

<p>Our regions based pagination algorithm basically looks like this:</p>

<ul>
<li><code>whenever a repagination event occurs:</code>

<ul>
<li><code>make a pessimistic guess about how many pages are needed</code></li>
<li><code>add that many pages to the dom</code></li>
<li><code>while the last page has overflow:</code>

<ul>
<li><code>add another page</code></li>
</ul>
</li>
</ul>
</li>
</ul>


<p>This was all working well enough in terms of paginating the content. There were some bugs related to click-able areas not being rendered where they were expected but we assumed these were bugs in Webkit / chrome and would get ironed out in future versions. Then one day we found out that regions were going to be <a href="https://bugs.webkit.org/show_bug.cgi?id=78525#c0">shut off by default in webkit</a>. Sure enough I installed <a href="http://tools.google.com/dlpage/chromesxs">chrome canary</a>, regions were gone and Readium did not handle it well. This was not good news for us.</p>

<p><img class="center" src="http://fuuu.us/152.png"></p>

<h2>Challenge Accepted</h2>

<p>We decided that a short term solution to our problem would be to find a way to allow CSS3 regions to be turned back on. With a little support from the awesome chrome team at google, I managed to stumble through the chromium source code and submit <a href="https://chromiumcodereview.appspot.com/9523002/">a patch</a> that allows regions to be turned on via the command line. This patch has landed.</p>

<h2>So What Can You do About it?</h2>

<p>If you are using chrome Canary or you are from the future where chrome has moved to a version in which CSS3 regions are off by default here are the steps you can take to turn them back on:</p>

<ol>
<li>Open a new tab in your browser and enter <code>about:flags</code> into the url bar</li>
<li>In the page that opens find an entry called <code>Enable CSS Regions</code> and click the <strong>enable</strong> link</li>
<li>Restart chrome for the changes to take affect</li>
</ol>


<p>Once you do this, you will may also notice that readium runs smoother and has a fewer bugs. This is all product of webkit&#8217;s regions implementation becoming more mature (just as we had hoped).</p>

<h2>A Better Solution</h2>

<p>We realize that the above solution does not provide the best user experience, and we are currently seeking other solutions. One idea is to distribute Readium with a custom build of Webkit that has regions turned on. This may seem a bit heavy handed, but we were planning on creating a custom Webkit in order support vertical typography all along. A simpler solution that is likely to show up soon will be to degrade gracefully when regions are not available or come up with a pagination strategy that works without CSS regions. This will also help us to take a big step towards creating a version of Readium that runs as normal web page (not packaged as a chrome extension). If you have any good ideas for how we might do this please put them in the <a href="https://github.com/readium/readium/issues?sort=created&amp;direction=desc&amp;state=open">issue tracker</a>.</p>
]]></content>
  </entry>
  
</feed>
