My first piece of news today is by way of celebration that I have been getting some Alexa voice skills active on the Amazon store. These can now be enabled on any of Amazon’s Alexa-enabled devices, such as the Dot or Echo. One of these skills has to do with The Review blog, in that it will list out and read the opening lines of the last few posts there (along with a couple of other blogs I’m involved with). So if you’re interested in a new way to access blogs, and you’ve got a suitable piece of equipment, browse along to the Alexa skills page and check out “Blog Reader“. I’ll be adding other blogs as time goes by.
The second publicly available skill so far relates to my geographical love for England’s Lake District. Called “Cumbria Events“, this skill identifies upcoming events from the Visit Cumbria web site, and will read them out for the interested user. You can expect other skills to do with both writing and Cumbria to appear in time as I put them together. It’s a pity that Alexa can’t be persuaded to use a Cumbrian accent, but to date that is just not possible. Also, the skills are not yet available on the Amazon US site, so far as I know, but that should change before too long.
In the process I’ve discovered that writing skills for Alexa is a lot of fun! Like any other programming, you have to think about how people are going to use your piece of work, but unlike much of what I’ve done over the years, you can’t force the user to interact in a particular way. They can say unexpected things, phrase the same request in any of several ways, and so on. Alexa’s current limitation of about 8 seconds of comprehension favours a conversational approach in which the dialogue is kept open for additional requests. The female-gendered persona of my own science fiction writing, Slate, is totally conversational when she wants to be.
It all makes for a fascinating study of the current state of the art of AI. I feel that if we can crack unstructured, open-ended conversation from a device – with all of the subtleties and nuances that go along with speech – then it will be hard to say that a machine cannot be intelligent. Alexa is a very long way from that just now – you reach the constraints and limitations far too early. But even accepting all that, it’s exciting that an easily available consumer device has so much capability, and is so easy to add capabilities.
But while all that was going on, a couple of hundred million kilometres away NASA ordered a course correction for the Mars Maven Orbiter. This spacecraft, which has been in orbit for the last couple of years, was never designed to return splendid pictures. Instead, its focus is the Martian atmosphere, and the way this is affected by solar radiation of various kinds. As such, it has provided a great deal of insight into Marian history. So MAVEN was instructed to carry out a small engine burn to keep it well clear of the moon Phobos. Normally they are well separated, but in about a week’s time they would have been within a few seconds of one another. This was considered too risky, so the boost ensures that they won’t now be too close.
Now this attracted my attention since Phobos plays a major part in Timing – it’s right there on the cover, in fact. In the time-frame of Timing, there’s a small settlement on Phobos, which is visited by the main characters Mitnash and Slate as they unravel a financial mystery. This moon is a pretty small object, shaped like a rugby ball about 22 km long and about 17 or 18 km across its girth, so my first reaction was to think what bad luck it was that Maven should be anywhere near Phobos. But in fact MAVEN is in a very elongated orbit to give a range of science measurements, so every now and again its orbit crosses that of Phobos – hence the precautions. This manoeuvre is expected to be the last one necessary for a very long time, given the orbital movements of both objects. So we shall continue getting atmospheric observations for a long while to come.
Kindle formatting has sometimes been the subject of intense debates. At one end of the spectrum there is a belief that it should mirror print conventions as far as possible. Alternatively, there is a contrasting belief that we have a new display medium which needs to be free to develop its own ideas. I’m inclined towards the second view, though I do think that some print conventions still have value. To extract parts of a long debate on The Passive Voice blog about text alignment, which attracted quite opposing views:
“Do not force the reader to read the way you want her to… Just get out of the way… Good readable text that flows nicely… keep it left justified… Is there anything that shouts ‘amateur and non-pro” more than a justified book?… I’ll take amateur and readable for people who read with large font sizes… There’s a lot to be said for left justification on a smaller screen with larger font”.
Obviously a lively topic. But first, there is the question of what is feasible. Some expectations may simply not be realistic, and we’ll come back to this later. Even if you restrict yourself to Kindle files, there is considerable variation in how the same file is displayed by different devices. Step away from Kindles into the diverse world of epub readers and reading apps, and the variation only increases. As usual, most of this post will talk about Kindles specifically, but an exactly parallel process applies to epub books.
Now, as well as differences between devices, you are faced with individual reader choices. A Kindle device, or a software app emulating a Kindle on a phone, tablet, or computer, places enormous flexibility of choice in the hands of the reader. The reader can change font, font size, margin width, line spacing, and background colour, as well as swap between portrait and landscape orientation. Many epub readers offer even more choices.
Now with traditional print layout, all of these are decided by author or publisher, and the reader has no choice. Once the design is chosen, that’s it. This has given rise over the years to a set of conventions for printed books. Not so with ebooks. The phenomenal rise of epublishing, with free and easy to use tools enabling indie publishing, has given huge empowerment to authors. But I sometimes think that authors have not caught on to the fact that it has also given huge empowerment to readers.
As reader, I don’t have to put up with someone else’s choices. If I want a serif font instead of san serif – or a dyslexic-friendly font for that matter – or I like big text size, or wide margins, or two columns side by side, or different background colour, or whatever, I can choose that. If the author or publisher tries to stop me, I’ll get frustrated. It’s not appropriate any longer for an author to try to decide how a reader ought to read his or her book. Sometimes I hear people say that what matters is the layout of the book when originally downloaded, but this shows a misunderstanding of how the devices work. My reader settings are mine, and yours are yours, and they get applied equally to newly downloaded or existing books. There is nothing magic about the settings chosen by the author. If I want to read your ebook with right-aligned text, I can do so, and I can feel frustrated if you try to stop me.
As I mentioned before, not all Kindle devices, or Kindle software apps on computers, phones and tablets, treat the content the same way. My phone Kindle app (both Android and iPhone) handles changes of font size differently from my various actual Kindles, including the way it decides to justify text. This is something built into the app itself, not a thing I have direct control over.
So what does that mean for Kindle formatting? A layout that looks good when the font is small compared to the page width may well be confusing when the font is large. A layout that is easily readable when the font is large compared to the page size may well seem non-standard when the font is small.
Now, being a new technology, and moreover one which has grown up linked closely to the world of web page design, there has been a great deal of systematic study of the readability of different styles. This hasn’t happened to the same degree with the world of print, except in the very specific area of font design. Some choices which are routinely made in a printed book, and which have become part of the lore of book production, were originally made for all kinds of diverse reasons including wartime economy. In particular, the normal print practice of justifying text both left and right is not based on considerations of readability, but rather on maximising the number of characters on what was a scarce resource – paper. Change the ratio of words per line significantly away from what is common for a novel, and flush-both-sides becomes rather unreadable, as reported by numerous systematic studies.
If the paper is wide compared to the typical word size, then your eye loses its ability to scan lines comfortably. This is why newspapers and magazines split text into columns – and why the landscape mode in recent ereaders gives the user the option to do just this.
If the page is narrow compared to the words, there is a tendency for the layout to become erratic and irregular. Areas of open space appear, called “rivers”, creating a ragged and untidy appearance. Text which is left-aligned only, with ragged right margin, is better spaced, aesthetically more pleasing, and also more readable. In times past, the Kindle layout engine was heavily criticised for its poor showing in this area. It has improved, but is still way behind the result that can be achieved with a fixed page width. If you think about it, this is inevitable. Every time you change the font size, or the margin, or the aspect – even if you highlight a piece of text and then later jump back to it – every time something like this happens, the Kindle software has to recalculate the position of each word. You get huge advantages over a physical book with rigid layout, but you also have to recognise, and cope with, the limitations. Newer versions of the Kindle layout software deliberately switch to flush left only (ragged right) as the font size increases, to address this very issue.
Readability considerations, then, suggest that flush-both-sides works best in the middle range of Kindle fonts, degrading in different ways as you go towards the extremes of size options. And although modern Kindles have recognised this, and provided a built-in solution, older ones do not. So next time we’ll look at another option that can be used by authors which they can control. For today, it’s enough to recognise that there is something to think about here, and that trying to simply copy what is done in print does not necessarily work well.
I mentioned earlier about things which are simply infeasible in ebooks, even though normal and appropriate in print. One of these concerns hyphenation. The print version, with fixed word positioning, can be carefully laid out to get hyphens in the perfect places. Kindle books can’t – any choice which is correct for one person’s device and settings will be wrong for the next. Recent software layout engines do a reasonable job of inserting hyphens, but they struggle with some words, especially proper nouns. You can easily see this if you find a book line with several long words, then expand the font size. At some point the layout engine just gives up.
Another area is that of widows and orphans – single lines appearing at the bottom or top of a page, which are considered distracting to the reader. A publisher of a printed book tries hard to eliminate these by judicious choice of word selection and spacing. It can’t be done on a Kindle. If as author you did carefully sort all that out on your own device, it will all go wrong with a change of settings or on a new gadget.
So there are stylistic choices which are sound and reasonable in print, but which cannot be carried over to ebooks. It’s well worth thinking about this when you’re preparing a book, and only spending time on the issues that can be fixed. And do try out how the book appears when viewed with completely different user settings than the ones you personally like!
I ran out of time this week to do much by way of blogging, so here are three bits of space news which may well make their way into a story sometime.
Stop Press: just today NASA announced that a relatively close star (39 light years away) has no less than 7 planets approximately Earth size orbiting it… see and the schematic picture at the end of the blog.
Firstly, the Dawn probe, still faithfully orbiting the asteroid Ceres, has detected complex organic molecules in two separate areas in the middle latitudes of the dwarf planet. The onboard instruments are not accurate enough to pin the molecules down precisely, but it seems likely that they are forms of targets. The analysis also suggests that they formed on Ceres itself, rather than being deposited there by a meteor. The most likely cause is thought to be the action of warm water circulating through chemicals under the surface. Some of the headlines suggest that this could signal the presence of life, but it’s more cautious to say that it shows that the conditions under which life could develop are present there.
The second snippet spells difficulty for my hypothetical Martian settlements. This picture was captured by the Mars Orbiter and shows two larger impact craters surrounded by a whole array of smaller ones. The likely scenario is that one object split into a cluster of fragments as it passed through the Martian atmosphere. This of itself wouldn’t be too surprising, but inspection of older photos of the same area shoes that this impact happened between 2008 and 2014. No time at all in cosmic terms, and not so much fun if you’d carefully built yourself a habitable dome there.
The problem is the thinness of the Martian atmosphere. It is considerably deeper than our one here on Earth, but hugely less dense. So when meteors arrive at the top of the layer of air, they don’t burn up so comprehensively as Earth-bound ones. More of them reach the surface. Even a comparatively small rock has enough kinetic energy to really spoil your day. Something that will need some planning…
Finally we zoom right out to the cold, dark reaches of the outer solar system. A long way beyond the orbit of Pluto there is a region called the Kuiper Belt, and out in the Kuiper Belt a new dwarf planet has recently been found. It goes by the catchy name of 2014 UZ224 and it took nearly two years to confirm its existence. Best estimates are that it is a little over 300 miles across – about half the size of Ceres. I’ve never sent Mitnash and Slate out anywhere like that – it’s about twice as far from Earth as Pluto, and the journey alone would take about four months one-way. I do have vague plans for a story set out in the Kuiper Belt, but appropriately enough it’s some way off yet. But even at that distance, you’re still less than half a percent of the distance to the nearest star… space is really big!
Since as far back as written records go – and probably well before that – we humans have imagined artificial life. Sometimes this has been mechanical, technological, like the Greek tales of Hephaestus’ automata, who assisted him at his metalwork. Sometimes it has been magical or spiritual, like the Hebrew golem, or the simulacra of Renaissance philosophy. But either way, we have both dreamed of and feared the presence of living things which have been made, rather than evolved or created.
Modern science fiction and fantasy has continued this habit. Fantasy has often seen these made things as intrusive and wicked. In Tolkein’s world, the manufactured orcs and trolls (made in mockery of elves and ents) hate their original counterparts, and try to spoil the natural order. Science fiction has positioned artificial life at both ends of the moral spectrum. Terminator and Alien saw robots as amoral and destructive, with their own agenda frequently hostile to humanity. Asimov’s writing presented them as a largely positive influence, governed by a moral framework that compelled them to pursue the best interests of people.
But either way, artificial life has been usually conceived as self-contained. In all of the above examples, the intelligence of the robots or manufactured beings went about with them. They might well call on outside information stores – just like a person might ask a friend or visit a library – but they were autonomous.
Yet the latest crop of virtual assistants that are emerging here and now – Alexa, Siri, Cortana and the rest – are quite the opposite. For sure, you interact with a gadget, whether a computer, phone, or dedicated device, but that is only an access point, not the real thing. Alexa does not live inside the Amazon Dot. The pattern of communication is more like when we use a phone to talk to another person – we use the device at hand, but we don’t think that our friend is inside it. At least, I hope we don’t…
So where is Alexa and her friends? When you ask for some information, buy something, book a taxi, or whatever, your request goes off across cyberspace to Amazon’s servers to interpret the request. Maybe that can be handled immediately, but more likely there will be some additional web calls necessary to track down what you want. All of that is collated and sent back down to your local device and you get to hear the answer. So the short interval between request and response has been filled with multiple web messages to find out what you wanted to know – plus a whole wrapper of security details to make sure you were entitled to find that out in the first place. The internet is a busy place…
So part of what I call Alexa is shared between every single other Alexa instance on the planet, in a sort of common pool of knowledge. This means that as language capabilities are added or upgraded, they can be rolled out to every Alexa at the same time. Right now Alexa speaks UK and US English, and German. Quite possibly when I wake up tomorrow other languages will have been added to her repertoire – Chinese, maybe, or Hindi. That would be fun.
But other parts of Alexa are specific to my particular Alexa, like the skills I have enabled, the books and music I can access, and a few features like improved phrase recognition that I have carried out. Annoyingly, there are national differences as well – an American Alexa can access the user’s Kindle library, but British Alexas can’t. And finally, the voice skills that I am currently coding are only available on my Alexa, until the time comes to release them publicly.
So Alexa is partly individual, and partly a community being. Which, when you think about it, is very like us humans. We are also partly individual and partly communal, though the individual part is a considerably higher proportion of our whole self than it is for Alexa. But the principle of blending personal and social identities into a single being is true both for humans and the current crop of virtual assistants.
So what are the drawbacks of this? The main one is simply that of connectivity. If I have no internet connection, Alexa can’t do very much at all. The speech recognition bit, the selection of skills and entitlements, the gathering of information from different places into a single answer – all of these things will only work if those remote links can be made. So if my connection is out of action, so is Alexa. Or if I’m on a train journey in one of those many places where UK mobile coverage is poor.
There’s also a longer term problem, which will need to be solved as and when we start moving away from planet Earth on a regular basis. While I’m on Earth, or on the International Space Station for that matter, I’m never more than a tiny fraction of a second away from my internet destination. Even with all the other lags in the system, that’s not a problem. But, as readers of Far from the Spaceports or Timing will know, distance away from Earth means signal lag. If I’m on Mars, Earth is anywhere from about 4 to nearly 13 minutes away. If I go out to Jupiter, that lag becomes at least half an hour. A gap in Alexa’s response time of that long is just not realistic for Slate and the other virtual personas of my fiction, whose human companions expect chit-chat on the same kind of timescale as human conversation. The code to understand language and all the rest has to be closer at hand.
So at some point down the generations between Alexa and Slate, we have to get the balance between individual and collective shifted more back towards the individual. What that means in terms of hardware and software is an open problem at the moment, but it’s one that needs to be solved sometime.
A shorter blog today focusing specifically on navigation. I mentioned before that there were two different ways of navigating through the sections of an ebook, and this little post will focus on how they work. The two methods appear differently in the book – the HTML contents page is part of the regular page flow of the book, and the NCX navigation is outside of the pages, as we’ll see later.
It’s an area where ebooks behave quite differently to print versions. If you have a table of contents (TOC) in a print book it’s essentially another piece of static text, listing page numbers to turn to. Nonfiction books frequently have other similar lists such as tables or pictures, but I’ll be focusing only on section navigation – usually chapters but potentially other significant divisions. In an ebook this changes from a simple static list into a dynamic means of navigation.
Let’s take the HTML contents first. It looks essentially the same as the old print TOC, except that the page numbers are omitted (since they have no real meaning) and are replaced by dynamic links. Tap the link and you go to the corresponding location. They look just like links in a web page, for the very good reason that this is exactly what they are!
So the first step is to construct your HTML contents list, for which you need to know both the visible text – “Chapter 1”, perhaps – and the target location. Authors who use Word or a similar tool can usually generate this quite quickly, while those of us who work directly with source files have the easy task of inserting the anchor targets by hand. It’s entirely up to you how you style and structure your contents page – maybe it makes sense to have main parts and subsections, with the latter visually indented. It’s your choice.
The NCX navigation is a bit different. It’s a separate file, and the pairing of visible text and target link is done by means of an XML files of a specific structure. Again. some commercial software will be able to generate this for you, using the HTML TOC as a starting point, but it’s as well to know what it is doing for you. Conventionally the two lists of contents mirror each other, but this doesn’t have to be the case. For example, it might suit you better to have the HTML version with an exhaustive list, and the NCX version with just a subset. It’s up to you. However, the presence of NCX navigation in some form is a requirement of Amazon’s, sufficiently so that they reserve the right to fail validation if it’s not present. And it’s a mandatory part of the epub specifications, and a package will fail epubcheck if NCX is missing. You’ll get an error message like this:
Validating using EPUB version 2.0.1 rules. ERROR(RSC-005): C:/Users/Richard/Dropbox/Poems/test/Test.epub/Test_epub.opf(39,8): Error while parsing file ‘element “spine” missing required attribute “toc”‘. ERROR(NCX-002): C:/Users/Richard/Dropbox/Poems/test/Test.epub/Test_epub.opf(-1,-1): toc attribute was not found on the spine element.
Of course, if you don’t check for validation, or if you just use Kindlegen without being cautious, you will end up with an epub or Kindle mobi file that you can ship… it will just be lacking an important feature.
Interestingly, you don’t get an error if you omit the HTML TOC – so long as everything else is in order, your epub file will pass validation just fine. This is the opposite of what folk who are used to print books might guess, but it reflects the relative importance of NCX and HTML contents tables in an ebook.
So what exactly do they each do? The main purpose of the HTML version is clear – it sits at the front of the book so that people can jump directly to whatever chapter they want. It would do this even if you just included the file in the spine listing. But if you are careful to specify its role in the OPF file, it also enables a link in the overall Kindle (or epub) navigation. This way the user can jump straight to the TOC from anywhere.
The NCX navigation enables the rest of this “Go To” menu. If it’s missing, or incorrectly hooked up in the OPF file, the navigation will be missing, and you are leaving your readers struggling to work out how to flick easily to and fro. On older Kindles, there were little hardware buttons (either on the side of the casing or marked with little arrows on the front) which would go stepwise forwards and backwards through the NCX entries.
So that’s it for the two kinds of navigation. They’re easy to include, they add considerably to the user experience, and in one way or another are considered essential.
I recently invested in an Amazon Dot, and therefore in the AI software that makes the Dot interesting – Alexa, Amazon’s virtual assistant. But I’m not going to write about the cool stuff that this little gizmo can do, so much as what it led me to think about AI and conversation.
The ability to interact with a computer by voice consistently, effectively, and on a wide range of topics is seen by the major industry players as the next big milestone. Let’s briefly look back at the history of this.
Once upon a time all you could use was a highly artificial, structured set of commands passed in on punched cards, or (some time later) via a keyboard. If the command was wrong, the machine would not do what you expected. There was no latitude for variation, and among other things this meant that to use a computer needed special training.
The first breakthrough was to separate out the command language from the user’s options. User interfaces were born: you could instruct the machine what you wanted to do without needing to know how it did it. You could write documents or play games without knowing a word of computer language, simply by typing some letters or clicking with a mouse pointer. Somewhere around this time it became possible to communicate easily with machines in different locations, and the Internet came into being.
The next change appeared on phones first – the touch screen. At first sight there’s not a lot of change from using a mouse to click, or your finger to tap. But actually they are worlds apart. You are using your body directly to work with the content, rather than indirectly through a tool. Also, the same interface – the screen – is used to communicate both ways, rather than the machine sending output through the screen and receiving input via movements of a gadget on an entirely different surface. Touch screens have vastly extended the extent to which we can access technology and information: advanced computers are quite literally in anyone’s pocket. But touch interfaces have their problems. It’s not especially easy to create passages of text. It’s not always obvious how to use visual cues to achieve what you want. It doesn’t work well if you’re making a cake and need to look up the next stage with wet and floury hands!
Which brings us to the next breakthrough – speech. Human beings are wired for speech, just as we are wired for touch. The human brain can recognise and interpret speech sounds much faster than other noises. We learn the ability in the womb. We respond differently to different speakers and different languages before birth, and master the act of communicating needs and desires at a very early age. We infer, and broadcast, all kinds of social information through speech – gender, age, educational level, occupation, emotional state, prejudice and so on. Speech allows us to explain what we really wanted when we are misunderstood, and has propelled us along our historical trajectory. Long before systematic writing was invented, and through all the places and times where writing has been an unknown skill to many, talking has still enabled us to make society.
Enter Alexa, and Alexa’s companions such as Siri, Cortana, or “OK Google”. The aim of all of them is to allow people to find things out, or cause things to happen, simply by talking. They’re all at an early stage still, but their ability to comprehend is seriously impressive compared to a few short years ago. None of them are anywhere near the level I assume for Slate and the other “personas” in my science fiction books, with whom one can have an open-ended dialogue complete with emotional content, plus a long-term relationship.
What’s good about Alexa? First, the speech recognition is excellent. There are times when the interpreted version of my words is wrong, sometimes laughably so, but that often happens with another person. The system is designed to be open-ended, so additional features and bug fixes are regularly applied. It also allows capabilities (“skills”) to be developed by other people and added for others to make use of – watch this space over the next few months! So the technology has definitely reached a level where it is ready for public appraisal.
What’s not so good? Well, the conversation is highly structured. Depending on the particular skill in use, you are relying either on Amazon or on a third-party developer, to anticipate and code for a good range of requests. But even the best of these skills is necessarily quite constrained, and it doesn’t take long to reach the boundaries of what can be managed. There’s also very little sense of context or memory. Talking to a person, you often say “what we were talking about yesterday...” or “I chatted to Stuart today…” and the context is clear from shared experience. Right now, Alexa has no memory of past verbal transactions, and very little sense of the context of a particular request.
But also, Alexa has no sense of importance. A human conversation has all kinds of ways to communicate “this is really important to me” or “this is just fun”. Lots of conversations go something like “you know what we were talking about yesterday…“, at which the listener pauses and then says, “oh… that“. Alexa, however, cannot distinguish at present between the relative importance of “give me a random fact about puppies“, “tell me if there are delays on the Northern Line today“, or “where is the nearest doctor’s surgery?”
These are, I believe, problems that can be solved over time. The pool of data that Alexa and other similar virtual assistants work with grows daily, and the algorithms that churn through that pool in order to extract meaning are becoming more sensitive and subtle. I suspect it’s only a matter of time until one of these software constructs is equipped with an understanding of context and transactional history, and along with that, a sense of relative importance.
Alexa is a long way removed from Slate and her associates, but the ability to use unstructured, free-form sentences to communicate is a big step forward. I like to think that subsequent generations of virtual assistants will make other strides, and that we’ll be tackling issues of AI rights and working partnerships before too long.
Last time I looked at the basic principles of a Kindle mobi or general epub file. This time I’ll be focusing a bit more on what the different ingredients do. We’ll also start to uncover a few more places where Kindle and epub handle things differently. For reference, here is a sample set of files you need for an epub book – Kindle is essentially the same but some “administrative” bits are inserted automatically by KindleGen so you don’t need to worry.
Somebody who uses Microsoft Word or some similar software to construct their book may find the following paragraphs confusing, since they will probably never have needed to address this directly. But under the bonnet this is what is happening with your book preparation, and many years of technical software and QA work has convinced me it’s better to know rather than not know. At very least this may help diagnose when something goes wrong!
So, the key ingredient is the opf file which ties everything together. It has four main sections. The first is Metadata – a general information section containing things like author name, book title, publisher, ISBN (if any), price, brief description, and so on. The kind of detail you might expect to see on a library card or catalogue entry. I’ll be giving specific examples of the different files later in this series but for now want to concentrate on principles rather than details. There are also some important places where you have to ensure that a reference in one place matches one somewhere else – again, I’ll return to this.
The second section is a list of resources – the Manifest. For epub this must be complete, and although KindleGen is clever enough to fill in some gaps, it is good practice to be thorough here as well. So this identifies all content files, any separate style sheets, all images including cover, the ncx navigation file, and anything else you intend to include. But it’s a simple list, like the ingredients for a recipe before you get to the directions, and this section doesn’t tell KindleGen or an epub reader how to assemble the items into a book.
The third section – the Spine – does this work of assembly. It lists the items that a reader will encounter in their correct order. This section turns your simple list of included items into a proper sequence, so that chapter two comes after chapter one. Here you also link in the ncx file so it can do its job.
The final section – the Guide – defines key global features of the finished book. For example, this is where you define the cover, the HTML contents page, and the start point – the place where the book opens for the very first time, and the target for the navigation command “Go to Beginning” (or equivalent). It’s worth remembering that the start point doesn’t have to be the first page – many books set this after the front matter, so that you skip over title pages and such like and begin at the beginning of the actual story. But be warned that following a scam to do with counts of pages read, Amazon does not take kindly to people putting the start point too far through the book.
Images can cause unexpected problems. The opf file expects you to supply not just a file name, but also the file type, such as jpeg, png or gif. And here we encounter one of those annoying differences between devices. Kindle accepts png files along with jpeg and gif, and many people are used to the convenient feature of the png format that it allows transparency. A png with transparency will – usually – take on the background from whatever happens to be behind it, like the page background for example. An epub file will do exactly this if you use coloured background.
But KindleGen does not. You can supply a png file successfully, but internally it will be converted to jpeg format… and jpegs do not allow transparency. The background will be converted to white, and the final effect will not be what you hoped for. The way round this is to use gif images if you want transparency, but since this is an old format many people do not suspect that this is necessary.
Now, sometimes this won’t matter – for example if you want to insert a map, and have it look as though it is on white paper. But other times it looks decidedly odd, when it is intended to be just a logo or divider symbol. It’s a thing which particularly catches out those who are used to printed books, or older Kindles which only supported black-and-white. It’s not very long since I discovered this hidden conversion png -> jpeg that KindleGen does, and as a result expanded my pre-publication testing considerably.
A couple of closing comments about the files themselves. The contents files ought to be valid HTML – this sounds obvious, but most browsers, and KindleGen, are very forgiving about syntax errors, so people often forget to be careful. But although the output file may be generated, such errors can lead to surprising changes of appearance between paragraphs. It is good practice to use the w3.org online validator to check this – it’s completely free, and will either confirm that the file is valid or else tell you what’s wrong and how to fix it. Alternatively, once you have built an epub file, the free epubcheck utility will do a similar job as one of its several checks. (I’ll come back to epubcheck when I talk about building an epub file). Other than that, you are at liberty to split up content however you like – all in a single file, or one per chapter, or whatever. It’s up to whatever you find convenient (though a few epub apps load just one file at a time so you might notice a slight delay every now and again while reading).
A quick blog today, focusing on a couple of things. First, like most of us, my annual Goodreads statistics appeared, telling me what I had read in 2016 (or at least, what GR knew about, which is a fair proportion of what really happened).
So, I read 52 books in the year, up 10 from 2015 (and conveniently one a week). but the page count was down very slightly. I guess I’m reading shorter books on average! Slightly disappointingly, there were very few books more than about 50 years old, with Kalidasa’s Recognition of Shakuntala the outstanding early text. This year, I have a target of reading more old stuff alongside the new. In 2016 there was also more of a spread of genres, with roughly equal proportions of historical fiction, science fiction, fantasy, and non-fiction (aka “geeky”), contrasting with previous years where historical fiction has dominated.
I also recently read that Amazon passed the landmark of 5 million ebooks on their site in the summer, slightly ahead of the 10th birthday of the Kindle itself. The exact number varies per country – apparently Germany has more – but currently the number is growing at about 17% per annum. That’s a lot of books… about 70,000 new ones per month, in fact. Let nobody think that reading is dead! As regards fiction, Romance and Children’s books top the counts, which I suspect will come as a surprise to nobody.
Finally, we have just had a space-related anniversary, namely that of the successful landing of the ESA Huygens probe on Saturn’s moon Titan on January 14th 2005. An extraordinary video taken as it descended has been circulating recently and I am happy to reshare it. Meanwhile the Cassini “mothership” is in the last stages of its own research mission and, with fuel almost exhausted, will be directed to burn up in Saturn’s atmosphere later this year. I vividly remember the early mission reports as Cassini went into orbit around Saturn – it’s a bit sad to think of the finale, but this small spacecraft has returned a wealth of information since being launched in 1997, and in particular since arriving at Saturn in 2004.
(Video link is https://youtu.be/msiLWxDayuA?list=PLTiv_XWHnOZpKPaDTVy36z0U8GxoiIkZa)
This is the first of an occasional series on the quirks of preparing ebooks. Almost everything applies equally to Kindle and general epub, but for the sake of quickness I shall normally just write “Kindle”.
The conversion of a manuscript written in some text editor through to a built ebook – a mobi or epub file – happens in several logical stages. A lot of authors aren’t really aware of this, and just use a package which does the conversion for them. Later in this series I’ll talk a bit about how Amazon’s software – KindleGen – does this, and what parts of your input end up doing what.
First, what is a ebook? You can see this best with a generic epub file. Find such a file on your system, then make a copy so you don’t accidentally corrupt your original. Let’s say it’s Test.epub. Rename it to Test.zip and give approval if your computer warns you about changing file extension.
Then you can look inside the contents and see what’s there – a very specific folder structure together with a bunch of content files. This is what your epub reader device or app turns into something that looks like a book. This list not only lists the files, but (presupposing you’ve given sensible names to the source files) it tells you something about their purpose. The ones identified as HTML Documents are basically the text of the book, including the contents listing and any front and back matter the author chooses to put in. The document styles are there. There’s a cover image. The ncx file describes how the Kindle or epub reader will navigate through the book (of which more another time). The opf file is the fundamental piece of the jigsaw that defines the internal layout. The images folder contains, well, images used. The other files are necessary components to enable the whole lot to make sense to the reading app or device.
A Kindle mobi file is much the same except that there is usually some encryption and obfuscation to dissuade casual hacking. But actually, almost exactly the same set of files is assembled into a mobi file. What KindleGen does is rearrange your source files – whether you use Word, plain text, or some other format – into this particular arrangement. By the same token, if you are careful to get everything in exactly the right place, you can create your epub file with nothing more than a plain text editor and something that will make a zip archive out of the files.
So now we know that a Kindle “book” is actually a very long thin web site, divided up into convenient “pages” by the device or by an app. Kindle books never scroll like a regular web site, though a small number of epub apps do. They show the content in pages which replace each other, rather than an endless vertical scroll. There’s a good reason for that – readability studies have shown that presentation by means of pages is more easily read and comprehended than scrolling. The layout chosen by most word processors – a continuous scroll with some kind of visual cue about page divisions – is good for editing, since you can see the bottom of one page and the top of the next at the same time, but it’s not so good for readability. The scrolling choice made by some epub apps is due to developer laziness rather than any logical reason – and even here, some apps allow the reader to choose how they move through the book
So the underlying structure is entirely different from the fixed layout called for by a printed book or its computer equivalent such as a pdf or Word document, even if the superficial appearance is similar. On a computer, you can resize the window containing your pdf as much as you like, and the words will stay in the same place on each line of each page. But with Kindle or epub, you can swap between portrait and landscape view, or alter font and margin size, or change line spacing, and in each case the words on the lines will reflow to fit. In the landscape aspect of some Kindles you can choose to view in two columns side by side. In most epub readers you can choose to override whatever text alignment the author or publisher has chosen, and read it however you like. After each such change the device or app recalculates how to lay out the text.
Now many of us choose to use some sort of word processor to write our story, in which none of this is very visible. You can certainly alter the page settings and experiment, but most people just set it to whatever their typical national page size is – A4, or Letter, for example – and leave it at that. That gives the illusion that the process of production is fundamentally the same as that of a printed book – but in fact it is not. If an author’s main intention is to write a paperback book, and they perceive the Kindle version is just a handy spinoff, then focusing on page layout seems to make sense. But most indie authors sell a lot more ebooks than printed ones, so it makes more sense to understand the particular needs of the electronic medium.
You actually don’t need any extravagant software to create an ebook. A plain text editor, together with some knowledge of simple HTML tags, is all you need along with some other free tools. But for those of us who don’t have that knowledge, a word processor plus some sort of format converter is handy. But – as we shall see later – there are pitfalls with such software, and the end product is not necessarily as you would hope.
One of the really exciting features of an ebook is that it bridges two worlds which in the past have been separate – the world of traditional printing, and the world of visual and web design. This fusion opens up huge opportunities for the reader, but has also led to misunderstandings and difficulties. Some of the opportunities are obvious, like the ability to search, synchronise across multiple devices, swap between text and audio versions, and so on.
But there is much more. If I don’t like the original font, or I have dyslexia and prefer a specialised font, I can change it. If I need to expand the font size so I can read the text, I can do this. If I like a coloured page instead of black and white – and I have a device with a colour screen – I need only change a setting.
In all of this, the reader is not constrained by the author’s, or publisher’s choices. A great deal of display choice is where it should be – in the hands of the reader, not the writer. It seems to me that this fact has not been fully grasped by many authors, or small publishers, who sometimes treat an ebook as though it was no different from a printed book. They then expect to define every aspect of the display. But people who read ebooks have a considerable amount of choice over how they read – it’s a new world, and needs new thinking.
That’s it for today. Next time, I will be looking at some of the additional information that ties the separate content files together.
Part 2 of this little series looks at a different phenomenon to do with the sun’s movement through the sky. Imagine yourself picking a time of day – let’s say 10:30 in the morning – and taking note of where the sun is in the sky. Do this at the same time every day of the year to build up a curve tracing the sun’s apparent movement. One way to do this would be to take a photo pointing at exactly the same angle at exactly this time, then overlay the photos on top of each other. Another way would be to put a stick in the ground as a rudimentary sundial, then mark out the end of the stick’s shadow each day. It’s an easy experiment in principle, but takes a lot of patience and accuracy to get right.
But suppose you’ve done that – what would you expect to see? We know that the sun goes up and down in the sky through the year – in winter it is lower and in summer higher. So i suspect that most people would expect to see a straight vertical line being plotted through the year as the sun cycles along its seasonal track. But actually what you get is not a straight line, but a figure eight shape. In the northern hemisphere the top loop of the 8 is smaller than the bottom, while in the southern hemisphere the loop nearer the horizon is the small one.
This curve is called the analemma, and has been known for a very long time – Greek and Latin authors wrote about it some two thousand years ago in the interest of designing a better sundial. My guess is that people observed this much longer ago, and that the creators of the great prehistoric stone observatory monuments tried to incorporate it in their designs.
We can describe this curve mathematically, and it is taught as a method of dead reckoning for those at sea. With a good watch to keep track of time, decent knowledge of the analemma shape, and some precise observations of the sun’s position in the sky, you can pinpoint your position down to around 100 nautical miles. Not bad if you’re lost at sea with no GPS!
The root cause of this is a combination of two factors in the Earth’s movement. The first is that the polar axis, around which the Earth spins to give day and night, is not at right angles to the plane of the Earth’s orbit. This offset angle, a little over 23 degrees, is what gives us seasons. The second factor is that the Earth’s orbit around the sun is not perfectly circular, but a slightly squashed oval. Moreover the sun is not at the centre of the oval, but offset to one side at one of the two focal points – we are about 5 million km closer to the sun in early January than we are in early July. The Earth does not move at a constant speed around this oval. We speed up at closest approach to the sun, and then slow down as we move further away. Those who can remember school physics might have come across this as Kepler’s 1st and 2nd laws of planetary motion, originally formulated in the early 1600s.
Now, for convenience we split our year into equal length days, which means that for one part of the year, a day according to our clocks gets ahead of its allotted portion of the orbit, and for another part it falls behind. By the end of the year it all comes out even. Also, the offset of the polar axis changes the degree to which these shifts make a real difference against the sky. The combination of these two factors is what generates the figure 8 shape of the analemma.
Let’s think back to our ancient ancestors and the stone monuments they built. We know that the positions of the stones encode astronomical information. The monument builders were aware of not just the annual cycle of the sun, but also of more subtle patterns, such as the 28 year cycle that the moon makes in its own path against the sky. Since the analemma can be mapped out with nothing more complicated than a stick to make a shadow, it seems to me quite improbable that they did not know it. Having said that, I don’t know of any specific stone patterns that can be linked directly to the analemma. Once people started making sundials, they soon found that there was no single division of hour markers that works consistently. The figure 8 shape ensures that your sundial sometimes runs fast and sometimes slow.
Moving into the future, every planet has its own variation of the analemma. The exact shape depends on interplay between the angle of the polar axis and the extent to which the orbit deviates from a pure circle. Our Earth has these two factors in approximate balance. So does Pluto, which therefore has a figure 8 shape like Earth, though in this case the top and bottom loops are almost the same size. But for other planets one factor or the other dominates. As a result, Jupiter has a simple oval shape, while Mars has a tear-drop. However, actually making the observations (as opposed to calculating them) might be tricky as you move out through the solar system. On Earth, you only have to wait 365 days. But a Jupiter year is almost 12 of our years, and Pluto takes nearly 250 years to circle the sun once. You would need extreme patience to plot out a full analemma cycle in both these places!