Codex Monkey

Friday, July 13, 2012

Moved - http://j-gorman.com

I've started a new blog. I've debated for a while whether to try to keep this one as a more anonymous blog, to integrate it with my new one, or just neglect it. Still have decided. Meanwhile, you can catch my new blog over at http://j-gorman.com

Monday, June 14, 2010

Why I returned the Nook.

So recently I decided to give an eReader a try. My motivation is that I do a lot of reading online and was hoping for a device that would make it easier for me to read in a variety of places, particularly while travelling on the bus and also while relaxing on the couch.

What I wanted to read.

Before I get into some of the reasons I ultimately decided to return the nook, it is important to know a little of what I was hoping to read on it.

Books on programming and computers
Recently I subscribed to Safari Books Online. There's a great collection of eBooks from the biggest publishers in computer science.
Out of copyrighted materials from Google Books and Archive.org
I'm a fan of a lot of writing that is actually out of copyright and also curious about some others. Lately I've downloaded a variety of authors, including Dunsany, Hearn, and Chambers. Some I don't mind reading as an ePub, but some, like the Wizard of Oz series, I'd like to read as a pdf. Some of the books I've downloaded recently:
The Wonderful Wizard of Oz has some beautiful illustrations and I'd like to have them inline with the book and would be willing to have to zoom in and out or fiddle with movement to do so.

The Two Major Issues: No Zoom in Reader, Bad Panning in Browser

The Nook has two key issues as of software version 1.3 that made it difficult to use for the technical reading I wanted to do as well as the graphic-intensive books. The first is there is no zoom in the reader mode. When reading a pdf the images are scaled down to the 6 inch screen of the Nook and occasionally the entire page is scaled when there is combined artwork and text. I could read the text but only if I held it close in bright light. I was hoping to expand it to see detail of the artwork, but in the book mode but could not figure out a way to zoom. You can adjust the size of the font, but at least with the books I tried it with it had little effect on the images. It would be nice to even center on one image and zoom in and out.

The browser however does have a zoom. Now, I'm technically savvy enough that this should be a solution, although an annoying one, to the issue with images. However, the browser has an equally problematic flaw with the panning feature. Panning, at least in Nook-speak, is the movement of the "screen" over the webpage by using the hardware buttons on the left and right of the Nook. They work like pressing Page Up and Page Down buttons on a keyboard while in a browser.

Or at least that is how panning is supposed to work. However, on most pages I tried panning would jump around the page in large chunks. So if there were three paragraphs on the screen and you pressed the button, it likely would move the screen down to the fifth paragraph, bypassing paragraph four.

Now, using the touch screen on the bottom does let you "scroll" by flicking it like it was a touchpad. However, this still made the page jump around irregularly and cases a mental break as I figured out where my scrolling had landed me. It is just too slow and tedious compared to rate I read. When trying to experiment with this I also found I had to dim the light of the bottom touch-screen considerably as it was becoming too distracting.

So I was really hoping Safari Books Online Mobile version would work well as it's designed to work with minimalistic browsers. However the above interfered with me reading the site too much. Now, Safari Books Online does let you download five chapters a month for free as pdfs or epub, but I'm nervous that I would quickly consume that. Additional chapters can be purchased I believe for $2 a chapters. Extremely spendy to me compared to the other plans.

One final note is that Manga artwork appears stunning on the e-Ink screen. Indeed, I wish the panning or the zoom worked better as the Nook would be a great device for reading manga. In fact, after looking at what manga I could on the browser I'd say there's a market for a dedicated manga reader. Maybe not as much as a few years ago, but I could still see offering a subscription model to digitized manga with a cheaper reader and have people pay per manga title they want to subscribe too. With a larger screen that was magazine sized and specialized software for panel movements and zooming it could work out really well.

Some more minor quibbles and observations

Price of books

I'm not much of a fiction reader any more these days although I used to be when I was younger. I'm also much less of a book collector. At some point the desire for less clutter and less back strain has slowed down my book purchases. On one hand, the Nook seems to be a wonderful way to reduce the clutter. I can just carry one book-sized object and have access to a lot of books. Even better, with the free AT&T service, a new book is always just a download away.

However, price ends up still being a concern. Even with the cheaper prices of ebooks, the fiction works still seem spendy to me. It made me think about my younger days when I did read more fiction. On one hand the Nook would have been great for that bookish youngster. On the other hand, I read in such great quantities I wouldn't be able to afford to have kept up with the prices in the eBooks. While I bought books new on occasion, far more frequently I haunted the half-price store, garage sales, and other sources of cheap books. That bargin-hunting part of me still exists. At this point I'd love a subscription model for books like netflix, but I've yet to really see many try this yet. How about B & N charges me $30 a month and I read as much as I can? Paying less for a subscription model for me would get around the disadvantage of not getting the cheap second-hand prices.

The one thing that comes close to the subscription model is the MyMediaMall offered here by the Lincoln Trails System. I think it's a third-party vendor for Overdrive designed for libraries. It in theory should have worked with my nook, but the selection at Lincoln Trails was small and the MyMediaMall website is really poorly set up. I have never wanted facets or filters so badly as when I was hunting for any books I'd like to read.

I tried downloading three books: a book on the history of cooking in America, one on UML practices, and a book on investing. The later was when I was getting desperate for examples. After having to use the "revert to factory settings" on my Nook to get it to work with the Adobe Digital Editions software, I tried loading the three books. The first book would crash my Nook whenever I tried to access the Table of Contents or go past page 17. The second book had diagrams that were unzoomable and unreadable. I gave up a few pages in. The third book came in fine.

I think I'd check out what the local library offered as far as eBooks. If there's a lot you will want to read, that's a great bonus for the Nook. Be aware though that there might be some that will not actually work.

Navigation

I also ended up using calibre when I was downloading some of the Google Books and Archive.org books. The navigation on the Nook itself just doesn't show enough metadata to be useful. This isn't entirely the fault of B&N or the Nook, I work with book metadata and I realize how painful it can be. However, when I download multiple volumes of "The Writings of Lafcadio Hearn." When you're trying to navigate through a list of 16 volumes that all appear exactly the same in the summary and detailed view of the Nook it's not easy. At least with some hand editing and using calibre it was manageable.

Lack of Apps

One final note, I would have loved to have a feed reader on the device. There's an excellent feature called the daily which just seems to be screaming out for the ability for me to add my own RSS feeds. However, I couldn't figure out how to do it, being stuck with the B & N feeds that come with it. This seems so odd as one of the most exciting parts of the Nook to me was that it was an Android-based reader with a wi-fi card. How about opening up that App Store/Android system to let me download a feed reader and then many other lightweight apps that involve a lot of reading such as twitter/facebook/identi.ca apps?

The Nook does look quite hackable though and a group of folks calling themselves the NookDevs have managed to get access to the underlying Android system and install their own apps on it. Of course, this always risks a future update from B & N wrecking everything

Why I'm not keeping the Nook

I came very close to keeping the Nook. I really, really like the e-Ink display. The size of the unit also feels right. The hardware is cool and the temptation to just play around with it is strong. However, in the end it is just too much money for what it does and there's just too much frustration with some of the readings I do on a regular basis.

Expect to payabout $310 once you get a case to go with it. Right now B & N is offering quite the good deal with a $50 dollar gift certificate when you buy the Nook. However, I'm overdue for a new phone and laptop and well and putting that money into those will likely get me something that will serve for my online reading as well for another year or two while I can watch the development of tablets and also dedicated eReaders. I may explore the Sony Touch Reader and some other e-Ink displays,

I am still highly tempted by an e-Ink display and will be keeping my eye out on software updates to the Nook and also to the other eReaders out there.

Who I think would like the Nook

I think if you do a lot of fiction reading and like staying on top of fairly recent releases, you'll get good prices and a really good reading experience by using the Nook. Also if you're a traveller or on the road a lot, you can't beat the price, selection, and quality of reading on the Nook. I think out of all the eReaders out the Nook has the best potential and future. The speed of previous updates and the hardware is impressive. The Android-base also makes the potential for other useful functionality such as reading online news, lightweight email and frequent updates to the software.

Also if you're a hardware hacker, there's some really interesting components. I didn't actually take my case apart, but from reading some of the Nook Dev stuff it looks really cool.

One final word of warning if you, like me, decide to return your Nook

When I tried "Revert to Factory Settings." it apparently did not delete my cookies in my browser, although it did clear out the My Documents folder. If you find yourself returning or giving away your Nook, I'd go into the browser and the settings and delete everything there.

Wednesday, July 15, 2009

Great disappointment of the last year: No Code4LIb Fix

One of my professional disappointments of this last year was my inability to attend the wonderful Code4Lib Conference. In fact, I didn't make any conference at all. I'm going to try to make up for it next year by attending the wonderful Access 2009 conference. I've got plane tickets, registration, and some travel plans in place. Next up is to try to secure lodging this year.

Yet, I can't help but wonder what would have happened if I had the chance to go to Code4Lib 2009. I'd finally have a chance to combine two of my loves....coding and geeking out about Lovecraft. In fact, I have a feeling I might have carried it too far and would have caused something like this to happen....

Sunday, May 10, 2009

Let's play Hide the API!

Howdy folks, it's been a while. But I have reached the end of my patience on a particular matter: documentation in the library vendor world. I'm bothered by both how difficult it is to find and how incomplete it usually is even when it does exist. So I'm going to quickly look over some examples and offer a plea to those creating these apis.

Citation Management: please let me enable users of YOUR product/service to actually get value from it.

One recent set of offenders were RefWorks and EndNote. They both offer "direct exports" where an end user could be on your website or catalog and choose to pass along citations to a person's refworks account. I ended up asking for help in the excellent #code4lib to even find the RefWorks documentation. I had seen it at some point but Googling and looking around their websites failed to find it again when I needed it. #code4lib provided the answer (Refworks Direct Export), but the documentation was incomplete.

Yes, I got enough to work. I had to sit and work through combinations of input formats (raw MARC, MarcXML) to the api and options to the api to even figure out combinations that worked. I'll confess. I only got it to work by seeing the University of Michigan's Vufind implementation: http://mirlyn2-beta.lib.umich.edu/. Had they offered samples of each of the import formats, that would have saved me hours of time and probably enabled me to get far enough that it would be in my catalog at this point.

There's another good reason to have good documentation that's easy to find. Even contacting their tech support I got wrong answers about what input formats could the filters take. This does not exactly make me trust your abilities and the longevity of your company. Refworks could drastically improve this by adding the examples, making sure a link to this is on their front page or at least in a page that's like "Technical documentation", and having the phrase "refworks api" somewhere in the document for indexers.

Endote was a complete failure in terms of documentation on their service, which still doesn't seem to exist. Refworks, see your opening?

Websites? Oh yeah, those are like brochures with even more pictures!

Or at least that's the impression I get from looking at library vendors sites.

Now my latest frustration. I'm trying ot find out more about the Syndetic API. You know, that thing you use to call to display cover images, table of contents, and other possibly cool stuff. At this point I'm likely to just poke at other sites and "crib" from their urls. But that doesn't let me see the full range of possibilities. But I keep looking at http://www.bowker.com/syndetics/ and failing to see anything useful. Googling for Syndetic API seems to just return a lot of related blog posts that all fail to have a link to any documentation.

I suspect in part this comes around from the fact that most of these companies seem to view the web as a place for marketing. They want people to see pretty pictures of their products. Meanwhile their developers and documentation folks probably are used to interacting with some of the big players and companies out there like say, ILS vendors, online retailers and the like. They probably have a pile of pdfs somewhere that they typically send off to one department when someone subscribes to their service.

I also suspect, as I've seen this behavior elsewhere, that there's a combination of paranoia (someone could just copy our API! Doom!) and the thought that somehow, somewhere they might be able to charge money at every point in the process. In other words, they want control about who knows about the api in the hopes of get away with charging developers to use the api and the library to use the service. If they open the api, maybe folks like Ex Libris would not get Syndetics agreement. Not that I believe this, but I suspect it's in the psychology of some of the management types who don't want to give up any control.

So I have a humble plea to all those creating apis and services that might be used by a variety of people, including small, loosely affiliated groups. Please make your documentation findable. I could probably dedicate another entire blog post on how to do this but here are some quick ideas: find out what other people actually call the api and make sure those phrases appear in the document, have a subsite just for documentation, and feature cool implementations on your front page with a discreet link to the api documentation.

Why document? Some reasons:

Unlike proprietary systems, software development in libraries seems to be more decentralized and erratic. Cool stuff happens due to small groups who manage to get a chunk of time to work on a project. Sharing pdfs in groups like this is problematic. Having to constantly navigate through your "tech support" is frequently frustrating and counterproductive. There's lots we could be doing and we'll just choose another project.

Sometimes there's people who do really cool stuff that aren't even on our staff. Computer science undergraduates doing mashups, library grad students trying to make a reputation for themselves, "Super patrons" who want to see cooler software in their libraries and volunteer their time.

Many libraries play "follow the leader". Every project that successfully implements your api could very well in turn increase the number of clients you have. Trying hard to compete with one other group? If a bunch of librarians see a catalog where in one catalog you can import your search results to citation manager x but not y, they're very likely to consider getting citation manager x. In other words, it's the best free advertising. I know there's a good business school type term for this, but I fail to forget it now.

Just having that api there increases your "buzz" factor. Software programmer types will poke around apis and daydream of what we could do with it and babble about it in blogs, IMs, and chat rooms. If there's some cool feature that no one else has implemented and you don't have it documented it does not exist. Period.

Really, do you think that you really are in danger of having the competition get a leg up on you if you have high quality documentation? Think of this, if your competition is that aggressive, they have far more motivation to reverse-engineer your service than we probably have of even using it. So in other words, even without adequate documentation that company may figure out what you're doing, but we certainly probably will not.

While I'm talking about documentation, I've got a related rule I've held for a while that is also a useful guide. If you find that in Google and other search engines other versions of documentation of your own service is getting higher hits than yours, that's an indication you need to re-work your documentation.

A quick example is Microsoft's documentation sites. Lately I've seen some sites that have (probably illegally) copied all the content but stripped most of the annoying msdn formatting, rendering a page that much more readable.

If you're seeing a lot of tutorials, make sure to write a good tutorial and add it to your site.

In the end...

In the end I'll probably just go look and figure out what our consortium is doing in our catalog as far as the Syndetic api. Which also means my stuff will look just like everyone else's. Boring.

Sunday, July 27, 2008

What is a Codex Monkey?

Any joke you have to explain is not a very good joke. Now, in a desperate attempt to be cute and clever I named this blog and my "handle" on it as Codex Monkey. All well and fine, until a librarian who found out about it said the other day "I don't get the name".

Easy. First, it's a spin off of the phrase "Code Monkey". Take a gander over at the entry in the jargon file:

A self-deprecating way of denying responsibility for a management decision, or of complaining about having to live with such decisions. As in “Don't ask me why we need to write a compiler in COBOL, I'm just a code monkey.”

There's even a show called Code Monkey which stunning animation brings back memories of playing far too many video games in the 80's.

I'm abusing the word codex a little here. It tends to be used mostly for manuscripts and frequently religious ones as well. However, as a word it has the handy property of dealing with books and being very close to code. But truth be told this blog title would be more suited to a programmer who worked in a rare book and manuscript library. If someone fits the bill and complains, I'll let them have it if they can think of a better name for me ;).

I've also seem to recall having seen codex be used in the sense corpus is used in analysing text and writing. Of course, this could just be my faulty memory. I thought about using corpus monkey, but that's a little too close to corpse monkey, don't you think?

Monday, May 05, 2008

Cataloging with blinders on.

There's a topic that's been sitting on my mind for a long time now and it probably deserves a bit more than a single blog post. I'll take a stab at addressing the overall issue though. A common saying in cataloging is that you catalog only the item at hand. That is, you're not expected to go running off to various sources or do in-depth research on each item. There's usually many reasons given for this but one of the main reasons boils down to one of practicality. Most libraries received too much material to give each item this treatment.

However, I believe this rule is becoming increasingly irrelevant. Searching and research is becoming dramatically cheaper in terms of manhours with each passing year. Given certain seed information for newer publication such as isbn, author and title it should be relatively easy to have automated searching done of the publisher onix servers, webpages related to it, Wikipedia entries, LibraryThing, Shelfari, and more. This searching can be performed and gather information for the cataloger to use to enhance the record. At the very least our systems should be smart enough to be attempting to guess appropriate authority records and offering analysis of similar records and suggest subject headings.

A good example of potential actually exits right now with the wealth of organized information for one particular medium, cds. Cds themselves can have their track information as cd-text. Barring that, there's many ways that the songs or cd can be used to pull up information from a wealth of online resources such as freedb, musicbrainz, discogs. This really deserves a post all of it's own, which I may get to one of these days.

For example involving printed books not long ago someone asked on how to create a record for two books (The Talisman and Black House) that came in the set. She claimed as far as she knew that they were unrelated. Turns out that they shared the same central character (Jack Sawyer), had re-occurring characters, and were written by the same two authors. Had she checked LibraryThing she would have found the books classed by readers as a series. Wikipedia even notes how the second book tied into Stephen King's Dark Tower series. There was no reason to even have her go to the browser, the act of putting in the isbns should have pulled up some of this information.

So lets abandon item in hand. It no longer takes a walk across several floors and possibly hours of effort to find useful information on several books. It could be as easy as doing what we're already doing, with a little smarter software and a little bit more flexibility in our cataloging procedures.

Wednesday, September 19, 2007

Symposium on the Future of ILSs: Let's see numbers

I had the good fortune to attend the Symposium on the Future of Integrated Library Systems recently. It was an excellent, excellent conference. The Lincoln Trail folks should be proud of themselves. I have pages and pages of notes, but I figured I would do a couple of posts on what seemed to be to be several of the points that seem to be common among several of the talks. We'll start with the one that has the most resonance with me: "We need to see the evidence".

This point kept coming over and over, although no one was obnoxious about it. We need to have more evidence to support our actions and decisions as we move forward. One part of this is we simply need better information on our own costs and expenditures. Chip Nilges from OCLC mentions the value of a link in one of his talks. Do you know how many people view an individual catalog record? Can you estimate how much that space is worth? This seems vague and fuzzy in the library world, but I'm not in administration. Perhaps that's just the view from below.

Perhaps even more importantly seems to be a gap in our knowledge about our own users. We see organizations like Google and Amazon rise because they focus a lot on the average user. They are constantly studying logs, creating and reading usability studies, and just talking with people. It's not to say that librarians haven't done this in the past. I know there's some excellent papers out there in libraries and from some of the Information Retrieval folks. However, it seems to be that in our day-to-day planning we make wild guesses, ones that are frequently wrong.

It's difficult to get funding or budgets for usability studies. Some of this seems to be changing recently, but it's difficult to tell if this is a general trend or just a local one. I'd like to think some people at least have gotten used to me trying to figure out what our users are actually doing and have stated to try to find better evidence for changes they'd like to make, but I'm really not that important. More likely it's become clear to some who resisted things like this that our current ways just aren't working.

Now, I want to clarify something. The need for evidence shouldn't be a chilling factor. I've seen some people recently become overly critical of fledgling efforts and seemingly requiring usability studies and the like before a project even starts. This is a severe burden when someone is just starting a cycle of development. Usability should come early, but you need experimentation as well. It shouldn't be something that each research and experimenter needs to be an expert on, but something that gets built into the overall process for research and development. Ideally there's a constant cycle of experimentation, feedback, development, and feedback.

To clarify, this is one time when not having much data shouldn't be a sin. It shouldn't be an excuse to kill a project before it starts. Yes, it's a good indication if there's existing studies that a user might like recommendations. It's madness not to move forward with at least examining, experimenting and researching with the idea of recommendations just because there's no documented usability studies about how people like them. The foundations for the actual usability and user studies should be allowed to be created.

So....in an attempt to stave off the book I could probably write about this, let me just conclude: user testing and user-orientated design is great. It should be much, much more involved in all levels of the library. It should be a re-occurring part of the feedback loops within the library. A healthy institution has a feedback loop between it and the real world. It feels like a living, breathing, reacting thing. An unhealthy one seems like a machine shambling along blind, deaf, and oblivious of its surroundings. Keep working at trying to incorporate actual information about patrons and your own people and your library might just start a little bit more alive, maybe even a little more human.