paper-pile: Simple Command-Line Paper Management

Posted: May 12th, 2012 | Author: | Filed under: Useful Software | No Comments »

I’ve struggled with research paper management for years. I gushed at length about Mendeley last year, but lately I’ve been having some of the same problems with Mendeley that I had with Evernote.

Its metadata lookup functionality, while convenient when it works, doesn’t seem to work that often for me anymore; this might be because I’m looking up newer papers or papers in an area that they don’t have a lot of metadata coverage for, who knows. Mendeley’s note-taking feature leaves a lot to be desired; you essentially have a Notepad window that you’re writing text down in. Too much formatting is distracting, I’ll admit, but I’d like to be able to at least bold and italicize things every once in a while. I also ran into the ceiling of Mendeley’s free space and had to start paying for extra storage (if you’ve read my markupserve post this is probably sounding familiar), which is irksome considering that I’ve got so much free Dropbox space available. Exporting BibTeX entries for individual papers in Mendeley is a lot more tedious than I think it should be. The social features of Mendeley would be interesting if any of my friends actually used them, but they don’t, so I don’t.

So last month, I sat down and figured out what I wanted – what I really wanted – out of a paper management system based on what I’d done with papers for the past five years. The list basically came out like this:

  1. I want to be able to take notes on a paper, both outside and inside of the PDF. It would be nice if I could do things like add images, formatting and links to notes occasionally.
  2. I want to be able to generate a BibTeX for a single paper or groups of papers quickly.
  3. I want to synchronize everything across computers without having to think about it.

Everything else, I felt, was a secondary concern.

Taking notes inside a PDF was taken care of – the PDF standard has supported inline annotations like highlights and comments for a long time. Taking rich notes outside a PDF is also not that hard – I could do what I’ve been doing with markupserve (Markdown, redcarpet and emacs). Generating BibTeX without dealing with BibTeX’s quirkiness required a metadata format that would be easy to convert into BibTeX. Thankfully, YAML‘s a pretty painless way to store information like this. As long as this whole thing stays on the file system, I could keep it synchronized with Dropbox. I figured that search wasn’t that big of a deal; as long as the PDFs were searchable and everything else was plaintext, I could just search with Finder.

The only thing I had left to write was the thing that kept PDF, notes and metadata in one place and allowed me to manipulate them and generate BibTeX. I wrote paper-pile to do this. Once paper-pile was relatively stable, I wrote a quick-and-dirty Python script that loaded my Mendeley library into paper-pile by parsing Mendeley’s BibTeX dump. After paper-pile’s basic functionality was done, I wrote a simple web server that would display formatted notes using the same libraries I used for markupserve. From start to finish, I figure it took me a couple weeknights to get it all the way I wanted it; most of the complexity was getting Mendeley’s library to import properly.

It’s not perfect – for example, I’m pretty sure all hell is going to break loose when I generate BibTeX for a paper whose author’s name has an accented character – but it gets the job done. When I encounter a bug or a feature that I suddenly wish paper-pile had, I just add it in. If I want to get BibTeX for a bunch of papers at once, I just list the papers’ paper-pile keys and pipe the list through xargs. No impedance mismatches, no GUIs getting in the way, and I’m in no danger of running out of room for papers on Dropbox.

If this sounds like something you’d want to use yourself, paper-pile is available on GitHub. I make no guarantees as to its performance or correctness (and you probably shouldn’t make the web UI world-visible), but I use it myself every day, and that has to count for something.


MarkupServe – A DIY Evernote Alternative

Posted: November 28th, 2011 | Author: | Filed under: Useful Software | 1 Comment »

I’ve gushed at length in the past about how much I love Evernote. I upgraded to Evernote Pro about half a year ago so that I could attach arbitrary file types to notes. After a while, though, a couple of things about my workflow with Evernote were starting to irk me.

First, I wasn’t even coming close to using Evernote Pro’s generous data cap. Second, I wasn’t sure that paying $5 a month so that I could attach a PDF to a note was worth it given how infrequently I was attaching PDFs to notes. Third, and maybe most critically, their WYSIWYG editor has always kind of bugged me; lists never really indent the right way, sometimes formatting leaks to the next paragraph in unpredictable ways, etc. It’s just really hard generally to get precision control over how your text looks in Evernote’s editor (at least on the Mac, I have no experience with the Windows editor). Its lack of Linux compatibility isn’t a problem for me anymore, but it might become a problem if I start working on Linux boxes again in the future.

I figured the “paying for space I don’t use” problem would be solved if I could find something that would sync with Dropbox. I have a couple GB of free space with them that is really underutilized and I would prefer using that space instead of paying for more underutilized space. For the “imprecise editing” problem, I decided that what I really wanted was the ability to write notes in a markup language like Markdown (which I also love and have gushed about here), do my writing in Emacs, and be able to view the notes in HTML easily.

I couldn’t find any pre-existing solution that I really liked, so I decided to roll my own.

Dropbox support wasn’t a problem; I just exported my notes from Evernote, used html2text to convert the exported notes into Markdown, and moved the folder containing the converted notes to my Dropbox folder, where it happily synchronized. Dropbox, you are awesome.

At first, I used Marked to render and view notes individually. Marked is a great app, but I wanted to be able to interact with rendered versions of my files more quickly. I had thought about writing up a read-only filesystem with FUSE that would do the rendering of Markdown to HTML transparently and just present a filesystem of HTML files, but that sounded like overkill even for me. I figured that writing up a simple web server with Bottle to display the rendered files would get me to a working solution much more quickly. After a couple hours of coding on Thanksgiving, I had MarkupServe working.

MarkupServe is pretty basic at this point. It’s given a directory tree containing markup files and presents that directory tree as a directory listing similar to the one httpd gives you if you don’t have an index page. It has simple keyword search (which just runs grep at the root of the directory tree and HTML-ifies the results) and renders notes to HTML on-the-fly when they’re clicked on. I made MarkupServe extensible enough that it should support more than just Markdown, in case others would find something like this useful but want to use other markup languages. It’s not the fastest or prettiest thing ever, but it works.

The last hurdle was making attaching files in Emacs comfortable. Evernote exports all attachments for a note note.html in a directory named note.resources, so I figured I’d stick to that convention. I made MarkupServe ignore directories that ended in .resources so that it wouldn’t clutter up the file listing. Then I wrote a couple little elisp functions that create a .resources directory for a note and “attach” a file by copying it into the appropriate .resources directory and inserting a Markdown-style link to the new copy. I’ve posted those functions in a GitHub gist if you’re interested in looking at those.

One big piece that this system is missing is a mobile app. Evernote’s iPhone app is terrific, and I’m going to miss it. At this point, my solution is to use Elements to edit notes and add photos taken with my phone’s camera to notes manually if the need arises. It’s sort of awkward, but I used the image attachment feature so infrequently in Evernote that I’m not really concerned about it. The system also lacks Evernote’s slick image OCR capability, but that was another feature I never really used (my handwriting’s pretty awful and the OCR could never really parse it well).

I’m sure I’ll tweak this setup considerably as I get more experience with it, but it was surprisingly quick to throw together and has proven really useful so far. Hopefully open-sourcing this stuff will help any other people who might have a similar itch to scratch.


tmux: the new hotness for terminal management

Posted: November 3rd, 2011 | Author: | Filed under: Useful Software | Comments Off

Some time ago I talked about using GNU Screen to effectively manage a bunch of terminal windows. Turns out screen has some serious competition: tmux.

One thing that’s pretty cool about tmux is the way it handles windows. tmux’s window management model is purely client-server; windows are clients, and clients are managed by a single server tmux process. This allows you to do things like move a window between sessions or attach the same window in multiple sessions. I haven’t run into a situation where this was useful yet, but it’s nice to know that it’s possible.

The thing that I have found useful is the fact that tmux is reasonably scriptable. In order to get the list of windows in a screen session, I had to create a dummy window, stuff the real window list into it, dump the contents of the dummy window to a file and parse that file. With tmux, I call

tmux list-windows

from any window in the session, or

tmux list-windows -t <session name>

outside the session, and parse the output. Not only that, but tmux displays none of the weird “I can only make calls that change a session once per second” problems that I’ve been seeing in practice with screen.

Like screen, tmux won’t auto-sort windows by name or make their numbering contiguous. My py-screnum tool for sorting and compacting Screen windows takes several seconds to re-arrange windows and is 95 lines of Python (according to sloccount). The analog for tmux, tmux-screnum, is only 57 lines, and re-arranges windows instantly. I won’t claim that either of these programs are minimal, but 50% fewer lines of code and faster is a winning combination.

There are times when I still have to use one or the other; many non-BSD machines come with screen installed by default but not tmux. I’m starting to prefer tmux whenever I have a choice, though.


Pianobar – Command-Line Pandora Client

Posted: October 31st, 2011 | Author: | Filed under: Useful Software | Comments Off

I get a lot of use out of my Pandora account, but I’ve never really liked their desktop client. Adobe AIR has always seemed too much like Flash to me – too bloated, too flaky. Pianobar is a great alternative to the native desktop client.

Pianobar is a command-line interface to Pandora; no GUI, just text. This means I can stick it in a detached screen or tmux session and it doesn’t take up space when I don’t need to look at it. This is a huge plus, especially when I’m working on my laptop and don’t have a lot of screen real estate to spare.

Even better, pianobar can be configured to call a script whenever certain events occur – when the track changes, for example. This is really cool because it opens up all kinds of possibilities for integrating Pandora playback with other applications and services.

I wrote a little script called Pianogrowl a while back (to be fair, I ported someone else’s Bash script and added a couple extra features) that displays a Growl notification containing the album art, title, artist and album for the currently-playing track whenever the track changes. It also displays a notification if pianobar ever has connection or authentication problems. Being able to do that with Pandora tracks is pretty cool; I wish more applications had these kind of hooks exposed out of the box.

If you use Pandora a lot and spend a lot of time in the console, you might want to give pianobar a try.


The Joys of Markdown

Posted: August 19th, 2011 | Author: | Filed under: Useful Software | 3 Comments »

OK, so I’m about 7 years late to the party on this one, but man oh man do I love Markdown.

I spend a lot of time dealing with text, but most of the time it’s text designed to be consumed by compilers and interpreters rather than people. When I write people-facing text, it’s almost always in LaTeX. In the process of dealing with these kinds of writing tasks, I’ve become really intolerant of WYSIWYG text editors. They’re just not precise enough.

Evernote’s a notorious culprit here. I tell it to bold a line, it bolds the next blank line too. I change the font, it gets changed back in weird places. Bulleted lists sometimes re-bullet or re-indent themselves in weird ways. It’s irritating. I’ve had similar problems with WordPress’ visual editor.

In short, I’m one of those weird people who doesn’t care what it looks like on-screen while I’m editing it as long as the finished product looks like I want it to look.

Markdown brings the kind of precision I’m used to in LaTeX to the realm of writing HTML. One thing that it loses is all the markup (hence the name, I suppose). For example:

**bold** produces bold text, *italicized* produces italicized text. Similarly terse, readable syntax for headers, links, images, and so on. All of the common stuff that you’re used to when writing HTML, without … well, the HTML.

Apparently Markdown got huge a few years ago, and obsessive programmers like me have integrated it into all sorts of things. There’s a Markdown plugin for WordPress (in which I’m editing this post). Of course, there’s a Markdown mode for Emacs. The one piece of my daily routine that lacks Markdown is Evernote, unfortunately. I think I might be able to get around that with a clever combination of evernote-mode and markdown-mode. If I can figure something out, I’ll post it here.


Software I Use Daily: Evernote

Posted: June 21st, 2011 | Author: | Filed under: Useful Software | Comments Off

Few tools have proven more useful in my day-to-day life than Evernote has. Evernote’s design is pretty simple; you can make notes that can include pictures, sound or documents as attachments, search through your notes, bundle them up into folders or tag them with tags. Notes get synchronized between any device that runs an Evernote client. Any image that gets included in a note also gets passed through OCR so that any words that appear in the image are indexed and searchable as well.

I probably refer to or write a note in Evernote at least half a dozen times per day. Every time I feel like I’m going to need to look up a piece of information more than once, it goes into Evernote. If I’m trying to figure something out, any information I find on that topic goes into Evernote.  As I refine my understanding about something, I’ll turn that raw info dump into something more compact and digestible. The ability to jump back to previous versions of notes is really helpful for this refining phase.

What I use Evernote for most is daily research logs. I really wish that I had started writing daily research logs years ago. They’ve been immensely helpful for organizing my thoughts and preventing me from doing redundant work. They also provide an easy-to-read archive of the work I’ve been doing, which makes preparing for status report meetings a lot easier.

I’ve been really pleased with Evernote overall, especially since they gave their mobile app a much-needed UI redesign. If you’re looking for a place to dump all the stuff that won’t fit in your brain, Evernote is definitely worth a look.


Airfoil, the Whole-House Music Streaming Killer App

Posted: May 31st, 2011 | Author: | Filed under: Useful Software | Comments Off

iTunes has had the ability to stream music to a remote set of speakers for a while now. At first it was just devices like the Airport Express, but now they’ve struck agreements with a bunch of different companies that make speaker docks and A/V units so that iTunes can stream to them too. Being able to play music out of a remote pair of speakers is great, but there are a few key limitations to AirPlay that have always left me less than sold on the idea.

First, AirPlay is all about iTunes streaming to something. Let’s face it, the only reason most people use iTunes is because it’s the application for synchronizing music to iPods. It works fine as long as your music library isn’t enormous, but it’s far from the most feature-rich music player out there and its support for things like Internet radio hasn’t improved much in 7 years. If you want to stream Pandora or Last.fm to remote speakers, AirPlay  just doesn’t fit the bill.

Second, the device set that recognizes AirPlay is still fairly limited and decidedly non-free. If you want an Airport Express, you’ll be paying $100 for what is essentially a wireless access point with an audio out jack. This has always rubbed me the wrong way. I already have a computer with an audio out jack and a network connection, why can’t I just stream to that? For that matter, why can’t my computers stream audio to each other?

Enter AirFoil.

Attach AirFoil to an application (pretty much any application), and it captures that application’s audio and sends it to one or more sets of speakers. AirFoil can stream to any AirPlay device as well as anything running the companion AirFoil Speakers application. The AirFoil server application runs on OS X or Windows, and there are versions of AirFoil Speakers for OS X, Windows, Linux and iOS (meaning it works on iPhones, iPads and iPod Touches too). The AirFoil server keeps all the audio streams to the various speakers magically in sync. AirFoil uses Bonjour service notification messages to find and advertise speakers, so AirFoil can see any speakers on the local network without the need for configuring anything.

When I want to listen to music from my iTunes library in my living room, I just fire up AirFoil on my desktop in the bedroom and stream through my media center PC in the living room. If I want an additional set of speakers in the kitchen (because hey, why not?) I can hook a pair of speakers to my phone and run AirFoil Speakers on it. This is something that would have cost me hundreds of dollars in additional, purpose-bought hardware to do without software like this. I’m extremely impressed by AirFoil; if any of this sounds remotely intriguing to you, I’d really recommend giving it a try.


App of the Moment: Bowtie

Posted: May 4th, 2011 | Author: | Filed under: Useful Software | Comments Off

I synchronize my music to my iPhone and carry it around with me. I listen to it in the car, at work, in the grocery store, while doing laundry … it probably gets a good 6-8 hours a day of use. At work, I’m faced with what I thought was a very obscure problem. I want to be able to control music playback on my iPhone from my computer.

It’s not that my phone isn’t sitting next to me on my desk when I’m working. I could just reach over, double-tap the phone button and press the screen to switch tracks. That requires my hands leaving the keyboard, though, and reaching over to switch tracks has probably cost me hours of accrued time debt over the years (sort of like how they say that you spent entire days of your life in total tying your shoes).

It’s irritating enough that I can’t share the music library on my phone with the local network; that would solve the problem right there. Unfortunately, Apple hasn’t seen a reason to implement that feature. I could also synchronize my iTunes library between my desktop and laptop, but unfortunately Apple hasn’t made that automatic and painless enough yet, and I’ve tried various other techniques (rsync, third-party apps, hosting the library on networked storage, you name it) without success.

For the longest time, I figured that I’d just have to deal with it (horrible, I know). Yesterday, the blog One Thing Well pointed me at an application called Bowtie.

The desktop version of Bowtie gives you basic control of iTunes (play/pause, next and previous tracks) with keyboard shortcuts and will show you the currently-playing track in a customizable little desktop widget. That isn’t very unique by itself; there are dozens of iTunes remote control apps of various maturity and feature-richness out there, and they’ve been around for years. Where Bowtie distinguishes itself is in the $0.99 companion app for the iPhone. Pair the iPhone application and the desktop application together, and you can control the iPhone’s music playback using the same keyboard shortcuts you use to control iTunes.

Pairing requires that the phone and the computer can see each other on the network (I’m not sure of the implementation details, but it probably relies on Bonjour). I use a wired connection at my desk (because the building’s wireless network is flaky), so I set up network sharing and connected my iPhone to the shared network and that has worked flawlessly so far.

Overall I’ve been really happy with it. If you’re running into the same first-world problem that I am, it’s more than worth the $0.99.


Software I Use Daily: Mendeley

Posted: April 17th, 2011 | Author: | Filed under: Useful Software | 1 Comment »

Almost two and a half years ago, I wrote a post here about my efforts to transfer my piles of research papers into digital form. At the time, I was running a combination of Referencer and Beagle, using Referencer to keep things organized and Beagle to make it all searchable. Unfortunately, this solution didn’t work out as well as I’d hoped. The main reason for this was the problem of manual cross-platform synchronization for both the papers themselves and all the various metadata associated with them. I didn’t want to waste time figuring out how best to keep everything synchronized between my desktop and laptop (one of the reasons I use my laptop exclusively for day-to-day development now), so I lived with that solution for about a year.

At some point in late 2009, I was introduced to Mendeley by some friends of mine in the CSE department. It’s like they took my wish list for a paper management system and implemented it. It’s fantastic. Here’s why:

Written by researchers, for researchers. It’s very clear that this application was written by people who have had to deal with academic papers a great deal. It strategically attacks so many pain points associated with dealing with large paper volumes that I can’t help but think that the entire design process was guided by researchers that were fed up with the current state of affairs.

It’s cross-platform. It works for OS X, Windows and Linux. And when I say “works”, that doesn’t mean “the Linux version barely works and the OS X version has a wonky UI”, which is true of a lot of cross-platform software in my experience.

Flexible organization. Like organizing with folders? It’s got folders. Like using tags instead? It’s got tags. Want to use both? Sure, go crazy.

Free, effortless synchronization. You can synchronize up to 500 MB of papers (both metadata and data) to Mendeley’s servers for free. For $5/month, that increases to 3.5GB. I’m currently at 365 MB and I’m storing 520 papers, so those 500 GB of space will go a long way. In my experience, synchronization between Mendeley instances “just works”, even across platforms.

Embedded notes and annotations. This is the killer feature for me. There’s nothing too complex here; just highlights, the ability to stick a note at a point in the text, and a dedicated notes region per-paper with basic formatting. The key here is that those notes synchronize across platforms and are actually readable everywhere.

It’s social (groan). It seems that in the new bubble, every piece of software you write has to have some social aspect to it. Thankfully, Mendeley manages to do this in a reasonably well-scoped and tasteful way. Your papers’ bibliographic information is sent to Mendeley, and they use that information to better recommend metadata for new papers that other users import. You can also share papers with other Mendeley users through “shared collections” (limited to 10 people in the free version), which is really useful for study groups and research teams that have to refer to the same pieces of literature. You can also track how many people are reading papers that you wrote and stroke your inner narcissist.

Mendeley is one of those applications I wish I had known about years ago. If you’re looking for a solution to your paper management problem, I encourage you to give it a shot.