Posted: October 31st, 2011 | Author: Alex | Filed under: Useful Software | Comments Off
I get a lot of use out of my Pandora account, but I’ve never really liked their desktop client. Adobe AIR has always seemed too much like Flash to me – too bloated, too flaky. Pianobar is a great alternative to the native desktop client.
Pianobar is a command-line interface to Pandora; no GUI, just text. This means I can stick it in a detached screen or tmux session and it doesn’t take up space when I don’t need to look at it. This is a huge plus, especially when I’m working on my laptop and don’t have a lot of screen real estate to spare.
Even better, pianobar can be configured to call a script whenever certain events occur – when the track changes, for example. This is really cool because it opens up all kinds of possibilities for integrating Pandora playback with other applications and services.
I wrote a little script called Pianogrowl a while back (to be fair, I ported someone else’s Bash script and added a couple extra features) that displays a Growl notification containing the album art, title, artist and album for the currently-playing track whenever the track changes. It also displays a notification if pianobar ever has connection or authentication problems. Being able to do that with Pandora tracks is pretty cool; I wish more applications had these kind of hooks exposed out of the box.
If you use Pandora a lot and spend a lot of time in the console, you might want to give pianobar a try.
Posted: October 16th, 2011 | Author: Alex | Filed under: Advice (Unsolicited) | Comments Off
Until about a week ago, I used password-less SSH key pairs. I would keep a private key on a machine and stick its corresponding public key in authorized_keys files for my accounts everywhere else so that I would be able to log into any machine from any other without using a password. I figured the only way something really bad could ever happen is if someone were to get ahold of one of my private keys – and what are the odds of that happening, right?
Turns out Murphy’s Law applies here too. Somehow, someone got ahold of one of my private keys last week. I won’t go into the gory details of how, mostly because I don’t know exactly how they did it. I spent a couple days last week frantically changing every password I have and regenerating all of my key pairs. I then nuked the offending machine from orbit.
To say the whole event was unsettling is the mother of all understatements. The fact of the matter is that I have no idea what they did, what they took or didn’t take, what files they accessed … it’s that lack of information that’s the most terrifying. I’m not a computer security expert by any means, but I’m not a complete layman; I took a lot of precautions on the server in question to make sure this wouldn’t happen, and it did anyway.
This whole event triggered a lot of research and soul-searching. Here are some thoughts/recommendations that have come out of that process.
One of the blogs I read on the subject said that password-less SSH keys are “like credit cards without PIN numbers”. The analogy is pretty appropriate, I think. Let my mistake serve as an example – just don’t use them. They just aren’t safe.
You should generate separate, strong key pairs for each machine you use. I used 1024-bit DSA keys before, but I’m using 4096-bit RSA key pairs everywhere now. 4096-bit keys have modest performance overheads relative to 2048-bit RSA keys (ssh-keygen‘s default), and should still be really hard to compromise by brute force for the next couple decades barring some major theoretical advance or access to an enormous botnet. Nothing’s foolproof, of course, but it’s a start.
The keys you generate should have distinct passwords. That is, keys’ passwords should be both distinct from your normal password for that machine and distinct from one another.
Use ssh-agent to save yourself the headache of providing a password every time. The web is littered with tutorials on how to use ssh-agent, so I won’t talk about that here. These days, I start an ssh-agent process when I log in, and make sure it gets terminated when I log out. See this Github gist for the appropriate incantations to make this happen. On OS X, it’s even easier; ssh-agent has been nicely integrated into Keychain and launchd since Leopard. Really, there’s no excuse not to use ssh-agent.
Finally, minimize your attack surface. If a service can run in a separate user account with no privileges, it should. If a service must run as root, run it in a sandbox. If a port doesn’t absolutely have to be open, it shouldn’t be open. If a machine can still do its job while not being world-routable, make it non-world-routable.
Personally, I’m done running my own publicly-visible servers. Unless you’re in the business of running and securing servers (and have the known-how to keep them secure), I just don’t see how it’s worth the trouble.
Posted: October 6th, 2011 | Author: Alex | Filed under: Random | 1 Comment »
So today has been, to say the very least, eventful.
So much has already been said about Steve Jobs’ life and legacy that I feel like I might just be rehashing what others have already said more eloquently. The first computer I ever used was a Mac. I wrote my first program on a Mac. His work and the work of his company have affected me profoundly and his singular vision will remain in the DNA of the computing industry for decades to come, even if Apple were to vanish tomorrow. I wish I could have thanked him in person.
In much happier news, my friends Alex and KC welcomed their daughter into the world today! Congratulations to them both. Nadia could not have asked for better parents.
Posted: September 24th, 2011 | Author: Alex | Filed under: Advice (Unsolicited), School | 1 Comment »
It looks like it’s about time for school to start again. Inspired by Justine’s post, I’ve decided that it’s time for yet another set of unsolicited advice for new students. This is the start of my fifth (oh jeez) year as a graduate student, so I feel that I can share some things that I didn’t quite grok in the early part of my graduate career. I apologize if any of these pieces of advice are cliched or obvious. This is particularly geared toward students in the systems and networking sub-disciplines of computer science, since that’s what I know. YMMV.
I’ll start with the one that all first-year graduate students hear and most completely fail to act on: grades don’t matter as long as they’re good enough. By this I mean that, as long as you pass, your grade in a graduate-level course does not matter at all. Nobody will ever look at your grades in graduate coursework, for internships, jobs or otherwise.
This will be really hard for you to accept, because you have been in the business of performing well in classes your entire life. Resist the temptation to spend more time than absolutely necessary on coursework. Make every effort to make every course project you do relevant to your research or publishable in some way. Time spent on your research is time spent productively. Time spent on anything else is time you should be spending on research (or, heaven forbid, actually enjoying yourself outside of work).
Graduate school is an emotional rollercoaster. You will have really good weeks. Who’s-the-man, major-results-every-day, high-fives-all-around weeks. If you’re anything like me, you’ll also have weeks when you feel like you haven’t gotten anything done. This is completely normal. If it happens more than once or twice in a row, take some time to step back and reconsider what you’re doing or how you’re doing it.
Some of your papers will be rejected. Some of them will be rejected several times in a row. Some might never even see the light of day. This does not mean that you’re a failure as a graduate student or that your research is garbage. You probably aren’t and it’s probably not.
The thing that is hard to come to grips with coming out of college is that papers aren’t accepted or rejected based on some objective rubric. A great deal of the selection process is very unscientific. Program committees are comprised of people, and everyone has their own opinions and biases. You might just have caught a reviewer on a bad day.
Treat every failed submission as a learning experience. Act on the legitimate complaints, ignore the inscrutable, bizarre and mean-spirited ones, and move on. Most importantly, don’t let it reflect on your opinions of yourself or your work. It doesn’t do you or anybody else any good. The only thing you can do is consider any constructive criticism and produce the highest quality work you are capable of producing. As long as you keep doing that, you’ll do fine.
Don’t be afraid to discard an idea you’ve been working on for a while or a piece of code that took you a long time to write if it’s clear that you’re going in the wrong direction. At the same time, don’t be too quick to abandon an idea if it doesn’t work out immediately.
Write down everything you try. If you run an experiment for a paper, write down how you ran it, when you ran it, and what the results were. In general, take good notes. They will save you a ton of time down the road.
There will be times during your career as a graduate student when you’ll ask yourself, “Why, oh why didn’t I just take that job at Large Software Company X out of college, with its hefty salary and reasonable hours?” The answer, hopefully, is that you wanted to gain a depth of understanding in a portion of your field and advance the state of the art. Eventually, probably when you start to see a tangible endpoint, you’ll feel like you’ve done that. Hang in there.
Posted: September 18th, 2011 | Author: Alex | Filed under: Random | Comments Off
What a convenient excuse for not having content this week!
As is usual for this time of year, I’m in the middle of a conference deadline push, so there isn’t going to be a whole lot of celebrating until that’s done, unfortunately. Thanks to everyone for your kind birthday well-wishes.
Posted: September 10th, 2011 | Author: Alex | Filed under: Computers | Comments Off
Sometimes when I’m bored or I’m losing focus at work, I’ll start doing what I call the “social network polling loop”:
- Repeat until several loops proceed without update:
- Check e-mail
- Refresh Facebook
- Refresh Google Plus
- Refresh Reddit
- Refresh Google Reader
- Load latest tweets on Twitter
- Check website stats on Google Analytics
It’s almost a reflex action, and it’s one that eats time. I’ve been trying to get myself out of the habit; it wastes time that I should be spending doing something productive. In addition to polling sites like these, I find that I spend far too much time every day looking at them when nothing has changed.
Thankfully, software can be a big help on both fronts here.
Eliminating the Need to Poll
There’s a lot to be said for just stopping yourself from polling websites in the first place. The knee-jerk reaction to this sort of approach is, “But what if I miss something?”
I’ve talked about RSS feeds here before; they’re a really good way to stay on top of changes to websites without polling them. Unfortunately, many social networking sites don’t make RSS feeds of their content available. I’ve basically given up checking Facebook regularly because of this; I’ll only look at it when the mobile app pings me or it sends me e-mail.
Other sites make RSS feeds available, but put them behind an authentication mechanism. Sadly, Google Reader still lacks support for authenticated RSS feeds. This is kind of a drag, since most major user-specific feeds are behind some sort of authentication these days.
My typical workaround for something like this is to build a wrapper around the protected RSS feed in Yahoo! Pipes. The wrapper performs the authentication and reads out the resulting RSS. After subscribing to the pipe’s private URL, I’ve got a feed that Google Reader will be able to read. The thing that’s great about Yahoo! Pipes is its ability to pass the feed through all manner of operators (filters, joins, and so on). This is great if you want to only get news on a particular topic from a site that only provides one “firehose” feed.
Changing the Access Method
Twitter is one of those services that lends itself quite well to polling; follow enough people and you’re guaranteed to be receiving at least one update every couple of minutes. They even make it easy to leave Twitter open and tab back in to load new tweets every few minutes “so that you don’t miss anything”.
Getting the Twitter RSS feed set up was pretty easy thanks to Steffen Grunwald’s status feed service; I was worried that I’d have to mess around with OAuth to make it work, but thankfully Steffen did the hard work for me.
Once the feed was up I found that there were just too many incoming tweets for me to get through, so I passed the feed through Yahoo! Pipes. Specifically, I filtered out any tweets that don’t a) contain links or b) contain a question mark. This essentially creates a “tweets that are asking questions or sharing a link to something” feed, which are the tweets I would least like to miss. I might expand this to include retweets at some point, but usually retweets include links anyway, so it works pretty well as-is.
Upper-Bounding the Time Suck
When it comes to quashing this social network poll loop, the spirit is willing but the flesh is often weak. This is where enforcement comes in. In Chrome, I use the StayFocusd plugin to limit myself to 15 minutes of total social network/feed reader time between 8 AM and 8 PM. StayFocusd isn’t very feature-rich (it doesn’t support multiple block sets with different timings, for example), but it serves its purpose pretty well. Whenever I have to bypass the block, I can just open a window in Incognito mode or disable the plugin. Unfortunately it still takes a deal of willpower to keep myself from abusing that ability.
E-Mail, The Time Waster Du Jour
The one piece of the polling loop that I haven’t managed to remove quite yet is e-mail. Usually I don’t poll my e-mail account, but I do get a lot of mail and have gotten myself into the habit of reading and/or responding to it pretty quickly after I receive it. I’m convinced that the frequent new e-mail notifications I keep getting are distracting, but the nature of my job and the way my co-workers and I typically use e-mail makes only checking my e-mail twice a day impractical. If any of you have strategies or experiences with this, I’d really like to hear them.
Posted: September 3rd, 2011 | Author: Alex | Filed under: Advice (Unsolicited), Computers | Comments Off
Last week, I talked about the bathtub curve and what it can tell you about bad hard drive reviews. I’m going to expand on that a little this week and talk about how replacing your drive doesn’t necessarily mean you’re solving the problem. Then we’ll briefly touch on another common source of consumer angst, hard drive sizes.
Correlated Failures
A common pattern in one-star hard drive reviews is the following:
First drive failed, sent it back. Replacement failed two weeks later. You computer people are all monsters. I’m going back to using a typewriter.
If you buy a drive from a company and it hits the wrong end of the bathtub curve, they will usually replace it. This is basically what hard drive warranties are for: they prevent customers on the wrong side of the bathtub from getting screwed over. Unfortunately, they will probably just pull the next hard drive box off the wall and send you that one. Those two drives probably arrived at their warehouse on the same shipping palette, which probably means that they were manufactured and left the factory at approximately the same time. If there was an unnoticed defect in that particular production batch, you’re much more likely to see the same problem on the replacement that you had with the original.
Incidentally, this is why you should never buy multiple instances of the same drive at the same time if you’re planning on building a RAID array with them; correlated failures might come back and bite you in a big way.
Drive Sizes Lie to You
Stop me if you’ve heard this complaint before:
I bought a 500GB hard drive, but it’s only got 465.7GB of space! I want my 34.3GB back!
I talked about this last year in the context of Wolfram Alpha. The short answer is that drive manufacturers are advertising their capacities in powers of ten and shipping with capacities in powers of two.
Operating systems vendors seem to be converging on lying to their customers rather than confusing them; Apple’s Disk Utility, for example, gives capacities in powers of two and units in powers of ten (500GB when it’s really 500 GiB). In my opinion, this is like setting the value of pi to 3.2; not only does it mask the problem, it hides some of the fundamental truths underlying it.
Posted: August 27th, 2011 | Author: Alex | Filed under: Advice (Unsolicited), Computers | Comments Off
Last week, one of my external drives failed, and another indicated that it’s about to die by failing a read and causing my RAID volume to degrade. Neither of these failures were surprising; both drives were well outside of their warranty periods. The way these drives failed and the (sadly ongoing) quest to replace them has brought up a couple of things that I’ll talk about here.
Failed drives means shopping for replacements. When it comes to external hard drives, we seem to be presented with a multitude of choices, none of which are good. Judging by reviews on NewEgg, external consumer-grade hard drives are some combination of:
- Unreliable
- Slow
- Feature-poor
- Plagued with awful customer support
I was surprised at how many of the one- and two-star reviews for hard drives on NewEgg (and virtually everywhere else that sells drives) display some of the same common misconceptions. It’s a sad indicator that as an industry, we still haven’t figured out how to make computers anything less than magical and inscrutable to the average consumer. I’m going to lay out a couple of those misconceptions in the next couple of posts. They’ve doubtlessly been rehashed elsewhere, but these are things that deserve repeating.
The Bathtub Curve
If you were to plot failure rate of hard drives versus time on a graph, the graph would probably look like the blue line in the graph below (thanks, Wikipedia!):

This blue line is what’s referred to in reliability engineering as a bathtub curve, because its shape is evocative of a bathtub. In plain English, the bathtub curve basically says
- Things that are shipped with defects usually fail early.
- Things that work as designed still eventually wear out.
- In the middle, anything can happen, but failure is less likely.
Many one-star NewEgg reviews I came across were some variant of:
Drive fails after X days of use. What a piece of crap. I’m never buying from this company again.
These are people who have unfortunately hit the wrong end of the bathtub curve.
Why does this happen? Well, some of it has to do with manufacturing; with something this intricate there will inevitably be defects, regardless of how much quality assurance you put into it. Some of it might have to do with what happens to the drives during shipping. Sometimes there is actually a systemic defect in a particular model or production batch that goes undetected by quality assurance; this usually results in a class action lawsuit months or years down the road.
The best bet, as I’ve stated here several times in the past, is to never assume that the drive will last another day. I was shocked at the number of times I read a review like this:
Bought this drive and it died three days later. Now 50,000 photos of my cat Muffins are gone. I hate you, Seagate, and so does Muffins.
So you bought this drive, and copied your photos to it, and then … you deleted the originals?! I’ve said it before and I’ll say it again: if there’s only one copy, it is only a matter of time before you lose that data.
Next week: why the replacement for your failed drive is more likely to fail, and why hard drive manufacturers are lying to you.
Posted: August 19th, 2011 | Author: Alex | Filed under: Useful Software | 3 Comments »
OK, so I’m about 7 years late to the party on this one, but man oh man do I love Markdown.
I spend a lot of time dealing with text, but most of the time it’s text designed to be consumed by compilers and interpreters rather than people. When I write people-facing text, it’s almost always in LaTeX. In the process of dealing with these kinds of writing tasks, I’ve become really intolerant of WYSIWYG text editors. They’re just not precise enough.
Evernote’s a notorious culprit here. I tell it to bold a line, it bolds the next blank line too. I change the font, it gets changed back in weird places. Bulleted lists sometimes re-bullet or re-indent themselves in weird ways. It’s irritating. I’ve had similar problems with WordPress’ visual editor.
In short, I’m one of those weird people who doesn’t care what it looks like on-screen while I’m editing it as long as the finished product looks like I want it to look.
Markdown brings the kind of precision I’m used to in LaTeX to the realm of writing HTML. One thing that it loses is all the markup (hence the name, I suppose). For example:
**bold** produces bold text, *italicized* produces italicized text. Similarly terse, readable syntax for headers, links, images, and so on. All of the common stuff that you’re used to when writing HTML, without … well, the HTML.
Apparently Markdown got huge a few years ago, and obsessive programmers like me have integrated it into all sorts of things. There’s a Markdown plugin for WordPress (in which I’m editing this post). Of course, there’s a Markdown mode for Emacs. The one piece of my daily routine that lacks Markdown is Evernote, unfortunately. I think I might be able to get around that with a clever combination of evernote-mode and markdown-mode. If I can figure something out, I’ll post it here.
Posted: August 13th, 2011 | Author: Alex | Filed under: Computers | 2 Comments »
Ever since I started exploring the more advanced features of GNU Screen, I’ve been using it constantly. For years, I never really thought of it as much more than a way to keep a shell open when I switched machines. I was also turned off by its appropriation of ctrl-A for commands, since it overwrites “jump to beginning of line” in Bash and emacs. You can still get at it, but the shortcut is ctrl-A + A, which never felt anything less than completely awkward.
Thankfully, Keaton sent me a copy of his screenrc which contains, among other things, an escape-sequence remapping command. I changed the control code to ctrl-O, since I couldn’t find any command sequence that conflicted with it (by adding escape ^Oo to my .screenrc).
I’ve posted the rest of my .screenrc file as a GitHub Gist.
Screen’s Hidden Gems
I used to just use screen for attaching and detaching single terminal windows, if I needed to leave a process running in a shell for example. In the past couple of months, I’ve gotten more of a taste of its capabilities, which are numerous. Among its more awesome features:
- Multiple Terminal windows per session that can be attached and detached as a group.
- Horizontal and vertical screen splitting – just like iTerm and Terminator, but
screen predates them both by years and allows you to attach and detach terminals from panes at will.
- Notification on activity and inactivity (that is, have
screen notify you when a terminal starts or stops producing output).
I honestly wish that I’d started to use screen more heavily a long time ago.
Fixing Window Fragmentation
I typically have a lot of windows open in a screen session at a time. If you close and open windows a lot, you’ll find that the windows’ numbers start getting fragmented. Rather than having windows numbered 1,2,3,4, for example, you’ll have windows numbered 1,3,4,7. Maybe this pegs me as an obsessive, but that’s sort of irritating. Also, I’d like to have the windows sorted by name. Because, you know, I sort of have a thing for sorting.
There’s a Bash script that’s been floating around called screnum that renumbers windows within a session so that they’re sequential, but I could never really get it to work reliably; it had problems with windows that have the same name and was really slow. So I ported it to Python (because in my opinion, Bash scripts get gross once they get more complicated than short lists of straight-line commands), fixed the same-window-name bug, and dramatically sped up the window sorting routine.
I posted the script, screnum.py, to Github. If you’re interested in using it or have ideas for improving it, have at it.