Cognitive Conga: a blog

Dancing the conceptual kerfuffle shuffle

Ratiocination, n. An instance of [reasoning]. Also: a conclusion arrived at by reasoning. Doubt the applicability of this at your peril leisure.

Archive for the ‘hacking’ Category

Rainstorms in cloud computing

Tuesday, February 17th, 2009

Losing your data isn't considered good practice or fun. Losing all your data, when it includes lots of other people's data, is surely much worse. When your business is in saving data for other people, losing it is dire indeed. The popular social bookmarking site Ma.gnolia was recently in just this predicament.

(more...)

Why serving XHTML as application/xhtml+xml is cool

Sunday, February 1st, 2009

It means that if you accidentally break something in your draft, your browser (if compliant; most are, except Internet Explorer) will tell you:

  1. that you really did break something, and
  2. what that thing is.

(more…)

Equitable license

Saturday, January 31st, 2009

Not much to see at this point; more to come in future, time permitting!

Rolltop

Saturday, January 10th, 2009

Ever since my mother got a Thinkpad 240 to use for her work, most of a decade ago, I’ve been of the opinion that tiny laptops are really useful for their portability but irritating because of their tiny screens. The ideal thing would be something that was as handy to carry as a book, and with a keyboard just big enough to type on at a good speed (60wpm at least), but which also had a large enough screen that the user wasn’t constantly scrolling and/or suffering from eyestrain.

Some kind of multi-panel, unfolding screen, perhaps.

And all of a sudden, something like this is available on the mass market.

Seeing a real-world manifestation, after a couple of years of harbouring a utopian fantasy in which laptops unfurled giant screens, makes clear that current mass market computer engineering isn’t up to my vision. So I’ve just had a think for a minute about how a manufacturer might narrow the gap.

The answers’s pretty easy: use a resilient flexible display that rolls up around the laptop (like a firehose around its hub, but with fewer coils) and can be unrolled and stiffened with springs, struts or goosenecks for use. A tensegral approach to supporting the unrolled screen would probably be the best for an optimal tradeoff between the twin goals of low weight and high rigidity.

But technology like this is probably still a few years off. In the meantime, the XO-2 will likely be the next best thing.

tar cannot stat

Saturday, January 3rd, 2009

Wtf? you might reasonably exclaim, unless you’re used to using *nix (Linux, UNIX, etc) from the command line.

tar is a command for moving a file or files into a single archive file (or for reversing the operation). It can also apply compression/decompression algorithms, and perform various other neat tricks. It’s useful and ubiquitous. Unfortunately, it’s also quite sensitive.

Just now, I was using tar -cfz mytarfile.tar.gz mydirectory to archive a directory, and this returned the error, tar: mytarfile.tar.gz: Cannot stat: No such file or directory
tar: Error exit delayed from previous errors
. Hmm, not good.

Easy to fix, though. The solution to the problem is this: I should have written, tar -czf mytarfile.tar.gz mydirectory. In other words, the z and the f get swapped in the options. This is because the f option is supposed to denote that what follows it is a file. So the first command I tried was looking for a file called mytarfile.tar.gz to add to an archive file called z. Since the mytarfile.tar.gz didn’t exist, tar couldn’t locate, or stat, the file. Hence the error.

Oh, what fun is to be had at the command line!

How to troubleshoot exim (or any other open-source software package)

Saturday, December 20th, 2008

Recently, I had some trouble getting email working the way I wanted it to, on a server running Debian Lenny. I’ve not really had to configure an MTA before (though I’d made a couple of half-hearted attempts on other occasions), so I wasn’t sure where to start. The route I ended up following to reach a solution illustrates the range of support that’s available for free software, especially popular free software like exim, the MTA I wanted to configure.

Free software, unless you buy or license a support package for it, usually comes unsupported. That’s not to say that there is no support available, just that it isn’t guaranteed. In practice, free software often has a better support than some commercial software, and much of it is free of charge.

All software packages in a mature distribution like Debian will have some documentation included, in the form of man pages or other documentation (docs). Exim is no exception. However, man pages sometimes assume more knowledge than the user has, and are consequently not always sufficient for the novice.

In cases like this, especially when the software package is a mature one, it’s worth searching the web for tutorials, and for extra documentation, e.g. on the software package’s website. Exim has a lot of good docs on its website, but these still tended to assume a bit more than I knew. I needed something to introduce me to the fundamentals of email, especially the fundamentals of email on GNU/Linux, from a sysadmin’s perspective.

Enter the book. Most of the seriously popular free software packages going, have had at least one decent book written about them. Very often, that book will have been published by O’Reilly, a publisher that has carved a niche for itself selling books that act as the missing manual for the software packages they are about. The Exim book isn’t an O’Reilly book, but it’s along similar lines, and it is mostly very well written. Fortunately, I was able to borrow it from a colleague, but like most software books, it’s readily available at online bookstores.

So the book gave me the background knowledge to make some sense of the docs, and the docs gave me a more detailed view of exim’s command and configuration files, but still I couldn’t get exim working the way I wanted. At this point, I had two further options, both free of charge: to send a mail to a mailing list, or to use IRC. Mailing lists are good for comprehensive discussions, but it can take a few hours (occasionally, days) to get a reply. IRC, on the other hand, happens synchronously, so other logged-on users will see your question almost immediately after you post it, and can respond straight away. I opted for IRC, and went to the #debian channel.

Within a few minutes, I was being helped by other users on the channel. Having already looked at the book and the docs reasonably well, I was able to understand fairly quickly why these other users proposed the troubleshooting techniques they did. And within an hour or so, all was working as I wanted it to. Bear in mind that the people helping me were all volunteers.

All in all, this was a far better experience than many I’ve had with proprietary software support requests, which can have similar end-to-end times (a day or two, start to finish), can be expensive, and can leave you with very little knowledge gain as a result. By contrast, I now have a passable understanding of the way an MTA operates and how to configure one, which may serve me well in the future, and I haven’t had to spend a dime. In fact, even if I had bought the Exim book, it might have still cost me less than the cost of a support incident with a typical old-school commercial software vendor, and I’d have a great reference book to boot!

It takes a while to acclimatise to the free software approach, but I’m glad I decided, a few years ago, to begin putting myself through the process. Hopefully, if you’ve read this far, what I’ve written might make your own efforts a little easier :)

Remotely booting an encrypted system

Saturday, December 6th, 2008

I’ve been trying to work out how to remotely boot an encrypted computer system – in particular, one running Debian GNU/Linux. It turns out that this is not trivial. The reason is, a Debian system with an encrypted system drive expects the user to be able to enter a password to decrypt the drive before it can finish booting up. That’s fine if your system is a laptop and you’ve got it in front of you, but it won’t work if your system is a server located hundreds of miles away, because even if you can remotely switch the server on (e.g. you have a remotely operable power strip, or a friend/contractor who will press the power switch for you), you will have to enter the password into the console before your SSH, HTTP or other server software will be able to run. The reason they can’t run straight away is because they, or at least their configuration files, are on the encrypted disk.

Assuming you really do want to keep your system drive encrypted, and you do need to be able to boot it remotely, there are a couple of solutions.

  1. Keep your /boot directory in a partition that is unencrypted, and install Dropbear (or another small SSH client) on it, configured to start up on boot in a way that will let you log in via SSH to enter your decryption password. This apparently works (see a tutorial with details for Debian and Ubuntu here), but is complex to configure, fragile (it can break during kernel upgrades) and will not show the remote user any BIOS or POST menus, since the /boot is accessed only after those have run.
  2. Use KVM-over-IP. This appears to be the standard solution among professional sysadmins. It seems that several server manufacturers (I’ve looked at the HP, Dell, Fujitsu-Siemens and Apple websites) now incorporate KVM-over-IP directly into their servers. Even some of the low-end servers have this built in. Alternatively, external KVM-over-IP boxes can be used if the server doesn’t have one built in. At the cheap end of the market (my end!), I’ve seen models for sale online by brands like Lindy, Belkin, Adder and Aten.

I’ve not tried any of these options, and I don’t know how good they are, but judging by forum posts, other people are interested in this problem too, so I figured I’d share what I’ve learned so far.

One thing that concerns me about both classes of solution is that there’s a remote login prompt sitting online, ripe for brute-forcing. I’d feel much happier deploying one of these systems if it had a rate-limiting option. I suppose that if I deployed it behind a firewall of some kind (e.g. a router-based firewall), I might be able to configure the firewall to limit login attempts; it would depend on the firewall, and on my abilities. Rate-limiting login attempts isn’t something I’ve yet done.

Earglasses*

Tuesday, June 10th, 2008

Mel Chua’s recent post about communication reminded me that I’m lagging on something I started many months ago: trying to find a filter chain in Wavelab that would let her hear things – music in particular – in a manner closer to that of someone with normal hearing.** A while ago she blogged about her auditory response being like that of a low-pass filter, and that made me think about what the most practical way to mitigate that response would be.

I don’t know what the audio-processing algorithms hearing aids use are, but it’s fairly clear that:

  • the user doesn’t have much ability to modify them, except perhaps in the case of highly expensive models.
  • they probably aren’t terribly powerful, because – judging by digital hearing aids’ power consumption – the processing power available to them is very small.
  • although amplification is one of the primary components of a hearing aid’s processing chain, amplification alone won’t combat most hearing difficulties (certainly not Mel’s).
  • they are optimised for speech, rather than music.

So I set about trying a different approach: CPU-intensive audio-processing using the best algorithms I could lay my hands on (i.e. Wavelab’s plugins), optimised for music. I set up an EQ stage at the end of the chain, modelled on the graph of Mel’s auditory response, and started applying filters before it in an attempt to make the end result sound as natural as possible. I had a hunch that multiband compression might be more effective for this than EQ – certainly more effective than EQ alone – and so it proved. Partly this is because multiple stages of EQ filtering can induce “ringing” (they become resonant – an unwanted side-effect). Although multi-band compressors include EQ filters to split the signal into bands, these filters didn’t seem to suffer from the same side effects, perhaps because the compression was attenuating any resonances that might have otherwise been present.

Anyhow, a few months ago – long after my first experiment, I was blessed with a few minutes of the Mel’s time, and we tried some of the filter chains, with The Decline as the test track (because it was the only CD I had to hand with a mix that I was familiar with). We were somewhat successful, but I haven’t had time to do much more with the algorithms since then.

I mentioned to Mel that I thought it would be cool to make portable Sharc DSP devices so people could carry their audio processors around with them, set up so that they could select and edit the algorithms. This isn’t very far-fetched. There are lots of battery-powered, pocket sized audio processors on the market at affordable prices (for instance, the Korg Pandora). I don’t know if they use Sharcs, but I do know that many price-breakthrough audio processors with pluggable algorithms that I’ve seen hit the market in recent years have used Sharcs, so they seemed like a good bet. (Mel knew the same company’s devices by a different name, Blackfin, and suggested those. Then we worked out we were talking about essentially the same thing :-] The Blackfin is a sibling product to the Sharc.)

And this is how I come to be writing a blog post called Earglasses. Sunglasses – filters for your eyes – are easy to get hold of, and not even very hard to make, but filters for your ears aren’t so straightforward. I doubt Mel and I can’t make earfilters possible by ourselves because we’re both too busy with other things, so this is where a community effort might come in handy. I’d love to see a bunch of people working to make earfilters – affordable, portable devices with an audio input and a headphone output and ton of helpful, user-programmable algorithms running in between – a reality. So making this an open hardware, open software project seems like the way forward.

I’ve registered earfilter.org and will set up a wiki there shortly. It will be a place that people can post their filter chains (from Wavelab, Audacity, etc), links to useful plug-ins, suggestions for hardware architecture, and anything else that’s relevant.

* * *

*It turns out there actually is a company manufacturing what they call Earglasses, a kind of latter-day ear trumpet. Whaddya know. Their engineering reminds me of Big Ears, which I’ve known about ever since I took up the dubious habit of reading Canford Audio catalogues as a teenager. (Actually, I learned a lot from those catalogues – and from CA’s competitors’ catalogues – which included pinouts, specs, regulations, construction diagrams and all sorts of other nutritious information for enquiring minds.)

**Mel’s post was, of course, not just about hearing. I’ve focused on that aspect of it here because of the earfilter.org idea, which I wanted to let the world know about. Mel’s underlying point is about communication, and I couldn’t agree with her more: any barrier to communication or comprehension can be frustrating. One of the greatest joys, for me, of living in the information age, is that we’re better placed than any previous generation to reduce or eliminate those barriers. Another great joy is that there are so many people working passionately and enthusiastically to do just that.

Juicy TLDs not being eaten

Tuesday, June 10th, 2008

I often think of web services I’d like to be able to use. Often these services don’t exist (yet) or aren’t easy to find if they do. While trying to find these services, I ask myself what I would call the service if I had created it – or if I were to create it. The reasoning behind this is, of course, that if the service exists and has an obvious name, which I have guessed correctly, I will find it quickly.

Thinking along these lines yesterday, I realised that several obvious domains for these services could make good use of TLDs other than the usual .com, .org, .net, and so on. Specifically, they could have benefitted from .ly or .ng . So I looked into registering domains with these and discovered that in the first case it wouldn’t be affordable for me and in the second it wouldn’t be straightforward.

.ng is the Nigerian ccTLD, and although there exists a Nigeria Internet Registration Association with a form to help potential registrants register their “domains”, it in fact only allows the registration of subdomains below .com.ng, .edu.ng, .gov.ng, .net.ng and .org.ng . So even if I had the most interesting site in the world, I couldn’t do something cool like host it at http://interesti.ng .

I think that’s a little crazy, because Nigeria could start making quite a healthy income from registrants who would be willing to pay for domains like that.

There’s another snag too: .ng is what’s known as a “closed” ccTLD, meaning that it’s supposed to only be used by organisations based in, or with a presence in, Nigeria. There has been high level criticism of the concept of “closed” ccTLDs for some time now, and I think much of it is valid. After all, what counts as a “presence in Nigeria” – or in any other country, for that matter? The registrar and hosting provider Web4Africa gives some guidance, and so do other sites, but it’s very vague. If I use a DNS server in Nigeria to host my domain, does that count as my having a physical presence there? I think it should, just as if I were renting an office there. But it’s not clear if it does. What is clear is that checking whether or not I have a physical presence in Nigeria is done manually. This means that Nigerian domain registration can’t happen quickly. That in turn means that there won’t be a Nigerian Go Daddy any time soon. Go Daddy is the largest domain registrar in the world by some margin, at the time of writing. It has built its business in large part, if I’m not mistaken, on its ability to perform automated domain registration. This is an opportunity that’s effectively denied to Nigerian registrars because of .ng’s closed status.

I should note at this point that I’m not a fan of all Go Daddy’s moral principles – here’s why – and for this reason, I avoid using Go Daddy (currently, I use Dreamhost and 123-reg for domain registration, but there are plenty of other good registrars about). But I do not believe that those ethics were necessary for the success of the business. What was necessary was a legal and technological infrastructure that permitted the automated registration of domains. This, and the ability to register whateveryoulike.ng, is all I am proposing herein that Nigeria should provide.

Another name I had in mind for a web service ended in .ly – the Libyan TLD. Here, the state of affairs is more promising, but still not quite ideal. There seems to be only one .ly registrar with a working web site in English: the intriguingly-named Libyan Spider Network. It seems that I could register whateverIwant.ly without too much trouble. The biggest snag is the price tag: $150 per year (for comparison, a .com typically costs $5-$15 per year). Clearly, what’s needed here is some competition. With a few more registrars in the marketplace, that price would likely fall to something a pauper like me could afford for a fledgling, unfunded web service.

Is there, you ask, a ray of sunshine in the ccTLD domain business? Well, yes. ccTLDs like .us, .uk, .jp, etc, are available through vast numbers of registrars. Competition keeps the prices low and the service reasonable (although there are opportunities to be fleeced if you’re foolish). But there are some great success stories from non-developed economies too. Tuvalu’s .tv ccTLD is wildly popular and widely available. Another island state with a thriving domain name business is São Tomé, whose .st TLD is modestly priced and available through a very efficient-looking site.

These cases highlight the shortcomings of Nigeria’s system. I don’t know why Nigeria doesn’t follow these other countries’ examples, and I wish it would.

How many humanities scholars does it take to change a professional paradigm?

Tuesday, May 13th, 2008

There are some tech-savvy humanities scholars, there are some who try to grok modern IT but don’t quite manage, and there are some who wish information technology had never progressed beyond the invention of the book (for the extremists, even the printing press was a step too far: bound manuscripts are the height of IT for these folks*). I recently had a conversation with an eminent Cambridge humanities professor who said to me, in the context of a longer conversation about information management, “It’s like when Windows [by which he meant Word] will run on Microsoft [by which he meant a PC running Windows] but won’t work on a Mac [by which he meant... who knows? Word and Windows will both run on Macs].”

This sort of comment bothers me for three reasons. One is that it is baldly nonsensical: one must interpret it – with little guidance except one’s own background knowledge and a few of the antagonist’s preceding slip-ups – in order to make sense of it. Another is that it shows a lack of concern about accuracy; a dangerous lack of concern, in fact, for someone who has responsibility for one of Cambridge’s extensive, unique, and breathtakingly expensive digital datasets (furthermore, the scholarly accuracy of the input to the system matters little if your data is being corrupted by both its storage and its delivery mechanisms, which it was being). The third reason, which is the one of greatest personal concern to me, is that somebody like this – and he really is a brilliant scholar – might not be able to use Interpreader. Without people like that using Interpreader, or something like it, the paradigm of keeping annotations private and interpretations informal will remain among at least some of the best (in a traditional sense) scholars. That is not what I want.

My usability hat just swivelled itself onto my head a little more firmly.

* I’ve nothing against bound manuscripts per se, but they are not an efficient means of mass-communication.