Fun with Self-Joins

In the course of doing a data migration yesterday, I was presented with an interesting challenge: I had a junction table creating a many-to-many relationship between people and counties. I needed to create a similar table, but with an extra column: an integer value, starting at zero, which ordered the counties for each person. E.g., for the (person, county) values { (1,1), (1,2), (2,1), (3,1) }, the resulting (person, county, position) values would need to be { (1,1,0), (1,2,1), (2,1,0), (3,1,0) } or { (1,1,1), (1,2,0), (2,1,0), (3,1,0) }. The straightforward approach, in my opinion, is procedural; select the distinct person values from the source table, then do queries for that person’s counties and just increment a variable to get the position value. Unfortunately, this would also be slow, and require more lines of SQL code, which means more possibility for error. So, I wondered if there was a non-procedural way to do it. After a few minutes of thought, I came up with the following:


SELECT j1.person, j1.county, COUNT(*)-1
  FROM junction_table j1
  INNER JOIN junction_table j2
    ON j1.person = j2.person
     AND j2.county < = j1.county   GROUP BY j1.person, j1.county;

It seems to have worked. 🙂

Remove Non-viewable Menu Items in Drupal

Out of the box, Drupal has no ability to restrict viewing of site content, beyond a blanket “this role can/cannot access site content”. Unsurprisingly, then, the menu system does not ensure that users can actually view a node that a custom menu item refers to before placing it in the menu. If you’re using an add-on access control module like Taxonomy Access Control Lite or Category Access Control Lite (included in the Category package), this will have the undesirable effect of leaving menu items visible for content the user has no access to, creating a usability and minor security problem.

This module goes through the {menu} table, finds nodes which are inaccessible to the user, and adds them to the navigation menu with the access attribute set to FALSE, preventing them from appearing. I’ll hopefully be putting this up on http://drupal.org soon, but for now, you can get it here. Not available in stores! Download today!

remove_nonviewable_items-5.x-1.0.tar.gz

Make Drupal act like a Content Management System

Before you say, “Drupal is a content management system!”, let me agree with you. It certainly does manage content, in roughly the same way that my “to do”/”to file” piles manage my paperwork. However, it lacks many features that I would consider critical for a CMS for a non-trivial website. Most fundamentally, it has no sense of hierarchy. Documents you create are just node/number. Files you upload all go in one directory (probably “files”). Any hierarchy in your navigation menus or URL aliases (which you will have to create manually!) exists only in your mind, not in Drupal. (To be fair, if you go to a document which has a menu item, it does highlight that as the “active” menu item.)

Out of the box, Drupal does come with support for categorization of documents, feed aggregation, and syndication. The default front page is a list of the most recently posted items. And, the two content types you can create are “Pages” and “Stories”. By now, this is probably starting to sound familiar: at its core, Drupal is a blogging engine…and not much more.

However, if you’re stuck using this product to create a non-blog website, it is possible to make it behave more like a fully-featured CMS through use of a few non-core modules. Specifically:

  1. Download and install these two modules.
  2. From Site Building→Modules, turn on “Category Menu”, “Category Pathauto”, and all dependencies
  3. From Site Building→Modules, turn off “Taxonomy” (Category replaces it)
  4. Under Site Configuration→Pathauto→Node Path Settings, set the Container and Category paths to “[categorypath]”
  5. If you haven’t already, under Site Configuration→Clean URLs, enable Clean URLs.

Now, create your content as type “Container”. (“Category” will also work, if you have already categorized pages that you’d like to show up as parts of the category page.) Two caveats:

  • When you’re setting the menu options for these items, do it under “Container information”, not menu settings. The latter will be overridden by the former.
  • This doesn’t solve the problem of a lack of hierarchy for uploaded files. Maybe there is a way to solve this; I haven’t really looked.

One other tip: the default input formats on a Drupal site leave something to be desired. If you want to be able to put in raw HTML, do this:

  1. Go to Site Configuration→Input Formats→Add Input Format
  2. Create a new format called Raw HTML with no filters selected
  3. Optionally, make it the default.

“Interesting” Linux Applications

In a small “introduction to Linux” class I was helping out with recently, I did a short demo of how Wireshark can show you what happens when you load a webpage (and, also, what gets broadcast on wireless networks 🙂 ). (No, it’s not a Linux-only program, but it is a mature, easy-to-use open-source application that runs on Linux and is in most software repositories.) I was asked for examples of other interesting (presumably, to broad swaths of geeks) applications, so here are a few:

I Use It And Like It

Wireshark
The continuation of Ethereal. Analyzes traffic going through your network interfaces.
Audacity
Easy-to-use audio-editing tool.
Totem (with Xine backend)
Totem is the GNOME movie player. It integrates nicely with the GNOME environment, but at least as of a year or so ago, you were better off using the version compiled using the Xine libraries because of broader media support. If you use Fedora, you may wish to add Livna to your repositories, as it includes totem-xine. It would also be worth checking out Stanton Finley’s Fedora Core 5 Release Notes for information on how to extend the number of codecs supported.
OpenOffice
Cross-platform open-source office productivity suite.
Firefox
Open-source, highly-extensible web browser. (I posted recently about a couple of my favorite extensions.)
Thunderbird
Open-source email client.

I Hear Good Things About It

Azureus
A popular and full-featured BitTorrent client. It’s Java-based, so it runs everywhere (*giggle*).
Eclipse
Eclipse seems to be the graphical open-source IDE for doing development in Java and C/C++ right now. It appears that there are also plugins for PHP, Python, and others, but I don’t know how mature they are.

It Sure Looks Cool

Ekiga
Open-source SIP/H.323 a audio- and video-conferencing program. Interoperates with programs like Microsoft’s Messenger and NetMeeting and Apple’s iChat.

Feel free to jump in with your thoughts and favorite applications.

A Remarkably Inefficient Gift-Giving Algorithm

For the past few years, I’ve created technological solutions to a problem clearly in need of the might of modern technology: choosing secret santa gift recipients for my girlfriend’s immediate family. (She’s proposed a solution involving a hat.)

In previous years, I’ve taken the easy and obvious approach: generate a random permutation, and see if anyone ends up giving themselves a gift. If they do, try again.

This approach has two flaws:

  1. It may take multiple tries to get a valid permutation. It could take three, four, or even more attempts! We could be delayed by milliseconds waiting to know the results.
  2. It’s way too efficient.

Fortunately, I found a way around both of these problems: generate all permutations that don’t fix any elements, then choose a random one. I’ve been poking around with learning Common Lisp, so I decided to write a solution in it.

I haven’t figured out exactly how inefficient this is, but I’m guessing it grows exponentially with the size of the input set. The number of permutations of the set is obviously exponential (Stirling’s approximation), and eliminating the permutations that fix an element doesn’t take out too many of the branches:

Input size Non-fixing permutations
1 0
2 1
3 2
4 9
5 44
6 265
7 1854
8 14833
9 133496

Conclusion: It’s a good thing her family isn’t too big.

Make the Web a Quiet Place In Firefox

A combination of three add-ons and a configuration tweak can go a long way toward making the Web a quiet, calm place.

Add-Ons

In the NoScript preferences page, go the Advanced tab and select Forbid Macromedia Flash and Forbid other plugins.

Animated GIFs

Unfortunately, some people still feel the need to put animated GIFs up on their web pages. Fortunately, there’s a hidden configuration option that will let you stop them from playing more than once or from playing at all. Go to about:config and find the image.animation_mode preference. Set it to once to let images animate once, or none to prevent them from animating at all. (Kudos to the mozillaZine article on animated images for documenting this.)

Java and Certificates

In setting up Wildfire at work, I discovered an unfortunate “feature” of Java: it encourages you to store certificates and private keys in Java Keystores rather than simple, easily-manipulated PEM-encoded files. This wouldn’t be terrible, since the Java Runtime Environment includes a keytool for manipulating the keystores, except for one thing: the keytool provides no way to import private keys into a keystore! I ended up having to write my own utility to do just that; I’ve posted the source on this site. Enjoy!

Good-bye Sendmail

As an experiment, when I built my new “home” server (the one hosting my personal email and websites, including this one), I chose Sendmail for my MTA. I’d been using Postfix in the server’s last incarnation, but I was a little tired of what I thought were the oddities in configuring it, and it looked like Sendmail had a nice interface for in-line content filtering.

The experiment is now concluded. Between the three MTAs with which I have experience (Postfix, Sendmail, and Qmail), Postfix is still handily the best.

The arcane and abstruse configuration for Sendmail is worse than setting up Postfix, to be sure. However, what finally tipped the scales is the poor way that virtual domains are handled. I was trying to set up Mailman for a domain that I host, and I discovered that Mailman can only create new lists on the fly if you’re running Postfix. This is due, at least in part, to the fact that you can’t have an address in a virtual domain delivered directly to a program. This effectively means that I’d have to create a local address along with every virtual address used by Mailman. This pollutes the namespace for one of my local domains with accounts which don’t need to be in it and requires me to shell in to my server and update the aliases file every time I add a list.

As an added blow, I discovered several days ago that Postfix also has a nice mechanism for delegating policy enforcement built into it, with a couple of sample SPF and greylisting policy scripts that seem to work well. So much for Sendmail having an advantage because of libmilter.

SORBS: Still Nuts

In my ongoing tweaks of mail systems at work, I reenabled the SORBS spam blacklist for a day.

Whoops!

Apparently, their quixotic crusade to blacklist every IP address from which they’ve ever received spam and demand a $50 ransom to be delisted continues. As you might imagine, given the number of users on their systems, providers like GMail, Ameritech, and Yahoo! have spam come through their mail servers from time to time, which lands them on the blacklist. As you also might imagine, these providers apparently have little interest in paying a fine to get delisted, presumably because you’d have to be a little daft to use a blacklist that blocks mail from them.

We’ll see if SORBS ever figures this one out or if their spam list is doomed to irrelevance.

Perl Has Some Uses

I don’t much care for Perl; I think that I must be the only system administrator on Earth who doesn’t. I see it as a Swiss Army Chainsaw; “there’s more than one way to do it” is not a feature when you (or deity forbid, someone else) have to read the code later and figure out what you were thinking when you wrote it. Or drinking when you wrote it.

Anyway, an alert fellow IMSA alumnus made me aware of a clever Perl implementation of a classic program.

I guess it does have some uses after all.