Here is a neat trick to make the current directory hierarchy available online:
$ cd /tmp
$ python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...
ad-hoc webserver from the shell
May 10th, 2011 § 0 comments § permalink
"Why yes, I am God" — Great Firewall edition
March 12th, 2011 § 0 comments § permalink
The Joy of Censorship:
It must be an immensely satisfying job being a Chinese net censor, at least in an oversight role. 450 million people surging hither and yon across multiple platforms intent on a dizzying variety of satisfactions. Squeeze this. Promote that. Block the other. Occasionally, a call comes down for a real work of art: carving a Namibia shaped hole in the Chinese internet after a company associated with the president’s son gets itself in a little difficulty down there, for instance.
It’s a genuine problem that the devil has so many of the best technical jobs. Not just censorship, but data-mining, surveillance, military technology — many, many jobs which are technically fascinating and morally repulsive.
Posting from vim
January 15th, 2011 § 0 comments § permalink
A while ago, I got excited about posting to this blog from vim. Then I upgraded things, got distracted, forgot about it, and it seems now _not to work_
easiest way I can find is to save the text into a new file, then do:
google blogger post --title "Posting from vim" --tags "technology, vim,google,howto" %
email over ssh/socks with evolution (to dodge wifi cafe firewall)
December 1st, 2010 § 0 comments § permalink
I’ve just been working in a cafe whose wifi blocks outgoing email. So I had to figure out how to send mail through an ssh tunnel. That is, hussle it through the firewall by sending it encrypted to a server elsewhere, and send the email outgoing from there.
For future reference, and in case it’s useful to anybody else, here’s how. This is assuming you are running ubuntu on your own machine, and have ssh access to a server somewhere else that’s capable of sending mail.
We use ssh to set up a SOCKS proxy, over an ssh tunnel. This establishes a port on the local machine (here, port 1234). any traffic sent through that port will emerge from the server at the other end:
ssh -D 1234 username@server.net
Now, install tsocks. This lets you run another program, with all outgoing connections sent via SOCKS
sudo apt-get install tsocks
configure tsocks to use the tunnel you’ve set up
sudo vim /etc/tsocks.conf
look for the default server settings, at the bottom. Edit so that:
server = 127.0.0.1
server_port = 1234
Now start your mail program under tsocks
tsocks evolution
In order to make external mail sending work under this setup, I had to turn off TLS in evolution. I’m not sure if this is a problem inherent to the socks/ssh setup, or just with my particular situation.
more info: http://ubuntuforums.org/showthread.php?t=791323
yeahconsole
July 28th, 2010 § 0 comments § permalink
Xmonad is my window manager. I’ve had it configured to use dmenu as an ersatz command-line, but have been fairly unimpressed by its slowness, and by the difficulty of getting any notification of errors.
So I’m turning to yeahconsole. This is a drop-down terminal, something similar to yakuake, and hearking back ultimately to the headsup terminal in quake. To use: Ctrl-Alt-y to bring it up, type/run your command, M-A-y again to hide it.
I leave outstanding two jobs, one easy and one hard. Easy one is integrating it with xmonad, to launch on M-p. Hard one is making it vanish after executing a command; from a glance at the docs it seems this will only be possible by futzing with the source directly
Saving firefox
July 25th, 2010 § 0 comments § permalink
further to earlier grumbling about firefox, it seems the main culprit is the restore-session facility. This is something I hated anyway, even without realising that it was shutting down my hdd every 10 seconds to churn through all my tabs. Solution: go to about:config. browser.sessionstore.interval controls how often firefox stores its tab data. The default is 10 seconds; setting it to a long string of nines has sped up my computer no end.
and so, firefox is saved for another day.
also of note: the vimkeys plugin, providing j/k scrolling.
Nested dictionaries in python
July 15th, 2010 § 4 comments § permalink
Python’s defaultdict is perfect for making nested dictionaries — especially useful if you’re doing any kind of work with json or nosql. It provides a dict which returns a default value when a key isn’t found. Set that default value an empty dict, and you have a convenient dict of dicts:
>>> from collections import defaultdict
>>> foo = defaultdict(dict)
>>> foo['x']
{}
But it breaks down when you go more than one layer deep:
>>> foo['x']['y']
Traceback (most recent call last):
File "", line 1, in
KeyError: 'y'
You can get another layer by passing in a defaultdict of dicts as the default:
>>> bar = defaultdict(lambda: defaultdict(dict))
>>> bar['x']['y']
{}
But suppose you want deeply-nesting dictionaries. This means you can refer as deeply into the hierarchy as you want, without needing to check whether the intermediate dictionaries have already been created. You do need to be sure that intervening levels aren’t anything other than a recursive defaultdict, mind. But if you know you’re going to have your content filed away inside, say, quadruple-nested dicts, this isn’t necessarily a problem.
One approach would be to extend the method above, with lambdas inside lambdas:
>>> baz = defaultdict(lambda: defaultdict(lambda:defaultdict(dict)))
>>> baz[1][2][3]
{}
>>> baz[1][2][3][4]
Traceback (most recent call last):
File "", line 1, in
KeyError: 4
>>>
It’s marginally more readable if we use partial rather than lambda:
>>> thud = defaultdict(partial(defaultdict, partial(defaultdict, dict)))
>>> thud[1][2][3]
{}
But still pretty ugly, and non-extending. Want infinite nesting instead? You can do it with a recursive function:
>>> def infinite_defaultdict():
... return defaultdict(infinite_defaultdict)
...
>>> spam = infinite_defaultdict() #defaultdict(infinite_defaultdict) is equivalent
>>> spam['x']['y']['z']['l']['m']
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {})
This works fine. The __repr__ output is annoyingly convoluted, though:
>>> spam = infinite_defaultdict()
>>> spam['x']['y']['z']['l']['m']
defaultdict(, {})
>>> spam
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'x':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'y':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'z':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'l':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'m':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {})})})})})})
A cleaner way of achieving the same effect is to ignore defaultdict entirely, and make a direct subclass of dict. This is based on Peter Norvig’s original implementation of defaultdict:
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
MongoDB
July 8th, 2010 § 0 comments § permalink
MongoDB (and nosql generally) is an appealing idea. The words written about it, though, are problematic: too much hype, too little documentation. That’ll change soon; we’re over the peak of the nosql hype cycle, into the trough. People are looking at the nosql systems they’ve eagerly implemented in recent months, noticing that they won’t solve every problem imaginable. For now, though, every blogpost with mongodb instructions is prefaced with grumbles about the lack of information.
So, i spend a ridiculous amount of time figuring out how to do grouping. Have a bunch of download logs, want to break them down by country.
The simplest way I could find of doing this is:
db.loglines.group({ ‘cond’ : {}, initial: {count: 0}, reduce: function(doc, out){out.count++;if(out[doc.country] == undefined){out[doc.country] = 0;};out[doc.country] += 1;}});
Or, the version in pymongo:
> reduce_func = """function(doc, out){
out.total++;
if(out[doc.country] == undefined){
out[doc.country] = 0;};
out[doc.country] += 1;};
"""
> l.group(key = {},
condition = {},
initial = {'total':0},
reduce = reduce_func)
[{
u'AE': 215.0,
u'AG': 23.0,
u'AM': 140.0,
u'AN': 58.0,
u'AO': 56.0,
...
u'total' : 87901;
}]
[apologies for formatting; I’ve not really figured out how to edit js within a python repl]
Patch: blogger post from stdin for googlecl
July 3rd, 2010 § 0 comments § permalink
Going to start hijacking this blog, to record/link to patches I submit to various open-source projects. As with everything else on here, it’s mainly to ensure I can find these little snippets a few months later.
So, to start, something intended for this blog itself. A patch to the google commandline tools enabling the “google blogger post” command to post content read from stdin (adding to the current options of supplying a string or a filename). Usage is the traditional ‘-‘ in place of a filename.
This enables two pieces of functionality I’d find very useful:
A) filter content through other programs. e.g. using markdown to HTMLify my content:
$ markdown post.txt | google blogger post –
B) make a blogpost from within vim, by selecting my post content and piping it to googlecl
finding and editing
July 1st, 2010 § 0 comments § permalink
Search for files containing some text, open them in vim (one per tab)
grep -l foo ./* | xargs vim -p
Alternatively, to get a single-line list that can be edited and then copy-pasted to a command-line:
grep -l foo ./* | xargs echo
There are more heavy-duty ways of removing lines in output listed here, but I see little reason for using them.t
notify-send
June 20th, 2010 § 0 comments § permalink
Yet another linux trick I keep on forgetting…
To display a notification on the desktop from the command-line:
# apt-get install libnotify-bin
$ notify-send “hello world”
obv. “from the command-line” really means “from a script”, unless you’re in some Evil Dead situation of independently-mobile hands
[reason for looking: trying to get xmonad+dmenu to notify me when I mistype a command, rather than just failing silently]
markdown + vim
June 20th, 2010 § 0 comments § permalink
Since I’m spectacularly dim, it never occured to me that I can run markdown from within vim. Select your text, run !markdown, and wham! bam! everything is replaced by its technicolor HTML twin.
June 8th, 2010 § 0 comments § permalink
A bit of Debian lore I always forget: finding which package is responsible for a certain file:
$ dpkg -S filename
e.g:
$ dpkg -S /usr/bin/lintian
lintian: /usr/bin/lintian
E-voting in Estonia
February 24th, 2010 § 0 comments § permalink
Certain small post-Soviet states have a tendency to be hooked into all the latest fads among global policy wonks. It’s an outgrowth of their size and history: ambitious young people who left for Western Europe or the US in the 90s, have now returned and found themselves wealthy, skilled, and ready to govern. Georgia and Estonia, in particular, have been quick to dive into every technnical/governmental trend, from twitter to linux to…e-voting.
As regards the latter, Estonia is forging ahead. 14% of votes in the European elections were cast online. 44% in the municipal elections in Talinn — which, to judge by the percentage of Berlin’s technorati vanishing there for mysterious projects, must be turning into something of an electronic mecca.
Opening up a tax haven
December 26th, 2008 § 11 comments § permalink
Panama is still one of the biggest and most important tax havens. As well as its absurd tax regime, its corporate disclosure regime means it is very difficult to get information about Panamanian companies.
Or rather, it was. Panama recently put online their company registry. You can now retrieve the names of the current directors of every Panamanian company, as well as all the company’s filings themselves (minutes of company meetings, details of shareholdings, ownership, certificates of incorporation etc. etc.).
Nice, but you can only search by the name of the company. If you want to find somebody who is dodging tax or doing something else dubious, you really need to search by director’s name.
This tool fixes that problem. I’ve scraped all 600,000 company records, going back 30 years, and indexed by directors.
Now you can, for instance, look up recently-arrested arms dealer Monzer al-Kassar, and you find a couple of companies. Looking through the records, you find the company’s current treasurer is Hans-Ulrich Ming, chairman of Swiss firm Pax Anlage. Previous directors include Enrico Ravano, president of Contship, the Italian company that controls the Calabrian port of Gioia Tauro. A Feb 2008 report for the Italian parliament accused Ravano of complicity in cocaine smuggling by the Calabrian mafia through Gioia Tauro – the report cited Italian estimates that 80% of all Europe’s cocaine is smuggled through Gioia Tauro. Ravano’s connection to al Kassar could help to stand up accusations (which al Kassar has always denied) that al Kassar was involved in drug trafficking as well as weapons trafficking; and helps to undermine Ravano’s recent denials that he’s had anything to do with any trafficking of any sort. [This set of connections was in fact found manually, by Global Witness, and was part of the inspiration to build the search]
Or take Nadhmi Auchi: Iraqi-British billionnaire, companion of Saddam Hussein in the ’50s, convicted of fraud in France (but appealing). I’ve not yet looked through the records of companies held by him and his friends – but there are plenty of records there, doubtless including some interesting connections.
And there are plenty more interesting names to look up. Most satisfyingly, it’s already proving useful in figuring out the activities of various currently-active arms dealers…
Want the raw data? Here is a database dump.
Westminster’s map
November 27th, 2006 § 0 comments § permalink
[Update: I finally got round to adding legends to the maps]
Which countries get talked about in parliament? With data from [They Work For You](http://www.theyworkforyou.com), I’ve put together these maps of where MPs like to talk about. Here’s the number of mentions a country has had in parliament recently, adjusted for population:
Looking at this, I’m actually surprised at how globally-minded Parliament is. Sudan (pop. 34.2 million) gets 2,302 mentions; Germany (pop. 82.5 million) has only 3,695 mentions in parliament.
Far from being ignored, Africa actually gets mentioned well beyond its economic importance to the UK. South America, on the other hand, is basically ignored.
Then there’s the size bias: small countries get more mentions than big ones, once you adjust for population. Look at Mongolia: Westminster, it seems, finds Mongolians immensely more important than Chinese. The bias can partly be discounted as a problem with measurement: parliament is prone to lists of foreign relations and trade issues, for instance, which mention every country regardless of how small it is. Also, it’s possible MPs talk about areas within China or India, which I wouldn’t have picked up on.
But there’s more to it: larger countries really do get short-changed in the attention we give them. China has a population perhaps 150 times larger than than of Bolivia – but we don’t hear anything like 150 times as much news from China. We’re all biased by imagining a world made up of nations, and giving the same weight to nations of all sizes. Small islands got discussed an incredible amount – particularly places in the news, like Tuvalu and the Pitcairns, but others as well.
Memes: toxic in China
November 7th, 2006 § 0 comments § permalink
Remember the Free Hugs meme? Somebody in Australia started hugging people in the streets, it spread to Russia, Italy, Taiwan, Korea, Poland, and pretty much the rest of the world.
Then, some people in Shanghai tried it – and were promptly arrested

Before the arrest, presumably
The huggers were released after a couple of hours, but still: a big ‘meh!’ to the Chinese police
[cross-post from livejournal]
Conference reloaded
October 3rd, 2006 § 1 comment § permalink
How can you develop a service without sharing a language with your users?
Holed up in Budapest, my head too messed up to do any proper work (eep! the doom she is a-coming!), I’ve been listening to danah Boyd‘s keynote at the blogtalk conference that’s just winding up in Vienna.
She touches on the fact that the creators of Orkut don’t have the faintest idea what their Portugese or Hindi-speaking users are doing. I’d always vaguely assumed that there would be a fair few Portugese-speakers within the Orkut development team, for instance. But obviously not.
It’d be a nice little project for a journalist or an anthropologist, to work out how much the developers of these sites know about their users.