ad-hoc webserver from the shell

May 10th, 2011 § 0 comments § permalink

Here is a neat trick to make the current directory hierarchy available online:

$ cd /tmp
$ python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...

"Why yes, I am God" — Great Firewall edition

March 12th, 2011 § 0 comments § permalink

The Joy of Censorship:

It must be an immensely satisfying job being a Chinese net censor, at least in an oversight role. 450 million people surging hither and yon across multiple platforms intent on a dizzying variety of satisfactions. Squeeze this. Promote that. Block the other. Occasionally, a call comes down for a real work of art: carving a Namibia shaped hole in the Chinese internet after a company associated with the president’s son gets itself in a little difficulty down there, for instance.

It’s a genuine problem that the devil has so many of the best technical jobs. Not just censorship, but data-mining, surveillance, military technology — many, many jobs which are technically fascinating and morally repulsive.

Posting from vim

January 15th, 2011 § 0 comments § permalink

A while ago, I got excited about posting to this blog from vim. Then I upgraded things, got distracted, forgot about it, and it seems now _not to work_

easiest way I can find is to save the text into a new file, then do:


google blogger post --title "Posting from vim" --tags "technology, vim,google,howto" %

email over ssh/socks with evolution (to dodge wifi cafe firewall)

December 1st, 2010 § 0 comments § permalink

I’ve just been working in a cafe whose wifi blocks outgoing email. So I had to figure out how to send mail through an ssh tunnel. That is, hussle it through the firewall by sending it encrypted to a server elsewhere, and send the email outgoing from there.

For future reference, and in case it’s useful to anybody else, here’s how. This is assuming you are running ubuntu on your own machine, and have ssh access to a server somewhere else that’s capable of sending mail.

We use ssh to set up a SOCKS proxy, over an ssh tunnel. This establishes a port on the local machine (here, port 1234). any traffic sent through that port will emerge from the server at the other end:

ssh -D 1234 username@server.net

Now, install tsocks. This lets you run another program, with all outgoing connections sent via SOCKS

sudo apt-get install tsocks

configure tsocks to use the tunnel you’ve set up

sudo vim /etc/tsocks.conf

look for the default server settings, at the bottom. Edit so that:


server = 127.0.0.1
server_port = 1234

Now start your mail program under tsocks

tsocks evolution 

In order to make external mail sending work under this setup, I had to turn off TLS in evolution. I’m not sure if this is a problem inherent to the socks/ssh setup, or just with my particular situation.

more info: http://ubuntuforums.org/showthread.php?t=791323

yeahconsole

July 28th, 2010 § 0 comments § permalink

Xmonad is my window manager. I’ve had it configured to use dmenu as an ersatz command-line, but have been fairly unimpressed by its slowness, and by the difficulty of getting any notification of errors.

So I’m turning to yeahconsole. This is a drop-down terminal, something similar to yakuake, and hearking back ultimately to the headsup terminal in quake. To use: Ctrl-Alt-y to bring it up, type/run your command, M-A-y again to hide it.

I leave outstanding two jobs, one easy and one hard. Easy one is integrating it with xmonad, to launch on M-p. Hard one is making it vanish after executing a command; from a glance at the docs it seems this will only be possible by futzing with the source directly

Saving firefox

July 25th, 2010 § 0 comments § permalink

further to earlier grumbling about firefox, it seems the main culprit is the restore-session facility. This is something I hated anyway, even without realising that it was shutting down my hdd every 10 seconds to churn through all my tabs. Solution: go to about:config. browser.sessionstore.interval controls how often firefox stores its tab data. The default is 10 seconds; setting it to a long string of nines has sped up my computer no end.

and so, firefox is saved for another day.

also of note: the vimkeys plugin, providing j/k scrolling.

Nested dictionaries in python

July 15th, 2010 § 4 comments § permalink

Python’s defaultdict is perfect for making nested dictionaries — especially useful if you’re doing any kind of work with json or nosql. It provides a dict which returns a default value when a key isn’t found. Set that default value an empty dict, and you have a convenient dict of dicts:


>>> from collections import defaultdict
>>> foo = defaultdict(dict)
>>> foo['x']
{}

But it breaks down when you go more than one layer deep:


>>> foo['x']['y']
Traceback (most recent call last):
File "", line 1, in 
KeyError: 'y'

You can get another layer by passing in a defaultdict of dicts as the default:


>>> bar = defaultdict(lambda: defaultdict(dict))
>>> bar['x']['y']
{}

But suppose you want deeply-nesting dictionaries. This means you can refer as deeply into the hierarchy as you want, without needing to check whether the intermediate dictionaries have already been created. You do need to be sure that intervening levels aren’t anything other than a recursive defaultdict, mind. But if you know you’re going to have your content filed away inside, say, quadruple-nested dicts, this isn’t necessarily a problem.
One approach would be to extend the method above, with lambdas inside lambdas:


>>> baz = defaultdict(lambda: defaultdict(lambda:defaultdict(dict)))
>>> baz[1][2][3]
{}
>>> baz[1][2][3][4]
Traceback (most recent call last):
File "", line 1, in 
KeyError: 4
>>>

It’s marginally more readable if we use partial rather than lambda:


>>> thud = defaultdict(partial(defaultdict, partial(defaultdict, dict)))
>>> thud[1][2][3]
{}

But still pretty ugly, and non-extending. Want infinite nesting instead? You can do it with a recursive function:


>>> def infinite_defaultdict():
...     return defaultdict(infinite_defaultdict)
...
>>> spam = infinite_defaultdict() #defaultdict(infinite_defaultdict) is equivalent
>>> spam['x']['y']['z']['l']['m']
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {})

This works fine. The __repr__ output is annoyingly convoluted, though:

>>> spam = infinite_defaultdict()
>>> spam['x']['y']['z']['l']['m']
defaultdict(, {})
>>> spam
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'x':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'y':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'z':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'l':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'m':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {})})})})})})


A cleaner way of achieving the same effect is to ignore defaultdict entirely, and make a direct subclass of dict. This is based on Peter Norvig’s original implementation of defaultdict:


>>> class NestedDict(dict):
...     def __getitem__(self, key):
...         if key in self: return self.get(key)
...         return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}

MongoDB

July 8th, 2010 § 0 comments § permalink

MongoDB (and nosql generally) is an appealing idea. The words written about it, though, are problematic: too much hype, too little documentation. That’ll change soon; we’re over the peak of the nosql hype cycle, into the trough. People are looking at the nosql systems they’ve eagerly implemented in recent months, noticing that they won’t solve every problem imaginable. For now, though, every blogpost with mongodb instructions is prefaced with grumbles about the lack of information.

So, i spend a ridiculous amount of time figuring out how to do grouping. Have a bunch of download logs, want to break them down by country.
The simplest way I could find of doing this is:

db.loglines.group({ ‘cond’ : {}, initial: {count: 0}, reduce: function(doc, out){out.count++;if(out[doc.country] == undefined){out[doc.country] = 0;};out[doc.country] += 1;}});

Or, the version in pymongo:


> reduce_func = """function(doc, out){
out.total++;
if(out[doc.country] == undefined){
out[doc.country] = 0;};
out[doc.country] += 1;};
"""

> l.group(key = {},
condition = {},
initial = {'total':0},
reduce = reduce_func)
[{
u'AE': 215.0,
u'AG': 23.0,
u'AM': 140.0,
u'AN': 58.0,
u'AO': 56.0,
...
u'total' : 87901;
}]

[apologies for formatting; I’ve not really figured out how to edit js within a python repl]

Patch: blogger post from stdin for googlecl

July 3rd, 2010 § 0 comments § permalink

Going to start hijacking this blog, to record/link to patches I submit to various open-source projects. As with everything else on here, it’s mainly to ensure I can find these little snippets a few months later.

So, to start, something intended for this blog itself. A patch to the google commandline tools enabling the “google blogger post” command to post content read from stdin (adding to the current options of supplying a string or a filename). Usage is the traditional ‘-‘ in place of a filename.

This enables two pieces of functionality I’d find very useful:
A) filter content through other programs. e.g. using markdown to HTMLify my content:
$ markdown post.txt | google blogger post –
B) make a blogpost from within vim, by selecting my post content and piping it to googlecl

finding and editing

July 1st, 2010 § 0 comments § permalink

Search for files containing some text, open them in vim (one per tab)

 grep -l foo ./* | xargs vim -p

Alternatively, to get a single-line list that can be edited and then copy-pasted to a command-line:

grep -l foo ./* | xargs echo

There are more heavy-duty ways of removing lines in output listed here, but I see little reason for using them.t

notify-send

June 20th, 2010 § 0 comments § permalink

Yet another linux trick I keep on forgetting…
To display a notification on the desktop from the command-line:
# apt-get install libnotify-bin
$ notify-send “hello world”

obv. “from the command-line” really means “from a script”, unless you’re in some Evil Dead situation of independently-mobile hands

[reason for looking: trying to get xmonad+dmenu to notify me when I mistype a command, rather than just failing silently]

markdown + vim

June 20th, 2010 § 0 comments § permalink

Since I’m spectacularly dim, it never occured to me that I can run markdown from within vim. Select your text, run !markdown, and wham! bam! everything is replaced by its technicolor HTML twin.

June 8th, 2010 § 0 comments § permalink

A bit of Debian lore I always forget: finding which package is responsible for a certain file:

$ dpkg -S filename

e.g:
$ dpkg -S /usr/bin/lintian
lintian: /usr/bin/lintian

E-voting in Estonia

February 24th, 2010 § 0 comments § permalink

Certain small post-Soviet states have a tendency to be hooked into all the latest fads among global policy wonks. It’s an outgrowth of their size and history: ambitious young people who left for Western Europe or the US in the 90s, have now returned and found themselves wealthy, skilled, and ready to govern. Georgia and Estonia, in particular, have been quick to dive into every technnical/governmental trend, from twitter to linux to…e-voting.

As regards the latter, Estonia is forging ahead. 14% of votes in the European elections were cast online. 44% in the municipal elections in Talinn — which, to judge by the percentage of Berlin’s technorati vanishing there for mysterious projects, must be turning into something of an electronic mecca.

Opening up a tax haven

December 26th, 2008 § 11 comments § permalink

Panama is still one of the biggest and most important tax havens. As well as its absurd tax regime, its corporate disclosure regime means it is very difficult to get information about Panamanian companies.
Or rather, it was. Panama recently put online their company registry. You can now retrieve the names of the current directors of every Panamanian company, as well as all the company’s filings themselves (minutes of company meetings, details of shareholdings, ownership, certificates of incorporation etc. etc.).
Nice, but you can only search by the name of the company. If you want to find somebody who is dodging tax or doing something else dubious, you really need to search by director’s name.
This tool fixes that problem. I’ve scraped all 600,000 company records, going back 30 years, and indexed by directors.

Now you can, for instance, look up recently-arrested arms dealer Monzer al-Kassar, and you find a couple of companies. Looking through the records, you find the company’s current treasurer is Hans-Ulrich Ming,