ad-hoc webserver from the shell

May 10th, 2011 § 0 comments § permalink

Here is a neat trick to make the current directory hierarchy available online:

$ cd /tmp
$ python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...

"Why yes, I am God" — Great Firewall edition

March 12th, 2011 § 0 comments § permalink

The Joy of Censorship:

It must be an immensely satisfying job being a Chinese net censor, at least in an oversight role. 450 million people surging hither and yon across multiple platforms intent on a dizzying variety of satisfactions. Squeeze this. Promote that. Block the other. Occasionally, a call comes down for a real work of art: carving a Namibia shaped hole in the Chinese internet after a company associated with the president’s son gets itself in a little difficulty down there, for instance.

It’s a genuine problem that the devil has so many of the best technical jobs. Not just censorship, but data-mining, surveillance, military technology — many, many jobs which are technically fascinating and morally repulsive.

Posting from vim

January 15th, 2011 § 0 comments § permalink

A while ago, I got excited about posting to this blog from vim. Then I upgraded things, got distracted, forgot about it, and it seems now _not to work_

easiest way I can find is to save the text into a new file, then do:


google blogger post --title "Posting from vim" --tags "technology, vim,google,howto" %

email over ssh/socks with evolution (to dodge wifi cafe firewall)

December 1st, 2010 § 0 comments § permalink

I’ve just been working in a cafe whose wifi blocks outgoing email. So I had to figure out how to send mail through an ssh tunnel. That is, hussle it through the firewall by sending it encrypted to a server elsewhere, and send the email outgoing from there.

For future reference, and in case it’s useful to anybody else, here’s how. This is assuming you are running ubuntu on your own machine, and have ssh access to a server somewhere else that’s capable of sending mail.

We use ssh to set up a SOCKS proxy, over an ssh tunnel. This establishes a port on the local machine (here, port 1234). any traffic sent through that port will emerge from the server at the other end:

ssh -D 1234 username@server.net

Now, install tsocks. This lets you run another program, with all outgoing connections sent via SOCKS

sudo apt-get install tsocks

configure tsocks to use the tunnel you’ve set up

sudo vim /etc/tsocks.conf

look for the default server settings, at the bottom. Edit so that:


server = 127.0.0.1
server_port = 1234

Now start your mail program under tsocks

tsocks evolution 

In order to make external mail sending work under this setup, I had to turn off TLS in evolution. I’m not sure if this is a problem inherent to the socks/ssh setup, or just with my particular situation.

more info: http://ubuntuforums.org/showthread.php?t=791323

yeahconsole

July 28th, 2010 § 0 comments § permalink

Xmonad is my window manager. I’ve had it configured to use dmenu as an ersatz command-line, but have been fairly unimpressed by its slowness, and by the difficulty of getting any notification of errors.

So I’m turning to yeahconsole. This is a drop-down terminal, something similar to yakuake, and hearking back ultimately to the headsup terminal in quake. To use: Ctrl-Alt-y to bring it up, type/run your command, M-A-y again to hide it.

I leave outstanding two jobs, one easy and one hard. Easy one is integrating it with xmonad, to launch on M-p. Hard one is making it vanish after executing a command; from a glance at the docs it seems this will only be possible by futzing with the source directly

Saving firefox

July 25th, 2010 § 0 comments § permalink

further to earlier grumbling about firefox, it seems the main culprit is the restore-session facility. This is something I hated anyway, even without realising that it was shutting down my hdd every 10 seconds to churn through all my tabs. Solution: go to about:config. browser.sessionstore.interval controls how often firefox stores its tab data. The default is 10 seconds; setting it to a long string of nines has sped up my computer no end.

and so, firefox is saved for another day.

also of note: the vimkeys plugin, providing j/k scrolling.

Nested dictionaries in python

July 15th, 2010 § 4 comments § permalink

Python’s defaultdict is perfect for making nested dictionaries — especially useful if you’re doing any kind of work with json or nosql. It provides a dict which returns a default value when a key isn’t found. Set that default value an empty dict, and you have a convenient dict of dicts:


>>> from collections import defaultdict
>>> foo = defaultdict(dict)
>>> foo['x']
{}

But it breaks down when you go more than one layer deep:


>>> foo['x']['y']
Traceback (most recent call last):
File "", line 1, in 
KeyError: 'y'

You can get another layer by passing in a defaultdict of dicts as the default:


>>> bar = defaultdict(lambda: defaultdict(dict))
>>> bar['x']['y']
{}

But suppose you want deeply-nesting dictionaries. This means you can refer as deeply into the hierarchy as you want, without needing to check whether the intermediate dictionaries have already been created. You do need to be sure that intervening levels aren’t anything other than a recursive defaultdict, mind. But if you know you’re going to have your content filed away inside, say, quadruple-nested dicts, this isn’t necessarily a problem.
One approach would be to extend the method above, with lambdas inside lambdas:


>>> baz = defaultdict(lambda: defaultdict(lambda:defaultdict(dict)))
>>> baz[1][2][3]
{}
>>> baz[1][2][3][4]
Traceback (most recent call last):
File "", line 1, in 
KeyError: 4
>>>

It’s marginally more readable if we use partial rather than lambda:


>>> thud = defaultdict(partial(defaultdict, partial(defaultdict, dict)))
>>> thud[1][2][3]
{}

But still pretty ugly, and non-extending. Want infinite nesting instead? You can do it with a recursive function:


>>> def infinite_defaultdict():
...     return defaultdict(infinite_defaultdict)
...
>>> spam = infinite_defaultdict() #defaultdict(infinite_defaultdict) is equivalent
>>> spam['x']['y']['z']['l']['m']
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {})

This works fine. The __repr__ output is annoyingly convoluted, though:


>>> spam = infinite_defaultdict()
>>> spam['x']['y']['z']['l']['m']
defaultdict(, {})
>>> spam
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'x':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'y':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'z':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'l':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {'m':
defaultdict(<function infinite_defaultdict at 0x7fe4fb0c9de8>, {})})})})})})


A cleaner way of achieving the same effect is to ignore defaultdict entirely, and make a direct subclass of dict. This is based on Peter Norvig’s original implementation of defaultdict:


>>> class NestedDict(dict):
...     def __getitem__(self, key):
...         if key in self: return self.get(key)
...         return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}

MongoDB

July 8th, 2010 § 0 comments § permalink

MongoDB (and nosql generally) is an appealing idea. The words written about it, though, are problematic: too much hype, too little documentation. That’ll change soon; we’re over the peak of the nosql hype cycle, into the trough. People are looking at the nosql systems they’ve eagerly implemented in recent months, noticing that they won’t solve every problem imaginable. For now, though, every blogpost with mongodb instructions is prefaced with grumbles about the lack of information.

So, i spend a ridiculous amount of time figuring out how to do grouping. Have a bunch of download logs, want to break them down by country.
The simplest way I could find of doing this is:

db.loglines.group({ ‘cond’ : {}, initial: {count: 0}, reduce: function(doc, out){out.count++;if(out[doc.country] == undefined){out[doc.country] = 0;};out[doc.country] += 1;}});

Or, the version in pymongo:


> reduce_func = """function(doc, out){
out.total++;
if(out[doc.country] == undefined){
out[doc.country] = 0;};
out[doc.country] += 1;};
"""

> l.group(key = {},
condition = {},
initial = {'total':0},
reduce = reduce_func)
[{
u'AE': 215.0,
u'AG': 23.0,
u'AM': 140.0,
u'AN': 58.0,
u'AO': 56.0,
...
u'total' : 87901;
}]

[apologies for formatting; I’ve not really figured out how to edit js within a python repl]

Patch: blogger post from stdin for googlecl

July 3rd, 2010 § 0 comments § permalink

Going to start hijacking this blog, to record/link to patches I submit to various open-source projects. As with everything else on here, it’s mainly to ensure I can find these little snippets a few months later.

So, to start, something intended for this blog itself. A patch to the google commandline tools enabling the “google blogger post” command to post content read from stdin (adding to the current options of supplying a string or a filename). Usage is the traditional ‘-‘ in place of a filename.

This enables two pieces of functionality I’d find very useful:
A) filter content through other programs. e.g. using markdown to HTMLify my content:
$ markdown post.txt | google blogger post –
B) make a blogpost from within vim, by selecting my post content and piping it to googlecl

finding and editing

July 1st, 2010 § 0 comments § permalink

Search for files containing some text, open them in vim (one per tab)

 grep -l foo ./* | xargs vim -p

Alternatively, to get a single-line list that can be edited and then copy-pasted to a command-line:

grep -l foo ./* | xargs echo

There are more heavy-duty ways of removing lines in output listed here, but I see little reason for using them.t

notify-send

June 20th, 2010 § 0 comments § permalink

Yet another linux trick I keep on forgetting…
To display a notification on the desktop from the command-line:
# apt-get install libnotify-bin
$ notify-send “hello world”

obv. “from the command-line” really means “from a script”, unless you’re in some Evil Dead situation of independently-mobile hands

[reason for looking: trying to get xmonad+dmenu to notify me when I mistype a command, rather than just failing silently]

markdown + vim

June 20th, 2010 § 0 comments § permalink

Since I’m spectacularly dim, it never occured to me that I can run markdown from within vim. Select your text, run !markdown, and wham! bam! everything is replaced by its technicolor HTML twin.

June 8th, 2010 § 0 comments § permalink

A bit of Debian lore I always forget: finding which package is responsible for a certain file:

$ dpkg -S filename

e.g:
$ dpkg -S /usr/bin/lintian
lintian: /usr/bin/lintian

E-voting in Estonia

February 24th, 2010 § 0 comments § permalink

Certain small post-Soviet states have a tendency to be hooked into all the latest fads among global policy wonks. It’s an outgrowth of their size and history: ambitious young people who left for Western Europe or the US in the 90s, have now returned and found themselves wealthy, skilled, and ready to govern. Georgia and Estonia, in particular, have been quick to dive into every technnical/governmental trend, from twitter to linux to…e-voting.

As regards the latter, Estonia is forging ahead. 14% of votes in the European elections were cast online. 44% in the municipal elections in Talinn — which, to judge by the percentage of Berlin’s technorati vanishing there for mysterious projects, must be turning into something of an electronic mecca.

Opening up a tax haven

December 26th, 2008 § 11 comments § permalink

Panama is still one of the biggest and most important tax havens. As well as its absurd tax regime, its corporate disclosure regime means it is very difficult to get information about Panamanian companies.
Or rather, it was. Panama recently put online their company registry. You can now retrieve the names of the current directors of every Panamanian company, as well as all the company’s filings themselves (minutes of company meetings, details of shareholdings, ownership, certificates of incorporation etc. etc.).
Nice, but you can only search by the name of the company. If you want to find somebody who is dodging tax or doing something else dubious, you really need to search by director’s name.
This tool fixes that problem. I’ve scraped all 600,000 company records, going back 30 years, and indexed by directors.

Now you can, for instance, look up recently-arrested arms dealer Monzer al-Kassar, and you find a couple of companies. Looking through the records, you find the company’s current treasurer is Hans-Ulrich Ming, chairman of Swiss firm Pax Anlage. Previous directors include Enrico Ravano, president of Contship, the Italian company that controls the Calabrian port of Gioia Tauro. A Feb 2008 report for the Italian parliament accused Ravano of complicity in cocaine smuggling by the Calabrian mafia through Gioia Tauro – the report cited Italian estimates that 80% of all Europe’s cocaine is smuggled through Gioia Tauro. Ravano’s connection to al Kassar could help to stand up accusations (which al Kassar has always denied) that al Kassar was involved in drug trafficking as well as weapons trafficking; and helps to undermine Ravano’s recent denials that he’s had anything to do with any trafficking of any sort. [This set of connections was in fact found manually, by Global Witness, and was part of the inspiration to build the search]
Or take Nadhmi Auchi: Iraqi-British billionnaire, companion of Saddam Hussein in the ’50s, convicted of fraud in France (but appealing). I’ve not yet looked through the records of companies held by him and his friends – but there are plenty of records there, doubtless including some interesting connections.

And there are plenty more interesting names to look up. Most satisfyingly, it’s already proving useful in figuring out the activities of various currently-active arms dealers…
Want the raw data? Here is a database dump.

Westminster’s map

November 27th, 2006 § 0 comments § permalink

[Update: I finally got round to adding legends to the maps]
Which countries get talked about in parliament? With data from [They Work For You](http://www.theyworkforyou.com), I’ve put together these maps of where MPs like to talk about. Here’s the number of mentions a country has had in parliament recently, adjusted for population:



< - Few mentions _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Many mentions->

Looking at this, I’m actually surprised at how globally-minded Parliament is. Sudan (pop. 34.2 million) gets 2,302 mentions; Germany (pop. 82.5 million) has only 3,695 mentions in parliament.
Far from being ignored, Africa actually gets mentioned well beyond its economic importance to the UK. South America, on the other hand, is basically ignored.
Then there’s the size bias: small countries get more mentions than big ones, once you adjust for population. Look at Mongolia: Westminster, it seems, finds Mongolians immensely more important than Chinese. The bias can partly be discounted as a problem with measurement: parliament is prone to lists of foreign relations and trade issues, for instance, which mention every country regardless of how small it is. Also, it’s possible MPs talk about areas within China or India, which I wouldn’t have picked up on.
But there’s more to it: larger countries really do get short-changed in the attention we give them. China has a population perhaps 150 times larger than than of Bolivia – but we don’t hear anything like 150 times as much news from China. We’re all biased by imagining a world made up of nations, and giving the same weight to nations of all sizes. Small islands got discussed an incredible amount – particularly places in the news, like Tuvalu and the Pitcairns, but others as well.

» Read the rest of this entry «

Memes: toxic in China

November 7th, 2006 § 0 comments § permalink

Remember the Free Hugs meme? Somebody in Australia started hugging people in the streets, it spread to Russia, Italy, Taiwan, Korea, Poland, and pretty much the rest of the world.
Then, some people in Shanghai tried it – and were promptly arrested

Shanghai Free Hugs

Before the arrest, presumably

The huggers were released after a couple of hours, but still: a big ‘meh!’ to the Chinese police
[cross-post from livejournal]

Conference reloaded

October 3rd, 2006 § 1 comment § permalink

How can you develop a service without sharing a language with your users?
Holed up in Budapest, my head too messed up to do any proper work (eep! the doom she is a-coming!), I’ve been listening to danah Boyd‘s keynote at the blogtalk conference that’s just winding up in Vienna.
She touches on the fact that the creators of Orkut don’t have the faintest idea what their Portugese or Hindi-speaking users are doing. I’d always vaguely assumed that there would be a fair few Portugese-speakers within the Orkut development team, for instance. But obviously not.
It’d be a nice little project for a journalist or an anthropologist, to work out how much the developers of these sites know about their users.

Where Am I?

You are currently browsing the Technology category at Dan O'Huiginn.