Some projects I'm involved in

And now to some interresting things, at least for some of the audience. Or so I hope...

Contents

PostgreSQL related software

This section covers software complementary to PostgreSQL that I'm participating into.

pg_staging

pg_staging aims to allow you to control a lot of dev and pre-live and snapshot databases, restored straight from your production backups, in an easy way. Either interactively with the console or in scripts to be run from cron.

pgloader

The PostgreSQL parallel ETL, writen in python. This little piece of software is a layer atop the PostgreSQL COPY command. The difference between pgloader and plain COPY is the behavior in case of errors in the source data files and the range of input formats supported.

And this little fact that pgloader is able to parallelize its tasks, loading several file at once or even using more than one CPU to load a single source file.

SkyTools: Londiste & PGQ for PostgreSQL replication

The SkyTools project is a suite of tools allowing to easily (I mean, in a really easy way) implement a replication solution, and I'm keeping a SkyTools page here.

You'll have to run two daemons for the replication to be operational, first a PGQ maintenance one (pgqadm.py), which role is to manage the provider queue. Then the replication daemon itself, londiste.py, which is a PGQ consumer implementation.

That means londiste.py is using PGQ as a transport and queuing mecanism, and "only" cares about replaying modification on the subscriber(s).

PostgreSQL extensions

The big extention I'm working on is prefix, but I sometime work on other stuff too. Some of them are not cleaned enough to get published (even in a public CVS)...

prefix search indexing, a PostgreSQL GiST module

This project is described in its own page, in short prefix is about speeding up prefix lookups for them to be fast. The prefix here is not the literal but the column value:

  SELECT * FROM prefixes WHERE prefix @> 'literal';

preprepare

The pre_prepare module aims to prepare all your statements as soon as possible and in a way that allows client queries not to bother at all and just call EXECUTE. It's a C-coded PostgreSQL extension in order to be able to preload it at backend start, but local_preload_libraries currently can not initialize SPI, so it could be a simple plpgsql script.

btree_fr_ops

A BTree OPERATOR CLASS which will use the following hardcoded collation setting in the index, allowing for sorting and comparing in it when the cluster is in another locale:

 #define BTREE_FR_OPS_COLLATION "fr_FR.UTF8"

The extension defines its own set of operators, in order not to step on standard provided operators shoes, it does this in a specific schema. So you'll have to have this schema in your search_path in order to use the index.

It's yet to be proven correct and useful, but if you're interrested it's online: btree_fr_ops.

PostgreSQL backports (available as extensions)

Sometime you need newer features of PostgreSQL in past releases. In those case, and when it's possible, I tend to backport the new feature and make the work available on the backports pgfoundry's project. And here's a maintained list of those.

Please note that I'm not making releases at the moment, for lack of interest and demand, so take the code from the CVS. As I'm using debian you should be able to build a package with:

  debuild -us -uc

min_update

This is part of the backports project and is indeed a backport of the suppress_redundant_updates_trigger that Andrew Dunstan made and commited into 8.4. The name of the trigger is intentionnaly not the same in order for smoothing the upgrade:

BEGIN;
 CREATE TRIGGER ... ON tablename ... suppress_redundant_updates_trigger();
 ALTER TABLE tablename DISABLE TRIGGER _min_update;
COMMIT;

uuid

New in 8.3, the uuid datatype allows one to store UUID values in your database, compare them, index them, etc. So here is an 8.2 version of it, as an extension.

uuid-ossp

Now that you're able to store UUID, you want to generate them to, right? For this you need to link to an existing external library, PostgreSQL project choosed the ossp one, but it's not available on as many platforms as the main project is, so it's an additional supplied module in 8.3.

For this to work while being simple to install and configure (read, you have nothing more than debuild or make install to care about) I've duplicated some code from uuid into uuid-ossp. So now it's possible to SELECT uuid_generate_v4(); from an 8.2 database.

debian packaging

They have a nice developer QA page where to see the packages I'm maintaining currently. If you have something simple I can get my hands on easily, and want it to get to debian, I'm open to discussion :)

Emacs

Emacs is made in a way that being a user of it means you're maintaining some code... and (of course) I've published some of it.

ClusterSSH Emacs Lisp implementation, cssh.el

ClusterSSH is about opening any number of xterm, ssh to a different remote machine in each one of them, and providing user a unique input line to rule them all.

ClusterSSH.el is about doing the same without quitting emacs, using term.el as the terminal emulator and the ClusterSSH major mode for ruling them all. It's available in ELPA and developed at the github cssh repository.

rcirc groups mode

This mode maintains a *Group* like buffer for rcirc, an included IRC client. It's called rcirc-groups.el.

switch-window

This is a visual replacement for C-x o, so here's what dim-switch-window.el will look like if you happen to use it:

el-get

Of course, my emacs setup is managed in a private git repository. Some people on #emacs are using git submodules (or was it straight import) for managing external repositories in there, but all I can say is that I frown on this idea. I want an easy canonical list of packages I depend on to run emacs, and I want this documentation to be usable as-is. Enters el-get!

(setq el-get-sources
      '((:name bbdb
               :type git
               :url "git://github.com/barak/BBDB.git"
               :load-path ("./lisp" "./bits")
               :info "texinfo"
               :build ("./configure" "make"))

        (:name magit
               :type git
               :url "http://github.com/philjackson/magit.git"
               :info "."
               :build ("./autogen.sh" "./configure" "make"))

        (:name vkill
               :type http
               :url "http://www.splode.com/~friedman/software/emacs-lisp/src/vkill.el"
               :features vkill)

        (:name yasnippet
               :type git-svn
               :url "http://yasnippet.googlecode.com/svn/trunk/")

        (:name asciidoc         :type elpa)
        (:name dictionary-el    :type apt-get)
        (:name emacs-goodies-el :type apt-get)))

(el-get)

So now you have a pretty good documentation of the packages you want installed, where to get them, and how to install them. For the advanced methods (such as elpa or apt-get), you basically just need the package name. When relying on a bare git repository, you need to give some more information, such as the URL to clone and the build steps if any. Then also what features to require and maybe where to find the texinfo documentation of the package, for automatic inclusion into your local Info menu.

The good news is that not only you now have a solid readable description of all that in a central place, but this very description is all (el-get) needs to do its magic. This command will check that each and every package is installed on your system (in el-get-dir) and if that's not the case, it will actually install it. Then, it will init the packages: that means caring about the load-path, the Info-directory-list (and dir texinfo menu building) the loading of the emacs-lisp files, and finally it will require the features.