Toxic Elephant

Don't bury it in your back yard!

How to Trim Spaces in TeX

Posted by matijs 20/07/2006 at 19h27

The problem

You created a macro package for LaTeX and promised that people could write either

\synttree[a label]

or

\synttree[a label]

and the result would be the same. Now your buggy macro for trimming spaces stands in the way of a much needed bug fix.

In short, you need a macro \trim that will trim (The name says it all, doesn’t it?) spaces off of its argument’s beginning and end.

Tags , no comments no trackbacks

Amazon S3 versus rsync

Posted by matijs 08/06/2006 at 11h15

JungleDisk is a new tool that uses
Amazon’s S3 as a storage device but appears to the user as just another
disk. It is a closed source, open standard application, seen locally as a
WebDAV server, so it interfaces with most desktop file managers.

I can’t get it for my Linux-on-iBook system, but I suppose if JungleDisk
catches on, someone will come along and do a Free reimplementation (source
code for retrieving files can already be downloaded from JungleDisk’s
site
).
The question then becomes: Do I want to use it?

I myself use rsync to backup my files to a friend’s machine on the other
side of the ocean. JungleDisk doesn’t do rsync. It’s just a disk, so you
have to consciously copy your files there. That’s fine if you just want to
store some files on-line. For backups, on the other hand, I want an
automated solution.

The following scheme might work: Store md5 and/or sha-1 hashes of all the
files sent to S3. The files sent there are simply indexed by hash, and we
store a mapping of hashes to directory nodes. This way also, when we move a
file, it doesn’t have to be uploaded again. The mapping has to be uploaded
to S3 as well, of course. For more fine-grained upload control, files can
even be divided into equal-sized blocks, and only changed blocks will have
to be re-uploaded.

As an aside, I wonder if S3 allows you to check md5 or sha-1 hashes of the files
stored there, or if there is some other way to check the files there are
the same as the files here.

Tags no comments no trackbacks

Misc MSGConvert Stuff

Posted by matijs 01/04/2006 at 14h09

MSGConvert Updated

A user alerted me to the fact that a .MSG file containing another .MSG file
as an attachment isn’t converted properly by
my converter. It turns out, Outlook stores
attachments like that in a different way.

MSGConvert now also converts such files properly. As a bonus, it
also works with perl versions before 5.6 now.

Outlook2Mac

Another reader pointed out some time ago that a program I pointed to in my
FAQ, Outlook2Mac, didn’t work very well for him. It
is meant to be used for converting large amounts of email from Outlook to
mbox, but it seems to have trouble with certain dates and encrypted
messages. Also, some messages went missing.

I haven’t tried this program
myself, but others have. John Tolva writes:

I had to extract several year’s worth of e-mail with attachments using
a variety of filters and applications (Outlook2vCal and Outlook2Mac) to
create mbox files. This was the least painful part of the transition and it
took me days.

And the Applepie blog says:

I turned to a little program called Outlook2Mac. It costs $10 but does
a decent, albeit primitive, job of transcoding my PST into Apple Mail Mail
Boxes. […] Transcoding was fast, but it was often interrupted by a
message box asking me if I wanted Outlook2Mac to fix the send date of
certain message, or if I wanted Outlook2Mac to save a digitally signed
message to an insecure file format.

In a more enthousiast tone, Eric Evers writes:

Ten bucks yields switcher bliss!

I have not found any complaints of missing messages.

Still, I have removed the link from my FAQ, also because it is not Free
Software, and it requires Outlook to be installed, which sort of defeats
the purpose: If you have Outlook installed, there are
easier techniques available.

Tags no trackbacks

Creating Type1 Fonts Directly From Metafont Sources

Posted by matijs 10/07/2005 at 12h47

I take a great interest in type design. I'm not sure when it started, but no doubt it has something to do with the following: A long time ago, I invented a writing system for the language of my imaginary country. Since then, I've created several fonts for this writing system. Since I'm only an amateur, I can't afford to buy any of the professional tools for font making, so I have had to resort to free tools (for varying definitions of free).

Tags no comments no trackbacks

Updated synttree Package

Posted by matijs 11/06/2005 at 15h45

It's been available on CTAN for a while now, so it's about time I made a formal announcement: I have updated my LaTeX package for typesetting syntactic trees. In this new version, the allowed number of branches has been increased from three to unlimited (Yay!).

To make this update, I had to refactor the code that was dedicated to zero, one, two and three branches into a more generic version. I must say, I've always enjoyed modifying code to make it more manageable, but refactoring becomes very satisfying once you have a good test suite to test that it keeps doing the same thing, and a version control system to keep your known good versions. (This is no new insight, of course; it's right there in the book). This is a highly recommended technique!

Tags no comments no trackbacks

A LaTeX package for typesetting syntactic trees

Posted by matijs 07/11/2004 at 01h40

While studying linguistics, I often had to put pictures of syntactic trees in my papers about generative grammar. I made several systems to draw these automatically. One of these was a LaTeX package called synttree.

I actually uploaded the version from 1998 to CTAN. That version was clumsy, and the code was hard to read, a result of my struggles with TeX. I have always wanted to clean the package up, so it would be easier to extend it if needed.

In 2001 or so, I read an article from TUGboat that showed me how to write a parser that I could actually understand after I had written it. So, I wrote the new parser, and sort of bolted it onto the existing drawing macros.

Two days ago I finally came back to it. Now, I have a new working version, much more extendable and cleaner, ready to be uploaded to CTAN.

[Update Dec 9, 2004: The new version can also be found at CTAN.]

[Update Jun 11, 2005: Information on downloading and using the package has been moved to the synttree page.]

Tags 3 comments no trackbacks

Getting Bryar to work: A Summary

Posted by matijs 11/10/2004 at 12h40

As I mentioned, installing Bryar was not
completely straightforward. Luckily, it is architected well, so making the
necessary modifications was easy.

Tags , no comments no trackbacks

Tips for Gimp-Perl

Posted by matijs 11/10/2004 at 12h05

After my image filtering annoyance day, I spent some more time on the problem, with more positive results.

First, I found out how to do pixel manipulations properly. It needs some incantations that are not in the man pages, but are in the source to Gimp's plugins written in C. For the basic framework of a plug-in, see Gimp-Perl's documentation and examples. For the rest, see below.

First, of course, you have to load the right modules.

use Gimp ":auto";
use Gimp::Fu;
use Gimp::Feature qw(pdl);
use PDL;

Then, the sub that actually does the work should look something like this:

sub do_something {
  my ($img, $dwb) = @_;

  my $w = $dwb->width;
  my $h = $dwb->height;
  my $gdrawable = $dwb->get;

  # make sure we can undo in one step.
  gimp_image_undo_group_start($img);

  # Read values from the source region, and write them to the
  # destination region. The destination region has its dirty and shadow
  # bits set. 
  my $src_rgn = $gdrawable->pixel_rgn(0,0,$w,$h,0,0);
  my $dst_rgn = $gdrawable->pixel_rgn(0,0,$w,$h,1,1);

  # Get pixel data as a 'piddle'
  my $rect = $src_rgn->get_rect($some_x,$some_y,$some_w,$some_h);

  # Do something with $rect's data.

  # Set pixel data
  $dst_rgn->set_rect($rect, $some_x, $some_y);

  # Magic incantations found in the C destripe plug-in.
  $gdrawable->flush();
  $dwb->update(0,0,$w,$h);
  $dwb->merge_shadow(1);
  gimp_displays_flush();

  # make sure we can undo in one step.
  gimp_image_undo_group_end($img);
  ();
}

Be sure to read the documentation for the PDL module. It explains how to manipulate the piddle with the pixel data. It helps to print part of the piddle now and then, or its dimensions (using the dims function).

Unfortunately, the method I had come up with to destripe my images didn't exactly work right. So, I went searching again. This time, I found a page describing an easy destriping method using the Gimp.

Since the method has several steps, I decided to automate it with a plug-in. I am quite happy with the result: It works on the selected layer, even if it's invisible, it handles errors gracefully, and it only works on the selection, if one is present.

Tags no comments no trackbacks

More Bugfixes

Posted by matijs 31/05/2004 at 18h00

I fixed up MSGConvert some more. All outstanding bug reports were taken care of.

Tags no comments no trackbacks

Bugfixes

Posted by matijs 20/05/2004 at 18h00

I spent last week in Switzerland, where my mother lives, and this gave me time in between enjoying the beautiful outdoors to fix two bugs in MSGConvert. My earlier rewrite already paid off in making the less trivial one much easier to fix.

Tags no comments no trackbacks