Kragen Javier Sitaker | 16 Jul 09:37 2012

choosing powerful primitives for a simplified computing system

One of the basic principles of the [FONC/STEPS system] [STEPS] is that
you can reduce the overall complexity of software by choosing powerful
primitives that simplify many parts of a system.  OMeta is sort of their
poster child: a generic rewriting system powerful enough to write
simple, efficient parsers with, but also to write compiler optimizers
and code generators with.

[STEPS]: http://vpri.org/fonc_wiki/index.php/Glossary

Here are some candidates for other primitives which, in practice,
simplify other parts of a computing system by more than the complexity they
add.

LZ77 obsoletes curses and other things
--------------------------------------

Using zlib to compress your data streams and files greatly reduces the
performance penalty for not optimizing your file formats and protocols
to reduce file and message size.  Here are some exmaples.

### curses ###

At the Barracas [hacklab][] tonight, I was talking about why curses, the
programming library, is useless today.  Curses goes to a lot of effort
to minimize the number of bytes sent to your terminal, because at 300 or
2400 or 9600 baud, redrawing an entire 80×24 screen would take 66 or 8
or 2 seconds.  So it sought the minimum amount of change to your screen
to get it to match its internal screen image.

[hacklab]: http://lab.hackcoop.com.ar/
(Continue reading)

Kragen Javier Sitaker | 15 Jul 07:15 2012

how to do a better predictive text input method for Android

The SwiftKey X trial just expired on me on the Android phone loaner
I’m using from my work.  I still have the keyboard layout, but the
predictive text no longer works.  This makes it a real bummer to use
the machine.

I guess it serves me right for depending on proprietary software.  I
should have known better.

So it occurred to me to think about what would be needed to replace
it with free software.

The first thing to note is that SwiftKey X can result in major
disclosures of confidential information.  If you borrow someone else’s
Android phone to check your Facebook account or whatever, and you
start accepting the default suggestion over and over again, it will
frequently reproduce substantial chunks of text from their history.
If they used SwiftKey to enter confidential information into their
phone, you may end up seeing bits of it purely by accident if you use
some of the same words.

So, given that this is apparently acceptable, a quick-and-dirty input
method that works more or less the same way could be done as follows:

1. Use one of the n-gram datasets published by Google (e.g. the 2009
   books dataset from <http://books.google.com/ngrams/datasets>; in
   English, 2 gigabytes raw compressed for 1-grams, 26 gigabytes raw
   for 2-grams, 88 gigabytes raw for 3-grams; other languages
   available are French, Chinese, German, Hebrew, Russian, and
   Spanish, but not Arabic or Hindi; there’s also the smaller 2006 web
   dataset,
(Continue reading)

Kragen Javier Sitaker | 4 Jul 01:24 2012

washing machines don't have to use energy to heat water

I recently saw an article that claims that heating water to 40° for washing
laundry consumes around 5–10kWh per load.

However, it turns out that heating up water doesn't consume energy.  You need
energy to do it, to be sure — but that energy is still in the water after you
heat it up.  The Carnot limit prevents you from recycling most of that heat
into exergy to, say, drive the washing machine motor, but nothing is preventing
you from transferring that heat from dirty water into clean water, or into a
heat reservoir that holds it until your next wash.

First, is such a reservoir possible?  Most definitely.  You can buy
fractionated paraffin wax that melts at more or less whatever temperature you
want, to a few degrees of precision, with a heat of fusion near that of water
ice.  A big Thermos inside the washing machine, full of paraffin or another
phase-change material, could hold a laundry load's full of heat for several
days, if not a week.  But how to get the heat into it?

The key is a clever little machine called a "countercurrent heat exchanger": in
its simplest form, two parallel long pipes that have been welded together, with
water, or some other fluid, flowing in opposite directions through them.  As
the hot water flows in one direction, it loses heat to the cold water flowing
in the other direction — and the cold water, you might say, loses its cold to
the hot water.  When the formerly hot water exits, it's just a little warmer
than the cold water going in, and similarly, the formerly cold water exits just
a little cooler than the hot water was originally.

The countercurrent heat exchanger is part of many animals (a nose is a variant
that needs only one tube, which acts as a heat reservoir, and the reason you
don't get hypothermia just from breathing) and its use in human engineering
goes back decades, if not centuries.  And indeed devices like cement kilns and
(Continue reading)

Kragen Javier Sitaker | 28 Jun 03:49 2012

Meghan Saweikis on the development of human society

Meghan Saweikis wrote the following; I thought it was really excellent,
so with her permission, I am posting it here.  Maybe it should go to
kragen-fw instead, but kragen-fw is mostly dead, really.

I think the issues touched on in this mail are among the most important
issues for every human being to think about.

---

I’ve been thinking a lot about what you said and about how I quantify
“net positive” vs. “net negative” impact on the world. I agree with
you that the objective level of suffering may be decreasing. I agree
that there is increasing literacy, autonomy and access to basic health
care services. I agree that these are positive things. I agree with
you that humanity as a whole has accomplished these positive
things. But, I think humanity as a whole has also created some
significant problems (both for other humans and for the
ecosystem). I’m not sure I think the benefits you mention are positive
enough to suggest that humanity in general is a “net positive.” And
even if humanity in general is a “net positive,” I don’t think that
implies that any particular individual (being a part of humanity)
would necessarily be a net positive.

As far as literacy goes, I think literacy is often equated with
opportunity and in many ways that is why increasing literacy has been
so important. In the past, I think this was a relevant correlation
because literacy was generally both necessary and sufficient to
provide a living wage and economic opportunity. Currently in the US
nearly 10% of individuals between 18-64 with basic literacy (measured
as high school diploma) live in families with incomes below a living
(Continue reading)

Kragen Javier Sitaker | 14 Jun 09:37 2012

FeML: a skeleton of a femto-ML with nothing but polymorphic variants and functions

A year or two ago, Darius Bacon and I were talking about making a sort
of variant of Scheme with ML-like pattern-matching as the basic
primitive.  You have a slightly more complicated lambda, but you don’t
need cond, car, cdr, eq, or null? in the base language, just cons or
quasiquote, plus the pattern-matching lambda and either labels or
define.

For example:

    (define assoc 
      (lambda (key ((key . value) . more)) (cons key value)
              (key (x . more))             (assoc key more)
              (key ())                     '()))

Here `lambda` takes an arbitrary set of pattern-action pairs and
matches the patterns in order against the argument list, evaluating
and returning the action expression for the pattern that matches.  In
this case, first we check to see if the key is the caar of the list of
key-value pairs, and if so, we return its (reconstructed) car;
otherwise, if there are more items in the list, we recurse down the
list; otherwise, if the list is the empty list, we return the empty
list.

(It’s not clear what happens if you say `(assoc 3 4)`.  Should that be
an error, or should it have unspecified behavior to permit compiler
optimization?)

This is awfully close to Prolog, although you wouldn’t do it this way
in Prolog:

(Continue reading)

Kragen Javier Sitaker | 12 Jun 12:39 2012

short-term predictions: 2014, 2017, 2022

The release of Meteor in April reminded me about an abandoned project of my own
called Kogluktualuk, which was basically Meteor, but never implemented to any
significant extent. It was kind of obvious that something like Meteor needed to
exist. So I thought I would record some other predictions about things that
haven't happened yet, to see if my foresight is really as good as it seems in
hindsight. After all, it's easy to fool yourself into thinking about only your
correct predictions, forgetting the stupid ones. So this is, like debugging, a
sort of exercise in therapeutically feeling stupid. 

---

Two years from now, most iOS applications will be written in Cordova (formerly
PhoneGap) or a successor, rather than in ObjC. This is both because JS is a much
more productive language than ObjC and in order to target Android as well. 

Within 5 years, P2P protocols will resurge in importance. This is despite the
massive move from desktop and laptop computers to handheld computers running
iOS and Android and using cellphone networks. The driver will be better
connectivity and crackdowns on user-generated content on centrally-operated
network services like YouTube and Megaupload. 

When automated fabrication—the scenario where you get your next bicycle by
downloading bicycle blueprints over the network and sending them to a machine
that then produces a bicycle for you without human intervention—happens, it
will not be by means of 3-D printers, which work by depositing layers of a
small number of materials. Instead, it will take the form of automated assembly
by robots of parts mostly made by other means, such as laser cutting, torch
cutting, CNC machining, and planar printing processes. 

More and more communication between people will be mediated by computers. 
(Continue reading)

Kragen Javier Sitaker | 26 Apr 09:37 2012

math for 3-D printers without any linear actuators

How do you control a polar 3-D printer?

Like, you have a turntable with your workpiece on it, and another
turntable that moves your tool, that can swing it over the center of
the first turntable.  And you can move your other turntable up and
down.  Now how do you convert x and y coordinates into angular
positions?

Perhaps a handy thing to know is that the polar equation r = sin θ
produces a circle running through the origin, retracing the same
circle when θ is in [0, π] and in [π, 2π], with its center at r = ½, θ
= ½π.  So, given the radius r at which a point is located on the
workpiece turntable, you can rotate the workpiece turntable to any θ
such that sin θ = r, such as sin⁻¹ θ, to give the tool turntable a
chance to hit the point.  Then it’s just a matter of finding the angle
for the tool turntable, a simple atan2.

In short, starting from x, y on the workpiece turntable, scaled so
that the distance between turntable centers is 1:

    θw = sin⁻¹ √(x² + y²)
    s = sin θw
    c = cos θw
    θt = atan2(1 - (c·x - s·y), s·x + c·y)

If you’re trying to design a toolpath, it may be worthwhile to take
advantage of the other alternatives available for θw.
--

-- 
To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-tol
Kragen Javier Sitaker | 23 Apr 22:42 2012

printing microfilm on ordinary laser printers on paper

#!/usr/bin/python
# -*- coding: utf-8 -*-
"""Compute proportional-font print size of a text.

The laser printer at my new workplace is 600dpi in both directions.
It prints on A4 or similar paper: 216×279mm.

Simple multiplication yields a capacity of 33.6 megabits per page, or
about 4 megabytes.

At 600dpi, a 4×6 pixel character cell like the one I use in
<http://canonical.org/~kragen/sw/dofonts-1k.html> gives you an 80×66
page of 13.5 mm × 16.8 mm.  (Janne Kujala designed the font.) If you
can successfully control every pixel, the result should be clearly
readable with a magnifying glass.  (If we consider 5-point text as the
lower limit of comfortable readability, and these 6-pixel-tall
characters are 1/100 inch, you need 7× magnification to make the text
comfortably readable.)

Further calculation suggests that the A4 page will contain an array
fitting 16 such reduced pages horizontally and 16.6 of them
vertically; practically this is probably 15 horizontally and 16
vertically, or 240 pages, 480 pages on the two sides of the paper, or
31680 80-column lines.  This is on the order of a megabyte and a half
of text, assuming an average of about 50 bytes per line.  You could
print the King James Bible on four sheets of paper.

In this form, the Rosetta Disk's 13000-page archive would require some
260 sheets of paper, the size of an average hardcover book.  Their
metal disk is probably more durable than the paper, but the paper
(Continue reading)

Kragen Javier Sitaker | 13 Apr 06:30 2012

backtracking HTML templating

This is an idea originally from 2010-04-08 that I just never got
around to publishing until now.

Now that I’ve written the below, I’m tempted to try to hack this
together tonight, but I think I should probably sleep instead, and
maybe do this on the weekend.

Motivation
----------

This week I’ve been working on a bibliographic website, which, among
other things, renders citations into HTML.  And this is resulting in
me writing a lot of HAML templates that say things like:

    - if  <at> publication.booktitle?
      In #{publication.booktitle}.
    - if  <at> publication.month?
      #{ <at> publication.month} #{ <at> publication.year}.

The general pattern here is that there are one or more properties that
need to be present, and if they’re present, we format them together,
along with some other window dressing intended to format them
properly.  But this involves a lot of very local duplication in the
code, and even though that duplication is local, it is still
error-prone; consider what happens in the above case if the year is
missing.

Now, aside from the question of whether there are already existing
BibTeX HTML formatters for Rails, this is an interesting kind of
problem to solve.
(Continue reading)

Kragen Javier Sitaker | 9 Apr 10:03 2012

dithering by optimizing perceptually-weighted error in the Gabor transform

The spatial responses of neurons in the first stages of the human visual system
are well approximated by Gabor features (the wavelet that arises from the
multiplication of a spatial sinusoid, i.e.  sin(omega dotproduct(somedirection,
[x y]) with a Gaussian), and the visual system is more sensitive to some
frequencies of these Gabor features than others.  (Interestingly, amblyopic
vision is sensitive to a different set of spatial frequencies.) JPEG, JPEG2000,
and MPEG compression exploit this feature of visual perception by spending more
bits on spatial frequencies that are likely to be perceptually salient, given a
typical viewing distance and pixel density.

(I'm going to stop saying "spatial frequencies".  All the frequencies in this
post are spatial; I'm not going to talk about how many terahertz is red or how
frequently you microsaccade.)

Dithering or halftoning is another sort of lossy data compression problem.  In
a typical case, given a picture with, say, 9 bits per pixel of luminance
information, you want to produce another picture with only 1 bit per pixel, but
which nevertheless appears identical to the original, to a human viewer.  This
is similar to the problem JPEG solves, but with the additional constraint that
you don't have the freedom to specify the "decompression algorithm", because
biology has already determined it for you.

There is clearly a tradeoff here between different frequencies: the bits you
use to encode higher frequencies are not available to encode lower frequencies,
and vice versa.  This implies that there's an unavoidable tradeoff between
precisely locating edges and precisely rendering levels of brightness.  But
that doesn't mean that standard algorithms are Pareto-optimal in these
tradeoffs, and because of the human visual system's varying sensitivity to
errors at different Gabor frequencies, even Pareto-optimal algorithms could
possibly be improved.
(Continue reading)

Kragen Javier Sitaker | 2 Apr 18:11 2012

GitHub's "responsible disclosure" stance is irresponsible

GitHub has generally been exemplary in their handling of crises, service
interrruptions, and security, but there has recently arisen a significant
exception, one where I would like to see GitHub correct their irresponsible
behavior.

Four paragraphs explaining the `attr_accessible` problem
--------------------------------------------------------

Ruby on Rails by default allowed web requests to update arbitrary database
fields unless those fields were declared `attr_protected`, or unless some other
fields in the same table ("model") were declared `attr_accessible`.  

This means that, unless you're extremely careful, your Rails application would
allow random users to change things in your database that they shouldn't be
able to change or even see, just by guessing the names you'd given them and
typing those names into their browser.  Egor Homakov reported this as a bug in
Rails.  The Rails people argued it wasn't a bug, because people would be
extremely careful, unless they were stupid.  

The Rails project keeps its source code on GitHub, which is written in Rails,
by some of the world's top Rails experts.  Homakov found a place where they had
forgotten to use `attr_accessible` and used it to add a modify the Rails
project adding a file explaining that even the best Rails developers made that
mistake.

The Rails project fixed the problem.

GitHub's irresponsible response
-------------------------------

(Continue reading)


Gmane