Planet

May 21, 2013
Haldor Topsøe: Computerkunde #1 & #2 (May 21, 2013, 05:05 UTC)
Haldor Topsøe er død. Jeg vil overlade til andre at skrive de officielle nekrologer og nøjes med at påpege at han var computer kunde nummer #1 og #2 i den danske computerbranche. Haldor Topsøe's hovedforretning for 50-60 år siden var ammoniakfabrikker baseret på firmaets katalysatorer. Katalys...
May 18, 2013
NemID - hjælper hardware tokens? (May 18, 2013, 06:57 UTC)

Jeg står overfor at skulle melde min flytning til CPR registeret, og det falder som et par andre ting, under den nye lov af d. 1 december 2012, om tvungen digital selvbetjening - så derfor fik jeg en udfordring,da hverken min kone eller jeg har tillid nok til NemID, til at have en OCES del (signatur delen) af NemID.

Jeg kan konstatere at mange jeg ellers ved har stor teknisk kompetance, slet ikke forstår hvorfor jeg ikke har OCES delen aktiveret på mit NemID - og i sidste instans en argumentation om at jeg da så bare må få det hardware token de nu (endelig) er kommet med.

May 16, 2013
Hvorfor blev min disk fyldt op? (May 16, 2013, 21:02 UTC)
Da jeg var ansat hos Nokia skrev min kvikke UNIX-guru Anders et snedigt disk-analyse-program. Design-ideen var at man kunne få information på alle katalog-niveauet om hvor gammel data er neden under og hvor mange GB, der lå gemt nedenunder. I et stort firma kom der ofte kæmpe data-mængder og ing...

If you are compositing pixels with 16 bits per component, you often need this computation:

uint16_t a, b, r;

r = (a * b + 0x7fff) / 65535;


There is a well-known way to do this quickly without a division:

uint32_t t;

t = a * b + 0x8000;

r = (t + (t >> 16)) >> 16;


Since we are compositing pixels we want to do this with SSE2 instructions, but because the code above uses 32 bit arithmetic, we can only do four operations at a time, even though SSE registers have room for eight 16 bit values. Here is a direct translation into SSE2:

a = punpcklwd (a, 0);
b = punpcklwd (b, 0);
a = pmulld (a, b);
b = psrld (a, 16);
a = psrld (a, 16);
a = packusdw (a, 0);


But there is another way that better matches SSE2:

uint16_t lo, hi, t, r;

hi = (a * b) >> 16;
lo = (a * b) & 0xffff;

t = lo >> 15;
hi += t;
t = hi ^ 0x7fff;

if ((int16_t)lo > (int16_t)t)
lo = 0xffff;
else
lo = 0x0000;

r = hi - lo;


This version is better because it avoids the unpacking to 32 bits. Here is the translation into SSE2:

t = pmulhuw (a, b);
a = pmullw (a, b);
b = psrlw (a, 15);
b = pxor (t, 0x7fff);
a = pcmpgtw (a, b);
a = psubw (t, a);


This is not only shorter, it also makes use of the full width of the SSE registers, computing eight results at a time.

Unfortunately SSE2 doesn’t have 8-bit variants of pmulhuw, pmullw, and psrlw, so we can’t use this trick for the more common case where pixels have 8 bits per component.

Exercise: Why does the second version work?

Sysprof 1.1.8 (May 16, 2013, 05:14 UTC)

A new version 1.1.8 of Sysprof is out.

This is a release candidate for 1.2.0 and contains mainly bug fixes.

Gamma Correction vs. Premultiplied Pixels (May 16, 2013, 05:14 UTC)

Pixels with 8 bits per channel are normally sRGB encoded because that allocates more bits to darker colors where human vision is the most sensitive. (Actually, it’s really more of a historical accident, but sRGB nevertheless remains useful for this reason). The relationship between sRGB and linear RGB is that you get an sRGB pixel by raising each component of a linear pixel to the power of $1/2.2$.

A lot of graphics software does alpha blending directly on these sRGB pixels using alpha values that are linearly coded (ie., an alpha value of 0 means no coverage, 0.5 means half coverage, and 1 means full coverage). Because alpha blending is best done with premultiplied pixels, such systems store pixels in this format:

[ alpha,  alpha * red_s,  alpha * green_s,  alpha * blue_s ]


where alpha is linearly coded, and (red_s, green_s, blue_s) are sRGB coded. As long as you are happy with blending in sRGB, this works well. Also, if you simply discard the alpha channel of such pixels and display them directly on a monitor, it will look as if the pixels were alpha blended (in the sRGB space) on top of a black background, which is the desired result.

But what if you want to blend in linear RGB? If you use the format above, some expensive conversions will be required. To convert to premultiplied linear, you have to first divide by alpha, then raise each color to 2.2, then multiply by alpha. To convert back, you must divide by alpha, raise to $1/2.2$, then multiply with alpha.

The conversions can be avoided if you store the pixels linearly, ie., keeping the premultiplication, but coding red, green, and blue linearly instead of as sRGB. This makes blending fast, but the downside is that you need deeper pixels. With only 8 bits per pixel, the linear coding loses too much precision in darker tones. Another problems is that to display these pixels, you will either have to convert them to sRGB, or if the video card can scan them out directly, you have to make sure that the gamma ramp is set to compensate for the fact that the monitor expects sRGB pixels.

[ alpha,  alpha_s * red_s,  alpha_s * green_s,  alpha_s * blue_s ]


That is, the alpha channel is stored linearly, and the color channels are stored in sRGB, premultiplied with the alpha value raised to 1/2.2. Ie., the red component is now

(red * alpha)^(1/2.2),


where before it was

alpha * red^(1/2.2).


It is sufficient to use 8 bits per channel with this format because of the sRGB encoding. Discarding the alpha channel and displaying the pixels on a monitor will produce pixels that are alpha blended (in linear space) against black, as desired.

You can convert to linear RGB simply by raising the R, G, and B components to 2.2, and back by raising to $1/2.2$. Or, if you feel like cheating, use an exponent of 2 so that the conversions become a multiplication and a square root respectively.

This is also the pixel format to use with texture samplers that implement the sRGB OpenGL extensions (textures and framebuffers). These extensions say precisely that the R, G, and B components are raised to 2.2 before texture filtering, and raised to 1/2.2 after the final raster operation.

Over is not Translucency (May 16, 2013, 05:14 UTC)

The ">">Porter/Duff Over operator, also known as the “Normal” blend mode in Photoshop, computes the amount of light that is reflected when a pixel partially covers another:

The fraction of bg that is covered is denoted alpha. This operator is the correct one to use when the foreground image is an opaque mask that partially covers the background:

A photon that hits this image will be reflected back to your eyes by either the foreground or the background, but not both. For each foreground pixel, the alpha value tells us the probability of each:

$a \cdot \text{fg} + (1 - a) \cdot \text{bg}$

This is the definition of the Porter/Duff Over operator for non-premultiplied pixels.

But if alpha is interpreted as translucency, then the Over operator is not the correct one to use. The Over operator will act as if each pixel is partially covering the background:

Which is not how translucency works. A translucent material reflects some light and lets other light through. The light that is let through is reflected by the background and interacts with the foreground again.

Let’s look at this in more detail. Please follow along in the diagram to the right. First with probability $a$, the photon is reflected back towards the viewer:

$a \cdot \text{fg}$

With probability $(1 - a)$, it passes through the foreground, hits the background, and is reflected back out. The photon now hits the backside of the foreground pixel. With probability $(1 - a)$, the foreground pixel lets the photon back out to the viewer. The result so far:

\begin{align*} &a\cdot \text{fg} \\ +&(1 - a) \cdot \text{bg} \cdot (1 - a) \end{align*}

But we are not done yet, because with probability $a$ the foreground pixel reflects the photon once again back towards the background pixel. There it will be reflected, hit the backside of the foreground pixel again, which lets it through to our eyes with probability $(1 - a)$. We get another term where the final $(1 - a)$ is replaced with $a \cdot \text{fg} \cdot \text {bg} \cdot (1 - a)$:

\begin{align*} &a\cdot \text{fg} \\ +&(1 - a) \cdot \text{bg} \cdot (1 - a)\\ +&(1 - a) \cdot \text{bg} \cdot a \cdot \text{fg} \cdot \text{bg} \cdot (1 - a) \end{align*}

And so on. In each round, we gain another term which is identical to the previous one, except that it has an additional $a \cdot \text{fg} \cdot \text{bg}$ factor:

\begin{align*} &a\cdot \text{fg} \\ +&(1 - a) \cdot \text{bg} \cdot (1 - a)\\ +&(1 - a) \cdot \text{bg} \cdot a \cdot \text{fg} \cdot \text{bg} \cdot (1 - a)\\ +&(1 - a) \cdot \text{bg} \cdot a \cdot \text{fg} \cdot \text{bg} \cdot a \cdot \text{fg} \cdot \text{bg} \cdot (1 - a) \\ +&\cdots \end{align*}

or more compactly:

$\displaystyle a \cdot \text{fg} + (1 - a)^2 \cdot \text{bg} \cdot \sum_{i=0}^\infty (a \cdot \text{fg} \cdot \text{bg})^i$

Because we are dealing with pixels, both $a$, $\text{fg}$, and $\text{bg}$ are less than 1, so the sum is a geometric series:

$\displaystyle \sum_{i=0}^\infty x^i = \frac{1}{1 - x}$

Putting them together, we get:

$\displaystyle a \cdot \text{fg} + \frac{(1 - a)^2 \cdot bg}{1 - a \cdot \text{fg} \cdot \text{bg}}$

I have sidestepped the issue of premultiplication by assuming that background alpha is 1. The calculations with premultipled colors are similar, and for the color components, the result is simply:

$\displaystyle r = \text{fg} + \frac{(1 - a_\text{fg})^2 \cdot \text{bg}}{1 - \text{fg}\cdot\text{bg}}$

The issue of destination alpha is more complicated. With the Over operator, both foreground and background are opaque masks, so the light that survives both has the same color as the input light. With translucency, the transmitted light has a different color, which means the resulting alpha value must in principle be different for each color component. But that’s not possible for ARGB pixels. A similar argument to the above shows that the resulting alpha value would be:

$\displaystyle r = 1 - \frac{(1 - a)\cdot (1 - b)}{1 - \text{fg} \cdot \text{bg}}$

where $b$ is the background alpha. The problem is the dependency on $\text{fg}$ and $\text{bg}$. If we simply assume for the purposes of the alpha computation that $\text{fg}$ and $\text{bg}$ are equal to $a$ and $b$, we get this:

$\displaystyle r = 1 - \frac{(1 - a)\cdot (1 - b)}{1 - a \cdot b}$

which is equal to

$\displaystyle a + \frac{(1 - a)^2 \cdot b}{1 - a \cdot b}$

Ie., exactly the same computation as the one for the color channels. So we can define the Translucency Operator as this:

$\displaystyle r = \text{fg} + \frac{(1 - a)^2 \cdot \text{bg}}{1 - \text{fg} \cdot \text{bg}}$

for all four channels.

Here is an example of what the operator looks like. The image below is what you will get if you use the Over operator to implement a selection rectangle. Mouse over to see what it would look like if you used the Translucency operator.

Both were computed in linear RGB. Typical implementations will often compute the Over operator in sRGB, so that’s what see if you actually select some icons in Nautilus. If you want to compare all three, open these in tabs:

Over, in sRGB

Translucency, in linear RGB

Over, in linear RGB

And for good measure, even though it makes zero sense to do this,

Translucency, in sRGB

Sysprof 1.2.0 (May 16, 2013, 05:14 UTC)

A new stable releasenew stable release of Sysprof is now available. Download version 1.2.0.

Big-O Misconceptions (May 16, 2013, 05:14 UTC)

In computer science and sometimes mathematics, big-O notation is used to talk about how quickly a function grows while disregarding multiplicative and additive constants. When classifying algorithms, big-O notation is useful because it lets us abstract away the differences between real computers as just multiplicative and additive constants.

Big-O is not a difficult concept at all, but it seems to be common even for people who should know better to misunderstand some aspects of it. The following is a list of misconceptions that I have seen in the wild.

But first a definition: We write

$f(n) = O(g(n))$

when $f(n) \le M g(n)$ for sufficiently large $n$, for some positive constant $M$.

Misconception 1: “The Equals Sign Means Equality”

$f(n) = O(g(n))$

is a widespread travestry. If you take it at face value, you can deduce that since $5 n$ and $3 n$ are both equal to $O(n)$, then $3 n$ must be equal to $5 n$ and so $3 = 5$.

The expression $f(n) = O(g(n))$ doesn’t type check. The left-hand-side is a function, the right-hand-side is a … what, exactly? There is no help to be found in the definition. It just says “we write” without concerning itself with the fact that what “we write” is total nonsense.

The way to interpret the right-hand side is as a set of functions:

$O(f) = \{ g \mid g(n) \le M f(n) \text{ for some $$M > 0$$ for large $$n$$}\}.$

With this definition, the world makes sense again: If $f(n) = 3 n$ and $g(n) = 5 n$, then $f \in O(n)$ and $g \in O(n)$, but there is no equality involved so we can’t make bogus deductions like $3=5$. We can however make the correct observation that $O(n) \subseteq O(n \log n)\subseteq O(n^2) \subseteq O(n^3)$, something that would be difficult to express with the equals sign.

Misconception 2: “Informally, Big-O Means ‘Approximately Equal’"

If an algorithm takes $5 n^2$ seconds to complete, that algorithm is $O(n^2)$ because for the constant $M=7$ and sufficiently large $n$, $5 n^2 \le 7 n^2$. But an algorithm that runs in constant time, say 3 seconds, is also $O(n^2)$ because for sufficiently large $n$, $3 \le n^2$.

So informally, big-O means approximately less than or equal, not approximately equal.

If someone says “Topological Sort, like other sorting algorithms, is $O(n \log n)$", then that is technically correct, but severely misleading, because Toplogical Sort is also $O(n)$ which is a subset of $O(n \log n)$. Chances are whoever said it meant something false.

If someone says “In the worst case, any comparison based sorting algorithm must make $O(n \log n)$ comparisons” that is not a correct statement. Translated into English it becomes:

“In the worst case, any comparison based sorting algorithm must make fewer than or equal to $M n \log (n)$ comparisons”

which is not true: You can easily come up with a comparison based sorting algorithm that makes more comparisons in the worst case.

To be precise about these things we have other types of notation at our disposal. Informally:

 $O()$: Less than or equal, disregarding constants $\Omega()$: Greater than or equal, disregarding constants $o()$: Stricly less than, disregarding constants $\Theta()$: Equal to, disregarding constants

and some more. The correct statement about lower bounds is this: “In the worst case, any comparison based sorting algorithm must make $\Omega(n \log n)$ comparisons. In English that becomes:

“In the worst case, any comparison based sorting algorithm must make at least $M n \log (n)$ comparisons”

which is true. And a correct, non-misleading statement about Topological Sort is that it is $\Theta(n)$, because it has a lower bound of $\Omega(n)$ and an upper bound of $O(n)$.

Misconception 3: “Big-O is a Statement About Time”

Big-O is used for making statements about functions. The functions can measure time or space or cache misses or rabbits on an island or anything or nothing. Big-O notation doesn’t care.

In fact, when used for algorithms, big-O is almost never about time. It is about primitive operations.

When someone says that the time complexity of MergeSort is $O(n \log n)$, they usually mean that the number of comparisons that MergeSort makes is $O(n \log n)$. That in itself doesn’t tell us what the time complexity of any particular MergeSort might be because that would depend how much time it takes to make a comparison. In other words, the $O(n \log n)$ refers to comparisons as the primitive operation.

The important point here is that when big-O is applied to algorithms, there is always an underlying model of computation. The claim that the time complexity of MergeSort is $O(n \log n)$, is implicitly referencing an model of computation where a comparison takes constant time and everything else is free.

Which is fine as far as it goes. It lets us compare MergeSort to other comparison based sorts, such as QuickSort or ShellSort or BubbleSort, and in many real situations, comparing two sort keys really does take constant time.

However, it doesn’t allow us to compare MergeSort to RadixSort because RadixSort is not comparison based. It simply doesn’t ever make a comparison between two keys, so its time complexity in the comparison model is 0. The statement that RadixSort is $O(n)$ implicitly references a model in which the keys can be lexicographically picked apart in constant time. Which is also fine, because in many real situations, you actually can do that.

To compare RadixSort to MergeSort, we must first define a shared model of computation. If we are sorting strings that are $k$ bytes long, we might take “read a byte” as a primitive operation that takes constant time with everything else being free.

In this model, MergeSort makes $O(n \log n)$ string comparisons each of which makes $O(k)$ byte comparisons, so the time complexity is $O(k\cdot n \log n)$. One common implementation of RadixSort will make $k$ passes over the $n$ strings with each pass reading one byte, and so has time complexity $O(n k)$.

Misconception 4: Big-O Is About Worst Case

Big-O is often used to make statements about functions that measure the worst case behavior of an algorithm, but big-O notation doesn’t imply anything of the sort.

If someone is talking about the randomized QuickSort and says that it is $O(n \log n)$, they presumably mean that its expected running time is $O(n \log n)$. If they say that QuickSort is $O(n^2)$ they are probably talking about its worst case complexity. Both statements can be considered true depending on what type of running time the functions involved are measuring.

Porter/Duff Compositing and Blend Modes (May 16, 2013, 05:14 UTC)

In the Porter/Duff compositing algebra, images are equipped with an alpha channel that determines on a per-pixel basis whether the image is there or not. When the alpha channel is 1, the image is fully there, when it is 0, the image isn’t there at all, and when it is in between, the image is partially there. In other words, the alpha channel describes the shape of the image, it does not describe opacity. The way to think of images with an alpha channel is as irregularly shaped pieces of cardboard, not as colored glass. Consider these two images:

When we combine them, each pixel of the result can be divided into four regions:

One region where only the source is present, one where only the destination is present, one where both are present, and one where neither is present.

By deciding on what happens in each of the four regions, various effects can be generated. For example, if the destination-only region is treated as blank, the source-only region is filled with the source color, and the ‘both’ region is filled with the destination color like this:

The effect is as if the destination image is trimmed to match the source image, and then held up in front of it:

The Porter/Duff operator that does this is called “Dest Atop”.

There are twelve of these operators, each one characterized by its behavior in the three regions: source, destination and both. The ‘neither’ region is always blank. The source and destination regions can either be blank or filled with the source or destination colors respectively.

The formula for the operators is a linear combination of the contents of the four regions, where the weights are the areas of each region:

$A_\text{src} \cdot [s] + A_\text{dest} \cdot [d] + A_\text{both} \cdot [b]$

Where $[s]$ is either 0 or the color of the source pixel, $[d]$ either 0 or the color of the destination pixel, and $[b]$ is either 0, the color of the source pixel, or the color of the destination pixel. With the alpha channel being interpreted as coverage, the areas are given by these formulas:

$A_\text{src} = \alpha_\text{s} \cdot (1 - \alpha_\text{d})$
$A_\text{dst} = \alpha_\text{d} \cdot (1 - \alpha_\text{s})$
$A_\text{both} = \alpha_\text{s} \cdot \alpha_\text{d}$

The alpha channel of the result is computed in a similar way:

$A_\text{src} \cdot [\text{as}] + A_\text{dest} \cdot [\text{ad}] + A_\text{both} \cdot [\text{ab}]$

where $[\text{as}]$ and $[\text{ad}]$ are either 0 or 1 depending on whether the source and destination regions are present, and where $[\text{ab}]$ is 0 when the ‘both’ region is blank, and 1 otherwise.

Here is a table of all the Porter/Duff operators:

 $[\text{s}]$ $[\text{d}]$ $[\text{b}]$ Src $s$ $0$ s Atop $0$ $d$ s Over $s$ $d$ s In $0$ $0$ s Out $s$ $0$ $0$ Dest $0$ $d$ d DestAtop $s$ $0$ d DestOver $s$ $d$ d DestIn $0$ $0$ d DestOut $0$ $d$ $0$ Clear $0$ $0$ $0$ Xor $s$ $d$ $0$

And here is how they look:

Despite being referred to as alpha blending and despite alpha often being used to model opacity, in concept Porter/Duff is not a way to blend the source and destination shapes. It is way to overlay, combine and trim them as if they were pieces of cardboard. The only places where source and destination pixels are actually blended is where the antialiased edges meet.

Blending
Photoshop and the Gimp have a concept of layers which are images stacked on top of each other. In Porter/Duff, stacking images on top of each other is done with the “Over” operator, which is also what Photoshop/Gimp use by default to composite layers:

Conceptually, two pieces of cardboard are held up with one in front of the other. Neither shape is trimmed, and in places where both are present, only the top layer is visible.

A layer in these programs also has an associated Blend Mode which can be used to modify what happens in places where both are visible. For example, the ‘Color Dodge’ blend mode computes a mix of source and destination according to this formula:

$\begin{equation*} B(s,d)= \begin{cases} 0 & \text{if $$d=0$$,} \\ 1 & \text{if $$d \ge (1 - s)$$,} \\ d / (1 - s) & \text{otherwise} \end{cases} \end{equation*}$

The result is this:

Unlike with the regular Over operator, in this case there is a substantial chunk of the output where the result is actually a mix of the source and destination.

Layers in Photoshop and Gimp are not tailored to each other (except for layer masks, which we will ignore here), so the compositing of the layer stack is done with the source-only and destination-only region set to source and destination respectively. However, there is nothing in principle stopping us from setting the source-only and destination-only regions to blank, but keeping the blend mode in the ‘both’ region, so that tailoring could be supported alongside blending. For example, we could set the ‘source’ region to blank, the ‘destination’ region to the destination color, and the ‘both’ region to ColorDodge:

Here are the four combinations that involve a ColorDodge blend mode:

In this model the original twelve Porter/Duff operators can be viewed as the results of three simple blend modes:

 Source: $B(s, d) = s$ Dest: $B(s, d) = d$ Zero: $B(s, d) = 0$

In this generalization of Porter/Duff the blend mode is chosen from a large set of formulas, and each formula gives rise to four new compositing operators characterized by whether the source and destination are blank or contain the corresponding pixel color.

Here is a table of the operators that are generated by various blend modes:

The general formula is still an area weighted average:

$A_\text{src} \cdot [s] + A_\text{dest} \cdot [d] + A_\text{both}\cdot B(s, d)$

where [s] and [d] are the source and destination colors respectively or 0, but where $B(s, d)$ is no longer restricted to one of $0$, $s$, and $d$, but can instead be chosen from a large set of formulas.

The output of the alpha channel is the same as before:

$A_\text{src} \cdot [\text{as}] + A_\text{dest} \cdot [\text{ad}] + A_\text{both} \cdot [\text{ab}]$

except that [ab] is now determined by the blend mode. For the Zero blend mode there is no coverage in the both region, so [ab] is 0; for most others, there is full coverage, so [ab] is 1.

May 15, 2013
Nedbrydning af den binære grænse (May 15, 2013, 10:12 UTC)
Der sker et stort informationstab når vi compilerer fra kildetekst til binær og jeg tænker ikke bare på at de få sporadisk placerede kommentarer elimineres. Reverse-engineering er en god og gammel sport i branchen, som har fået en særlig opblomstring i virus/malware branchen. Men det er stadigv...
May 11, 2013
A DNSSEC book for the working sysadmin, likely to put you ahead of the pack in securing an essential Internet service.

I have a confession to make. Michael W. Lucas is a long time favorite of mine among tech authors. When Michael descends on a topic and produces a book, you can expect the result to contain loads of useful information, presented along with humor and real-life anecdotes so you will want to explore the topic in depth on your own systems.

In DNSSEC Mastery (apparently the second installment in what could become an extensive Mastery series -- the first title was SSH Mastery, reviewed here -- from Michael's own Tilted Windmill Press), the topic is how to make your own contribution to making the Internet name service more reliable by having your own systems present verifiable, trustworthy information.

Before addressing the book itself, I'll spend some time explaining why this topic is important. The Domain Name System (usually referred to as DNS or simply 'the name service' even if nitpickers would be right that there is more than one) is one of the old-style Internet services that was created to solve a particluar set of problems (humans are a lot better at remembering names a than strings of numbers) in the early days of networking when security was not really a concern.

Old-fashioned DNS moves data via UDP, the connectionless no-guarantees-ever protocol mainly because the low protocol overhead in most cases means the answer arrives faster than it would have otherwise. Reliable delivery was sacrificed for speed, and in general, the thing just works. DNS is one of those things that makes the Internet usable for techies and non-techies alike.

The other thing that was sacrificed, or more likely never even considered important enough to care about at the time, was any hope of reliably verifying that the information received via the DNS service was in fact authentic and correct.

When you ask an application to look up a name, say you want to see if anything's new at bsdly.blogspot.com or if you want to send me mail to be delivered at bsdly.net, the answer comes back, not necessarily from the host that answers authoritatively for the domain, but more likely from the cache of a name server near you, and serves mainly one or more IP addresses, with no guarantee other than it is, indeed a record type that contains one or more IP addresses that appear to match your application's query.

Or to put it more bluntly, with traditional DNS, it's possible for a well positioned attacker to feed you falsfied information (ie leading your packets to somewhere they don't belong or to somewhere you never intended, potentially along with your confidential data), even if the original DNS designers appear to have considered the scenario rather unlikely back then in the nineteen-eighties.

With the realization that the Internet was becoming mainstream during the 1990s and that non-techies would rely on it for such things as banking services came support cryptographically enhanced versions of several of the protocols that take care of the bulk of Internet traffic payloads, and even the essential and mostly ignored (at least by non-techies) DNS protocol was enhanced several times over the years. Around the turn of the century came the RFCs that describe cryptographic signatures as part of the enhanced name service, and finally in 2005 the trio of RFCs (4033, 4034 and 4035) that form the core of the modern DNSSEC specification were issued.

But up until quite recently, most if not all DNSSEC implementations were either incomplete or considered experimental, and getting a working DNSSEC setup in place has been an admirable if rarely fulfilled ambition among already overworked sysadmins.

Then at what seems to be the exactly right moment, Michael W. Lucas publishes DNSSEC Mastery, which is a compact and and extremely useful guide to creating your own DNSSEC setup, avoiding the many pitfalls and scary manouvres you will find described in the HOWTO-style DNSSEC guides you're likely to encounter after a web search on the topic.

The book is aimed at the working sysadmin who already has at least basic operational knowledge of running a name service. Starting with one DNSSEC implementation that is known to be complete and functional (ISC BIND 9.9 -- Michael warns early on very clearly that earlier versions will not work -- if your favorite system doesn't have that packaged yet, you can build your own or start bribing or yelling at the relevant package maintainer), this book takes a very practical, hands on approach to its topic in a way that I think is well matched to the intended audience.

Keeping in mind that the one thing a working sysadmin is always short on is time, it is likely a strong advantage that this book is so compact. With 12 chapters, it comes in at just short of 100 pages in the PDF version I used for most of this review. With the stated requirement that the reader needs to be reasonably familiar with running a DNS service, the introductory chapters fairly quickly move on to give an overview of public key cryptography as it applies to DNSSEC, with pointers to wordier sources for those who would want to delve into details, before starting the steps involved in setting up secure name service using ISC BIND 9.9 or newer.

Always taking a practical approach, DNSSEC Mastery covers essentially all aspects of setting up and running a working service, including such topics as key management, configuring and debugging both authoritative and recursive resolvers, various hints for working with or around strengths or deficiencies in various client operating systems, how the new world of DNSSEC influences how you manage your zones and delegations, and did I mention debugging your setup? DNSSEC is a lot less forgiving of errors than your traditional DNS, and Michael includes both some entertaining examples and pointers to several useful resources for testing your work before putting it all into production. And for good measure, the final chapter demonstrates how to distribute data you would not trust to old fashioned DNS: ssh host key fingerprints and SSL certificates.

As I mentioned earlier, this title comes along at what seems to be the perfect time. DNSSEC use is not yet as widespread as it perhaps should be, in part due to incomplete implementations or lack of support in several widely used systems. The free software world is ahead of the pack, and just as the world is getting to realize the importance of a trustworthy Internet name service, this book comes along, aimed perfectly at the group of people who will need an accessible-to-techies book like this one. And it comes at a reasonable price, too. If you're in this book's target group, it's a recommended buy.

The ebook is available in several formats from Tilted Windmill Press, Amazon and other places. A printed version is in the works, but was not available at the time this review was written (May 11, 2013).

Note: Michael W. Lucas gives tutorials, too, like this one at BSDCan in Ottawa, May 15 2003.

Title: DNSSEC Mastery: Securing The Domain Name System With BIND
Author: Michael W. Lucas
Publisher: Tilted Windmill Press (April 2012)

Note: Michael W. Lucas gives tutorials, too, like this one at BSDCan in Ottawa, May 15 2003.

Title: DNSSEC Mastery: Securing The Domain Name System With BIND
Author: Michael W. Lucas
Publisher: Tilted Windmill Press (April 2012)

But if you're interested in OpenBSD and haven't got your copy of that book yet, you're in for a real treat. If a firewall or other networking is closer to your heart, you could give my own The Book of PF and the PF tutorial (or here) it grew out of. You can even support the OpenBSD project by buying the books from them at the same time you buy your CD set, see the OpenBSD Orders page for more information.

Upcoming talks: I'll be speaking at BSDCan 2013, on The Hail Mary Cloud And The Lessons Learned. There will be no PF tutorial at this year's BSDCan, fortunately my staple tutorial item was crowded out by new initiatives from some truly excellent people. (I will, however, be bringing a few copies of The Book of PF and if things work out in time, some other items you may enjoy.)
Enden på softwarepatenter ? (May 11, 2013, 11:02 UTC)
Der er noget der tyder på at skovlen er kommet godt ind under software patenter i USA. Deres Højesteret har sendt et par sager return, ikke mindst den meget omtalte "in re. Bilski" og dønningerne i de lavere retssale er kun lige begyndt. En spritny dom fra appelretten for "federal circuit" der ...
May 09, 2013
Trademarking somebody else's idea behind their back is both a bad idea and highly immoral. If it wasn't your idea, you don't trademark and you don't patent. It really is that simple, people.

The news that the term hackathon had been trademarked in Germany reached me late last week, via this thread on openbsd-misc. The ideas sounded pretty ludicrous to me at the time, but I was too busy with other stuff that couldn't wait to start reacting properly, and a few distractions later, I'd forgotten about the whole thing.

Then today, via the Twitter stream, came the news that an outfit trading under the name Young Targets (how cute) had now started sending invoices at EUR 2500 a pop to anybody in Germany who dared use the term. One example has been preserved here by Hannover-based doctape, who had hosted an informal developer meetup earlier this year.

It may come as a surprise to a select few, but if there is somebody, somewhere, who is entitled to making money off that fairly well-known term, it is not that group of Germans. The term hackathon has been in use for a decade at least, and it springs like many other good things from the free software movement. The exact origin of the term is not clear, but one of the more prominent contenders for the first original use is the OpenBSD project. As you can see from the project's hackathons page, informal developer gatherings have most likely been called just that since 1999 at least.

And as anyone with an Internet connection an minimal searching skills will find out, hackathons have been quite crucial in keeping the project moving forward and offering tech goodies everybody uses, all for free and under a permissive license anybody can understand.

These items include the Secure Shell client and server used by 97% of the Internet (OpenSSH), the much praised OpenBSD packet filter PF and a whole host of other useful software that's developed as integral parts of the OpenBSD system but tend to find their way into other products such as those offered by Apple, Blackberry and quite a few others, including Linux distributions.

My brief and not too exhaustive search of mailing list archives tonight seems to turn up this message From Theo de Raadt to openbsd-misc dated July 1st, 2001 as the earliest public reference to a hackathon, but reading Theo's message again today I'm pretty convinced that the term was in common use even back then. If anyone can come up with evidence of use earlier than this, I'd love to hear from you, of course (mail to peter at bsdly dot net preferably with the word hackathon somewhere in the subject will be read with interest, or leave a comment below if you prefer).

I'm no lawyer at the best of times, but trademarking a term that both originated elsewhere and has been in general use for more than a decade seems to me at least highly immoral, and if it's not illegal, it should be. Trademarking a free software term and proceeding to charge EUR 2500 a pop for its use? It will be in your best interest to stay out of my physical proximity, Meine Damen und Herren.

Hot on the heels of what must have been a hectic night for the newly targeted young Berliners comes an announcement that states that they kinda, sorta will consider not charging sufficiently non-profity people for the use anyway, in the fluffiest terms I have ever heard come out of a German.

I'll offer our new targets some practical advice: Stop your nonsense right now, and make a real effort to track down the originators of the hackathon concept. It's likely you wil find that person is either Theo de Raadt or somebody else closely associated with the OpenBSD about the last turn of the century. If you cannot unregister the trademark, transfer the rights, free of charge, to the concept's originator.

Then either return any fees collected from your wrongful registration, or, at your victims' option, donate the equivalent sum to OpenBSD or a charity of your individual victims' choice.

Doing the right thing this late in the game and after messing up this thoroughly most likely won't save you from being the target of some sort of mischief from young hotheads (note that I strongly caution against using extra-legal tactics in this matter), but at least you, members and employees of Young Targets can hope that this embarrasing episode will be forgotten soon enough for you to resume some semblance of carreers in a not too distant future. Please go hide under a rock for now, after you've done the right thing as outlined above.

For anyone else interested in the matter, I strongly urge you to go to the OpenBSD project's donations page to donate, grab some CD sets and/or other swag from the orders page, and if you think you can help out with one or more items listed on the hardware wanted page, that will be very welcome for the project too.

It should be noted that I do not serve in any official capacity for the OpenBSD project. The paragraphs above represent my opinion only, and what I have outlined here should not be considered any kind of offer or representation on behalf of the OpenBSD project.

If you're interested in OpenBSD in general, you have a real treat coming up in the form of Michael W. Lucas' Absolute OpenBSD, 2nd edition. If a firewall or other networking is closer to your heart, you could give my own The Book of PF and the PF tutorial (or here) it grew out of. You can even support the OpenBSD project by buying the books from them at the same time you buy your CD set, see the OpenBSD Orders page for more information.

Upcoming talks: I'll be speaking at BSDCan 2013, on The Hail Mary Cloud And The Lessons Learned, with a preview planned for the BLUG meeting a couple of weeks before the conference. There will be no PF tutorial at this year's BSDCan, fortunately my staple tutorial item was crowded out by new initiatives from some truly excellent people. (I will, however, be bringing a few copies of The Book of PF and if things work out in time, some other items you may enjoy.)
May 06, 2013
Feudalistiske Startups (May 06, 2013, 08:01 UTC)
Therese har opdateret sin liste over "danske startups" og det er meget fint osv. Det må antages at være underforstået at det skal være noget hipt-IT-cloud noget for kvalificere sig til Thereses liste. Men listen provokerer mig, mest på grund af de bagvedliggende antagelse, der har meget lidt me...
May 05, 2013
Verdens højeste modeshow (May 05, 2013, 12:57 UTC)

I anledning af 100 års-jubilæet for Herning som købstad har BON’A PARTE deltaget i et projekt med en række andre virksomheder og institutioner. Et projekt der skulle resultere i verdens højeste modeshow.

En australsk kunstnergruppe har benyttet en af vores haller til at lære en masse studerende fra TEKO Designskole at bygge 4 meter høje dukker, der fredag den 3. maj skulle gå catwalk i Herning. De studerende har bygget 10 dukker der repræsenterer 10 tekstilvirksomheder fra Herning-regionen, som udover BON’A PARTE også tæller virksomheder som JBS, Egetæpper, KABOOKI m.fl.

Se regionalt indslag fra begivenheden her:

http://www.tvmidtvest.dk/nettv/?id=24517

May 04, 2013
Keep smiling, waste spammers' time (May 04, 2013, 19:28 UTC)
When you're in the business of building the networks people need and the services they need to run on them, you may also be running a mail service. If you do, you will sooner or later need to deal with spam. This article is about how to waste spammers' time and have a good time while doing it.

Assembling the parts

To take part of the fun and useful things in this article, you need a system with PF, the OpenBSD packet filter. If you're reading this magazine you are likely to be running all important things on a BSD already, and all the fully open source BSDs by now include PF (as do the commercialized variants sold by the Apple and Blackberry), developed by OpenBSD but also ported to the other BSDs. On OpenBSD, it's the packet filter, and if you're running FreeBSD, NetBSD or DragonFlyBSD it's likely to be within easy reach, either as a loadable kernel module or as a kernel compile-time option.

Getting started with PF is surprisingly easy. The official documentation such as the PF FAQ is very comprehensive, but you may be up and running faster if you buy The Book of PF or do what almost 150,000 others have done before you: Download or browse the free forerunner from http://home.nuug.no/~peter/pf. Or do both, if you like.

Network design issues
A PF setup can be, and to my mind should be, quite unobtrusive. For the activities in this article it does not matter much where you run your PF filtering, as long as it is somewhere in the default path of your incoming SMTP traffic. A gateway with PF is usually an excellent choice, but if it suits your needs better, it is quite feasible to do the filtering needed for this article on the same host your SMTP server runs.

Enter spamd
OpenBSD's spamd, the spam deferral daemon (not to be confused with the program with the same name from the SpamAssassin content filtering system), first appeared in OpenBSD 3.3. The original spamd was a tarpitter with a very simple mission in life. Its spamd-setup program would take a list of known bad IP addresses, that is, the IP addresses of machines known to have sent spam recently, and load it into a table. The main spamd program would then have any SMTP traffic from hosts in that table redirected to it, and spamd would answer those connections s-l-o-w-l-y, by default one byte per second.

A minimal PF config
As man spamd will tell you, the bare minimum to get spamd running in a useful mode on systems with PF version 4.1 or later is

table <spamd-white> persisttable <nospamd> persist file "/etc/mail/nospamd"pass in on egress proto tcp from any to any port smtp \        rdr-to 127.0.0.1 port spamdpass in on egress proto tcp from <nospamd> to any port smtppass in log on egress proto tcp from <spamd-white> to any port smtppass out log on egress proto tcp to any port smtp

Or, in the pre-OpenBSD 4.7 syntax still in use on some systems,

table <spamd-white> persisttable <nospamd> persist file "/etc/mail/nospamd"no rdr inet proto tcp from <spamd-white> to any \       port smtprdr pass inet proto tcp from any to any \       port smtp -> 127.0.0.1 port spamd

This means, essentially, that any smtp traffic from hosts that are not already in the table spamd-white will be redirected to localhost, port spamd, where you have set up the spam deferral daemon spamd to listen for connections. Enabling spamd, on the other hand, is as easy as adding spamd_flags="" to your /etc/rc.conf.local if you run OpenBSD or /etc/rc.conf if you run FreeBSD (Note that on FreeBSD, spamd is a port, so you need to install that before proceeding. Also, on recent FreeBSDs, the rc.conf lines are obspamd_enable="YES" to enable spamd and obspamd_flags="" to set any further flags.), and starting it with

$sudo /usr/libexec/spamd or if you are on FreeBSD, $ sudo /usr/local/libexec/spamd

It is also worth noting that if you add the "-d" for Debug flag to your spamd flags, spamd will generate slightly more log information, of the type shown in the log excerpts later in this article.

While earlier versions of spamd required a slightly different set of redirection rules and ran in blacklists-only mode by default, spamd from OpenBSD 4.1 onwards runs in greylisting mode by default. Let's have a look at what greylisting means and how it differs from other spam detection techniques before we exlore the finer points of spamd configuration.

Content versus behavior: Greylisting
When the email spam deluge started happening during the late 1990s and early 2000s, observers were quick to note that the messages in at least some cases messages could be fairly easily classified by looking for certain keywords, and the bulk of the rest fit well in familiar patterns.

Various kinds of content filtering have stayed popular and are the mainstays of almost all proprietary and open source antispam products. Over the years the products have develped from fairly crude substring match mechanisms into multi-level rule based systems that incorporate a number of sophisticated statistical methods. Generally the products are extensively customizable and some even claim the ability to learn based on the users' preferences.

Those sophisticated and even beautiful algorithms do have a downside, however: For each new trick a spam producer chooses to implement, the content filtering becomes incrementally more complex and computationally expensive.

In sharp contrast to the content filtering, which is based on message content, greylisting is based on studying spam senders' behavior on the network level. The 2003 paper by Evan Harris noted that the vast majority of spam appeared to be sent by software specifically developed to send spam messages, and those systems typically operated in a 'fire and forget' mode, only trying to deliver each message once.

The delivery software on real mail servers, however, are proper SMTP implementations, and since the relevant RFCs state that you MUST retry delivery in case you encounter some classes of delivery errors, in almost all cases real mail servers will retry 'after a reasonable amount of time'.

Spammers do not retry. So if we set up our system to say essentially

"My admin told me not to talk to strangers"

- we should be getting rid of anything the sending end does not consider important enough to retry delivering.

The practical implementation is to record for each incoming delivery attempt at least
4. time of first delivery attempt matching 1) through 3)
5. time delivery of retry will be allowed
6. time to live for the current entry
At the first attempt, the delivery is rejected with temporary error code, typically "451 temporary local problem, try again later", and the data above is recorded. Any subsequent delivery attempts matching fields 1) through 3) that happen before the time specified in field 5) are essentially ignored, treated to the same temporary error. When a delivery matching fields 1) through 3) is attempted after the specified time, the IP address (or in some implementations, the whole subnet) is /whitelisted/, meaning that any subsequent deliveries from that IP address will be passed on to the mail service.

The first release of OpenBSD's spamd to support greylisting was OpenBSD 3.5. spamd's greylisting implementation operates only on individual IP addresses, and by default sets the minimum time before a delivery attempt passes to 25 minutes, the time to live for a greylist entry to 4 hours, while a whitelisted entry stays in the whitelist for 36 days after the delivery of the last message from that IP address. With a properly configured setup, machines that receive mail from your outgoing mail servers will automatically be whitelisted, too.

The great advantage to the greylisting approach is that mail sent from correctly configured mail servers will be let through. New correspondents will experience an initial delay for the first message to get through and their IP address is added to the whitelist. The initial delay will vary depending on a combination of the length of your minimum time before passing and the sender's retry interval. Regular correpondents will find that once they have cleared the initial delay, their IP addresses are kept in the whitelist as long as email contact is a regular affair.

And the technique is amazingly effective in removing spam. 80% to 95% or better reduction in the number of spam messages is frequently cited, but unfortunately only a few reports with actual numbers have been published. An often-cited report is Steve Williams' message on opensd-misc (available among other places at marc.info), where Steve describes how he helped a proprietary antispam device cope with an unexptected malware attack. He notes quite correctly that the blocked messages were handled without receiving the message body, so their apparently metered bandwidth use was reduced.

Even after more than four years, greylisting remains extremely effective. Implementing greylisting greatly reduces the load on your content filtering systems, but since messages sent by real mail servers will be let through, it will sooner or later also let a small number of unwanted messages through, and unfortunately it does not eliminate the need for content filtering altogether. Unfortunately you will still occasionally encounter some sites that do not play well with greylisting, see the references for tips on how to deal with those.

Do we need blacklists?
With greylisting taking care of most of the spam, is there still a place for blacklists? It's a fair question. The answer depends in a large part on how the blacklists you are considering are constructed and how much you trust the people who generate them and the methods they use.

The theory behind all good blacklists is that once an IP address has been confirmed as a source of spam, it is unlikely that there will be any valid mail send from that IP address in the foreseeable future.

With a bit of luck, by the time the spam sender gets around to trying to deliver spam to addresses in your domain, the spam sender will already be on the blacklist and will in turn treated to the s-l-o-w SMTP dialogue.

Knowing how a host makes it into a blacklist is important, but a clear policy for checking that the entries are valid and for removing entries is essential too. Once spam senders are detected, it is likely that their owners will do whatever it takes to stop the spam sending. Another reason to champion 'aggressive maintenance' of blacklists is that it is likely that IP addresses are from time to time reassigned, and some ISPs do in fact not guarantee that a certain physical machine will be assigned the same IP address the next time it comes online.

Your spamd.conf file contains a few suggested blacklists. You should consider carefully which ones to use. Take the time you need to look up the web pages listed in the list descriptions in the spamd.conf file and then decide which lists fit your needs. If you decide to use one or more blacklists, edit your spamd.conf to include those and set up a cron job to let spamd-setup load updated blacklists at regular intervals.

The lists I consider the more interesting ones are the nixspam list, with a 4 day expiry, and the uatraps list, with a 24-hour exiry. The nixspam list is maintained by ix.de, based on their logs of hosts that have verifiably sent spam to their mail servers. The uatraps list is worth looking into too, mainly because it is generated automatically by greytrapping.

Behavior based response: Greytrapping
Greytrapping is yet another useful technique that grew out of hands-on empirical study of spammer behavior, taken from the log data available at ordinary mail servers. You have probably seen spam messages offering lists of "millions of verified email addresses" available. However, verification goes only so far. You can get a reasonable idea of the quality of that verification if you take some time to actually browse mail server logs for failed deliveries to addresses in your domain. In most cases you will find a number of attempts at delivering to addresses that either have never existed or at least have no valid reason to receive mail.

The OpenBSD spamd developers saw this too. They also realized that what addresses are deliverable or not in your own domain is something you have complete control over, and they formulated the following rule to guide a new feature to be added to spamd:
"if we have one or more addresses that we are quite sure will never receive valid email, we can safely assume that any mail sent to those addresses is spam"
that feature was dubbed greytrapping, and was introduced in spamd in time for the OpenBSD 3.7 release. The way it works is, if a machine that is already greylisted tries to deliver mail to one of the addresses on the list of known bad email addresses, that machine's IP address is added to a special local blacklist called spamd-greytrap. The address stays in the spamd-greytrap list for 24 hours, and any SMTP traffic from hosts in that blacklist is treated to the tarpit for the same period.

This is the way the uatraps list is generated. Bob Beck put a list of addresses he has referred to as 'ghosts of usenet postings past' on his local greytrap list, and started exporting the IP addresses he collects automatically to a freely available blacklist. As far as I know Bob has never published the list of email addresses in his spamtrap list, but the machines at University of Alberta appear to be targeted by enough spammers to count. At the time this article was written, the uatraps list typically contained roughly 120,000 addresses, and the highest number of addresses I have seen reported by my spamd-setup was just over 180,000 (it peaked later at just over 670,000 addresses). See Figure 1 for a graphical representation of the number of hosts in the uatraps list over the period February 2006 through early March 2008.

Figure 1: Hosts in uatraps

By using a well maintained blacklist such as the uatraps list you are likely to add a few more percentage points to the amount of spam stopped before it reaches your content filtering or your users, and you can enjoy the thought of actively wasting spammers' time.

A typical log excerpt for a blacklisted host trying to deliver spam looks like this:

Jan 16 19:55:50 skapet spamd[27153]: 82.174.96.131: connected (3/2), lists: uatrapsJan 16 19:59:33 skapet spamd[27153]: (BLACK) 82.174.96.131: <bryonRoe@boxerdelasgargolas.com> -> <schurkoxektk@ehtrib.org>Jan 16 20:01:17 skapet spamd[27153]: 82.174.96.131: From: "bryon Roe" <bryonRoe@boxerdelasgargolas.com>Jan 16 20:01:17 skapet spamd[27153]: 82.174.96.131: To: schurkoxektk@ehtrib.orgJan 16 20:01:17 skapet spamd[27153]: 82.174.96.131: Subject: vresdiamJan 16 20:02:33 skapet spamd[27153]: 82.174.96.131: disconnected after 403 seconds. lists: uatraps

This particular spammer hung around at a rate of 1 byte per second for 403 seconds (six minutes, forty-three seconds), going through the full dialogue all the way up to the DATA part before my spamd rejected the message back to the spammer's queue.

Figure 2: Connection lengths measured at bsdly.net's spamd

That is a fairly typical connection length for a blacklisted host. Statistics from my sites (see Figure 2) show that most connections to spamd last from 0 to 3 seconds, a few hang on for about 10 seconds, and the next peak is at around 400 seconds. Then there's a very limited number that hang around for anywhere from 30 minutes to several hours, but those are too rare to be statistically significant (and damned near impossible to graph sensibly in relation to the rest of the data.

Interaction with a running spamd: spamdb
Your main interface to the contents of your spamd related data is the spamdb administration program. The command

$sudo spamdb without any parameters will give you a complete listing of all entries in the database, whether WHITE, GREY or others. In addition, the program supports a number of different operations on entries in spamd's data, such as adding or deleting entries or changing their status in various ways. For example, $ sudo spamdb -a 192.168.110.12

will add the host 192.168.110.12 to your spamd's whitelist or update its status to WHITE if there was an entry for that address in the database already. Conversely, the command

$sudo spamdb -d 192.168.110.12 will delete the entry for that IP address from the database. For greytrapping purposes, you can add or delete spamtrap email addresses by using a command such as $ sudo spamdb -T -a wkitp98zpu.fsf@datadok.no

Hitting back, poisoning their well: Summary of my field notes
Up util July 2007, I ran my spamd installations with greylisting, supplemented by hourly updates of the uatraps blacklist and a small local list of greytrapping addresses like the one in the previous section, which is obviously a descendant of a message-id, probably harvested from a news spool or from some unfortunate malware victim's mailbox. Then something happened that made me take a more active approach to my greytrapping.

My log summaries showed me an unusually high number of attempted deliveries to non-existent addresses in the domains I receive mail for. Looking a little closer at the actual logs showed spam backscatter: Somebody, somewhere had sent a large number of messages with made up addresses in one of our domains as the From: or Reply-to: addresses, and in those cases the to: address wasn't deliverable either, the bounce messages were sent back to our servers.

The fact that they were generating bounces to the spam messages indicates that any copies of those messages directed at actually deliverable addresses in those domains would have been delivered to actual users' mailboxes, not too admirable in itself.

Another variety that showed up when I browsed the spamd logs was this type:

Jul 13 14:36:50 delilah spamd[29851]: 212.154.213.228: Subject: Considered UNSOLICITED BULK EMAIL, apparently from youJul 13 14:36:50 delilah spamd[29851]: 212.154.213.228: From: "Content-filter at srv77.kit.kz" <postmaster@srv77.kit.kz>Jul 13 14:36:50 delilah spamd[29851]: 212.154.213.228: To: <skulkedq58@datadok.no>

which could only mean that the administrators at that system had not yet learned that spammers no longer use their own From: addresses.

Roughly at that time it struck me:
1. Spammers, one or more groups, are generating numerous fake and nondeliverable addresses in our domains.
2. adding those generated addresses to our local list of spamtraps is mainly a matter of extracting them from our logs
3. if we could make the spammers include those addresses in their To: addresses, too, it gets even easier to stop incoming spam and shift the spammers to the one-byte-at-a-time tarpit. Putting the trap addresses on a web page we link to from the affected domains' home pages will attract the address slurping robots sooner or later.
or the short version: Let's poison their well!

(Actually in the first discussions about this with my BLUG user group friends, we referred to this as 'brønnpissing' in Norwegian, which translates as 'urinating in their well'. The more detailed descriptions of the various steps in the process can be tracked via blog entries at http://bsdly.blogspot.com, starting with the entry dated Monday, July 9th, 2007, Hey, spammer! Here's a list for you!.)

Over the following weeks and months I collected addresses from my logs and put them on the web page at http://www.bsdly.net/~peter/traplist.shtml.

After a while, I determined that harvesting the newly generated soon-to-be-spamtrap addresses directly from our greylist data was more efficient and easier to script than searching the mail server logs. Using spamdb, you can extract the current contents of the greylist with

$sudo spamdb | grep GREY which produces output in the format GREY|96.225.75.144|Wireless_Broadband_Router|<aguhjwilgxj@bn.camcom.it>|<bsdly@bsdly.net>|1198745212|1198774012|1198774012|1|0GREY|206.65.163.8|outbound4.bluetie.com|<>|<leonard159@datadok.no>|1198752854|1198781654|1198781654|3|0GREY|217.26.49.144|mxin005.mail.hostpoint.ch|<>|<earle@datadok.no>|1198753791|1198782591|1198782591|2|0 where GREY is what you think it is, the IP address is the sending host's address, the third entry is what the sender identified as in the SMTP dialogue (HELO/EHLO), the fourth is the From: address, the fifth is the To: address. The next three are date values for first contact, when the status will change from GREY to WHITE and when the entry is set to expire, respectively. The final two fields are the number of times delivery has been blocked from that address and the number of conntections passed for the entry. For our purpose, extracting the made up To: addresses in our domains from backscatter bounces, it is usually most efficient to search for the "<>" indicating bounces, then print the fifth field. Or, expressed in grep and awk: $ sudo spamdb | grep "<>" | awk -F\| '{print $5}' | tr -d '<>' | sort | uniq will give you a sorted list of unique intended bounce-to addresses, in a format ready to be fed to a corresponding script for feeding to spamd. The data above and the command line here would produce earle@datadok.noleonard159@datadok.no - in some situations, the list will be a tad longer than in this ilustration. This does not cover the cases where the spammers apparently assume that any mail with From: addresses in the local domain will go through, even when they come from elsewhere. Extracting the fourth column instead # spamdb | grep GREY | awk -F\| '{print$4}' | grep mydomain.tld |  tr -d '<>' | sort | uniq

will give you a list of From: addresses in your own domain to weed out a few more bad ones from.

After a while, I started seeing very visible and measurable effects. At short intervals, we see spam runs targeting the addresses in the published list, working their way down in more or less alphabetical order. For example, in my field notes dated November 25, 2007, I noted

"earlier this month the address capitalgain02@gmail.com
 started appearing frequently enough that it caught my
 attention in my greylist dumps and log files. The earliest contact as far as I can see was at
 Nov 10 14:30:57, trying to spam wkzp0jq0n6.fsf@datadok.no
 from 193.252.22.241 (apparently a France Telecom customer).
 The last attempt seems to have been ten days later, at
 Nov 20 15:20:31, from the Swedish machine 217.10.96.36. My logs show me that during that period 6531 attempts
 had been made to deliver mail from capitalgain02@gmail.com
 via bsdly.net, from 35 different IP addresses, to 131 different
 recipients in our domains. Those recipients included three
 deliverable addresses, mine or aliases I receive mail for.
 None of those attempts actually succeeded, of course."

It is also worth noting that even a decreipt the Pentium III 800MHz (since replaced with a Pentium 4 box, donations of more recent hardware gratefully accepted) at the end of the unexciting DSL line to my house has been able to handle about 190 simultaneous connections from TRAPPED addresses without breaking into a sweat. For some odd reason, the number of simultaneous connection a the other sites I manage with better bandwidth have not been as high as the ones from my home gateway.

During the months I've been running the trapping experiment, the number of spamtrap addresses in the published list has grown to more than 10,000 addresses (by May 4th, 2013, the list had grown to 24431 entries). Oddly enough, my greylist scans still show up a few more every few days.

Meanwhile, my users report that spam in their mailboxes is essentially non-existent. On the other side of the fence, there are indications that it may have dawned on some of the spammers that generating random addresses in other people's domains might end up poisoning their own well, so they started introducing patterns to be able to weed out their own made up addresses from their lists. I take that as a confirmation that our harvesting and republishing efforts have been working rather well.

The method they use is to put some recognizable pattern into the addresses they generate. One such pattern is to take the victim domain name, prepend "dw" and append "m" to make up the local part and then append the domain, so starting from sia.com we get dwsiam@sia.com.

There is one other common variation on that theme, where the prepend string is "lin" and the append string is "met", producing addresses like linhrimet@hri.de. Then again when they use that new, very recognizable, address to try to spam my spamtrap address malseeinvmk@bsdly.net, another set of recognition mechanisms are activated, and the sending machine is quietly added to my spamd-greytrap. (We've since seen other patterns come and go, scanning the list at http://www.bsdly.net/~peter/traplist.shtml will see examples of them all).

And finally, there are clear indications that spammers use slightly defective relay checkers that tend to conclude that a properly configured spamd is an open relay, swelling my greylists temporarily. We already know that the spammers do not use From: addresses they actually receive mail for, and consequently they will never know that those messages were in fact never delivered.

If you've read this far and you're still having fun, you can find other anecdotes I would have had a hard time believing myself a short time back in my field notes at . By the time the magazine has been printed and distributed (or by the time you find this revised article online), there might even be another few tall tales there.

You might also want to read

The Book of PF, 2nd Edition, by Peter N. M. Hansteen, No Starch Press November 2010 (covers both pre-4.7 and post-4.7 syntax), available in better bookshops or from the publisher

The Next Step in the Spam Control War: Greylisting, by Evan Harris. Available at http://greylisting.org/articles/whitepaper.shtml

Maintaining A Publicly Available Blacklist - Mechanisms And Principles, April 14, 2013 describes the maintenance regime for the published version of my spamd-greytrap list

In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe, May 28, 2012, offers another, more OpenBSD-centric, recipe for setting up a spamd based system.

This article originally appeared in BSD Magazine #2, June 2008. This re-publication has suffered only minor updates and edits.

If you're interested in OpenBSD in general, you have a real treat coming up in the form of Michael W. Lucas' Absolute OpenBSD, 2nd edition. If a firewall or other networking is closer to your heart, you could give my own The Book of PF and the PF tutorial (or here) it grew out of. You can even support the OpenBSD project by buying the books from them at the same time you buy your CD set, see the OpenBSD Orders page for more information.

Upcoming talks: I'll be speaking at BSDCan 2013, on The Hail Mary Cloud And The Lessons Learned, with a preview planned for the BLUG meeting a couple of weeks before the conference. There will be no PF tutorial at this year's BSDCan, fortunately my staple tutorial item was crowded out by new initiatives from some truly excellent people.
May 02, 2013
You've Installed It. Now What? Packages! (May 02, 2013, 00:05 UTC)
Once you've installed your OpenBSD system, packages are there to make your life easier. A works for me/life is good guide for your weekend reading.

Installing OpenBSD is easy, and takes you maybe 20 minutes. Most articles and guides you find out there will urge you to take a look at the files in /etc/ and explore the man pages to make the system do what you want. With a modern BSD, the base system is full featured enough that you can in fact get a lot done right away just by editing the relevant files and perhaps starting or restarting one or more services. If all you want to do is set up something like a gateway for your network with basic-to-advanced packet filtering, everything you need is already there in the basic install.

Then again, all the world is not a firewall, and it is likely you will want to use, for example, a web browser other than the venerable lynx or editing tools that are not vi or mg. That's where packages and package systems come in. I'll skip a little ahead of myself and make a confession: The machine I'm writing this piece on reports that it has some 381 packages installed.

Before we move on to the guts of this article, some ceremonial words of advice: If you're new to OpenBSD or it's your first time in a while on a freshly installed system, you could do a lot worse than spending a few minutes reading man afterboot. That man page serves as a handy checklist of things you should at least take a peek at to ensure that your system is in good working order.

Some packages will write important information, such as strings or stanzas to put in your rc.conf.local, rc.local or sysctl.conf files, to your terminal. If you're not totally confident what to do after the package install finishes, it may be a good idea to run your ports and packages installs in a script(1) session. See man script for details.

When dinosaurs roamed the Earth ...
The story of the ports and packages goes back to the early days of free software when we finally found ourselves with complete operating systems that were free and hackers^H^H^H^H^H^H system administrators found that even with full featured operating systems such as the BSDs, there were sometimes things you would want to do that was not already in there.

The way to get that something else was usually to fetch the source code, see if it would compile, make some changes (or a lot) to make it compile, possibly introduce the odd #ifdef block and keep at it until the software would compile, install and run. In the process you most likely found out what, if any, other software (tools or libraries) needed to be installed to complete the process. At that point, you could claim to have ported the software to your platform. If you had been careful and saved a copy of the original source files somewhere, you could use the diff(1) utility to create a patch you could then send to the program maintainer and hope that he or she would then incorporate your changes in the next release.

But then, why wait for the next release? Why not share those diffs with others? How about putting it into a CVS repository that would be available to everyone? That idea was tossed around on relevant mailing lists for a while, and the first version of the ports system appeared in FreeBSD 1.0 in December 1993.

The other BSD systems adopted the basic idea and framework soon after, with small variations. On NetBSD, the term port was already in use for ports of the operating system itself to specific hardware platforms, so on that operating system, the ports tree is referred to as 'package source', or pkgsrc for short. The ports and packages tools are still actively maintained and developed on all BSDs, and most notably Marc Espie rewrote the pkg_* tools for OpenBSD's 3.5 release. Marc and other OpenBSD developers have been refining the package tools with every release since then.

Parallel development has lead to some differences in the package handling on the various BSDs, and some of the operations I describe here from an OpenBSD perspective may not be identical on other operating systems.

Around the same time the BSDs started including a ports tree and packages, people on the Linux side of the fence started developing package systems too. With distributed development taken to the point where the kernel, basic system tools and libraries are maintained separately, perhaps the need there was even greater than on the BSDs.

In fact, some Linux distributions such as the Debian based ones have taken the package management to the point where 'everything is a package' - every component on a running system is a package that is maintained via the package system, including basic system tools, libraries and the operating system kernel. In contrast, the BSDs tend to treat the base system as a whole, with the package management tools intended solely for managing software that does not come as a part of the default install.

The anatomy of ports and packages
The ports system consists of a set of 'recipes' to build third party software to run on your system. Each port supplies its own Makefile, whatever patches are needed in order to make the software build ande optionally package message files with information that will be displayed when the software has been installed.

So to build and install a piece of software using the ports system, you follow a slightly different procedure than the classical fetch - patch - compile cycle. You will need to install the ports tree, either by unpacking ports.tar.gz from your CD set or by checking out an updated version via cvs. With a populated ports tree in hand, you can go to the port's directory, say

$cd /usr/ports/misc screen to see about installing screen, the popular GNU multi-screen window manager. On a typical OpenBSD system, that directory contains the following files: $ls -l
total 20drwxr-xr-x  2 root  wheel   512 Mar 31 16:46 CVS-rw-r--r--  1 root  wheel  1047 Mar 28 17:34 Makefile-rw-r--r--  1 root  wheel   283 Apr  5  2007 distinfodrwxr-xr-x  3 root  wheel   512 Jun 26  2012 patchesdrwxr-xr-x  3 root  wheel   512 Mar 11  2012 pkg

here, the Makefile is the main player. If you open it now in a text editor or viewer such as less, you will see that the syntax is quite straightforward. What it does is mainly to define a number of variables such as the package name, where to fetch the necessary source files, which programs are required for the compile to succeed and which libraries the resulting program will need to have present in order to run correctly. The file defines a few other variables too, and you can look up the exact meaning of each in the man pages, starting with man ports and man bsd.port.mk

With all relevant variables set, at the very end the file uses the line

.include <bsd.port.mk>

to pull in the common infrastructure it shares with all other ports. This is what makes the common targets work, so for example, typing

$sudo make install  (probably the most common port-related make command for end users and administrators) in the port directory will start the process to install the software. But before you type that command and press Enter, you may want to consider this: This command will generate a lot of output, most likely more than will fit in the terminal's buffer. If the build fails, it is likely that the message about the first thing that went wrong will have scrolled off the top of your screen and out of the terminal buffer. For that reason, it is good sysadmin practice to create a record of lengthy operations such as building a port by using the script command. Typing script in a shell will give you a subshell where everything displayed on the screen will be saved in a file. Escape sequences, asterisk-style progress bars and 'twirling batons' will end up a bit garbled, but that essential message you are looking for will be there too. man script will give you the details, and unless you're an incurable packrat, do remember to delete the typescript file afterwards. That process will start with checking dependencies, go on with downloading the source archive and checking that the fetched file matches the cryptographic signatures stored in the distinfo file. If the signatures match, the source code is extracted to a working directory, the patches from the patches/ directory are applied, and the compilation starts. If the dependency check finds that one or more pieces are missing, you will see that the process fetches, configures and installs the required package before continuing with the build process for the original package. After a while, the package build most likely succeeds and the install completes. At this point you will have a new piece of software installed on your system. You should be able to run the program, and the installed package will turn up in the package listings output by pkg_info, such as $ pkg_info | grep screenscreen-4.0.3p3      multi-screen window manager

This information is taken from the package's subdirectory in /var/db/pkg, where the information about currently installed packages is stored.

If you paid close attention during the make install process, you may have noticed that the install step was performed from a binary package. This is one of the distinctive features of the OpenBSD version of the package system. The package build always generates an installable package based on a 'fake' install to a private directory, and software is always installed on the target system from a package. And now we should mention that on a typical modern OpenBSD system, you wouldn't want to install GNU Screen at all. Since the OpenBSD 4.6 release, equivalent (or better!) functionality has been included in the OpenBSD base system via tmux(1).

But you don't need to do that!

This means several things. If you have built and installed a package by typing make install in the relevant ports directory and later run the make deinstall or pkg_delete to remove the software, any subsequent install of the software will take place from the package file stored in a subdirectory of /usr/ports/packages.

But more importantly, in most cases you can keep your system's packages up to date without a ports tree on the machine. The main exceptions to the rule that precompiled packages are available from the mirrors are software with licenses that do not allow redistribution or require the end user to do specific things such as go to a web site and click a specific button to formally accept a set of conditions. In those cases it cant' be helped, and you will need to go via the ports system to create a package locally and install that.

For each release, a full set of packages is built and made available on the OpenBSD mirrors, and by the time you read this, there is reason to hope that running updates to -stable packages will be available for supported releases too.
The way to make good use of this is to set the PKG_PATH variable to include the packages directory for your release on one or more mirrors close to you and/or a local directory, and then run pkg_add with the -u flag.

My laptop runs -current and I'm based in Europe, so the PKG_PATH is set to
PKG_PATH=ftp://ftp.eu.openbsd.org/pub/OpenBSD/snapshots/packages/uname -m/

On a more conservatively run system, you may want to set it to something like
PKG_PATH=ftp://ftp.eu.openbsd.org/pub/OpenBSD/uname -r/packages/uname -m/

If you want to find out what packages are available at your favorite mirror, you can get a listing of package names by fetching the file $PKG_PATH/index.txt. Another nice resource is openports.se, which offers a nice clickable interface. Once your PKG_PATH is set to something sensible, you can use pkg_add and the package base name to install packages, so a simple $ sudo pkg_add screen

would achieve the same thing as the 'make install' command earlier (minus the lengthy compilations, and still assuming that you would want to install the package instead of getting to know tmux(1), which is included in the base system.), and most likely a lot faster too.

Once you have a set of packages installed, and keeping in mind that you need a meaningful PKG_PATH, you can keep them up to date using pkg_add -u. If you want more detailed information about the package update process and want pkg_add to switch to interactive mode when necessary, you can use something like this command:
$sudo pkg_add -vui I have at times tended to run my pkg_add -u with some of the -F flags in order to force resolution of certain types of conflict, but given the quality of the work that goes into the packages, most of the -F options are rarely needed. pkg_add and its siblings in the pkg_* tools collection has a number of options we have not covered here, all intended to make your package management on OpenBSD as comfortable and flexible as possible. The tools come with readable man pages, and may very well be the topic of future articles. You should also be aware that Michael W Lucas's Absolute OpenBSD, 2nd Edition is about to be released (already available as an ebook), with a more in-depth treatment of the package system than what I've presented here. Look at the end of the article for further links. How do I make a package then? That is a large question, and the first question you should ask if you think you want to port a particular piece of software is, "Has this already been ported?". There are several ways to check. If you are thinking of creating a port, you most likely already have the ports tree installed, so using the ports infrastructure's search infrastructure is the obvious first step. Simply go to the /usr/ports directory and run the command $ make search key=mykeyword

where mykeyword is a program name or keyword related to the software you are looking for. One other option with even more flexible search possibilities is to install databases/sqlports. And of course, searching the ports mailing list archives (http://marc.info/?l=openbsd-ports) or asking the mailing list works too. When you have determined that the software you want to port is not already available as a package, you can go on to prepare for the porting effort. Porting and package making is the subject of much usenet folklore and rumor, but in addition you have several man pages with specific information on how to proceed. These are, ports(7), package(5), packages(7), packages-specs(7), library-specs(7)and bsd.port.mk(5).

Read those and use your familiarity with the code you are about to port to find your way. The OpenBSD web offers a quite a bit of information too. You could start with re-reading the main ports and packages page at http://www.openbsd.org/faq/faq15.html, and follow up with the pages about the porting process at http://www.openbsd.org/porting.html, testing the port at http://www.openbsd.org/porttest.html and finally the checklist for a sound port at http://www.openbsd.org/checklist.html.

All the while, try first to figure out the solution to any problems that pop up, read the supplied documentation, and only then ask port maintainers via the ports mailing list for help. Port maintainers are generally quite busy, but if you show signs of having done your homework first, there is no better resource available for helping you succeed in your porting or port maintenance efforts.

One fine resource for the aspiring porter is Bernd Ahlers' ports tutorial from OpenCon 2007 (hm. doesn't that need a refresh?), you can look up Bernd's slides at http://www.openbsd.org/papers/opencon07-portstutorial/index.html, and it is possible he can be persuaded to repeat the tutorial at a conference near you. And for some recent advances in the OpenBSD ports and packages system, see Marc Espie's EuroBSDCon 2012 presentation Advances in packages and ports in OpenBSD.

The main source of information about the OpenBSD ports and packages system is to be found on the OpenBSD project's web site. The FAQ's ports and packages section at http://www.openbsd.org/faq/faq15.html has more information about all the issues covered in this article, and goes into somewhat more detail than space allows here. If you encounter problems while installing or managing your packages, it is more than likely that you will find a solution or a good explanation there. And of course, if nothing else works or you can't figure it out, there is always the option of asking the good people at misc@openbsd.org or ports@openbsd.org (do read the OpenBSD Mailing Lists page before just butting in) or search the corresponding mailing list archives.

An earlier version of this article appeared in BSD Magazine 2/2008. You can now also find this updated version featured at OpenBSD Journal (aka undeadly.org), the primary OpenBSD news site.

If you're interested in OpenBSD in general, you have a real treat coming up in the form of Michael W. Lucas' Absolute OpenBSD, 2nd edition. If a firewall or other networking is closer to your heart, you could give my own The Book of PF and the PF tutorial (or here) it grew out of. You can even support the OpenBSD project by buying the books from them at the same time you buy your CD set, see the OpenBSD Orders page for more information.

Upcoming talks: I'll be speaking at BSDCan 2013, on The Hail Mary Cloud And The Lessons Learned, with a preview planned for the BLUG meeting a couple of weeks before the conference. There will be no PF tutorial at this year's BSDCan, fortunately my staple tutorial item was crowded out by new initiatives from some truly excellent people.
April 30, 2013
a.k.a. phk
Solskin ind i grundloven! (April 30, 2013, 08:11 UTC)
En af de mere beundrede og citerede højesteretsdommere fra USA, var Louis brandeis, om hvilket meget godt er sagt og skrevet. Meget få af disse skriverier når dog hans pen til sokkeholderne og allermest berømt er han dog for udtrykket "Sunshine is said to be the best disinfectants" der stammer f...
April 28, 2013
a.k.a. pto
UNIX pusleri: en hurtigere uptime? (April 28, 2013, 18:11 UTC)
Et af de steder jeg er lidt træt af UNIX er "uptime", som viser tre system-"load" værdier - i snit over det sidste et, fem og ti minutter. Vi kan diskutere hvad load er, men min pointe er nærmere de tidskonstanter. Hvordan får man tilsvarende på en meget mindre tids-skala? Eksempel: $uptime 17... April 26, 2013 The Slime Also Evolves: New bruteforce ssh attempts come in at 10 second intervals, and they keep going even if we block them. Is this the warmup to a new iteration of the Hail Mary Cloud? Note: This post has been updated with a correction, see the end of the article. Regular readers will remember the activities of the Hail Mary Cloud, which turned up in authentication logs with large numbers of unsuccessful ssh login attempts, apparently coordinated across a large number of source IP addresses and with any individual host in the attacker set making a new attempts at intervals of anything from several seconds to several minutes. At the time, commentators took these activites either as an indication of a truly inspired idea from a brilliant mind (after all, avoiding detection is essential) or a token of almost unimaginable ineptitude or perhaps just an overdose of faith that if you keep going long enough, even extremely unlikely things will happen. It's been a litte while now since we last saw the slow, distributed bruteforce attacks at work here at the BSDly labs (we've kept collecting data here), but one curious incident during the last week indicates that somebody, somewhere is still working on ssh cracking scripts that operate on fairly similar methods. Bruteforce attacks can be fairly easy to detect and head off. In most cases the attacker comes in with a larger than usual number of login attempts in rapid succession from a single IP address, and with modern tools such as OpenBSD's PF packet filter, you can set up rules that use state tracking options to intercept. The phenomenon is common enough that the bruteforce avoidance section is one of the more popular parts of my online PF tutorial (and of course, a slightly expanded version is avavailable in The Book of PF). I wouldn't publish or recommend anything that I haven't at least tried myself, so just to illustrate, [Fri Apr 06 14:48:21] peter@skapet:~$ sudo grep bruteforce /etc/pf.conftable <bruteforce> persist countersblock log (all) quick from <bruteforce>pass log (all) proto { tcp, udp } to port ssh keep state (max-src-conn 15, max-src-conn-rate 7/4, overload <bruteforce>
The PF rules on BSDly.net's gateway have something much like the published example. This means that a traditional bruteforce attempt will end up something like this:
[Fri Apr 06 15:30:38] peter@skapet:~$grep 203.34.37.62 /var/log/authlogApr 5 17:42:36 skapet sshd[32722]: Failed password for root from 203.34.37.62 port 44936 ssh2Apr 5 17:42:36 skapet sshd[32722]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:38 skapet sshd[26527]: Failed password for root from 203.34.37.62 port 45679 ssh2Apr 5 17:42:38 skapet sshd[26527]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:41 skapet sshd[29912]: Invalid user db2inst1 from 203.34.37.62Apr 5 17:42:41 skapet sshd[29912]: Failed password for invalid user db2inst1 from 203.34.37.62 port 46283 ssh2Apr 5 17:42:41 skapet sshd[29912]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:43 skapet sshd[30349]: Failed password for root from 203.34.37.62 port 46898 ssh2Apr 5 17:42:43 skapet sshd[30349]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:46 skapet sshd[25557]: Invalid user prueba from 203.34.37.62Apr 5 17:42:46 skapet sshd[25557]: Failed password for invalid user prueba from 203.34.37.62 port 47495 ssh2Apr 5 17:42:46 skapet sshd[25557]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:48 skapet sshd[5380]: Failed password for bin from 203.34.37.62 port 48087 ssh2Apr 5 17:42:48 skapet sshd[5380]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:51 skapet sshd[23635]: Invalid user postgres from 203.34.37.62Apr 5 17:42:51 skapet sshd[23635]: Failed password for invalid user postgres from 203.34.37.62 port 48658 ssh2Apr 5 17:42:51 skapet sshd[23635]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:54 skapet sshd[2450]: Failed password for root from 203.34.37.62 port 49307 ssh2Apr 5 17:42:54 skapet sshd[2450]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:56 skapet sshd[16673]: Failed password for root from 203.34.37.62 port 49910 ssh2Apr 5 17:42:57 skapet sshd[16673]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:42:59 skapet sshd[17522]: Failed password for root from 203.34.37.62 port 50503 ssh2Apr 5 17:42:59 skapet sshd[17522]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:43:02 skapet sshd[4633]: Invalid user mythtv from 203.34.37.62Apr 5 17:43:02 skapet sshd[4633]: Failed password for invalid user mythtv from 203.34.37.62 port 51218 ssh2Apr 5 17:43:02 skapet sshd[4633]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:43:05 skapet sshd[25728]: Failed password for root from 203.34.37.62 port 51849 ssh2Apr 5 17:43:05 skapet sshd[25728]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:43:08 skapet sshd[10487]: Failed password for root from 203.34.37.62 port 52565 ssh2Apr 5 17:43:08 skapet sshd[10487]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:43:10 skapet sshd[31156]: Failed password for root from 203.34.37.62 port 53264 ssh2Apr 5 17:43:11 skapet sshd[31156]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth]Apr 5 17:43:13 skapet sshd[31956]: Invalid user mmroot from 203.34.37.62Apr 5 17:43:13 skapet sshd[31956]: Failed password for invalid user mmroot from 203.34.37.62 port 53958 ssh2Apr 5 17:43:13 skapet sshd[31956]: Received disconnect from 203.34.37.62: 11: Bye Bye [preauth] And looking up the current contents of the table shows our new perpetrator has indeed been caught: [Fri Apr 06 15:34:23] peter@skapet:~$ sudo pfctl -t bruteforce -vT show91.197.131.24 Cleared:            Thu Apr  5 20:22:29 2012 In/Block:           [ Packets: 1                  Bytes: 52                 ] In/Pass:            [ Packets: 0                  Bytes: 0                  ] Out/Block:          [ Packets: 0                  Bytes: 0                  ] Out/Pass:           [ Packets: 0                  Bytes: 0                  ]200.11.174.131 Cleared:            Thu Apr  5 19:09:30 2012 In/Block:           [ Packets: 1                  Bytes: 52                 ] In/Pass:            [ Packets: 0                  Bytes: 0                  ] Out/Block:          [ Packets: 0                  Bytes: 0                  ] Out/Pass:           [ Packets: 0                  Bytes: 0                  ]203.34.37.62 Cleared:            Thu Apr  5 17:43:13 2012 In/Block:           [ Packets: 1                  Bytes: 52                 ] In/Pass:            [ Packets: 0                  Bytes: 0                  ] Out/Block:          [ Packets: 0                  Bytes: 0                  ] Out/Pass:           [ Packets: 0                  Bytes: 0                  ]
The table data show us one more thing worth noting: All of these bruteforcers sent exactly one packet after they were blocked, and gave up right away when they noticed they were blocked.

On Sunday, April 1st 2012, I noticed an unusually high number of ssh login attempts coming from two Chinese addresses (58.214.5.51 and 61.160.76.123), amazingly persistent and for some reason they had not been caught by my bruteforce avoidance rules. Thinking I'd simply adjust my rate settings, I simply added those addresses to the table by hand and started looking at the authentication log versus my rule set. Then a little while later, I noticed that instead of just bowing out after blocking, these two kept going. (I also tweeted about this, however not accurate in all details, at the time)

A little later that same evening, the table looked like this:
[Sun Apr 01 22:58:02] peter@skapet:~$sudo pfctl -t bruteforce -vT show58.51.95.75 Cleared: Sun Apr 1 22:05:29 2012 In/Block: [ Packets: 1 Bytes: 52 ] In/Pass: [ Packets: 0 Bytes: 0 ] Out/Block: [ Packets: 0 Bytes: 0 ] Out/Pass: [ Packets: 0 Bytes: 0 ]58.214.5.51 Cleared: Sun Apr 1 14:06:21 2012 In/Block: [ Packets: 3324 Bytes: 199440 ] In/Pass: [ Packets: 0 Bytes: 0 ] Out/Block: [ Packets: 0 Bytes: 0 ] Out/Pass: [ Packets: 0 Bytes: 0 ]61.91.125.115 Cleared: Sun Apr 1 03:10:05 2012 In/Block: [ Packets: 1 Bytes: 52 ] In/Pass: [ Packets: 0 Bytes: 0 ] Out/Block: [ Packets: 0 Bytes: 0 ] Out/Pass: [ Packets: 0 Bytes: 0 ]61.160.76.123 Cleared: Sun Apr 1 14:07:08 2012 In/Block: [ Packets: 3262 Bytes: 195720 ] In/Pass: [ Packets: 0 Bytes: 0 ] Out/Block: [ Packets: 0 Bytes: 0 ] Out/Pass: [ Packets: 0 Bytes: 0 ] The two hosts kept coming, at a rate of roughly one attempt every ten seconds, and apparently ignored the fact that they were blocked in the packet filter rules and would be getting connection refused errors for each attempt. Looking at the log data (preserved here along with data from various other attempts from other sources in the relevant period), both hosts were busy trying to guess root's password from the time they started until they were blocked. When the block expired after 24 hours, they had both apparently proceeded down similiar lists of user names and were busy with rooter): Apr 2 14:10:06 skapet sshd[13332]: Invalid user rooter from 61.160.76.123Apr 2 14:10:06 skapet sshd[13332]: input_userauth_request: invalid user rooter [preauth]Apr 2 14:10:06 skapet sshd[13332]: Failed password for invalid user rooter from 61.160.76.123 port 46578 ssh2Apr 2 14:10:06 skapet sshd[13332]: Received disconnect from 61.160.76.123: 11: Bye Bye [preauth]Apr 2 14:10:14 skapet sshd[30888]: Invalid user rooter from 58.214.5.51Apr 2 14:10:14 skapet sshd[30888]: input_userauth_request: invalid user rooter [preauth]Apr 2 14:10:14 skapet sshd[30888]: Failed password for invalid user rooter from 58.214.5.51 port 47587 ssh2Apr 2 14:10:14 skapet sshd[30888]: Received disconnect from 58.214.5.51: 11: Bye Bye [preauth] They both kept going afterwards, at roughly the same rates as before. The host at 61.160.76.123 kept varying its rate and at one point sped up enough that it triggered the automatic bruteforce blocking. After running a fairly familiar alphabetic progression through a list of supposed user names, the remaining host finally gave up during the first hour of April 3rd, by CEST time: Apr 3 00:36:24 skapet sshd[30287]: Received disconnect from 58.214.5.51: 11: Bye Bye [preauth]Apr 3 00:36:33 skapet sshd[27318]: Invalid user clodia from 58.214.5.51Apr 3 00:36:33 skapet sshd[27318]: input_userauth_request: invalid user clodia [preauth]Apr 3 00:36:33 skapet sshd[27318]: Failed password for invalid user clodia from 58.214.5.51 port 58185 ssh2Apr 3 00:36:33 skapet sshd[27318]: Received disconnect from 58.214.5.51: 11: Bye Bye [preauth] Before we go into further details, I have a question for you, dear reader: Did anything like this turn up in your authentication logs during the same rough time frame? If your logs show something similar, please drop me a line at (lightly obfuscated) peter at bsdly dot se. It could be instructive to compare this last batch with the previous samples. The log format differs slightly, since the previous attempts were aimed at FreeBSD machines, while this last round was aimed at a single OpenBSD host. The whois information for the two hosts (58.214.5.51 and 61.160.76.123) both point to Chinese networks, as far as I can tell in the same provice and possibly in the same city, Wuxi, which appears to be one of several Chinese tech cities. The slow rate of the login attempts and the sequence of user names attempted are both similar enough to the earlier distributed attempts that it's possible this is a limited experiment by the developers of the previous bruteforcing malware. The rate of roughly one attempt per host per 10 seconds is a significant speedup compared to the previous attempts, and it fits in the interval where blocking due to the rate of connections would most likely produce an unacceptably high number of false positives. It will be interesting to see what rate of incoming connection the next full scale attempts will be using. It is possible that the source addresses are somewhere close to the actual whereabouts of the malware developers, but at this point it's pure speculation. At this point we can only keep watching our logs and make sure that our sshd configurations are the best possible shape. If you need up to date advice on how to configure and use SSH safely, you could do significantly worse than grabbing Michael W. Lucas' recent SSH book SSH Mastery. Update 2013-04-25: Revisiting the data in preparation for my BSDCan 2013 talk (also to be featured or rather previewed at tonight's BLUG meeting), I realized that a trivial scripting error had lead me to draw false conclusions. The total number of attempts is correct, but both the number of hosts involved and the number of user names attempted were seriously off. The two hosts I mentioned in the article were the most active, but actually a total of 23 hosts participated, trying for a total of 1081 user names. Full data available here. It seems the Hail Mary Cloud had shrunk, but not completely vanished as I thought at the time. a.k.a. cb400f Stallman speaks in Copenhagen (April 26, 2013, 13:36 UTC) This week the founder of the GNU project and the Free Software Foundation, Richard Stallman (RMS) gave two presentations at the Technical University of Denmark. The events were organized by KLID. On Wednesday the topic was Copyright vs. Community. Discussing the history of copyright, how it is being extended and (mis)used, and how Richard Stallman proposes to reform copyright. On Thursday night the topic was A Free Digital Society. Covering a wide range of topics including privacy, censorship, electronic voting, software freedom, DRM and streaming services, software patents, services as a software substitution and also the EU Unitary Patent. Here is Stallman auctioning off a plush GNU at the end of the talk. My apologies for the poor photo. As always it was a very interesting and entertaining experience. Video recordings were made and should be up on the KLID website before too long. If you can’t wait, you can download other audio and video recordings of Stallman’s speeches here: http://audio-video.gnu.org/ Stallman also gave away stickers and sold various other items for the support of the FSF. April 21, 2013 a.k.a. pto Har du ikke lært at debugge kode? (April 21, 2013, 19:53 UTC) På de videregående uddannelser lærer man at udvikle kode (at programmere), men det har slået mig at kunsten at debugge kode lærer man ikke under uddannelse - det er da hul i hovedet. På en af mine t-shirts har jeg teksten "Half of programming is coding. The other 90% is debugging". Ikke helt skæv... April 20, 2013 a.k.a. jlouis Acme as an editor (April 20, 2013, 11:38 UTC) On using Acme as a day-to-day text editor I've been using the Acme text editor from Plan9Port as my standard text editor for about 9 months now. Before that, I have used Emacs and Vim quite a lot. I never really got the hang of either Sublime Text or TextMate. The latter because I couldn't run it on all operating systems, the former because is was too new to bother. With Acme, you sacrifice almost everything. There is no configuration file. This is a plus. I have spent way too much time messing with configuration, where I should have been messing with adaptation. The acme defaults are designed to be sensible and to be easy to work with. The standard font choice works well, and even though it is not antialiased by default, I tend to like the font nonetheless. Other sacrifices are syntax highlighting, automatic indentation, specific language support and so on. But you gain that the editor always work and there are no upgrades which bother you when working. The editor is built to be a window manager of sorts and you use it as a hub to connect other software together. This hub becomes the main focus of everything you are doing. What pleases me about the acme editor is that it is simple. You can learn most things it can do in a week and then the power stems from combination of those simple things. It is very much a Zen-style editor with few things going on. You will have to like that choices have been made for you and that you have to adapt to those. But with this editor I spend much more time working on code bases than trying to get my editor to behave. Setting up acme • Grab Plan 9 from User Space and ./INSTALL it somewhere. This gives you the basis environment, but it does require some help to get running. • Grab Plan 9 setup which is my small tools as shell scripts to manipulate files • I have some changes to $HOME/.profile:
export BROWSER='chromium'
PLAN9=/home/jlouis/P/plan9
PATH=$PATH:$PLAN9/bin

# Plumb files instead of starting new editor.
EDITOR=E
unset FCEDIT VISUAL

# Get rid of backspace characters in Unix man output.
PAGER=nobs

# Default font for Plan 9 programs.
font=$PLAN9/font/lucsans/euro.8.font # Equivalent variables for rc(1). home=$HOME
prompt="$H=; " user=$USER

export \
BROWSER\
⋯

PLAN9\
⋯
font\
home\
prompt\
user

• Acme is started once through the acme-start.rc script. This also starts the plumber service.
• $HOME/lib/plumbing is linked so I get some additional plumbing rules in addition to the default rules primarily quick access to github stuff On mouse usage Acme requires a good mouse to be really effective. Find a gaming mouse with good DPI resolution and then proceed to configure it so it has acceleration and sensitivity settings that you like. It shouldn't be needed to move the mouse too much, yet the movement should be precise. Some mice has microprocessors in them which smooths movement so when you sweep a line, it is easier to stay on the line. It all depends on what mouse you have. In a moded editor like vim, you are usually either in command mode for cursor movement or in insert mode for entering text. To understand acme, it is the same, either you have a hand on your mouse and are doing commands, or you are inserting text into the buffer at some place. Note that in a system like acme, you can do a lot of tasks on the mouse alone. You can double-click next to a " character to select the whole string or to select pairs (), [] or {}. Click in the start of a line select the whole line. And so on. Since you can also select the \n character, you can easily move around large textual parts of the code. Cut, copy and paste is also on the mouse alone. So most of the (common) things you do in the vim command mode is done with the mouse. You do have access to a command language, which comes from the sam(1) editor. Learning how that language works helps a lot. I do a lot of my surgery on files by writing commands that change the contents of a selection. Is the mouse more efficient than the keyboard? Hell yes! The more complex an editing task is, the more effective the mouse is. And for most other simple things, the speed is about the same as the keyboard movement for me. Working with acme The key concept of acme is that you use it as the main entry point for all work you do. One of my screens is full-screen acme, and it usually runs two major windows in acme, at the least: scratch and win. The latter is a standard shell so you can open files and operate on commands. I either open files by doing lc and then 3-click them or by executing B <filename>. Remember that ^F completes filenames. The scratch is a file I continually write containing helper commands, small snippets, GH references and so on. I usually have a global one, and one per project I am working on. Usual contents: • Urls to important web pages. 3-click them and chromium takes you there • Github issues • Git branch names enclosed in brackets: [jl/fix-eqc-test/37] • Notes of importance, thougths. • Complex commands with notes on their usage so they can be copied in and used quickly If I am working on a branch, there are usually commands helpful to that branch in the scratch buffer. If there is a gg foo command you can just Snarf it and then use the Send command in the shell window to fire it off in the source code. I usually keep done things in the scratch buffer as well for documentation. Every 2-3 months I then move it to a file scratch.$(date +%Y%m%d) to remember that.
I make heavy use of the fact that acme has a neat way to enter unicode directly, so there are a lot of correct punctuation and a lot of things you won't usually see in ASCII. The editor uses a variable width font by default which is really good when reading and scratching stuff down. Though I also use the Font command if I need a fixed-with font at times.
Acme is a visual-spatial editor environment. By default, it doesn't hide information from you. At work, it is not uncommon that i have over 50 open buffers on a big 27 inch Mac display. You can do that with acme easily and since you have spatiality, you can also remember where you put a window and get back to it easily. I usually run 4 columns:
• One with documentation and scratch pads.
• One which contain the main code I am working on right now.
• One with shells, erlang shells and code mixed.
• A narrow strip containing directory output to quickly get at a specific file. This strip also holds windows with error output.
On smaller screens, I run a 2 column setup which is the default one.

Working with Erlang in acme

Most of what I do is Erlang. I sometimes work in different languages, but the operation is roughly the same.
• A shell is used to run rebar compile. I add [rebar compile] to the tag and then click at the right of [ to select it. A 2-click now recompiles. Other typical things in the tag could be make or make test and so on.
• The gg command is a shorthand for git grep -n. Need a specific thing? I gg it and then the filename comes up with a line number. 3-clicking it is understood by the plumber to open that file on that line.
• I tend to avoid using tools which can follow code paths. Mainly because if you need a tool, then chances are that the code itself is quite convoluted and nasty
• I have a window which runs an erlang console for the project I am working on. I often dynamically load code into the erlang node and test it out. It is rare that I reboot the node unless I am doing something startup-specific coding.
• Documentation: Edit , <erl -man lists in a dummy window for the purpose
• I often search for code. :/^keyfind will search for keyfind, but at the start of the line. I keep such a line around in the tag for searches.
• The Edit , d command clears a window by selecting all contents and then deleting it.
• I often utilize the shell commands: <date +%Y-%m-%d inserts the current date into the buffer for instance. Selecting text and sending it through |sort will sort export lists and atom tables.
• Each project is written to a dump file with Dump acme.projectname. This way, you can easily get back with Load acme.projectname which restores your current window layout and more.
• I use the shell a lot when I write code. In practice I see the UNIX system as the IDE and then I use acme to access that IDE. It works wonders.
April 18, 2013
Introducing the Education Kit (April 18, 2013, 01:48 UTC)
Since I first visited my wife's home town in San Luis Potosi, Mexico, I have been contemplating on how I could apply some of the knowledge gained through years of work as a software engineer to do good in a region that is very rich on natural resources - but starting to lose the battle against a declining economy, crime and most of all - brain drain.

In 2010, I started working with a technical high-school there on solutions that would boost the offering of educational content - and at the same time, introduce students to software development, empowering the teachers and students to do more, while not creating unnecessary overhead.

After a great deal of iterations on different ideas and designs, we settled on a model that would give the best benefit to the students at a very low cost, utilizing technology already present:  Nokia S40 and other devices running J2ME apps.

Some years have passed and today we have a range of very affordable hardware available that can do much more than run simple J2ME apps.

The Education Kit

For the last half year, I have been working on creating a brand new experience, building on the same basic concept done in collaboration with the technical school in Mexico.

Some of the schools outside the larger cities in the areas I have been in in Mexico, don't have much more than an outside basketball court and a few connected rooms with chairs and a TV inside.  NOTE:  Of course this is worst case - but it's nice to know that the worst case is covered by the design.

The whole thing will be based around HTML5/WebApps and the primary target platform delivered, consists of a Raspberry Pi running a software stack with node.js on the server side and a customized Qt5/QML2 WebKit2 WebApps centric browser on the client side.  The Raspberry Pi may not be the most powerful board available, but it's quite affordable and connects to old and new TVs out of the box.

Why not use <insert your favorite technology here>?

The decision to use WebApps as technology for the exercises and not e.g. python or .net is the following:

• It is becoming the industry standard for apps (phones, tablets, desktops, ...).  Even Microsoft is going in that direction.
• It's fairly easy to get started with HTML and JavaScript
• Many local technical schools typically offer courses in... Delphi, Visual FoxPro, Visual Basic, no C/C++ but HTML and JavaScript! - mainly for web design, but still:  The base knowledge is present.

So - what's the scope?

Everything is designed to live up to the following requirements:

• It must be VERY easy to use for both teachers and students.
• All WebApps exercises must strive to be directly supplementing the existing curricula
• Every exercise must be possible to complete within a standard session (~ 45 mins)
• Every exercise must have some stretch goal that involves injecting code (JavaScript) directly into the WebApp (to introduce software development)
One example is a color mixing app that is part of the standard official apps, where the normal exercise tasks involve matching colors using subtractive and additive color systems - and the stretch goal is:  "make the application write how many percent of each primary color is in the mix on screen".

Working outside the box

As one of the goals of the project is to function as a catalyst for the young bright minds to come up with new software/hardware solutions with local businesses, an addition to the minimal setup will also include an Arduino based board and libraries for the WebApps to easily be able to interact with connected motors and sensors.  This will provide a foundation for industrial prototypes that will otherwise be hard to access in these regions.  Where it makes sense, exercises will be able to interact with connected hardware, e.g. a color sensor would make a lot of sense to use in connection with a color mixer exercise.

Additionally, because of the split of the server and client, it will be possible for schools with LAN connectivity to hook up the device and offer several students to work on exercises from the same box at the same time.  Another possibility could be to have a WiFi dongle attached to the RaspberryPi, providing access to exercises to students with WiFi enabled devices in the classroom.

Where to go from here?

Well - the next milestone is a talk at OpenSourceDays 2013 in Denmark.

In parallel, I am working hard on getting the system ready, as well as looking for additional help, potential sponsors, publishers for content and schools for pilots.

If you want to support this in any way, leave a comment and I will get back to you!

April 16, 2013
When you publicly assert that somebody sent spam, you need to ensure that your data is accurate. Your process needs to be simple and verifiable, and to compensate for any errors, you want your process to be transparent to the public with clear points of contact and line of responsibility. Here are some pointers from the operator of the bsdly.net greytrap-based blacklist.

Regular readers will be aware that bsdly.net started sharing our locally generated blacklist of known spam senders back in July 2007, and that we've offered hourly updates for free since then.

The mechanics of maintaining a list boil down to a few simple steps, as described in the original article and the various web pages it references as well several followups, but the probably most informative recipe for how it's all done was this one, written in May 2012 in response to (as usual) a heated exchange on openbsd-misc.

As I've explained in earlier articles, once the basic spamd(8) setup is in place, maintaining the blacklist starts with defining your list of known bad, never to become deliverable adresses in domains you control. It is worth noting that you can run spamd on any OpenBSD computer even if you do not run a real mail service (several of my correspondents do, and do evil things like crank up the time between response bytes to 10 seconds for entertainment), but as it happens we have a few real mail servers behind our spamd equipped gateways, so it seemed natural to restrict our pool of trap addresses to the domains that are actually served by our kit here.

Collecting addresses for the spamtraps list started with a totally manual process of fishing out addresses from the mail server logs, greping for log entries for delivery attempts to non-existent addresses in our domains. Spammers would do (as they still do) Joe jobs on one or more of our domains, making up or generating fake addresses to use as From: or Reply-to: addresses on their spam messages, and messages that for one reason or the other were not deliverable would end up generating bounce messages that our mail service would need to deal with. But a manual process is error prone and we're bound to have missed a few, so not too long after I'd written the script that that generates the downloadable blacklist, I had it checking the active greylist for any addresses not already in the pool of known bad addresses.

This is the process that has helped generate the current list of 'imaginary friends', now 24,324 entries long and with a growth of usually a handful per day (but there have been whole days without a single new entry) but up to a few hundred, in rare cases, whenever the script runs. I assume there will be more entries arriving as I write and post this article, but right now the latest entry so far, received 13 Apr 2013 15:10 CEST, was pfpeter@bsdly.net (which mildly suggests that somebody is having a bit of fun with my address and obvious keywords -- if you get the trap address list, you'll see that grep peter@bsdly.net sortlist turns up close to a hundred entries, mostly combinations of well-known keywords and my email address).

You could argue that fishing out bounce-to addresses of the greylist quickly for trapping purposes runs the risk of unfairly penalizing innoncent third parties with badly configured mail services, and I must admit that risk exists. However, my experiment of planting my own made-up adresses in the spamtraps list reveals that the list is indeed read and used by spammers, and after all, early sufferers would be blacklisted from here only for 24 hours after their last attempt at bouncing back the worthless stuff to us.

And once an address is in the spamtrap list, attempts at delivering mail to that address turns up in logs something like this:

Apr 14 15:19:22 skapet spamd[1733]: (GREY) 201.215.127.126: <switchbackiwh0@google.com> -> <aramforbess@bsdly.net>Apr 14 15:19:22 skapet spamd[31358]: Trapping 201.215.127.126 for tuple 201.215.127.126 pc-126-127-215-201.cm.vtr.net <switchbackiwh0@google.com> <aramforbess@bsdly.net>Apr 14 15:19:22 skapet spamd[31358]: sync_trap 201.215.127.126

the sync_trap line indicates that this spamd is set up to synchronize with a sister site, like I described in the In The Name Of Sane Email... article. When the miscreant returns, it looks something like this:
Apr 14 15:28:01 skapet spamd[30256]: 201.215.127.126: connected (3/3), lists: spamd-greytrapApr 14 15:30:15 skapet spamd[30256]: 201.215.127.126: disconnected after 134 seconds. lists: spamd-greytrap
most likely with repeat attempts until the sender gives up.

That's the basic mechanism. Now for the principles. I outlined some of the operating principles in a kind of terms of service statement here, but I'll offer a rehash here with a tiny sprinkling of tweaks I've made to the process in order to make the quality of the data I offer better.

First, as I already pointed out in the ingress, you want your process to be simple and verifiable. Run of the mill spamd greytrapping passes the first test with flying colors; after all, any host that ends up in the blacklist verifiably tried to send mail to a known bad address. Keep your logs around for a while, and you should be in good shape to verify what happened.

You also want your data to be accurate, with each entry representing a host that verifiably sent spam. This means watching out for errors of any kind, including but not limited to finding and removing false positives. The automatic 24-hour expiry that's part of the whole greytrapping experience helps a lot here. Any perpetrator or unlucky victim will be out of harm's or blockage's way within 24 hours of the last undesired action we register from their side. There is no requirement that the system administrator track down a web form and swear on their grandmother's pituitary gland that they have 'cleaned up the system'. We (perhaps naively) assume that anyone we don't hear from is no longer our problem.

However, spamd was designed to be a solution to a relatively simple and limited set of problems. Every day some spam messages will manage to get past the outer defenses and face the content filtering that in most cases makes the right decision and drops any spam messages that reaches it on the floor. And there is a small, but not entirely non-existent body of messages that are spam of some kind that will end up in users' inboxes.

For the case where messages are dropped by the content filtering, I found that it was fairly simple to extract the IP addresses of the last hop before entering our network from the logs generated by the content filtering, and at regular intervals these IP addresses are collected from the mail servers with the content filtering in place, and fed into the local greytrap via spamdb(8). It took more than a few dry runs before I trusted the process, but setting up something similar for your environment should be within any sysadmin's scripting skills. We use spamassassin and clamav here, but you should be able to extract fairly easily the information you need to fit the behavior of your particular combination of software. We also offer our users the option of saving messages in spam and not-spam folders on a network drive to train spamassassin's Bayesian engine, indirectly helping the quality of the generated blacklist via more accurate detection of spam. In addition, a so-minded administrator can even extract IP addresses from any headers the user had a mind to conserve and use spamdb(8) to manually insert offending IP addresses in the local greytrap list.

And finally to compensate for any errors, you want your process to be transparent to the public with clear points of contact and line of responsibility. In other words, make sure that you have people in place who are indeed accessible and responsive when somebody tries to contact you via any of the RFC 2142 required addresses. And post something like this article to somewhere reachable. At bsdly.net and associated domains, it's a distinct advantage that contact attempts happen from hosts not currently in the blacklists, but as far as I am aware any errors in the published list have been dealt with before anybody else noticed, and we have avoided being party to the blocklist vendettas and web forum flame wars that have plagued other blacklist maintainers (it has been suggested that the December 2012 DDOS incident could have been part of somebody's revenge, but we do not have sufficient evidence to point any fingers).

In short, you need to keep things simple, act responsibly and be responsive to anyone contacting you about your (mostly automatically generated) work product.

Good night and good luck.

2013-04-15 update: Clarified that manual spamdb(8) manipulation can be used to insert IP addreses in the blacklist too.

2013-04-16 update: It is also possible to fetch the hourly dump from the NUUG mirror here: http://home.nuug.no/~peter/bsdly.net.traplist. In fact, fetching from there should under most circumstances be faster than getting it from the original location. The file is copied at 15 minutes past the hour, while the generating starts at 10 past the hour.

In addition to the techniques described here, it is useful to know that OpenBSD developer Peter Hessler is working on distributing spamd data via BGP, as described in his AsiaBSDCon 2012 paper. Not part of the base distribution yet, but work continues and could come in useful in addition to the batch import of exported lists like the bsdly.net hourly dump.

If you're interested in setting up your own spamd, your main source of information is included in your OpenBSD (or FreeBSD or NetBSD) installation: the man pages such as the one I refer to here. Recommended secondary sources include my own The Book of PF and the PF tutorial (or here) it grew out of. You can even support the OpenBSD project by buying the book from them at the same time you buy your CD set, see the OpenBSD Orders page for more information.

If you're interested in OpenBSD in general, you have a real treat coming up in the form of Michael W. Lucas' Absolute OpenBSD, 2nd edition, also available from the OpenBSD site, and for a few hours more the auction of the first copy printed is running. Surely you can top USD 1145? With your boss' credit card, perhaps?

Upcoming talks: I'll be speaking at BSDCan 2013, on The Hail Mary Cloud And The Lessons Learned, with a preview planned for the BLUG meeting a couple of weeks before the conference. There will be no PF tutorial at this year's BSDCan, fortunately my staple tutorial item was crowded out by new initiatives from some truly excellent people. But you can lobby other organizers to host one.

Recipes in our field are all too often offered with little or no commentary to help the user understand the underlying principles of how a specific configuration works. To counter the trend and offer some free advice on a common configuration, here is my recipe for a sane mail setup.

Mailing lists can be fun. Most of the time the discussions on lists like openbsd-misc are useful, entertaining or both.

But when your battle with spam fighting technology ends up blocking your source of information and entertainment (like in the case of the recent thread titled "spamd greylisting: false positives" - starting with this message), frustration levels can run high, and in the process it emerged that some readers out there place way too much trust in a certain site offering barely commented recipes (named after a rare chemical compound Cl-Hg-Hg-Cl).

I did pitch in at various points in that thread, but then it turned out that the real problem was a misconfigured secondary MX, and I thought I'd offer my own recipe, in the true spirit of sharing works for me(tm) content. So without further ado, here is

Setting Up OpenBSD's spamd(8) With Secondary MXes In Play in Four Easy Steps

Yes, it really is that simple. The four steps are:
1. Make sure your MXes (both primary and secondary) are able to receive mail for your domains
2. Set set up content filtering for all MXes, since some spambots actually speak SMTP
3. Set up spamd in front of all MXes
4. Set up synchronization between your spamds
These are the basic steps. If you want to go even further, you can supplement your greylisting and publicly available blacklists with your own greytrapping, but greytrapping is by no means required.

For steps 1) and 2), please consult the documentation for your MTA of choice and the content filtering options you have available.

If you want an overview article to get you started, you could take a peek at my Effective spam and malware countermeasures paper (originally a BSDCan paper - if you feel the subject could bear reworking into a longer form, please let me know). Once you have made sure that your mail exchangers will accept mail for your domains (checking that secondaries do receive and spool mail when you stop the SMTP service on the primary), it's time to start setting up the content filtering.

At this point you will more likely than not discover that any differences in filtering setups between the hosts that accept and deliver mail will let spam through via the weakest link. Tune accordingly, or at least until you are satisfied that you have a fairly functional configuration.

When you're done, leave top or something similar running on each of the machines doing the filtering and occasionally note the system load numbers.

Before you start on step 3), please take some time to read relevant man pages (pf.conf, spamd, spamd.conf and spamlogd come to mind), or you could take a peek at the relevant parts of the PF FAQ, or my own writings such as The Book of PF, the somewhat shorter Firewalling with PF online tutorial or the most up to date tutorial slides with slightly less text per HTML page.

The book and tutorial both contain material relevant to the FreeBSD version and other versions based on the older syntax too (really only minor tweaks needed). In the following I will refer to the running configuration at the pair of sites that serve as my main lab for these things (and provided quite a bit of the background for The Book of PF and subsequent columns here).

As you will have read by now in the various sources I cited earlier, you need to set up rules to redirect traffic to your spamd as appropriate. Now let's take a peek at what I have running at my primary site's gateway. greping for rules that reference the smtp should do the trick:

peter@primary $sudo grep smtp /etc/pf.conf which yields pass in log quick on egress proto tcp from <nospamd> to port smtppass in log quick on egress proto tcp from <spamd-white> to port smtppass in log on egress proto tcp to port smtp rdr-to 127.0.0.1 port spamd queue spamdpass out log on egress proto tcp to port smtp Hah. But these rules differ both from the example in the spamd man page and in the other sources! Why? Well, to tell you the truth, the only thing we achieve by doing the quick dance here is to make sure that SMTP traffic from any host that's already in the nospamd or spamd-white tables is never redirected to spamd, while traffic from anywhere else will match the first non-quick rule quoted here and will be redirected. I do not remember the exact circumstances, but this particular construct was probably the result of a late night debugging session where the root cause of the odd behavior was something else entirely. But anyway, this recipe is offered in a true it works for me spirit, and I can attest that this configuration works. The queue spamd part shows that this gateway also has an altq(9) based traffic shaping regime in place. The final pass out rule is there to make sure spamlogd records outgoing mail traffic and maintains whitelist entries. And of course for those rules to load, you need to define the tables before you use them by putting these two lines table <spamd-white> persisttable <nospamd> persist file "/etc/mail/nospamd" somewhere early in your /etc/pf.conf file. Now let's see what the rules on the site with secondary MX looks like. We type: $ sudo grep smtp /etc/pf.conf

and get

pass in log on egress proto tcp to port smtp rdr-to 127.0.0.1 port spamdpass log proto tcp from <moby> to port smtppass log proto tcp from <spamd-white> to port smtppass log proto tcp from $lan to port smtp which is almost to the letter (barring only an obscure literature reference for one of the table names) the same as the published sample configurations. Pro tip: Stick as close as possible to the recommendend configuration from the spamd(8) man page. The first version here produced some truly odd results on occasion. Once again the final rule is there to make sure spamlogd records outgoing mail traffic and maintains whitelist entries. The tables, again earlier on in the /etc/pf.conf file, are: table <spamd-white> persist counterstable <moby> file "/etc/mail/nospamd" At this point, you have seen how to set up two spamds, each running in front of a mail exchanger. You can choose to run with the default spamd.conf, or you can edit in your own customizations. The next works for me item is bsdly.net's very own spamd.conf file, which automatically makes you a user of my greytrapping based blacklist. Once you have edited the /etc/rc.conf.local files on both machines so the spamd_flags= no longer contains NO (change to spamd_flags="" for now), you can start spamd (by running /usr/libexec/spamd and /usr/libexec/spamdlogd and run /usr/libexec/spamd-setup manually). Or if you want, reboot the system and look for the spamlogd and spamd startup lines in the /etc/rc output. The fourth and final required step for a spamd setup with backup mail exchangers it to set up synchronization between the spamds. The synchronization keeps your greylists in sync and transfers information on any greytrapped entries to the partner spamds. As the spamd man page explains, the synchronization options -y and -Y are command line options to spamd. So let's see what the /etc/rc.conf.local on the primary has in its spamd_flags options line: peter@primary-gw$ grep spamd /etc/rc.conf.localspamd_flags="-v -G 2:8:864 -w 1 -y bge0 -Y secondary.com -Y secondary-gw.secondary.com "

Here we see that I've turned up verbose logging (-v), for some reason I've fiddled with the greylisting parameters (-G). But more significantly, I've also set up this spamd to listen for synchronization messages on the bge0 interface (-y) and to send its own synchronization messages to the hosts designated by the -Y options.

On the secondary, the configuration is almost identical. The only difference is the interface name and that the synchronization partner is the primary gateway.

$sudo grep spamd /etc/rc.conf.localspamd_flags="-v -G 2:8:864 -w 1 -y xl0 -Y primary-gw.primary.com -Y primary.com" With these settings in place, you have more or less completed step four of our recipe. But if you want to make sure you get all spamd log messages in a separate log file, add these lines to your /etc/syslog.conf: # spamd!!spamddaemon.err;daemon.warn;daemon.info;daemon.debug /var/log/spamd After noting the system load on your content filtering machines, restart your spamds. Then watch the system load values on the content filterers and take a note of them from time to time, say every 30 minutes or so. Step 4) is the last required step for building a multi-MX configuration. You may want to just leave the system running for a while and watch any messages that turn up in the spamd logs or the mail exchanger's logs. The final embellishment is to set up local greytrapping. The principle is simple: If you have one or more addresses in your domain that you know will never be valid, you add them to your list of trapping addresses with a command such as spamdb -T -a noreply@mydomain.nx and any host that tries to deliver mail to noreply@mydomain.nx will be added to the local blacklist spamd-greytrap to be stuttered at for as long as it takes. Greytrapping can be fun, you can search for posts here tagged with the obvious keywords. To get you started, I offer up my published list of trap addresses, built mainly from logs of unsuccessful delivery attempts here, at The BSDly.net traplist page, while the raw list of trap email addresses is available here. If you want to use that list in a similar manner for your site, please do, only remember to replace the domain names with one or more that you will be receiving mail for. This is the list that is used to trap the addresses I publish here with a faster mirror here. The list is already in the spamd.conf file I offered you earlier. If you want more background on the BSDly.net list, please see the How I Run This List, Or The Ethics of Running a Greytrapping Based Blacklist page or search this blog on the obvious keywords. By the way, what happened to the load on those content filtering machines? Update 2012-05-30: Titles updated to clarify that the main feature here is the spamd(8) spam deferral daemon from the OpenBSD base system, not the identically-named program from the SpamAssassin content filtering suite. Update 2013-04-16: Added the Pro tip: Stick as close as possible to the recommendend configuration from the spamd(8) man page. The first version here produced some truly odd results on occasion. April 15, 2013 WordPress-sikkerhed og brute force angreb (April 15, 2013, 19:52 UTC) Der er i øjeblikket mange der taler om et botnet der på nuværende tidspunkt skulle være i gang med at brute force sig adgang til WordPress sites overalt. Der skulle efter sigende være tale om et angreb fra op til 90.000 forskellige kilder. Om et sådan angreb er i gang eller ej skal jeg ikke kunne sige, men det har tilsyneladende fået rigtig mange WordPressbrugere til at genoverveje deres nuværende sikkerhedsniveau, så noget positivt er der da kommet ud af det lige meget hvad. Jeg ser mange der diskuterer problemet og mulige løsninger, hvilket igen er positivt, men der er mange der ikke har helt styr på hvad der er godt og skidt, og evt. hvorfor, så jeg vil her prøve at give mit bud på nogle overvejelser man bør gøre sig inden man hovedløst kaster sig ud i installation af tilfældige sikkerhedsplugins. En række supportere på det officielle WordPress support forum har sammen lavet en side på WordPress codex med tips og tricks til at sikre sig mod angrebet, så hvis du vil have den korte version på engelsk kan den findes her. Jeg vil gå lidt mere i dybden og forsøge at forklare det hele på dansk. Hvad sker der egentlig? For god ordens skyld vil jeg lige starte med at forklare det egentlige problem, så vi allesammen snakker om de samme ting, og for at give en idé om de overvejelser man bør gøre sig. Botnets? What? Det omtalte angreb stammer fra et stort botnet. Et botnet er et antal computere der, for det meste uden at ejeren ved det, bliver brugt af tredjepart til andre formål end hvad ejeren af computeren havde til hensigt. I dette tilfælde handler det efter sigende om at bagmændende bag botnettet prøver at tiltvinge sig kontrol over flere computere, for på den måde at udvide deres botnet. De gør det ved at “brute force” sig administratoradgang til maskiner der kører websites bygget på WordPressplatformen. Hvis det lykkedes dem at få adgang til et WordPress-site gør de så dette site til en del af deres botnet, og på den måde vil angrebet vokse hver gang det lykkedes dem at overtage et WordPress-site. Hvad så med det “brute forcing”? Brute forcing handler om at tiltvinge sig adgang ved “brute force”, altså rå magt. Det foregår ved de simpelthen prøver at logge ind som adminbrugeren, ved at prøve mere eller mere tilfældige passwords, i teorien kan man jo gætte alle passwords hvis man har uendelige forsøg og god tålmodighed. Da der er tale om et automatisk angreb er det ikke noget der koster bagmændende noget tid, og de har derfor rigeligt med tålmodighed. Men hvad kan jeg gøre ved det? For det første skal du tage dine normale forholdsregler. Sørg ALTID for at holde WordPress fuldt opdateret. Det samme gælder alle installerede plugins og temaer. Dette er ikke specielt for det nuværende angreb, men alt software er usikkert på den ene eller anden måde. Folkene bag WordPress arbejder hårdt for at lukke alle sikkerhedshuller der bliver fundet så hurtigt så muligt, så sørg for altid at være opdateret. Sørg desuden for at slå alle plugins fra du ikke bruger. Som nævnt er intet software 100% sikkert, så en tommelfingerregel er at jo mindre software du har jo bedre. Mht. det nuværende angreb er der selvfølgelig nogle ekstra ting du kan gøre. Tag backup af alt! En backup sikrer dig selvfølgelig ikke imod angreb udefra. Men hvis du skulle være uheldig enten at blive offer for angrebet, eller at få ødelagt dit site i et forsøg på at sikre det er det altså noget rarere at have end up-to-date backup, end at skulle starte helt forfra. Sørg altså for at tage backup af både selvesitet (via. FTP) og databasen (f.eks. via phpmyadmin), gerne lige nu, inden du forsøger at øge din sikkerhed. Selve backupprocessen er afhængig af hvor du har dit website liggende, så der kan jeg ikke være så behjælpelig, men der findes som sædvanlig en række forslag på WordPress Codex. Skift navnet på din adminbruger Det igangværende angreb udnytter tilsyneladende at WordPress i gamle dage automatisk oprettede en administratorbruger med brugernavnet. Fra version 3.0 og frem har det været muligt selv at bestemme dette brugernavn, men det er ikke alle der har ændret det. En ændring af dette brugernavn vil gøre det sværere at brute force sig adgang til dit site, da angriberne så både skal finde brugernavn og password, i stedet for kun password. Hvis du har en administratorbruger med brugernavnet admin kan det ændres forholdsvist enkelt med et plugin, eller ved et par manuelle skridt: 1. Login som din administratorbruger 2. I adminpanelet gå til “Brugere” -> “Tilføj ny” 3. Opret en ny bruger, sæt brugerens rolle til administrator (med et ordentlig password!) 4. Log ud. 5. Log ind med din nyoprettede administratorbruger. 6. I adminpanelet gå til “Brugere” -> “Alle brugere”. 7. Hold musen hen over admin-brugeren, og tryk på det “slet”-link der kommer frem. Hvis du oplever at du ikke kan slette admin-brugeren så dobbeltcheck lige at du har logget ind med din nyoprettede bruger, og ikke stadig er logget ind som adminbrugere (ja, vi kan alle lave trykfejl ). Brug ordentlige passwords Dette er vel nok det vigtigste punkt på listen, og desværre nok også det mest besværlige at gøre ordentligt. Sørg altid for at bruge ordentlige passwords der ikke er lige til at gætte. Som sagt foregår det nuværende angreb som et brute force angreb, hvor de prøver sig at gætte sig til dit password, dette vil normalt foregå på en af to måder. Enten kan man prøve fra en ende af, med “a” som password, derefter “b”, når alle muligheder er opbrugt prøver man så med to bogstaver, “aa”. Dette kan selvsagt tage lang tid, men hvis dit password f.eks. er “asdf” tager det altså ikke lang tid. En anden metode er at tage udgangspunkt i en ordbog. I stedet for at prøve alle kombinationer af tal og bogstaver nøjes man med ord fra en prædefineret liste af ord. Disse liste vil ofte også indeholde ofte brugte afarter af ord, hvor f.eks bogstaver er udskiftet med tal. Dette betyder at hverken “password” eller “p4ssw0rd” er særlig gode passwords. Der er skrevet rigtig, rigtig, rigtig, meget om hvordan man gør sine passwords sikre. WordPress Codex nævner også nogle gode forholdsregler man kan forholde sig til. Der henvises desuden til nogle services der kan hjælpe med at generere og huske gode passwords. De væsentligste ting må være: • Lad være med at bruge personlige oplysninger, såsom dit navn, din fødselsdag eller lignende offentlige oplysninger. • Lad være med at bruge ord fra ordbogen. • Længere passwords er generelt bedre passwords. • Brug en blanding af store og små bogstaver, tal og tegn Men husk altid på de ovenstående metoder der bruges til at gætte passwords. 5up3rM4nRul3z43v3r (superman rules forever) er altså ikke nødvendigvis et godt password, da det stadig er en sammensætning af ord der garanteret vil stå i enhver password breaking ordbog. Gode passwords vil oftest (altid?) være svære at huske. En metode til at gøre det lettere er at bruge en passwordmanager, et program eller en service til at huske dine passwords for dig. Jeg bruger selv LastPass som gør at jeg har et meget stærkt password som jeg skal huske. Dette masterpassword giver så adgang til alle mine passwords til diverse websites. Det betyder at jeg kan bruge services til at generere tilfældige passwords (eller, ihvertfald tilfældige nok), uden selv at skulle huske alle disse passwords, eller nedskrive dem på papirer som jeg kan smide væk. Bloker adgang til din loginside Da det igangværende angreb forsøger at få adgang administratorinterfacet på WordPress sites via den normale loginmetode, er en mulighed for øget beskyttelse at begrænse adgangen til adminpanelet. De to mest brugte metoder til dette er: • Password-beskyt adminpanelet • Bloker adgang til filen på IP-niveau Fælles for de to metoder er at de kan fungere på et af to niveauer. Det kan enten gøres via. WordPress selv, eller det kan gøres direkte på serverniveau. Om man bruger den ene eller anden metode kan umiddelbart virke ligegyldigt, men jeg vil gerne lige understrege hvorfor det gør en forskel. WordPress er skrevet i PHP, som bl.a. bruges til at skabe dynamiske hjemmesider. Jeg har andetsteds skrevet lidt om forskellen på statiske og dynamiske hjemmsider (læs afsnittene om statiske og dynamiske hjemmesider, resten af artiklen er ikke relevant for dette emne). Men det betyder at hver gang sikkerheden bliver håndteret i WordPress skal serveren have hele WordPress i gang med at arbejde for først at finde ud af om en bruger skal have adgang til en ressource. Samme sikkerhed kan ofte klares på serverniveau, hvilket betyder meget mindre arbejde for serveren, da en evt. angriber vil blive forment adgang allerede inden WordPress bliver sat i gang. Password-beskyt adminpanelet Passwordbeskyttelse kan som nævnt foregå både på server- og på WordPressniveau. Passwordbeskyttelse på WordPressniveau foregår via. WordPress eget loginsystem, det er en beskyttelse vi snakkede om tidligere med ændring af adminbrugernavn, og brug af gode passwords. Passwordbeskyttelsen kan dog også flyttes ned på serverniveau med HTTP authentication. Dette vil resultere i at når du tilgår din WordPress loginside vil du først møde en popup hvor du skal skrive brugernavn og password, og derefter vil du møde den normale WordPress loginside. Dette er selvfølgelig en afvejning af sikkerhed vs. convenience da det kan virke lidt irriterende at skulle logge ind to gange. Hvis du vælger denne metode skal du selvfølgelig huske ikke at bruge samme brugernavn/passwordkombination til begge lag af beskyttelse. WordPress Codex har selvfølgelig også instruktioner til hvordan du opsætter HTTP authentication, både på Apache og Nginx servere. Bloker adgangen på IP-niveau En anden metode er at blokere adgangen til WordPress-loginformen ud fra forespørgerens IP-adresse. Hvis du ved at du har en fast IP-adresse og at du vil have den samme adresse lige så længe som du har din WordPressside kan du selvfølgelig lave en opsætning der kun giver den ene IP-adresse adgang til din loginside, men det er de færreste forundt (og nu er ikke tidspunktet til en IPv6-diskussion). Et alternativ, som vil være mere relevant for de fleste, er kun at tillade en IP-adresse X antal loginforsøg, før denne blokeres. Dette kan gøres på WordPressniveau vha. et plugin som WordFence (tak til Kasper Bergholt for anbefaling af pluginet, der tilsyneladende også kan mange andre lækre ting), men igen har vi problemet med at hele WordPresspakken skal startes op, før der kan tages stilling til om en bruger skal have adgang eller ej. Dette er dog noget mere besværligt, og nok ikke noget alle og enhver bare bør kaste sig ud i, men en guide til opsætning af af rate-limiting på Apache kan findes her. Hvis man vil opsætte rate limiting i .htaccess, kræver det dog at man kører Apache med mod_security 2.7.3 eller senere. Et problem med denne tilgang til det igangværende angreb er at angrebet som forklaret udføres af et botnet der arbejder fra mere end 90.000 forskellige IP-adresser (med mulighed for udvidelse, hvis de får succes). Det betyder at de potentielt kan fortsætte angrebet fra en ny IP, hvis en af adresserne bliver spærret. Dog må der altid være en form for ressource cost tilknyttet hvis angrebet skal flyttes til en ny afsender, da der skal holdes styr på hvor langt hver angriber er nået i ordbogen. Så jo flere lag af sikkerhed du kan opsætte, jo sværere bliver det for angriberen. Opsummering Så for at opsummere hvad jeg mener du bør gøre for at højne din WordPress-sikkerhed generelt, og imod det måske igangværende angreb er: 1. Hold WordPress opdateret, inkl. • WordPress core • Alle installerede plugins • Alle installerede temaer 2. Tag backup. Om ikke andet for at gøre det lettere at komme tilbage på benene hvis uheldet skulle være ude. 3. Lad være med at have en adminbruger med navnet admin. 4. Brug ordentlige passwords. Brug evt. en passwordmanager. 5. Passwordbeskyt din loginside. 6. Bloker adgangen for IP-adresser der prøver at gætte dit password. Hvis i har andre gode råd til beskyttelse af WordPress hører jeg selvfølgelig også gerne om det i kommentarfeltet nedenfor, så vi kan få spredt ordet om ordentlig sikkerhed (At lade være med at bruge WordPress godtages ikke som et godt råd, så hvis det er din mission, så spar dig selv besværet og lad være med at skrive det her). April 14, 2013 QR koder – en udfordring til jer (April 14, 2013, 11:22 UTC) Jeg har længe fundet QR koder interessante, og var relativt hurtigt til at tage dem til mig, da de først kom frem. Til gengæld er jeg aldrig rigtig blevet overbevist om forretningspotentialet i dem. Webbureauer og QR-konsulenter har længe kontaktet mig for at overtale mig om deres værdi, og selvom jeg aldrig rigtig bliver vist tal, der kunne vise den værdi, kunne jeg da godt blive overbevist alligevel af de rigtige argumenter. Men dem er jeg bare endnu ikke blevet mødt med endnu – og jeg er ikke overbevist om, at ressourcerne der skal lægges i at få succes med dem, er det værd. Så hermed en udfordring: • Hvordan kan QR koder binde offline og online sammen – på en kreativ, innovativ måde der VIRKER? • Hvordan kan man markedsføre sin QR kode aktivitet uden at bruge store dele af marketingsbudgettet for det? • Hvordan får man en målgruppe 35-55 år til at anvende det effektivt? • Og på tværs af mange lande? Jeg gider ikke store præsentationer og komplekse sammenhænge. Hvis ideen og konceptet ikke er simpelt nok, er det heller ikke godt nok April 12, 2013 a.k.a. phk Offentlige IT-slagskibe (April 12, 2013, 12:01 UTC) Der er to nyheder i strømmen idag, som begge handler om manglende fantasi. I den ene ende har vi NemID der tilsyneladende kører på et aflagt 28.8 modem og en 286 PC som ingen nogensinde kunne forestille sig ville blive DOS'et. På den anden side har vi en uauthenticeret upload til flyvemaskiner.... a.k.a. brother Regulære udtryk fra den virkelige verden (April 12, 2013, 08:51 UTC) Jeg skriver rigtig mange regulære udtryk og kender alle de advancerede ting som look-around assertions og uafhængige deludtryk. Teorien bag regulære udtrykt har jeg nogenlunde styr på, så jeg har en ide om hvor hurtige de er. Kort sagt: Jeg kan mine regulære udtryk. Når nybegyndere spørger hvord... April 11, 2013 Do you have People redundancy? (April 11, 2013, 10:52 UTC) As an Operations consultant, I always see people focus on redundancy on all levels... Except for documentation. For some reason everyone says, there should be documentation, but never ends up being usable documentation. Either it's too much, too little or something else :) read more a.k.a. phk Datakunst efterlysning (April 11, 2013, 10:46 UTC) Vi har modtaget en henvendelse i datamuseum.dk, fra Louise Foo. (Hvis navnet virker bekendt er det måske fordi Louises søster, Sharin, udgør halvdelen af Raveonettes.) Louise leder efter kilder til den tidlige danske computerkunst og der kommer vi lidt til kort i datamuseum.dk Sammen med Ander... April 10, 2013 a.k.a. brother Reviewing my own regular expressions (April 10, 2013, 12:59 UTC) Regular expressions are a wonderful tool, but as many other powerful tools they tend to be misused. If I had a penny for everytime I recommended against using regular expressions for parsing HTML fragments …. yeah you know what I mean. This time made me think a bit about my own use of regular expressions. Do I ever fall in the trap of misusing regular expressions myself? Looking at my current code base I couldn’t find any glaring misuses, if I could only get a list of all the regular expressions in my current project. Regular expressions to the rescue? No, that would probably be quite an horrible adventure. Luckily I have tools to parse Perl: PPI to the rescue: A few hundred regular expressions. Most of them just matching simple substrings (in some cases case insensitive) or just short hand for testing a handful of equalities in one go. The only zero-width assertions we are using are the word boundary and quantifiers are mostly used on simple character classes like \d, \s, [0-9a-f] and a few “anything but this one or two characters” ([^>]). Lessons learned: 1) PPI is cool. 2) I really do use regular expressions as I preach. April 09, 2013 a.k.a. pto Forældreintra - få det dog lavet bedre! (April 09, 2013, 12:45 UTC) Vi folkeskole-forældre har nok "forældreintra" som fælles "hade"-system. Forældreintra er skolernes fælles web-service, hvor man får lektier fra lærer, ser kommunikation mellem lærer og forældre mv. Ideen er super god, Væk med meddelsesbogen, glemte sedler fra lærer osv. Problemet er den måde sy... April 07, 2013 Keeping your OpenBSD systems in trim is dead easy. Occasional reboots are inevitable, but then that's where our famous redundancy features really shine, right? Read on for notes on upgrading your system. My upgrades always start with the same couple of commands:$ cd ~/upgrade
$ls -l$ ncftp eu-openbsd
ncftp ...penBSD/snapshots/amd64 > dir

The first two commands of this sequence show me the date of the latest snapshot I downloaded, and the next two give me the date on what is on the OpenBSD European mirror site.

The ncftp bookmark eu-openbsd in this case expands to

ftp://ftp.eu.openbsd.org/pub/OpenBSD/snapshots/amd64/

which is appropriate for a North European like myself with an AMD64 based system on hand. If you're not in Europe, it's likely you're better of with some other choice of mirror. Check the Getting Releases page on the OpenBSD site for a mirror near you. And of course, do pick the correct architecture for your system.

If the files on the mirror server are newer than the ones I have locally, I'll download them with

ncftp ...penBSD/snapshots/amd64 > get *

If there are no updates, well, that means I'll just check back later.

And of course, if you do not have the ncftp package installed, this command will work on any OpenBSD box with a full base system installed:

$ftp -ia ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/uname -m/{index.txt,*tgz,bsd*,INS*} (thanks, Pedro Caetano!) One of the things that makes doing OpenBSD upgrades so amazingly easy is the sysmerge(8) program. Much inspired by FreeBSD's classic mergemaster(8), this program is, as we shall see by executing the following commands,$ which sysmerge
/usr/sbin/sysmerge
$file which sysmerge /usr/sbin/sysmerge: Korn shell script text executable actually a shell script that makes extensive use of sdiff(1) to highlight the differences between your installed version of a configuration file and the one from the source tree or your install sets, and offers to merge (hence the name) your current customizations into the newer version files and install them on the spot if you like. Do take a peek with at what the script does:$ less /usr/sbin/sysmerge

possibly complemented by looking up the sdiff(1) man page if you want.

Or, if you just want to get on with the upgrading, you can do what I tend to do (in a slight deviation from orthodoxy) and -- making sure to check that your $PWD (present working directory) is the directory where you just downloaded the fresh install sets -- run this command:$ sudo sysmerge -s etc52.tgz -x xetc52.tgz

which says to sysmerge, take as your source the files in the etc52.tgz and xetc52.tgz sets and perform the merge.

This of course reveals that at the time of writing, the system where I'm doing the merge is already running OpenBSD 5.2-beta snapshots, if you use these notes as a reference for some other version, please do insert the file names as they exist on your system.

If you run regular upgrades like I tend to, with snapshots only days or at most a few weeks apart, the differences sysmerge(8) detects are likely few and trivial. During the last few upgrades of my laptop, for example, the only merge I needed to do was to OK cranking the CVS id and carrying over my choice of smarthost as specified in /etc/mail/sendmail.cf.

If you do longer jumps, such as between releases, you will almost certainly find more differences, and in rare cases the changes in the configuration file baselines will be large enough that you will need to take some time out to edit the files by hand. In those cases, studying the upgrade notes relevant to your versions (and all intermediate ones if you do a long jump -- go to http://www.openbsd.org/faq/ and choose the Upgrade Guide link at the top of the left column) will be well worth your time.

With a successful sysmerge(8) run completed, my next step is to copy the fresh bsd.rd to the root directory of my boot disk:

$sudo cp bsd.rd / With the bsd.rd in place, the next step is to reboot your system. Either choose your window manager's reboot option or simply from a shell prompt type$ sudo reboot

When the boot> prompt appears, type b bsd.rd and press Enter.

The familiar bsd.rd boot messages appear (much like the ones illustrated in my earlier installation piece, The Goodness Of Men And Machinery), and the first question you need to answer is:

Since we're upgrading, we type u (upper or lower case doesn't matter) and press Enter.

The next question up is the choice of keyboard layout:

Choose your keyboard layout ('?' or 'L' for list)

Here I type no for my Norwegian keyboard layout, if you want a different one you can get a list of available choices by typing L instead, and pressing Enter.

Once you've chosen your keyboard layout, the upgrade script prompts you to specify where your root file system is located. I've never seen the upgrader guess wrong, so more likely than not you'll just press Enter here.

The upgrader performs a full file system check on the designated root file system, and proceeds to configure any network interfaces it finds configuration files for in the designated root file systems etc directory. You will see the output of DHCP negotiations or similar.

When the network configuration is done, the upgrader prompts you to decide whether you want to perform a full file system check on the other partitions it finds.  The default choice here is no, so if you press enter here, the upgrade script simply runs a fsck -p on each of the file system listed in the system's fstab. You'll see the output for each one of these relatively lightweight checks.

Location of sets? (cd disk ftp http or 'done') [cd]

Here, since I already downloaded the install set files to a local directory, the natural choice is disk. The upgrade script has already mounted all partitions from the system's fstab under /mnt/, so the files I downloaded to /home/peter/upgrade are now available under /mnt/home/peter/upgrade, and that's what I type in response to the prompt for where the sets are to be found.

Unless you have prepared a site.tgz for your site there is normally no reason to add or subtract sets from the default list, so pressing Enter will do nicely. The sets are copied and extracted, and when the extraction is done, you confirm that the upgrade is [done], and when the # shell prompt appears, type

# reboot

to let the system boot into the upgraded operating system.

Watch the system boot, log in and then, in case the updated sysmerge(8) program includes some new magic, re-run the sysmerge command using the etc and xetc sets:

$cd ~/upgrade$ sudo sysmerge -s etc52.tgz -x xetc52.tgz

That's it! There is a chance that there have been updates to packages you have installed (such as ncftp in my case), so I tend to do a package upgrade pretty soon after rebooting into a freshly upgraded system. The basic update command is pkg_add -u, but I tend to want more verbose output and prompts and generally choose this variant:

