Brian's Bloghttp://www.lothar.com/blog/2018-02-11T19:34:00-08:00The "Spellserver": A Generic Remote-Code-Execution Host2018-02-11T19:34:00-08:002018-02-11T19:34:00-08:00Brian Warnertag:www.lothar.com,2018-02-11:/blog/58-The-Spellserver/<p>
A "Spellserver" is my name for a very generic server design: one in which
the server-side code is provided by the <strong>clients</strong> (but somehow
constrained by the server owner). I've been exploring ways to achieve
E-like delegation and attenuation-by-code, in which I can write a
program to enforce my restrictions …</p><p>
A "Spellserver" is my name for a very generic server design: one in which
the server-side code is provided by the <strong>clients</strong> (but somehow
constrained by the server owner). I've been exploring ways to achieve
E-like delegation and attenuation-by-code, in which I can write a
program to enforce my restrictions (rather than pre-programming the
server with all conceivable ones). E provides this quite elegantly, but
depends upon a number of unusual platform details, many of which have
not actually been implemented yet. I'm looking for a way to achieve
similar goals but in a familiar language like Javascript, and invoked
via plain HTTP.
</p>
<h2>Background</h2>
<p>What is E? What's so special about it? Which of its properties am I
trying to achieve in something else?</p>
<p>The <a href="http://www.erights.org/index.html">E programming language</a> was
developed to explore object-capability -based programming patterns. It
has been the testbed for Eventual-Sends, Promises, Promise Pipelining,
Proxies, Facets, Rights-Amplification, Auditors, and other key ideas of
the objcap movement.</p>
<p><a href="https://research.google.com/pubs/pub40673.html">"Dr. SES"</a> is an
ongoing project to bring E's concepts to Javascript, making them
accessible to a larger community. Ten years of TC39 standardization work
have lead to ECMAScript being a reasonable target for secure
object-capability -based programming.</p>
<h3>E's Abstract Ideas</h3>
<p>In E, objects contain references to other objects, which can be used to
invoke methods on those targets (and in fact are the <em>only</em> way to
communicate with them). These method invocations include arguments, and
these arguments can include other object references. Each method can
also return data (or a <a href="http://wiki.erights.org/wiki/Promise">Promise</a>
to some data), including object references.</p>
<p>Object references can span multiple computers. The
<a href="http://www.erights.org/elib/concurrency/vat.html">"Vat"</a> abstraction is
a collection of objects which can make "local" (immediate, blocking)
calls to each other. When the target object is in a foreign Vat, any
method calls must use the "eventual send" operator, which returns a
Promise for the result, and guarantees that target will be unable to
affect any execution until the caller's stack frame has been unwound. In
practice, each Vat runs on a single computer.</p>
<p>The abstract Vat is immortal: it never shuts down, and any object
created in one will be persisted forever (or until all references to it
are lost, at which point its continued existence is indistinguishable
from its destruction).</p>
<p>The Vat basically contains a persistent object store (indexed by some
inter-Vat identifier scheme), a standard execution stack, and a queue of
inbound messages (either generated from remote Vats or from local
eventual-send operations). Each immediate method call pushes a new frame
on the stack, and yields a return value when it finishes. Each
eventual-send pushes a new message on a queue (either local or in some
foreign Vat). Eventual-sends provide a Promise to the caller, and
include a reference to the corresponding Resolver object as a special
argument in the remote message. The Vat knows how to take the return
value from a message invocation and deliver it to the Resolver (as a new
message).</p>
<h3>Realizing E For Real</h3>
<p>E is a
<a href="https://github.com/kpreid/e-on-java.git">fully-functional programming environment</a>,
implemented as a Java-based interpreter, but for various reasons it
hasn't caught on to the extent that Python, Node.js, Go, or Rust have.
Some of E's concepts have been brought into the Javascript world, but
many are not yet available outside of a real E interpreter. Porting
these ideas to a different environment would be pretty useful.</p>
<p>E's programming model imposes a number of requirements on any
implementation. The basic ones are the network protocols that define how
Vats on different computers should talk to each other. E includes
<a href="http://erights.org/elib/distrib/vattp/index.html">VatTP</a> and
<a href="http://erights.org/elib/distrib/captp/index.html">CapTP</a> for this
purpose, but they're somewhat dated (they use Java serialization, MD5,
SHA1, DSA, and RSA).</p>
<p>These protocols are connection-oriented: when a TLS/TCP connection is
severed, object references that cross that connection are "broken", and
both sides can react to the break (by restarting from some persistent
identifier, and rebuilding everything that was lost). The persistent
identifiers are called
<a href="http://wiki.erights.org/wiki/Walnut/Distributed_Computing#Live_Refs.2C_Sturdy_Refs.2C_and_URIs">SturdyRefs</a>,
and are basically an unguessable string, plus enough crypto bits to
identify and reach a specific TLS server.</p>
<p>This leads applications to distinguish between long-lived
externally-referenceable objects (which must live forever, because
there's no good way to learn when the secret string that grants access
has been forgotten by all callers), and ephemeral objects (which can be
called remotely, but the only way to reach them is via messages to one
of the long-lived objects). These long-lived objects are clear
candidates for persistence.</p>
<p>The <a href="http://waterken.net/">Waterken</a> system shares much heritage with E,
but provides a non-connection-oriented approach which imposes other
requirements: try-forever message delivery, and no ephemeral objects
(everything gets a persistent reference string).</p>
<p>The most significant constraint of E's model is orthogonal object
persistence. Vats pretend to hold all objects in RAM, forever, on a
durable and immortal computer, which nominally means everything gets
checkpointed after each message is processed. Application code doesn't
need to specify what data is persistent and what is ephemeral: in fact
objects are generally unaware of being serialized. At each checkpoint,
the Vat walks the table that maps external SturdyRefs to local objects,
then chases down everything which those objects reference, transitively.
The platform must know how to serialize each object, including cycles,
platform objects like file handles, and immutable containers that can
only be reconstructed in a specific order. E can serialize the code that
implements object methods, closures, and whatever other objects were
captured from their parent scope. This is important, since much of E's
encapsulation is based on lexically-scoped closures which capture
(otherwise-unreachable) private state variables.</p>
<p>If we assume that the Vat server stays up most of the time, but is
occasionally restarted, then we might persist only the long-lived
externally-referenceable objects. Ephemeral objects work normally until
the server is stopped, at which point all connections are dropped
anyways. Restarting the Vat is equivalent to breaking all external
connections, throwing out all objects that aren't reachable by one of
the persistent roots, then waiting for external callers to reconnect to
a root.</p>
<h2>Something Like E, But Different</h2>
<p>I want to use something <em>like</em> E, but with a familiar language
(Javascript, specifically Dr. SES), and which can run on a regular
non-immortal computer. I'd like an HTTP-based non-connection-oriented
transport, in which messages can be stored and forwarded from one node
to another safely (which implies signatures and encryption, rather than
transport-based security).</p>
<p>I don't know how to make Javascript do orthogonal persistence, and I'm
not sure it's a good idea anyways (in my experience with Twisted's old
"TAP" files, it was awfully easy to accidentally reference
non-serializable objects, or very large objects, and saving raw object
state interacts badly with attempts to upgrade program code). So I'm
interested in something that does <em>non-orthogonal</em> serialization, where
all persistent state lives in a single "Memory" object, and application
code calls <code>set()</code> and <code>get()</code> to access it.</p>
<h2>Offline Uninvolved Attenuated Delegation</h2>
<p>The most interesting features of object-capability systems are
<strong>delegation</strong> and <strong>attenuation</strong>, wherein new kinds of authorities can
be created by wrapping and combining existing ones. Security analysis of
a system is much easier when you know an upper bound on the power
available to any given component, and object references make this pretty
easy to reason about.</p>
<p>When Alice holds a reference to some object on Bob's server, she holds
the authority to invoke that object. She can delegate that authority to
Carol by sharing her object reference (which might just be a secret
string). But the more interesting case is where Alice wants to delegate
a <em>subset</em> of that authority. The standard ocap way to attenuate
authority is to create a wrapper object: the wrapper can apply
restrictions on the method being called, or on the arguments passed, or
it can simply throw an exception instead of proxying the message.</p>
<p>(I'm being pretty casual about whether "Alice" means the human who runs
the computer on which some object lives, or the anthropomorphized object
itself.. hopefully this won't cause confusion).</p>
<p>E lets you create these wrappers easily, but by default they live on the
local server. In our example, this would result in Carol holding a
reference to some wrapper object on Alice's computer, which must be
involved whenever Carol wants to invoke that object (this impacts
availability, latency, and overall bandwidth cost, as the messages must
traverse both the Carol->Alice and Alice->Bob links, instead of just the
Carol->Bob link).</p>
<p>To remove Alice's involvement, the wrapper must be created on Bob's Vat:
the same Vat as the target object. Creating it on Carol's Vat would be
obviously insecure, since she could bypass it any time she wanted. Bob
already has that power (since he owns the computer where the target
lives), so we don't lose any security by asking Bob to host the wrapper
too. To build it there, either Bob's object must offer an explicit
<code>eval()</code> method, or E's proposed <code>where()</code> feature must be
implemented (a generalized tool for creating an object on the same Vat
as some initial target).</p>
<p>In addition, these E wrappers/proxies must be constructed on Bob
<em>first</em>, so Alice can acquire a reference to them, which she can later
deliver to Carol. Bob must be online to perform attenuated delegation.</p>
<p>I'd like this delegation to be "offline", meaning that Alice can perform
this attenutation without talking to anyone else (Bob in particular),
I'd also like Alice to be uninvolved with the eventual exercise of this
delegated authority. And I'd like the wrapper to not consume persistent
space on Bob's computer: until they are exercised, only Alice should pay
the cost of creating and maintaining the wrapper object.</p>
<h2>What Is A Server, Anyways?</h2>
<p>At a high level of abstraction, all servers are equal. In a zero-latency
infinite-bandwidth universally-trusted Internet (presumably managed by
frictionless uniform-density spherical cows, favorites of physics
students everywhere), it doesn't matter where any object lives. Remote
object references are every bit as good as local ones (eventual-send vs
immediate-call being the only difference). Messages are executed
instantly, regardless of argument size or the complexity of their code.</p>
<p>Once we acknowledge that the speed of light is finite, we start to care
about latency. We'll want to move objects closer to the data they need,
and we discover some tradeoffs. Caching/mirroring moves the data to the
code, which helps if the data is small and the code is large. SQL-style
stored procedures and <a href="http://erights.org/elib/distrib/pipeline.html">Promise
pipelining</a> moves the
code to the data, which helps in the opposite situation.</p>
<p>The next stage of descent from the abstractosphere reveals trust issues,
where we don't want to grant too much authority to the server that hosts
a given object. Any reference we give to an object on Alice's Vat will
be available to Alice herself, which we might not like. The subsequent
uncomfortable realization regards confidentiality: any information we
reveal to Alice's objects are also visible to Alice. This prompts the
development of tools which keep certain objects inside a security
perimeter: a set of approved Vats.</p>
<p>But the core proposition remains: if you like the server, and you can
move code to data or vice versa, then all servers are equal. Today's
question is how you get the code there in the first place: how to turn a
bare slab of CPU and storage into a server that's useful for your
particular purpose.</p>
<p>The traditional non-Vat-ish server (e.g. a LAMP stack) is highly
pre-configured. You split most web applications into a client half and a
server half, and write a bunch of code that defines how the server
operates. This server code might be bundled into a container image,
which bottoms out in Linux syscalls and a kernel TCP stack. The admin
must provision and launch a server first, then they can tell clients
where to find it (as well as granting access credentials).</p>
<p>The Vat-ish approach is for the server admin to evaluate a big chunk of
text, which produces a persistent object and a SturdyRef. </p>
<p><strong>The direction I want to explore is this:</strong> why should we need to
pre-configure the server? Since most API calls boil down to an object
identifier, a method name, and some serialized arguments, would it be
possible to build the server on the fly in response to (roughly) the
same amount of inbound message data?</p>
<h2>On-Demand Server Creation</h2>
<p>Imagine a very generic server, with just the following API at the root
URL path:</p>
<div class="highlight"><pre><span></span>@post("/")
def post(body):
f = eval(body)
return f()
</pre></div>
<p>(my pseudocode uses a Python/Flask-like syntax, but of course we'd use a
properly-confined language, where <code>eval()</code> doesn't grant the
client-supplied code access to anything else: it can't import a library
to write to the filesystem, or make additional network calls. It can
think furiously, but has neither voice nor hands.)</p>
<p>This isn't very useful yet. Perhaps if our server has a bigger CPU than
the client, this would let them offload some intensive computation. But
each request is isolated and independent.</p>
<p>So let's add some memory to this brain:</p>
<div class="highlight"><pre><span></span>memory = dict()
@post("/")
def post(body):
f = eval(body)
return f(memory)
</pre></div>
<p>Now that the client-supplied code can use a basic dictionary/map, they
could use it to build a basic object store. One call can leave data for
a subsequent one. Entire
<a href="https://en.wikipedia.org/wiki/Redis">companies</a> revolve around
providing tools like this. The set-side code could implement clever
indexing tables, allowing the get-side code to locate data faster.
Arbitrary search criteria could be used. This could provide a database
with server-side map/reduce functions. Clients look like this:</p>
<div class="highlight"><pre><span></span>STORE = '''
def run(memory):
memory["INDEX"] = "value"
'''
POST(url, body=STORE)
</pre></div>
<p>Does it feel wrong to supply the entire server program each time? Let's
cache it. First, let's just cache the text of the program. We augment
the protocol to make the client normally deliver just the hash, but if
the server doesn't recognize that hash, the client makes a second
request (to a different endpoint) that includes the complete program. If
the client believes that the server doesn't already have a copy, it will
premptively fill the cache before sending the invocation.</p>
<div class="highlight"><pre><span></span>program_cache = dict()
@post("/cache")
def cache_program(program_hash, program_body):
if program_hash not in program_cache:
if not program_body:
raise NeedProgramBody
if hash(program_body) != program_hash:
raise BadHash
program_cache[program_hash] = program_body
def get_program(program_hash):
return program_cache[program_hash]
memory = dict()
@post("/")
def http_post(program_hash):
program = get_program(program_hash)
f = eval(program)
return f(memory)
</pre></div>
<p>Next, let's cache the compiled/evaluated program too. E defines the
notion of <a href="http://www.erights.org/elang/kernel/auditors/">Auditors</a>
which let one object ask questions about the code inside another object.
The <code>DeepFrozen</code> auditor asserts that a given object has no references
to any mutable state. Any mutability must be passed into the object on
each call. This makes it safe to cache the callable object.</p>
<p>As Clarke's maxim says, sufficiently advanced technology is
indistinguishable from magic. These programs define an intricate
process, and carry great power when invoked. The hashes are like magic
words which uniquely identify a program, so let's call them a <strong>spell</strong>.</p>
<div class="highlight"><pre><span></span>callable_cache = dict()
def get_callable(spell):
if spell not in callable_cache:
program_body = get_program(spell)
f = eval(program_body)
if not DeepFrozen(f):
raise ProgramIsNotDeepFrozen
callable_cache[spell] = f
return callable_cache[spell]
@post("/")
def http_post(spell):
f = get_callable(spell)
return f(memory)
</pre></div>
<p>In the subsequent examples, I'll omit the details of this cache, but
remain confident that it can work, and with enough engineering, we won't
have to re-deliver the whole program every time.</p>
<h2>Security of Spellcasting</h2>
<p>Obviously, our little server is completely unprotected: anyone on the
internet can guess the IP address and port number, and supply programs
(aka "cast spells"). And callers are not isolated from each other. Every
POST could do anything it wants to the stored data: steal secrets,
corrupt state, or just delete it entirely.</p>
<p>All security on the internet is based upon knowledge of secrets, and
upon various ways of proving that knowledge to others. Bearer
credentials (e.g. API tokens) sound simple, but they require several
things: a TLS stack, a live TCP connection, trust in some particular set
of CA roots, and the discipline to only reveal the token to the right
destination (each secret is scoped to a particular recipient's identity).</p>
<p>Digital signatures are more generic. If we put aside confidentiality for
a moment, signed request messages don't require any notion of server
identity, and they can be forwarded along arbitrary paths before
eventual delivery (making them usable in asynchronous environments, or
between machines with intermittent network connectivity). Using them
safely requires something to prevent replay attacks, but bearer
credentials did too, it's just easy to forget because the obligatory TLS
connection generally did that for free.</p>
<p>Signatures involve four things: a Signer (which holds the private key),
a Verifier (which holds the public key), the Message, and the Signed
Message. Apart from key generation, there are only two operations: the
Signer can turn a Message into a Signed Message, and the Verifier can
turn a Signed Message into a Message (or throw an exception if the
signature was invalid). I learned this API from
<a href="http://ed25519.cr.yp.to/">djb's Ed25519 library</a>, and I like it because
it discourages the use of unverified data: the only way to get the
Message is to go through the Verifier first. There are situations in
which detached signatures make sense, but usually as some sort of
optimization.</p>
<p>For our server, the simplest way to limit access to specific clients is
to generate a keypair, and embed the public verifier key in the server
(either we generate the private signing key and give it to the client,
or we have the client generate the signing key and tell us the
verifier). Only clients that know the corresponding private key will be
able to make calls, and they'll express this knowledge by signing each
request.</p>
<div class="highlight"><pre><span></span>OWNER_VERIFIER = Verifier("4bc6b2faa20a39acf51e7e..")
memory = dict()
@post("/")
def http_post(signed_spell):
spell = OWNER_VERIFIER.verify(signed_spell)
f = get_callable(spell)
return f(memory)
</pre></div>
<p>Now only a client who knows the matching <code>Signer</code> private key can use
our storage and compute resources.</p>
<div class="highlight"><pre><span></span>OWNER_SIGNER = Signer("3d677d3fa31a8c88..")
STORE = '''
def run(memory):
memory["INDEX"] = "value"
'''
signed_spell = OWNER_SIGNER.sign(STORE)
POST(url, body=signed_spell)
</pre></div>
<h3>Multiple Clients</h3>
<p>What if we want to have two clients, who use separate resources? Alice
has been given free reign on the server, and she'd like to hand out
"accounts" to her friends Bob and Bert (who have distinct keys). The
simplest way would be to give them isolated dictionaries, and use their
signing key to distinguish their requests:</p>
<div class="highlight"><pre><span></span># maps client key to their personal 'memory' dict
clients = {Verifier("70ccf6d1ae2edf896b2c887.."): dict(), # Bob
Verifier("cd5afa8e4737f4313b0d3fc.."): dict(), # Bert
}
@post("/")
def http_post(client_key, signed_spell):
if client_key not in clients:
raise UnknownClientID
memory = clients[client_key]
spell = client_key.verify(signed_spell)
f = get_callable(spell)
return f(memory)
</pre></div>
<p>But this has two problems:</p>
<ul>
<li>Inflexible: Adding more clients must be done by modifying the server's
code, which can only be done by the machine's owner.</li>
<li>Isolated: Clients cannot interact with each other, even if they wanted
to.</li>
</ul>
<h3>Chained Programs</h3>
<p>So how about we have the server accept <em>two</em> programs on each request?
The first program is controlled by the owner (Alice), and is signed by
her. The second comes from the client (signed by Bob or Bert), and gets
all its power from the first. Alice's program gets to do two things:</p>
<ul>
<li>decide which key is allowed to sign the second program</li>
<li>create an attenuated <code>power</code> object (a wrapper around <code>memory</code>) to
be passed into that second program</li>
</ul>
<p>We define the first "<strong>spell component</strong>" to be the signed serialization
of an "<strong>activation record</strong>": a 2-tuple of an attenuation program and a
verifying key. The key will be used to check the signature of the next
component. We don't currently need a verifying key in the second
activation record because there's no third program (yet!).</p>
<div class="highlight"><pre><span></span>OWNER_VERIFIER = Verifier("4bc6b2faa20a39acf51e7e..")
memory = dict()
@post("/")
def http_post(body):
(first_component, second_component) = parse(body)
first_activation_record = OWNER_VERIFIER.verify(first_component)
(first_program, second_key) = parse(first_activation_record)
second_activation_record = Verifier(second_key).verify(second_component)
(second_program, _ignored) = parse(second_activation_record)
f1 = eval(first_program)
f2 = eval(second_program)
power = f1(memory)
return f2(power)
</pre></div>
<p>When Alice wants to create an account for Bob, she gets a verifying key
from Bob, and build a program to attenuate <code>memory</code> down into the
Bob-specific subset. She concatenates it with the key we'll use to
recognize Bob's program, and signs the result:</p>
<div class="highlight"><pre><span></span>OWNER_SIGNER = Signer("3d677d3fa31a8c88..")
LIMIT_TO_BOB = '''
def run(memory):
power = memory["bob"]
return power
'''
record = build_record(LIMIT_TO_BOB, "70ccf6d1ae2edf896b2c887")
PREFIX_FOR_BOB = OWNER_SIGNER.sign(record) # this will be 'first_component'
</pre></div>
<p>Alice gives this prefix to Bob. Later, when Bob wants to store something
in his private memory space, Bob creates a program to use his attenuated
object, and signs it with his key:</p>
<div class="highlight"><pre><span></span>BOB_SIGNER = Signer("2f0f57f33e71ee58")
STORE_DATA_AT_INDEX = '''
def run(power):
power["index"] = "data"
'''
record2_store = build_record(STORE_DATA_AT_INDEX, None)
POST(url, body=join(PREFIX_FOR_BOB, BOB_SIGNER.sign(record2_store)))
</pre></div>
<p>The server will run <code>limit_to_bob</code>, create a <code>power</code> object (to
Alice's original specifications) just for Bob, then pass it to Bob's
function (which stores some data). It runs <code>store_data_at_index</code> if
and only if it was signed with the key that was delivered next to
<code>limit_to_bob</code>.</p>
<p>When Bob wants to retrieve that data, he creates a different program:</p>
<div class="highlight"><pre><span></span>def get_index(power):
return power["index"]
</pre></div>
<p>and signs and POSTs it as before, using <code>PREFIX_FOR_BOB</code> again.</p>
<p>When Alice wants to create an account for Bert, she builds a
<code>limit_to_bert()</code> that looks just like <code>limit_to_bob</code> except that it
requires a different public verifying key, and returns a <code>power</code> that
holds a different subset of the <code>memory</code> object:</p>
<div class="highlight"><pre><span></span>def limit_to_bert(memory):
power = memory["bert"]
return power
</pre></div>
<p>Bert gets this new prefix, and uses it for all his requests.</p>
<p>Suppose that Alice owns the execution host. She can write attenuator
programs Alice1, Alice2, Alice3, etc, all of which will be accepted and
executed by the host, with full power. With each program, she can attach
a single identifier that enables a client like Bob to execute a second
program with less authority. Bob may have several such programs, all of
which are signed (or otherwise protected) by the same value. So if
Alice1 is tied to Bob's key, then Alice1.Bob1 and Alice1.Bob2 are valid
sub-programs. Alice2 might be tied to Bert's key, enabling Alice2.Bert1
and Alice2.Bert2. But "Alice1.Bert1" would be rejected, because the
Alice1 program doesn't authorize Bert's key.</p>
<h3>Delegation</h3>
<p>Suppose Bob stores a number in his subset of memory, and wants to let
Carol increment this value (but never decrement it). If we add a third
level of programs, then Bob can get a verifying key from Carol, and give
her a second-level prefix which exposes an <code>increment()</code> function:</p>
<div class="highlight"><pre><span></span>def limit_to_increment(power):
def increment(value):
assert isinstance(value, float)
assert value >= 0.0
power["index"] += value
return increment
</pre></div>
<p>and Carol uses this by calling the <code>power</code> object she receives:</p>
<div class="highlight"><pre><span></span>def increment_by_5(power):
power(5.0)
</pre></div>
<h2>Spells are Arbitrary-Length Lists of Attenuation Programs</h2>
<p>We can generalize this, so the server accepts an entire list of signed
programs. It lets each one decide whether it likes the
following one, executes it to create an attenuated subset of the
passed-in power, then gives that subset to the next function. The last
function is the only one which actually does anything.</p>
<p>To be specific, a "spell" is a list of signed activation records (the
"spell components"). Each record identifies both a program and the
verifying key for the next component.</p>
<div class="highlight"><pre><span></span>OWNER_VERIFIER = Verifier("4bc6b2faa20a39acf51e7e..")
memory = dict()
@post("/")
def http_post(body):
signed_records = parse(body)
next_verifier = OWNER_VERIFIER
power = memory
for i in len(signed_records):
signed_record = signed_records[i]
record = next_verifier.verify(signed_record)
program, next_key = parse(record)
next_verifier = Verifier(next_key)
f = eval(program)
if i+1 == len(signed_records):
# this is the last spell, 'next_key' is None and ignored
return f(power) # finally invoke the leaf
else:
power = f(power) # attenuate power for the next step
</pre></div>
<p>All non-leaf programs are run with a <code>power</code> argument, however we
require that none of them actually invoke it yet: they may only build
wrappers around it. No messages are sent to any <code>power</code> objects until
the final leaf function is executed. We might enforce this with an
Auditor of some sort, or by wrapping each <code>power</code> object in a proxy that
rejects invocation attempts until they are all switched on, just before
the final leaf runs.</p>
<p>All programs are DeepFrozen, the only mutability comes from the <code>power</code>
argument, and nothing invokes it during construction of the tree, so
evaluating these programs cannot make any changes to mutable state. The
lack of side-effects makes the results safe to cache, which is a big
win.</p>
<p>We'll build a tree cache: a tree in which each path from the root is a
list of components. At each node, we'll cache two things. The first is
compiled program (the <code>power</code> value) for that path, which will
encapsulate a reference to the tree-parent's cached <code>power</code> object
(which itself references its own parent, and so on, up to the root,
which is just the original <code>Power</code> object, without any code). These
<code>power</code> objects get weaker as we get further from the root, since
there are more attenuators in the way. The other value at each node is
the verifier, which is used to recognize programs that are allowed to
use the <code>power</code> object.</p>
<h2>Extensions</h2>
<p>So that's the basic idea. There are a lot of implementation details that
need to be worked out, so stay tuned for more posts on this subject. In
particular, there are several properties that we might want to add:</p>
<ul>
<li>
<p>Promises: The wrappers should at least be able to use Promises
internally, returning a Promise for the result. It'd even be nice to
let these Promises span multiple messages, although that probably
means we need a way to store the Resolvers in the persistent storage.</p>
</li>
<li>
<p>Confidentiality: In many contexts, we want privacy for the code we're
sending and the results we get back. We'd like to encrypt the programs
before sending them, in addition to signing them.</p>
</li>
<li>
<p>Caching and API Compression: We don't really want to send every
program in the chain, for every call. This can be compressed this by
sending hashes instead of programs (and only supplying the programs
when necessary). We can compress it further by referring to entire
cached chains with a single truncated hash.</p>
</li>
<li>
<p>Evaluation Approaches: There are other possible ways to invoke parent
and child programs, giving one the results of the other, which might
be more natural for certain delegation patterns.</p>
</li>
<li>
<p>More Power: Memory is the simplest thing we might offer each program,
but there are other interesting powers to choose from. The first would
be the ability to send messages to other servers, presented as a basic
HTTP client, or using some other protocol. We might also allow
programs access to a clock, or a random-number generator, enabling
non-deterministic behavior. We can use these to prevent replay
attacks, or to expire a delegated authority after some amount of time.</p>
</li>
<li>
<p>Rights-Amplification: The chains we've seen so far are purely
<em>attenuating</em>: each child program can do no more than its parent.
"Rights Amplification" is about combining <em>two</em> program chains
together to do more than either one could separately. This will
involve a program taking multiple <code>power</code> arguments, each of which
derives from a full chain.</p>
</li>
<li>
<p>Creating power in response to the brand: In our current scheme, for
Carol to receive power, Bob must create a prefix for her (which
references her key). We could instead store information about Carol in
a table in the memory, and Bob could write a generic function that
would accept messages from any client listed in this table. Bob's
function would get to examine the "Brand" of Carol's message, and
would attenuate the power differently depending upon what it sees.
When the spells are confidential, Bob's program would also be
responsible for returning an Unsealer for the client program.</p>
</li>
</ul>SPAKE2 Interoperability2017-07-31T19:37:00-07:002017-07-31T19:37:00-07:00Brian Warnertag:www.lothar.com,2017-07-31:/blog/57-SPAKE2-Interoperability/<p>
I've been <a href="https://github.com/warner/spake2.rs">working on</a>
a <a href="https://crates.io/crates/spake2">Rust implementation</a> of SPAKE2.
I want it to be compatible with
my <a href="https://github.com/warner/python-spake2">Python version</a>. What do I
need to change? Where have I accidentally indulged in protocol design,
so a choice I make in this library might cause it to behave differently
than somebody else's …</p><p>
I've been <a href="https://github.com/warner/spake2.rs">working on</a>
a <a href="https://crates.io/crates/spake2">Rust implementation</a> of SPAKE2.
I want it to be compatible with
my <a href="https://github.com/warner/python-spake2">Python version</a>. What do I
need to change? Where have I accidentally indulged in protocol design,
so a choice I make in this library might cause it to behave differently
than somebody else's library? How can I write unit tests for
interoperability?
</p>
<p>This post walks through the parts of the SPAKE2 protocol that a library
author must decide for themselves how to implement, most of which have a
direct bearing on potential compatibility with other libraries. It
describes the choices I happened to make while writing python-spake2,
none of which are necessarily the best, effectively (but accidentally)
defining a specification of sorts.</p>
<h2>Left As An Exercise For The Programmer</h2>
<p>The
<a href="http://www.di.ens.fr/~pointche/Documents/Papers/2005_rsa.pdf">SPAKE2 paper</a>,
like most self-respecting academic publications, leaves out a lot of
details that would be necessary to build a specific implementation.
Authors get tenure points for inventing new protocols and breaking
existing ones, but unfortunately not for the necessary engineering work
of defining test vectors and data-serialization formats.</p>
<p>If you want to actually implement the protocol, you must answer a number
of extra questions (listed below). For your program to interoperate with
someone else's, you must both use the same answers. For protocols that
are "grown up" enough, folks like the IETF will publish RFCs with those
details. SPAKE2 is not there yet
(<a href="https://www.rfc-editor.org/rfc/rfc8125.txt">RFC8125</a> defines some
considerations, and the CFRG has
an
<a href="https://datatracker.ietf.org/doc/draft-irtf-cfrg-spake2/">expired draft</a>),
so lucky us, we're on the cutting edge! The first few implementations
might choose mutually-incompatible approaches, but eventually we can
learn from each other and agree upon something interoperable.</p>
<p>The 0.7 release of my
<a href="https://github.com/warner/python-spake2">python-spake2 library</a>
incorporates some decisions we made on
<a href="https://github.com/bitwiseshiftleft/sjcl/pull/273">pull request #273</a>
of the <a href="https://github.com/bitwiseshiftleft/sjcl/">SJCL Project</a> (a
pure-javascript crypto library), where we were able to hash out a
mostly-interoperable pair of libraries (python and JS). Some of the
discussions there may be useful. Jonathan Lange and JP Calderone are
working on <a href="https://github.com/jml/haskell-spake2">haskell-spake2</a>, and
I'm working on a <a href="https://github.com/warner/spake2.rs">Rust library</a>,
and we're all aiming for compatibility with python-spake2.</p>
<p>I'm going to call this set of decisions the "<strong>0.7 protocol</strong>", to
emphasize my hope that some day we'll get a fully RFC-blessed
specification (deserving of the name "1.0"). When that happens, I'll
update python-spake2 to implement 1.0, and provide an "0.7 mode" for
backwards compatibility with older clients.</p>
<h2>SPAKE2 Overview</h2>
<p>SPAKE2 belongs to a family of protocols named PAKE, which stands for
Password-Authenticated Key Exchange.</p>
<p>The SPAKE2 protocol helps two people who start out knowing some shared
secret code or password. It lets our Alice and Bob exchange messages
(one each), and then both sides calculate a secret key. If Alice and Bob
used the same code, their keys will be the same, and nobody else will
know what that key is.</p>
<p>You can imagine that Alice gives the code (and some other identifying
data) to her friendly local SPAKE2 Robot, which she bought off the shelf
at SPAKE2 Robots R' Us. The robot gives her a message to deliver to
Bob's robot. Meanwhile Bob is doing almost exactly the same thing. When
Alice gives Bob's message to her local robot, her robot prints out a
random-looking key. At the same time, Bob gives Alice's message to his
robot, and his robot prints out (hopefully) the same key.</p>
<p>Both Alice and Bob have to tell their robots that the first person
playing this game is named "Alice", and the second person is named
"Bob". They must also tell their robots who is who: Alice must say
"Robot, I'm the first person, not the second", and Bob must say "I'm the
second person, not the first". That last item, the "side", is the only
thing which they do differently.</p>
<p>Internally, these robots are going to generate some random numbers, do
some math, generate a message, accept a message, do some more math,
assemble a record of everything they've done and seen and thought into a
<strong>transcript</strong>, and then turn the transcript into a key. Part of the
transcript will be secret, which is what makes the key secret too.</p>
<h2>Your SPAKE2 Library API</h2>
<p>The first thing to define is your local library API (the "SPAKE2
Robot"). This isn't completely exposed externally, of course: different
libraries with different APIs (in different languages) should be able to
interoperate if they can agree on all the other decisions we'll explore
below. But you don't have complete freedom either: some constraints
bleed through. We must glue together local API requirements, the wire
format, and the SPAKE2 math itself.</p>
<p>We'll consider some <strong>application</strong> that sits above the <strong>library</strong>.
You, as the library author, are providing an API to those applications.
You don't know what they need SPAKE2 for, or how they're going to use
it, but your job is to enable:</p>
<ul>
<li>safety: give the application author a decent chance of getting things
right</li>
<li>interoperability: enable compatibly-written applications (perhaps
using different libraries, in other langauges) get the right results
when run against an application using your library</li>
</ul>
<p>The library API we need to provide is pretty simple, and basically
consists of two functions (but which could be expressed in different
ways):</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="n">msg1</span><span class="p">,</span> <span class="n">state</span><span class="p">)</span> <span class="o">=</span> <span class="n">start</span><span class="p">(</span><span class="n">password</span><span class="p">,</span> <span class="n">idA</span><span class="p">,</span> <span class="n">idB</span><span class="p">,</span> <span class="n">side</span><span class="p">)</span>
<span class="c1"># somehow send msg1 to the other party</span>
<span class="c1"># somehow receive msg2 from the other party</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">finish</span><span class="p">(</span><span class="n">msg2</span><span class="p">,</span> <span class="n">state</span><span class="p">)</span>
<span class="c1"># do something with the shared key</span>
</pre></div>
<p><code>start()</code> will accept the password (probably as a bytestring, but
maybe as unicode), the identities of the two sides, and some way to
indicate which side this instance is playing. It's a nondeterministic
function (since it must pick a random scalar), so some languages will
require passing in a source of randomness, but to be safe, application
code should not be required to deal with this. The first function
returns two things: an outbound message to be sent to the peer, and a
state object to be used later.</p>
<p><code>finish()</code> is deterministic. It accepts both the state object and a
message received by the peer, and emits the shared key (as a
fixed-length bytestring). The length of the shared key is up to the
library author, but it's typically 256 or 512 bits (since it's the
output of the transcript hash). Applications that need more key material
should derive it from the shared key with
<a href="https://tools.ietf.org/html/rfc5869">HKDF</a>, outside the scope of the
SPAKE2 library.</p>
<p>You run both halves of the API on both sides. Each side will generate
one message and accept one message. Two messages are generated in all.</p>
<p><img alt="SPAKE2 Message Flow" src="./flow.png" /></p>
<p>The outbound message will be a bytestring, and the application will be
responsible for encoding it in whatever way is needed for the channel
(e.g. the app might need to base64-encode it to put into an HTTP header,
but that's outside the scope of the SPAKE2 library). It will generally
be of a fixed length for any given group (see below), but it may be
easier to tell application authors to expect a variable-length byte
vector.</p>
<p>The state object could be an opaque in-memory struct. For example, in
object-oriented languages, the first function may be an object
constructor, and the second is just a method call on that same object.
If the object can be serialized, or if the first function returns a
serializeable state object, then the application may be able to shut
down and be resumed in between the generation of the first message and
the receipt of the second (so the two users don't have to be online at
the same time). If you offer serialization, make sure to warn authors
against using the same state object twice, since this will hurt
security.</p>
<p>For two different libraries to interoperate, they must use the same key
length. They must also encode the same password in the same way, as well
as the identities of the two sides.</p>
<p>The application author's responsibilities are:</p>
<ul>
<li>give their user a way to enter a code or password</li>
<li>deliver that password to your library (the first function)</li>
<li>take the message your library returns and deliver it correctly to some
remote application</li>
<li>take the matching inbound message from the remote application and
deliver it correctly to your library's second function</li>
<li>do something useful with the shared key</li>
</ul>
<h3>Symmetric Mode</h3>
<p>My python-spake2 library also offers a "Symmetric Mode" which isn't
defined by the Abdalla/Pointcheval paper. This is a variant that I
developed, with help from Mike Hamburg and other crypto folks. It
removes the "side" parameter from the API, so two identical clients can
establish a key without pre-arranged knowledge of which one is which.</p>
<h2>Protocol Definitions</h2>
<p>The details that any given SPAKE2 implementation must define are:</p>
<ul>
<li>how to represent/encode the "identities" of each side</li>
<li>what group to use</li>
<li>what generator element to use</li>
<li>how to represent/encode the code/password, both as a scalar and in the
transcript</li>
<li>what "arbitrary group elements" to use for M and N (and S)</li>
<li>how to encode the group element that is sent to the other party</li>
<li>how to decode+validate the group element received from the other party</li>
<li>how to assemble the transcript</li>
<li>how to hash the transcript into a key</li>
</ul>
<p>If the library offers serialization of the state object, then it must
also define a way to serialize and parse scalars, but this is private to
the library, so it doesn't need to be part of the interoperability
specification. Scalar parsing would also be needed by any private
deterministic testing interface, described below.</p>
<h3>Identities</h3>
<p><a href="http://www.di.ens.fr/~pointche/Documents/Papers/2005_rsa.pdf">The paper</a>
describes each side as having an "identity", such as a username or
server name. The idea here is to prevent an attacker from re-using
messages of one protocol execution in some other context. If Alice is
intending to establish a session key with Server1, then her message
should not be suitable for a similar process with Server2.</p>
<p>The protocol needs bytestrings (to put into the hashed transcript, since
hashes need bytes), so the library API should require bytestrings. </p>
<p>The canonical example of an identity is a username, and of course
usernames can include all sorts of interesting human-language
characters, so the application may want to accept unicode strings, and
convert them (deterministically) into bytestrings before passing them to
the library. There are, unfortunately, multiple ways to convert weird
unicode strings into UTF-8 (look up "surrogate pairs" sometime), but
there's one recommended canonical encoding ("NFC") that should work for
all but the strangest of inputs.</p>
<p>It's not entirely clear whether this conversion should be performed by
the application or the library. The problem with doing it in the
application is that using compatible SPAKE2 libraries may not be enough
to achieve interoperability (if the applications encode differently),
and everything will seem to work fine until a sufficiently novel
username is encountered. The problem with doing it in the library is
that it drags unicode into an otherwise somewhat clear-cut API, and
hampers the application from using full 8-bit bytestrings if it should
choose (perhaps the identities are public keys from some other system:
they'd have to be encoded as unicode before passing into such an API).</p>
<p>My recommendation is to have the library accept a bytestring, but
provide guidance on what the application should do with unicode
identities (which should be "encode with UTF-8 and NFC").</p>
<p>The two identities must be given to <code>start()</code>, and must be included in
the state object so they can also be used inside <code>finish()</code>. They
should be passed as arguments with "A" and "B" in the names, so that the
other library gets them the same way around, even if the actual argument
names are different (e.g. <code>idA</code> vs <code>identity_A</code>).</p>
<p>If Carol and Dave are using this protocol, and Carol passes
<code>idA="Carol"</code> and <code>idB="Dave"</code>, then Dave must also pass
<code>identity_A="Carol"</code> and <code>identity_B="Dave"</code>. If Carol says
<code>side=A</code>, then Dave must say <code>side=B</code>.</p>
<ul>
<li>What python-spake2-0.7 does: bytestrings</li>
<li>What draft-irtf-crfg-spake2-03 does: bytestrings</li>
<li>Symmetric mode: there is only one identity, named "idS", but it can
still be set to an arbitrary string to distinguish between different
applications</li>
</ul>
<h3>The Group and Generator</h3>
<p>The details are beyond the scope of this post, but SPAKE2 uses "Abelian
Groups", which contain a (huge) number of "elements". When our group is
an "elliptic curve" group, each element is also known as a "point" (the
elements of integer groups are integers, and other groups use other
kinds of elements). So you'll see references to "point encoding" and
"point validation". The other thing to know about groups is that there's
a second thing called a "Scalar", which is basically just an integer,
limited by a big prime number (which depends on the group). Sometimes
you'll need to deal with group elements, sometimes you'll need to work
with scalars.</p>
<p>Some groups are faster than others, or have smaller elements and
scalars, or are more or less secure.</p>
<p>The Ed25519 signature protocol defines a group (sometimes named
"X25519") with nice properties, but you could use others. My python
library defaults to Ed25519 but also implements a couple of integer
groups.</p>
<p>Both sides must use the same group, of course. Every group comes with a
standard generator to use for the base-point "scalarmult" operation, and
the group's order will constrain many other choices.</p>
<ul>
<li>What python-spake2-0.7 does: defaults to Ed25519, but offers
1024/2048/3072-bit integer groups too.</li>
<li>What draft-irtf-crfg-spake2-03 does: left as an exercise for the
reader, but sample M/N values are generated for SEC1 P256/P384/P521,
and Ed25519 gets a passing mention</li>
</ul>
<h3>Code/Password</h3>
<p>The input password should be a bytestring, of any length (your library
shouldn't impose arbitrary length limits). Any necessary encoding should
be done by the application before submitting a bytestring to the SPAKE2
library (if the application needs to allow humans to choose the
password, then it may want to accept unicode and perform UTF-8 encoding
itself).</p>
<p>The same arguments for identities apply here, but I'm even more in favor
of a bytestring API (rather than unicode), because it's entirely valid
to have the password be the output of some other hash function (maybe
you stretch it with bcrypt, scrypt, or <a href="https://www.argon2.com/">Argon2</a>
first), in which case requiring a unicode string would be messy.</p>
<p>The password is needed in two places. The first (and most complicated)
is as a scalar, where it is used to blind the public Diffie-Hellman
parameters. The second is when it gets copied into the transcript (see
below).</p>
<p>For the first case, the password must be converted into a scalar of the
chosen group. This is enough of a nuisance that PAKE papers like to
pretend that users have hundred-digit integers as passwords rather than
strings, so they can avoid discussing how to get from one to the other.
We can't dodge this task: our SPAKE2 library will be responsible for
this conversion.</p>
<p>Turning a password into a scalar is closely related to generating a
uniformly random scalar, so see my
<a href="http://www.lothar.com/blog/56-Uniformly-Random-Scalars/">previous blog post</a>
for some discussion about the process. To summarize, we don't need
uniformity for SPAKE2, we just need to emit a non-negative integer less
than some large prime number P (equal to the order of the group we're
using), and preserve all the entropy of the original password
distribution.</p>
<p>In practice, this means something like:</p>
<div class="highlight"><pre><span></span>def convert_password_to_scalar(password_bytes, P):
pw_hash_bytes = sha256(password_bytes).digest()
pw_hash_int = int(binascii.hexlify(pw_hash_bytes), 16)
pw_scalar = pw_hash_int % P
return pw_scalar
</pre></div>
<p>We start with SHA256 to convert the arbitrarily-sized password into a
fixed-size bytestring, and also reduce the chance that distinct
pathological inputs (empty string, all-NUL) will give us distinct
pathological scalars (like 0, which might not be good). Then we treat
those 32 random-ish bytes as an integer, then take the result modulo P
to make sure it's in the right range (no matter what P actually is).</p>
<p>In large groups (<code>P > 2**256</code>), this won't fill the whole range, but
that's ok. In any group, this will have a bias (smaller scalars will
occur more frequently than large ones), but that's also ok for SPAKE2
(the password isn't uniformly distributed in the first place). It's
definitely <strong>not</strong> ok for Diffie-Hellman: see
the <a href="http://www.lothar.com/blog/56-Uniformly-Random-Scalars/">blog post</a>
for details. SHA256 is wide enough to preserve more entropy than any
conceivable password. A wider hash (SHA512, Blake2) would be fine too,
but e.g. a 16-bit hash would waste a lot of password entropy.</p>
<p>You might already have a function lying around which turns
uniformly-random seeds into uniformly-random scalars: this is safe to
use for the SPAKE2 password, but it's overkill, and you might get some
funny looks from the standards committee.</p>
<p>Note that some crypto libraries store scalars as opaque binary
structures (e.g. arrays of 51 bit integers,
to <a href="https://www.imperialviolet.org/2010/12/04/ecc.html">speed up</a> the
math), even when the language has built-in "bigint" support. So your
password-to-scalar function may need to use a library-specific
large-integer-to-scalar-object function.</p>
<p>Both sides must perform this conversion in exactly the same way,
otherwise they'll get mismatched keys. All aspects must be the same: the
overall algorithm, the hash function you use, the way the hash output is
turned into an integer (big-endian vs little-endian will trip you up),
and the final modulo operation.</p>
<p>When I wrote python-spake2, I was (incorrectly) worried about
uniformity, so I used an overly complicated approach (which mimics the
Ed25519 random-scalar-generation code: hash the seed to a much larger
range than is really necessary before moduloing down to P; this reduces
the bias to a tiny fraction of a bit). If I were to start again, I'd use
something simpler.</p>
<ul>
<li><a href="https://github.com/warner/python-spake2/blob/v0.7/src/spake2/groups.py#L70">What python-spake2-0.7 does</a>:
HKDF(password, info="SPAKE2 pw", salt="", hash=SHA256), expand to
32+16 bytes, treat as big-endian integer, modulo down to the Ed25519
group order (2^252+stuff)</li>
<li>What draft-irtf-crfg-spake2-03 does: left as an exercise, although
key-stretching is recommended</li>
</ul>
<p>Note that key-stretching only matters if the same password is used for
multiple executions of the protocol. Stretching would be most useful on
a login system using the SPAKE2+ variant. In SPAKE2+, the "server" side
stores a derivative of the password, so a server compromise does not
immediately allow client impersonation: this password derivative must
first be brute-forced to reveal the original password, and each loop of
this process will be lengthened by the stretch. In magic-wormhole, a new
wormhole code is generated each time, and nothing is stored anywhere, so
stretching is not necessary. I think key-stretching should be done
outside the SPAKE2 library.</p>
<h3>Arbitrary Group Elements: M and N</h3>
<p>The M and N elements must be constructed in a way that
<a href="http://www.lothar.com/blog/54-spake2-random-elements/">prevents anyone from knowing their discrete log</a>.
This generally means hashing some seed and then converting the hash
output into an element. For integer groups this is pretty easy: just
treat the bits as an integer, and then clamp to the right range. For
elliptic-curve groups, you treat the bits as a compressed representation
of a point (i.e. pretend the bits are the Y coordinate, then recover the
X coordinate), but you must make sure that the point is correct too: it
must be on the correct curve (not the twist), and it must be in the
correct subgroup.</p>
<p>Why hash a seed? We definitely want different values for M and N, we
kind of want different values for different curves, and it would be cool
to reduce our ability to fiddle with the results too much: the elements
we pick should somehow be the most "obvious" choice. Another name for
this property is the "nothing up my sleeve" number. djb's
<a href="https://bada55.cr.yp.to/">bada55</a> site (and the delightfully amusing
<a href="https://bada55.cr.yp.to/bada55-20150927.pdf">paper</a>) touch on this.
Pretend that we're trying to build a sabotaged standard, defined to use
an element for which we (and we alone) know a discrete log. Maybe we
have some magic way to learn the discrete log of, say, one in every
million elements. And say that instead of a seed, we're just using an
integer. Now we could just keep trying sequential integers until we get
an element that we can DLOG, and then we write this not-huge integer
into our standard, and tell folks something like "oh, 31337 was my first
phone number, so that was the most obvious choice", muahaha.</p>
<p>Using a short seed, named something obvious like "M", gives us a warm
fuzzy feeling that there's not much wiggle room to perform this
hypothetical search for a bad element. It's not perfect, though, since
we can probably just wiggle the other aspects (which hash to use, which
other fields to include in the hash, the order to arrange them, etc). So
hash-small-seed is nominally a good idea, but the real safety comes from
the choice of group and the hardness of the DLOG assumption.</p>
<p>Using a string seed that includes the curve name means we'll definitely
get different (and quite unrelated) values for different curves, but to
be honest using different curves pretty much gives you that anyways, and
it's not clear how similar-looking elements in unrelated curves could be
used as an attack anyways. The general concern is that you might use the
same password on two different instances of the protocol (one with each
curve), and then an attacker can somehow exploit confusion about which
messages go with which curve.</p>
<p>So that's how get a "safe" element. Nominally, this only ever needs to
be done once: in theory, I could publish the program that turns a seed
of "M" into 0x19581b8f3.. in a blog post, and then just copy the big hex
values into python-spake2, and include a note that says "if you want
proof that these were generated safely, go run the program from my
blog". And if you actually went and downloaded that code and ran it and
compared the strings, then you'd get the same level of safety. But
nobody will actually do that, so we can inspire more confidence by
adding the seed-to-element code into the library itself, and starting
from a seed instead of a big hex string.</p>
<p>(note that doing this at each startup may add a
<a href="https://github.com/warner/python-spake2/commit/b77f73207494cc60553dbefa814f3284b989faaa">non-trivial slowdown</a> to
applications: a suitable compromise might be to hard-code the elements
for the regular code path, but compare them against recomputed values in
the library's unit tests)</p>
<p>Since we need all implementations to use the same M/N elements, this
means we may need to port the specific seed-to-arbitrary-element routine
from the original language (where maybe it was a pretty natural
algorithm) into each target language (where it may seem
overcomplicated).</p>
<ul>
<li><a href="https://github.com/warner/python-spake2/blob/v0.7/src/spake2/ed25519_basic.py#L271">What python-spake2-0.7 does</a>:
HKDF(seed, info="SPAKE2 arbitrary element", salt=""), with seed equal
to "M" or "N", expand to 32+16 bytes, treat as big-endian integer,
modulo down to field order (2^255-19), treat as a Y coordinate,
recover X coordinate, always use the "positive" X value, reject if
(X,Y) is not on curve, multiply by cofactor to get candidate point,
reject candidate is zero (i.e. we started with one of the 8 low-order
points), reject if candidate times cofactor is zero (i.e. candidate
was not in the right subgroup), return candidate. If the candidate is
rejected, increment the Y coordinate by 1, wrap to field order, try
again. Repeat until success. We expect this to loop 2*8=16 times on
average before yielding a valid point. This happens at module import
time.</li>
<li>Symmetric Mode: a group element named S is constructed in the same
way, with a seed of <code>S</code>, and is used for blinding/unblinding in both
directions (where SPAKE2 says "N", replace it with S, and where SPAKE2
says "M", also replace it with S).</li>
<li>What draft-irtf-crfg-spake2-03 does: find the OID for the curve,
generate an infinite series of bytes (start with SHA256("$OID point
generation seed (M)") for that OID to get the first 32 bytes, then
SHA256(first 32 bytes) to get the second 32 bytes, repeat), slice into
encoded-element lengths, clamp bits as necessary, interpret as point,
if that fails repeat with the next slice. Do the same with "(N)".</li>
</ul>
<h3>Element Representation and Parsing (Encode/Decode)</h3>
<p>Element representation is the most obvious compatibility-impacting
decision to make, as the algorithm provides a group element (e.g. a
point) for the first message, but our library API returns a bytestring
(since we need to send bytes over the wire). So clearly we need to
define how we turn group elements into bytes, and back again.</p>
<p>(while you could define the API to return an abstract element, and push
the serialization job onto the application, that sounds unlikely to ever
interoperate with other libraries)</p>
<p>The X.509 certificate world has a fairly well-established process for
doing this, called BER or DER, and it includes things like multiple
compression mechanisms and built-in curve identifiers. The security
community has an equally well-established distaste for BER/DER, because
the parsers are hard to implement correctly (and without buffer
overflows), and these days flexibility is considered a misfeature.</p>
<p>For the Ed25519 group, points are represented as they are in the Ed25519
signature protocol: a 32-byte string, containing the Y coordinate as a
255-bit little-endian number, with the sign of the X coordinate appended
as the last bit.</p>
<p>On the receiving side, the library must parse the incoming bytes into a
group element. Obviously the encoder/parser pair must round-trip
correctly for anything generated by our library, but it must also work
<em>safely</em> for other random strings, including deliberate modifications of
otherwise-valid values by an attacker (intended to force the key to some
known value that's independent of the password). There
<a href="https://moderncrypto.org/mail-archive/curves/2017/000896.html">is</a>
<a href="https://moderncrypto.org/mail-archive/curves/2015/000551.html">some</a>
<a href="https://neilmadden.wordpress.com/2017/05/17/so-how-do-you-validate-nist-ecdh-public-keys/">debate</a>
about the issue, but for safety, the receiving side should reject any
message which turns into:</p>
<ul>
<li>a point that's not actually on the right curve</li>
<li>a point that's not actually a member of the expected subgroup</li>
</ul>
<p>The former is accomplished by testing the recovered X and Y coordinates
against the curve equation, which takes time. The latter is done by
multiplying by the order of the group (the details of which depend upon
the "cofactor"), and can take as much time as the main SPAKE2 math
itself (so potentially doubling the total CPU cost). However both are
important to do, and worth the slowdown: it will be trivial compared to
the network delay.</p>
<p>Note that whatever crypto library you use will probably implement
point-encoding and decoding for you. In general, this is great, because
they probably did a much better job of it than anything we could do. But
this also limits your ability to interoperate with a SPAKE2 function
that uses a different crypto library. And check the docs carefully to
make sure it's doing enough validation: you might be using a function
that assumes the encoded values come from a trusted source (e.g. saved
to disk), and that's not the case for us.</p>
<p>Finally, SPAKE2 is (usually) "sided": there are two roles to play, and
both participants must somehow choose (different) sides before they
start. A really common application mistake is to use the same side on
both ends: "Hello Alice?
<a href="http://www.imdb.com/title/tt1373156/">This is Alice</a>.". When this
happens, the keys won't match, and the result will be indistinguishable
from a password mismatch, which will take forever to debug.</p>
<p>To help programmers discover this error earlier, the library might want
to add a "side identifier" to the message. If the second API function is
given a message from the same side as it was told to be in the first
function, it can throw an exception which instructs the programmer to
assign different sides.</p>
<ul>
<li><a href="https://github.com/warner/python-spake2/blob/v0.7/src/spake2/ed25519_basic.py#L342">What python-spake2-0.7 does</a>:
encode points like Ed25519 does, reject not-on-curve and
not-in-correct-subgroup points during parsing. A one-byte "side
identifier" is prepended to the outgoing message, and this identifier
is checked and stripped on the inbound function.</li>
<li>What draft-irtf-crfg-spake2-03 does: specified by the choice of group,
suggests SEC1 uncompressed or big-endian integers</li>
</ul>
<h3>Transcript Generation</h3>
<p>Now that each side has sent their element, and received the other side's
element, the SPAKE2 math gets us a secret shared element. However this
isn't a key yet. The remainder of the protocol is responsible for
leveraging this secret element in the production of a proper shared key.</p>
<p>The secrecy of the shared key comes entirely from the secrecy of the
shared element (the original password is also involved, to make the
proof stronger, but doesn't add any meaningful security). However using
it alone would open us up to several mix-and-match attacks, where the
adversary redirects and reorders messages to confuse e.g. an Alice+Bob
session with an Alice+Carol session. In addition, the shared element
isn't a uniformly random key: for starters it isn't even a bytestring.
And serializing a random element doesn't get you a random bytestring:
there are usually distinctive patterns, like the high bit is always set,
or the low bits are always clear, or its value as an integer is always
smaller than the group order. Our goal is a fixed number of
independently uniformly random bits, usually 256 of them.</p>
<p>Modern protocols handle both these problems by building a "conversation
transcript", which contains every message that was exchanged (as well as
the "inner voice" intentions and computed secrets), and finally hashing
the whole thing. The hash function hides any structure from the secret
element, and the inclusion of the other messages prevents the
mix-and-match attacks.</p>
<p>It's as if Alice's SPAKE2 Robot keeps a journal as it works, with the
following entries:</p>
<ul>
<li>This a journal about a SPAKE2 conversation.</li>
<li>I'm using a password of: <code>password</code></li>
<li>The first side's identity is: <code>Alice</code></li>
<li>The second side's identity is: <code>Bob</code></li>
<li>The first side sent a message to the second side with: <code>MESSAGE1</code></li>
<li>The second side sent a message to the first side with: <code>MESSAGE2</code></li>
<li>I derived a shared group element of: <code>SECRET SHARED ELEMENT</code></li>
</ul>
<p>Except that everything in the transcript needs to be a bytestring. Bob's
robot will have an identical journal: note that every statement is true
for both sides (assuming the shared element works out), and nothing is
specific to a given side (e.g. the phrase "I am Alice" or "I am the
first side" does not appear).</p>
<p>The password could be hashed in its original form (as a bytestring), or
as a scalar (which must then be serialized into a bytestring). We need
the scalar form in both the first and the second functions, so you have
a couple of choices of CPU and space usage (noting that both are
miniscule):</p>
<ul>
<li>store only the bytestring in the state vector, and re-convert to a
scalar in the second function, then hash the bytestring</li>
<li>store only the bytestring in the state vector, and re-convert to a
scalar in the second function, then hash the serialized scalar</li>
<li>store only the scalar in the state vector, and hash the serialized
scalar</li>
<li>store both in the state, and hash the bytestring (this is my
preference)</li>
<li>store both in the state, and hash the serialized scalar</li>
</ul>
<p>The messages could include the "side" marker, or not. Since the messages
need to be bytestrings for transmission anyways, it makes sense to use
these same encoded forms for the transcript too. The final shared
element should be encoded in the same way as the messages were, although
of course this encoded secret element is never sent over a wire.</p>
<p>The concatenation scheme must resist "format confusion" attacks: where
the combination of (A1, B1) results in the same bytes as a combination
of some different (A2, B2). This only really happens when either value
can be variable-length, and the length is not correctly included in the
combined form. For example:</p>
<div class="highlight"><pre><span></span>def unsafe_cat(a, b): return a+b
assert unsafe_cat("youlo", "se") != unsafe_cat("you", "lose")
</pre></div>
<p>Adding a fixed delimiter is unsafe if the strings could contain the
delimiter:</p>
<div class="highlight"><pre><span></span>def unsafe_cat2(a, b): return a+":"+b
assert unsafe_cat2("you:lo", "se") != unsafe_cat2("you", "lo:se")
</pre></div>
<p>Escaping the delimiter can work, but is touchy (you must escape the
escape character too). The rule is that if you could reliably parse the
concatenated string back into the original pieces, no matter how weird
those pieces were, then you've got a secure concatenation function.</p>
<p>Two safe and easy ways to do this are:</p>
<ul>
<li>prefix all variable-length strings with a fixed-size length field</li>
<li>hash each variable-length string first, and concatenate the
fixed-length hashes</li>
</ul>
<div class="highlight"><pre><span></span>def safe_cat(a, b):
assert len(a) < 2**64 # length fits in 8 bytes
assert len(b) < 2**64
return "".join([(struct.pack(">Q", len(x)) + x) for x in [a,b]])
</pre></div>
<div class="highlight"><pre><span></span>def safe_hashcat(a, b):
return sha256(a).digest() + sha256(b).digest()
</pre></div>
<p>Of course, both sides must use the same order of elements, the same
encoding for each element, and the same final concatenation technique.</p>
<ul>
<li><a href="https://github.com/warner/python-spake2/blob/v0.7/src/spake2/spake2.py#L45">What python-spake2-0.7 does</a>:
transcript =
sha256(password)+sha256(idA)+sha256(idB)+msg_A+msg_B+shared_element.to_bytes()</li>
<li>Symmetric Mode: sort the two messages lexicographically to get
msg_first and msg_second, then transcript = sha256(password)+sha256(idS)+msg_first+msg_second+shared_element.to_bytes()</li>
<li>What draft-irtf-crfg-spake2-03 does: len(idA)+idA+ len(idB)+idB+ len(B_msg)+B_msg+ len(A_msg)+A_msg+ len(shared_element)+shared_element+ len(password_scalar)+password_scalar</li>
</ul>
<p>In draft-irtf-crfg-spake2-03, <code>len(x)</code> uses 8-byte little-endian
encoding. The shared element is encoded the same way as it would be on
the wire. The hash uses the password scalar, rather than the password
itself. All fields are length-prefixed even though most of them have
fixed lengths. And for some reason (maybe a typo) <code>B_msg</code> appears
before <code>A_msg</code>, even though <code>idA</code> appears before <code>idB</code>.</p>
<p>Also note the context of that draft's protocol: IETF specifications are
frequently composed together. The SPAKE2 conversation defined therein
may get embedded in other protocols (e.g. TLS) which have their own
notion of a transcript, object encoding, or required hash functions. So
the draft might not want to overspecify the protocol, for fear of
inhibiting composition.</p>
<h3>Hashing the Transcript</h3>
<p>Finally, the transcript bytes are hashed, and the result is used as the
shared key. The library must choose the hash function to use (SHA256 is
a fine choice), which nails down exactly how large the shared key will
be. Libraries should stick to some fixed-length hash function (either
SHA256, SHA512, or BLAKE2) and return a single key. Applications which
want more key material should feed this shared key into
<a href="https://tools.ietf.org/html/rfc5869">HKDF</a>.</p>
<p>Hash functions can be specialized for specific purposes (the HKDF
"context" argument provides this, or the BLAKE2 "personalization"
string). This helps to ensure that hashes computed for one purpose won't
be confused with those used for some other purpose. For SPAKE2 this
doesn't seem likely, but a given library might choose to use a
personalization string that captures the other implementation-specific
choices that it makes. Alternatively, the transcript can include a fixed
string that encapsulates the rest of the protocol (the first line could
be "This is a journal about an RFC-NNNN -formatted SPAKE2 conversation",
where the RFC specifies the hashes and groups and fixed elements and
everything else). But of course using the name of the specification in
the specification itself is a circular definition that would probably
sink any chances of getting the spec published.</p>
<p>Both sides must use the same hash function and personalization choices.</p>
<ul>
<li><a href="https://github.com/warner/python-spake2/blob/v0.7/src/spake2/spake2.py#L45">What python-spake2-0.7 does</a>:
key = sha256(transcript)</li>
<li>What draft-irtf-crfg-spake2-03 uses: left as an exercise</li>
</ul>
<h3>Things That Don't Matter (for interoperability)</h3>
<p>SPAKE2 implementations must also choose a random secret scalar. They'll
use this to compute the message that gets sent to their peer, and then
they use it again on the message they receive from their peer. It is
<a href="https://arstechnica.com/gaming/2010/12/ps3-hacked-through-poor-implementation-of-cryptography/">imperative</a>
to use a <a href="https://www.xkcd.com/221/">fresh</a> new scalar each time they
run the protocol.</p>
<p>This scalar must be chosen uniformly from the full range (<code>0 <= x <
P</code>). The considerations and techniques described
<a href="http://www.lothar.com/blog/56-Uniformly-Random-Scalars/">earlier</a> are
all important, however since this scalar is kept secret,
interoperability is not affected by how any given implementation does
it.</p>
<h3>Things That Do Matter</h3>
<p>A summary of the choices that both sides must agree upon to achieve
interoperability (some of these are made by the library's code, and
others are <em>inputs</em> to the library):</p>
<ul>
<li>the two identity strings</li>
<li>how identity strings are encoded into the transcript</li>
<li>the group to use</li>
<li>which generator of that group to use</li>
<li>how group elements are encoded (for messages and the transcript)</li>
<li>the "arbitrary elements" (M/N/S)</li>
<li>the password</li>
<li>how the password is turned into a scalar</li>
<li>how the password is encoded into the transcript</li>
<li>the order of things going into the transcript</li>
<li>the safe-concatenation technique of the transcript</li>
<li>how to hash the transcript into the final key</li>
</ul>
<p>And additional choices that affect security (but poor choices would not
show up as interoperability failures):</p>
<ul>
<li>using a group where discrete log is difficult</li>
<li>using "arbitrary elements" without a known discrete log</li>
<li>rejecting invalid encoded elements</li>
<li>safely concatenating the pieces of the transcript</li>
</ul>
<h2>Testing Interoperability</h2>
<p>The interactive nature of the protocol makes it particularly hard to
write unit tests of interoperability, especially the kind where you
compare a new execution transcript against a known "good" trace copied
earlier. SPAKE2 is a form of ZKP ("zero-knowledge proof"): Alice is
proving that she knows the same password as Bob, without revealing any
other knowledge about that password. In fact the way you prove SPAKE2 is
a ZKP is to demonstrate that someone who doesn't know the password could
still generate a transcript that's indistinguishable from a real one.</p>
<p>So we can't just take a transcript of some reference implementation
(say, python-spake2) and copy it into the non-interactive unit tests of
a new implementation (say, spake2.rs). Testing a non-modified SPAKE2
implementation requires something interactive: either having both
implementations in the same program (e.g. your Rust unit tests have to
run python code too), or using an online server to query the other
implementation (your unit tests must make network calls).</p>
<p>The key feature of SPAKE2 that enables the ZKP proof is the private
scalar (selected randomly during the first function, used to construct
the first message, stored in the state vector, and used again to process
the second message). For the algorithm to be secure, this scalar must be
selected uniformly at random from the full range of the group order, it
must never be revealed outside the library, and every single run of the
protocol must generate a fresh value.</p>
<p>So, to test two implementations against each other non-interactively, we
would have to break both implementations, by forcing them to use a known
scalar, rather than a unique random value. We run the reference
implementation as Alice with some fixed scalar, and then copy the
generated message into our unit tests. We run it again as Bob, with a
different fixed scalar, and copy that message too. We combine Alice and
Bob, and record the shared key.</p>
<p>Now, in our new implementation, we find a way to force it to use the
same scalar that we used in Bob. We assert that:</p>
<ul>
<li>the new implementation's Bob produces the same outbound message, given
the same secret scalar</li>
<li>when given Alice's message, the new code produces the same shared key</li>
</ul>
<p>This requires modifying the code. We can't exercise the random-scalar
part without interaction, but we can exercise everything beyond that
point. So to enable non-interactive unit tests, implementations should
be factored into two parts: an outer function (called by application
code) which generates a random scalar, and an inner function which
accepts the scalar as input and generates the first message (and the
state that's passed to the second half). The inner function should never
be exposed to applications: it should be private to the library's own
unit tests.</p>
<p>As an added benefit, the inner function is fully deterministic and
purely-functional.</p>
<h3>Testing Server</h3>
<p>Testing SPAKE2 implementations would benefit from an online server that
can perform protocol queries (with fully random values), which will emit
both the normal protocol message <em>and</em> the normally-secret key. To help
with debugging mismatches, this test server should also reveal its
internal state: the secret scalar, and the full transcript. If your
implementation gets a different key, you can go back and compare the
intermediate values until you find the first one that doesn't match.</p>
<p>I'm working on one, and I know JP has something in the works too.</p>
<h2>Conclusions</h2>
<p>SPAKE2 is a neat protocol: it's pretty simple (as these things go), and
it enables a good chunk of functionality that none of the other crypto
primitives can provide. </p>
<p>There seems to be a long "lead time" as cryptographic protocols slowly
make their way from the academic world down into the hands of everyday
programmers. There's all sorts of exciting stuff waiting for you in the
literature, but most of the tools that your average developer (or
security engineer) will feel comfortable using are decades old. In rough
order of familiarity:</p>
<table>
<thead>
<tr>
<th>family</th>
<th>year introduced</th>
<th>age</th>
<th>modern example</th>
</tr>
</thead>
<tbody>
<tr>
<td>hashes</td>
<td>~1979</td>
<td>~40 years</td>
<td>SHA256 (2001)</td>
</tr>
<tr>
<td>symmetric encryption</td>
<td>~50 BC (Caesar)</td>
<td>~2000 years</td>
<td>AES (1998)</td>
</tr>
<tr>
<td>MACs</td>
<td>~1984 (MAA/ISO8731-2)</td>
<td>~30 years</td>
<td>HMAC (1996)</td>
</tr>
<tr>
<td>public-key signatures</td>
<td>1977 (RSA)</td>
<td>~40 years</td>
<td>Ed25519 (2011)</td>
</tr>
<tr>
<td>public-key encryption</td>
<td>1977 (RSA)</td>
<td>~40 years</td>
<td>NaCl SecretBox (2010)</td>
</tr>
</tbody>
</table>
<p>And I think the PAKE family is ready to be the next thing through this
pipeline: the family as a whole was introduced with EKE in 1992 (25
years ago!), and SPAKE2 itself is now 12 years old.</p>
<p>But to get there, SPAKE2 must go through the same process as the rest of
these protocols. We need a good specification, as well as a certain
amount of evangelism, clear use cases, examples, prior art, visible
trailblazers, and developer confidence. Hopefully this list of
compatibility criteria can help us get there.</p>Uniformly Random Scalars2017-07-15T21:13:00-07:002017-07-15T21:13:00-07:00Brian Warnertag:www.lothar.com,2017-07-15:/blog/56-Uniformly-Random-Scalars/<p>
Many cryptographic protocols, like Diffie-Hellman and SPAKE2, require a
way to choose a uniformly random scalar from some prime-order range.
Why? What is the best way to do this?
</p>
<h2>What (is a scalar)?</h2>
<p>Classic
<a href="https://en.wikipedia.org/wiki/Diffie_hellman">Diffie-Hellman Key Exchange</a>
starts with each side chosing a random scalar. This is kept secret, but …</p><p>
Many cryptographic protocols, like Diffie-Hellman and SPAKE2, require a
way to choose a uniformly random scalar from some prime-order range.
Why? What is the best way to do this?
</p>
<h2>What (is a scalar)?</h2>
<p>Classic
<a href="https://en.wikipedia.org/wiki/Diffie_hellman">Diffie-Hellman Key Exchange</a>
starts with each side chosing a random scalar. This is kept secret, but
is used to derive a "public ephemeral element" that is sent to the other
side. It is also used upon the peer's ephemeral element to build the
shared secret element, from which the final secret key is derived.
SPAKE2, as a modified DH protocol, relies on this secret random scalar
too.</p>
<p>Scalars are basically integers in a specific range, bounded by the order
of an Abelian group, and the order is generally a big prime number P. To
be precise, scalars are "equivalence classes of integers modulo P",
meaning that you're choosing a <em>class</em> of integers, all of which are
equal to each other if your idea of "equal" is modulo P. If P is 5, then
one such equivalence class is the integers 2, 7, 12, -3, -8, -13, etc.
Each of these classes can be <em>represented</em> by a single member, which is
an integer between 0 and P, so we usually pretend that scalars are just
integers with the constraint that <code>0 <= x < P</code>. We say "2" instead of
"the class that includes 2, 7, 12, etc".</p>
<p>Also note: there is some confusion, at least in my mind, about the
precise range of scalars. Some references (including the original SPAKE2
paper) say <code>Zp</code>, which means any positive integer less than P (<code>0 <=
x < P</code>).
The <a href="http://cacr.uwaterloo.ca/hac/">Handbook of Applied Cryptography</a>
section on Diffie-Hellman (protocol 12.47, page 516) says scalars should
be <code>1 <= x <= P-2</code> (excluding both 0 and -1). I'm pretty sure that 0
is a bad choice: in DH it will cause the resulting shared element to
always be the same thing (the identity element), independent of the
other party's message. It's a bit like
a <a href="https://en.wikipedia.org/wiki/Weak_key">weak key</a> in symmetric
ciphers. But P is huge, so the chance of accidentally getting a scalar
of 0 (or any other specific value) is effectively nil. As long as the
protocol only uses scalars from trusted sources (i.e. ourselves, not the
network), we don't need to worry about it.</p>
<p>So for simplicity, I'll define our task to be generating an integer
<code>x</code>, where <code>0 <= x < P</code> for some large (prime) P, such that the
value is uniformly randomly distributed in that range (all values are
equally likely).</p>
<h2>Why (do we need a random one)?</h2>
<p>The DSA and ECDSA signature algorithms also use a unique secret random scalar
(known as a "nonce", or just <code>k</code>), and
<a href="https://crypto.stackexchange.com/questions/44644/how-does-the-biased-k-attack-on-ecdsa-work">are vulnerable</a> to
attack if this nonce is biased. If you know the first or last few bits of
each nonce, and you have multiple signatures to work with, then a brute-force
search for the signer's private key is <em>much</em> easier than it should be. In
some cases, the private key can be recovered in a couple of hours.</p>
<p>Of course, if the implementation
<a href="https://www.xkcd.com/221/">doesn't even try to be random</a>, then you wind
up with things like the
<a href="https://arstechnica.com/gaming/2010/12/ps3-hacked-through-poor-implementation-of-cryptography/">Playstation 3</a>
where they used the same hard-coded value of <code>k</code> for every single message,
allowing the private key to be recovered trivially with just two signatures.</p>
<p>It isn't clear if other protocols (like DH) are quite this vulnerable.
<a href="https://weakdh.org/imperfect-forward-secrecy-ccs15.pdf">The Logjam paper</a>,
in section 3.5, mentions attacks on small-exponent DH in poorly-chosen
integer groups, and this
<a href="https://www.ietf.org/mail-archive/web/cfrg/current/msg05004.html">email about Curve25519 scalars</a>
points out the attack-resistance provided by their specific clamping
decisions (which constrain the scalar to certain values). But in
general, our security proofs are built around the assumption that the
scalar is unique and uniformly random, so to be safe we must follow
those rules.</p>
<h2>How (do we create one)?</h2>
<p>We can assume that our operating system gives us a source of random
bytes. <code>/dev/urandom</code> on a fully-initialized unix-like host will give
us as many as we need.</p>
<p>If our target range <code>0 <= x < 256</code>, or <code>0 <= x < 65536</code>, or some other
power of 256, that'd be trivial. It would also be easy to produce integers in
a range that's any integral power of two (you just mask off the extra bits,
and treat the result as an integer). But since P is a prime, we're never
going to have a nice round size for truncation.</p>
<p>So we need to use <code>/dev/urandom</code> to get a <strong>seed</strong>, and then convert some
number of these seed bytes to an integer. This is pretty easy: just treat the
array of bytes as a base-256 number. In Python2, we can exploit the
<code>hexlify()</code> and <code>int()</code> functions to make this really fast (python3 adds
<code>int.from_bytes()</code>, which is even better):</p>
<div class="highlight"><pre><span></span>def bytes_to_integer(seed_bytes):
return int(binascii.hexlify(seed_bytes), 16)
</pre></div>
<p>What's the range of this number? It will be 0 to <code>2**len(seed_bytes)</code>.
If we use too few bytes, then it will obviously not even cover the
entire target range, so our first step is to make the seed larger than
the total range. This introduces the possibility of getting a number
that's too big, so we'll have to modulo down:</p>
<div class="highlight"><pre><span></span>def make_random_scalar_with_bytes(seed_length_bytes, P):
# check that our seed will produce sufficiently-large integers
# the right-hand side is roughly equal to ln2(P)
assert 8*seed_length_bytes > (4*len("{:x}".format(P)))
seed_bytes = os.urandom(seed_length_bytes)
hash_int = bytes_to_integer(seed_bytes)
scalar = hash_int % P
return scalar
</pre></div>
<p>What's a reasonable choice of seed length? For the Curve25519 group, P is
<code>2**252 + 27742317777372353535851937790883648493</code>, which lies on the low
end of the range between <code>2**252</code> and <code>2**253</code>. If we use 253 random bits
(which you get from 32 random bytes by doing something like <code>seed_bytes[0]
&= 0x1F</code> to mask out the top three bits), then we'll get a suitable value
slightly more than half the time, and the modulo function will kick in (i.e.
"aliasing" occurs) slightly less than half the time.</p>
<p>But that's pretty badly biased. Each time aliasing happens (e.g.
<code>hash_int >= P</code>) means that two values of <code>hash_int</code> (which <em>is</em>
uniform) are mapping to the same value of <code>scalar</code> (which therefore is
not uniform). Consider the simple case of <code>P = 2**8 - 1 == 255</code> (so we
want outputs from 0 to 254, inclusive, and exclude only 255), and our
<code>seed_length</code> is 1 byte. Seeds of 0 and 255 will both map to an output
of 0, so zeros will appear in the output twice as frequently as any
other value. The one case of aliasing will induce a bias in our output.</p>
<p>The amount of bias, in a statistical sense, depends upon how many extra
bits we start with, and how close our target <code>P</code> is to a power of 2,
so it's something like <code>ln2(P) - floor(ln2(P))</code>, using the base-2
logarithm of our target P.</p>
<h2>The Best Good-Enough Solution</h2>
<p>The simplest solution that yields a minimal bias is to throw more bits
at the problem. Using a <code>seed_length</code> that's 32 bytes (128 bits)
larger than we really need reduces the bias to a statistically
insignificant level. In this case, we're aliasing almost <strong>all</strong> the
time:</p>
<div class="highlight"><pre><span></span>def make_random_scalar(P):
# conversion that reduces the bias to a fraction of a bit
minimal_length_bits = 4*len("%x" % P)
safe_length_bits = minimal_length_bits + 128
safe_length_bytes = safe_length_bytes // 8
# that gets us between 121 and 128 bits of safety margin
return make_random_scalar_with_bytes(safe_length_bytes, P)
</pre></div>
<p>This is the approach used by the Ed25519 codebase to compute unbiased
deterministic nonces from the private key and the message being signed.
These nonces have the same requirements as ECDSA: they must be unique
and unbiased. The Ed25519 signing function creates a 512-bit hash and
then reduces it down to the ~252-bit group order: see the bottom of page
6 of the <a href="http://ed25519.cr.yp.to/ed25519-20110926.pdf">Ed25519 paper</a>,
where <code>r</code> is the nonce, and computations end up being performed mod
<code>P</code> (which they call <code>l</code>). They use about 258 extra bits.</p>
<p><a href="http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf">FIPS 186-4</a>,
which defines DSA and ECDSA, says that 64 extra bits are sufficient (in
appendix B.2.1).</p>
<h2>The Exact Solution: Try-Try-Again</h2>
<p>There is a way to remove <strong>all</strong> the bias, but you might not like it. To
achieve zero bias, you remove the modulo-P step (so there's no chance of
aliasing), and you add a loop that keeps trying new random seeds over
and over again until the integer just happens to be in the right range.</p>
<div class="highlight"><pre><span></span>def try_try_again(P):
length_in_bits = 4*len("%x" % P)
seed_length_bytes = round_up_to_multiple_of_eight(length_in_bits)
while True:
seed_bytes = mask(os.urandom(seed_length_bytes), length_in_bits)
candidate = bytes_to_integer(seed_bytes)
if candidate < P:
return candidate
# else, try again
</pre></div>
<p>This takes an unpredictable amount of time, but provides a perfectly
uniform output. The number of trials that you'll need depends upon the
same bias that we're removing. If you mask the bytes down to the minimum
number of bits, then the worst case (where P is just slightly larger
than some power of 2) is an average of two passes. If you don't bother
masking individual bits, then the worst case is 255 average passes. If P
is just slightly <strong>smaller</strong> than a power of 2, the average is a single
pass.</p>
<p>But this is an exponential distribution: if you're really unlucky, it
could take thousands of iterations before you find a suitable integer,
or worse. The <strong>mean</strong> is small, but the <strong>maximum</strong> is infinite.</p>
<p>I used this "try-try-again" algorithm as an option in
<a href="https://github.com/warner/python-ecdsa">python-ecdsa</a>. But unbounded
runtime is a drag, so the recommended approach is to use the
extra-128-bits scheme described above (in <code>make_random_scalar()</code>).</p>
<p>This technique is also used (since around 2003 for large ranges, and
<a href="https://bugs.python.org/issue9025">since 2010</a> for all ranges) in
Python's <code>random.SystemRandom.randrange()</code> function, and
<code>secrets.randbelow()</code> in Python3.6.</p>
<p>Before that point, python2.4 had
a <a href="http://bugs.python.org/issue812202">bug</a> (reported by none other than
Ron Rivest, the R in RSA!) in which <code>random.SystemRandom</code> used
<code>/dev/urandom</code> as a seed correctly, but <code>randrange()</code> used that seed
to create a floating point number, then multiplied it out to the desired
range (and rounded the result to an integer). As a result, no matter how
large a range you asked for, the number could never have more than about
53 bits of entropy (and in fact the low-order bits were always zero,
which is exactly where ECDSA is vulnerable).</p>
<p>That bug was fresh in my mind when I wrote the python-ecdsa code, which
is why I avoided using the standard library functions. But at this point
it's probably safe to just use the following (though be sure to check
what the underlying functions are really doing, especially if you're
porting this to some other language which might have made the same
mistake as Python):</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">secrets</span>
<span class="k">def</span> <span class="nf">make_random_scalar</span><span class="p">(</span><span class="n">P</span><span class="p">):</span>
<span class="k">return</span> <span class="n">secrets</span><span class="o">.</span><span class="n">randbelow</span><span class="p">(</span><span class="n">P</span><span class="p">)</span>
</pre></div>
<p>or, on python2.7:</p>
<div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">random</span> <span class="kn">import</span> <span class="n">SystemRandom</span>
<span class="k">def</span> <span class="nf">make_random_scalar</span><span class="p">(</span><span class="n">P</span><span class="p">):</span>
<span class="k">return</span> <span class="n">SystemRandom</span><span class="p">()</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">P</span><span class="p">)</span>
</pre></div>
<h2>Scalars From Seeds</h2>
<p>For testing, it may be useful to break the function up into two pieces.
The private inner function is deterministic, and accepts the seed bytes
as an argument. The externally-visible outer function is where
<code>/dev/urandom</code> is sampled. The inner function can be unit tested.</p>
<div class="highlight"><pre><span></span>def _bytes_to_integer(seed_bytes):
return int(binascii.hexlify(seed_bytes), 16)
def _map_bytes_to_scalar(seed_bytes, P):
# check that our seed will produce sufficiently-large integers
# the right-hand side is roughly equal to ln2(P)
assert 8*len(seed_bytes) > (4*len("{:x}".format(P)))
hash_int = _bytes_to_integer(seed_bytes)
scalar = hash_int % P
return scalar
def make_random_scalar(P):
# conversion that reduces the bias to a fraction of a bit
minimal_length_bits = 4*len("%x" % P)
safe_length_bits = minimal_length_bits + 128
safe_length_bytes = safe_length_bytes // 8
# that gets us between 121 and 128 bits of safety margin
seed_bytes = os.urandom(seed_length_bytes)
return _map_bytes_to_scalar(seed_bytes, P)
</pre></div>
<p>This can also be used in a related function: mapping seeds to scalars.
This function is needed for protocols like SPAKE2, where the
<code>password</code> input must be converted into a scalar for the blinding
step. In this case, uniformity is not strictly necessary (the SPAKE2
password isn't randomly distributed, so any deterministic function of it
will have the same non-random distribution). But if your library already
has <code>_map_bytes_to_scalar()</code>, then it may be easiest to build on top
of that:</p>
<div class="highlight"><pre><span></span>def password_to_scalar(pw, P):
seed = sha256(pw).digest()
return _map_bytes_to_scalar(seed, P)
</pre></div>
<p>In addition, you might want the seed-to-scalar function to behave
differently for different protocols, so the same password used in two
different places doesn't produce values which could be mixed/matched in
an attack. The usual way to accomplish this is to feed some sort of
algorithm identifier into the hash function. Some options are:</p>
<ul>
<li>a simple prefix string: <code>sha256("my algorithm name" + pw)</code></li>
<li>a real key-derivation function: <code>HKDF(context="my algorithm name",
secret=pw)</code>. This also gives you exact control over the number of
bytes, not limited to the native output size of the hash function.</li>
<li>some modern hash functions like BLAKE2 have dedicated
"personalization" inputs: <code>blake2(input=pw, personalize="my algorithm
name")</code></li>
</ul>
<h2>Use in python-spake2</h2>
<p>All of this is an attempt to explain why the password-to-scalar function
in my <a href="https://github.com/warner/python-spake2">python-spake2</a> library
is so over-complicated. When I wrote that function, I was worried that
the blinding scalar needed to be uniformly random (like most other
scalars in cryptographic protocols). So I combined all the techniques
above: both algorithm-specific hash personalization <em>and</em> using an
oversized hash output.</p>
<p>In retrospect, it would probably have been ok to just truncate a plain
SHA256 output to something less than the Curve25519 group order. In
fact, just using 128 bits would have been enough, which removes the need
for the modulo step.</p>
<div class="highlight"><pre><span></span>def password_to_scalar(pw, P):
return _bytes_to_integer(sha256(pw).digest()[:16])
</pre></div>
<p>So if you're looking at the <code>password_to_scalar</code>
<a href="https://github.com/warner/python-spake2/blob/master/src/spake2/groups.py#L70">function</a>
in <a href="https://github.com/warner/python-spake2">python-spake2</a> and think
it's unnecessarily complicated, that's why.</p>
<h2>Conclusions</h2>
<p>Thanks to Thomas Ptáček, Sean Devlin, Thomas Pornin, and Zaki Manian for
their advice and feedback.</p>Git over Tahoe-LAFS2017-02-07T01:17:00-08:002017-02-07T01:17:00-08:00Brian Warnertag:www.lothar.com,2017-02-07:/blog/55-Git-over-Tahoe-LAFS/<p>
Tahoe-LAFS provides reliability, integrity, and confidentiality, so you
can store important data safely across multiple servers. Git provides
version control and merge tools, enabling better coordination between
multiple authors. By using Tahoe as a Git backend, we could get both.
</p>
<h2>Motivations</h2>
<h3>Dropbox-workalike</h3>
<p>Tahoe's main API looks a lot like an …</p><p>
Tahoe-LAFS provides reliability, integrity, and confidentiality, so you
can store important data safely across multiple servers. Git provides
version control and merge tools, enabling better coordination between
multiple authors. By using Tahoe as a Git backend, we could get both.
</p>
<h2>Motivations</h2>
<h3>Dropbox-workalike</h3>
<p>Tahoe's main API looks a lot like an FTP server: you can
add/replace/remove whole files, and manage directories. It has mutable
files, but it doesn't handle write contention very well, and small
changes aren't as efficient as we'd like.</p>
<p>It would be nice to use Tahoe as a replacement for Dropbox: sharing a
directory among multiple computers, generally all owned by the same
person. But the simple approach (each computer reading and writing to
the same shared directory) would have several problems:</p>
<ul>
<li>two sides modifying the same file at about the same time will probably
result in one of the versions being lost</li>
<li>two sides modifying the directory at the same time could clobber the
directory entirely, depending upon the encoding settings and the
number of simultaneous writes</li>
</ul>
<p>One way to avoid simultaneous writes is to give each side exclusive
control over their own "publish" directory. All sides then watch the
publish directories from all the other sides, and merge their contents
into their local copy. Then, afterwards, they publish the merged
contents back out. With luck, this process will eventually converge, and
all sides will see the same thing. By recording some amount of history,
we can provide the merge process with more context to work with.</p>
<h3>Private Git Server</h3>
<p>Another use-case is to store Git repositories privately. Github offers
"private" repos, but in fact all the data in those repos is visible to
Github's servers. This enables a lot of valuable tooling, so it's a
reasonable tradeoff for many users. But if all you want is a secure
place to store a repository so multiple users can access it, and you
don't want the server to be able to read or modify the contents, then it
would be nice to use Tahoe as a Git backend.</p>
<p>This also dovetails with the Dropbox workflow, since we can use Git to
manage the merging between multiple publish-directories.</p>
<h2>Goals</h2>
<p>So what we want is a
<a href="https://git-scm.com/docs/gitremote-helpers">Git remote helper</a>
extension for Tahoe-LAFS. In case you aren't familiar with them,
remote-helpers let you drop a program named <code>git-remote-lafs</code> on our
<code>$PATH</code>, and then any <code>git clone lafs:...</code> will use our program
instead of the built-in HTTP/HTTPS/SSH functionality. Our helper will
also be involved with any subsequent pushes and pulls to that remote.</p>
<p>The general idea is that each Git client gets exclusive ownership of a
single Tahoe directory. These clients will push to their own directory,
then pull from all the other clients' directories. A higher-level tool
will somehow subscribe to those directories so it knows when to pull,
and can use inotify (or equivalent) to watch a local directory to know
when to <code>git commit</code> and <code>git push</code>.</p>
<h2>Git Objects, Git Push</h2>
<p>Time for a quick refresher about Git's
<a href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects">object model</a>:</p>
<ul>
<li>each Git repo is an object store, mapping SHA1 to an immutable chunk
of data and a type</li>
<li>files are stored as <strong>blob</strong> objects</li>
<li>directories are stored as <strong>tree</strong> objects, which just map child name to
an object-id (either a blob or another tree), plus a mode (chmod)</li>
<li><strong>commit</strong> objects reference parent commits, comments, and a tree object</li>
<li><strong>tag</strong> objects are GPG-signed blobs that reference a commit</li>
<li>Git can store all of these objects "loose", one local file per object,
in <code>.git/objects/XX/XYZZ</code></li>
<li>or, it can "pack" many objects into a single file, in
<code>..git/objects/pack</code>, with zlib compression, inter-object deltas,
and an index for fast access. Local packs are self-contained, but
"thin packs" can have deltas that depend upon objects that aren't in
the packfile (e.g. when you know the recipient of the packfile already
has those objects).</li>
<li>the Git repo also remembers <strong>refs</strong>, which map branch/tag names to a
commit object</li>
<li><code>git add</code> is what adds blobs and trees to the store. It also updates
an internal reference called the <strong>index</strong>, which always points at a
tree object. <code>git commit</code> creates a commit object around that tree,
adds it to the store, and updates a ref.</li>
</ul>
<p>There are a lot of similarities between Git's immutable blobs and
Tahoe's immutable files: this will help us map one to the other.</p>
<p>When you run <code>git push</code>, git runs a
<a href="https://git-scm.com/book/en/v2/Git-Internals-Transfer-Protocols">clever protocol</a>
that attempts to minimize the amount of data it has to transfer. First,
the receiving server tells the client about all the refs it already has.
Then the client runs a graph-reachability algorithm against the local
tree, to make a list of all objects that are reachable by the references
being pushed, but not reachable from the remote's current refs
(something like <code>git rev-list MINE ^THEIRS</code>: this is not guaranteed to
be minimal, but in practice it works quite well). Then it builds a
single packfile with all of these objects, expressed as deltas against
objects it knows the remote end has access to (<code>git pack-objects
--thin</code>). Then it merely sends this packfile over the wire.</p>
<p>The remote end drops the packfile into the local object store (<code>git
index-pack</code> or <code>git unpack-objects</code>), then verifies that it can find
all the new refs it's been sent (and every object they can reach). If
that looks good, it updates the ref to whatever the client specified.
Later, a <code>git checkout</code> or a pull can rely upon the routine that reads
objects from <code>.git/objects/</code> (using packfiles if necessary) to take
care of the rest.</p>
<p>A similar routine happens when <code>git fetch</code> pulls objects from a remote
repo.</p>
<p>Git remote helpers can do whatever they want, as long as the right
objects wind up in the right place.</p>
<p>Finally, for clarity, let's distinguish between the different kinds of
objects we're dealing with:</p>
<ul>
<li><strong>source files</strong>: the files from the original workspace, which you
manipulate with your editor or other application, and add with <code>git
add</code></li>
<li><strong>git objects</strong>: the blobs and trees and commits that git knows about</li>
<li><strong>tahoe objects</strong>: CHK (immutable) and SSK (mutable) tahoe files and
directories</li>
</ul>
<h2>Efficiency</h2>
<p>So we need to build a tool that maps the Git object graph (or changes to
it) into the Tahoe filesystem, and back again. Clearly we have a large
design space for this tool. There are several things we might try to
optimize or achieve:</p>
<ul>
<li>size of data (in bytes) pushed into Tahoe for each change</li>
<li>number of objects added to Tahoe for each change</li>
<li>rate of Tahoe garbage (dereferenced tahoe objects) generated per
change</li>
<li>amount of overhead (garbage-collection / consolidation work) done
during push</li>
<li>number of objects fetched by up-to-date subscribers for each change</li>
<li>size of data fetched by subscribers</li>
<li>number of objects / size of data fetched by new clones who care
about getting the whole history</li>
<li>same, for new shallow clones (those who don't care about history)</li>
<li>does regular Tahoe garbage-collection work? or do we need a special
tool to decide when it's safe to delete a Tahoe object?</li>
<li><strong>direct representation</strong>: does the Tahoe-side object graph look just
like the Git-side graph?</li>
<li>do we store all history? or just enough for the known subscribers?</li>
<li>implementation complexity</li>
</ul>
<p>The optimizations will depend upon the file sizes we're storing, how
they change over time (tiny edits or major modifications), and how
quickly we make changes to them. They'll also depend upon the clients:
do we need to support a lot of history for clients that sync
infrequently, or can we ignore history altogether? Some choices will
make the uploads cheaper at the expense of downloads, or vice versa.</p>
<p>Storing the entire history gives clients more information to perform
merges. We can reduce the storage requirements by not retaining any
history, but then clients who see conflicting local changes may have
less information to work with. If we know how up-to-date each client is,
we can get away with less information.</p>
<p>One use case is for a "todo" list, shared between a desktop and a
laptop. In this case, the main todo file might be 1MB in size, and each
day we add a few hundred bytes to it. The additions are mostly
line-oriented, so Git's native merge routines are likely to work pretty
well.</p>
<h2>Necessary Deltas</h2>
<p>For the Dropbox use-case, we only need to retain history as far back as
the oldest client. Once all clients have caught up, we don't need to
keep the history any more (unless we're specifically trying to provide
revision control, as opposed to merely keeping directories in sync). The
most likely case is that we'll have some number of "live" clients, who
are generally up-to-date, and some number of "stale" clients (e.g.
laptops that haven't been plugged in for a while) which only get updated
rarely. And then occasionally we'll add a new client, who needs to be
brought up-to-date from nothing.</p>
<p>If you think about a linear history, we might have one "stale" client at
version 5, another stale client stuck at version 8, and then all the
"live" clients are at version 20. One of the live clients makes a change
(bringing us to 21), then the other live clients catch up. In that case,
we need to be able to bring a new client from "0" directly to 21, and we
also need enough information to get from 5->8 and from 8->21 (if the
stale clients ever reconnect).</p>
<h2>One CHK Per Object, Tahoe-style</h2>
<p>Given the close relationship between Git objects and Tahoe's immutable
CHK files, the simplest <em>conceptual</em> approach would be to store each Git
object blob in a separate CHK file, store trees/commits/tags as
immutable directories, and put the refs as children of a top-level
mutable directory.</p>
<p>When translating Git trees into (immutable) Tahoe directories, we'd
store each child's "mode" in Tahoe's metadata. Commits could be rendered
similarly: comments, tree object, and parent commits would all be
expressed as specially-named "children". The translation must be
reversible, so we can get the same SHA1 out from the far side.</p>
<p>In this scheme, you could browse the entire history with just <code>tahoe
ls</code>, and <code>git log</code> would look a lot like <code>tahoe get
$COMMIT/parent-1/parent-1/comments</code>. You could even do a checkout with
<code>tahoe cp -r</code>! However none of the Git SHA1 identifiers would appear in
the Tahoe tree. Any operations that needed to compare SHA1s against the
stored data (i.e. the "what objects need to be pushed" step of the
smart-transport protocol) would need a translation table. This would map
CHK filecap to SHA1 or vice versa. We could build this cache locally,
and not store it in the Tahoe directory, however we couldn't translate
an arbitrary SHA1 into the corresponding Tahoe object without first
processing every single Tahoe object.</p>
<p>Our "fetch" tool will do a recursive traversal of the tahoe directory
space, starting with the ref, copying everything it walks into the local
Git repo (with <code>git hash-object -w --stdin</code>), and pruning each time
the child link points at something which already exists in our CHK->SHA1
cache. When it finishes the walk, it allows the normal "git fetch" to
run against the local repo, where it ought to find everything it needs.</p>
<p>This would be pretty easy to describe, but is hard to implement, since
our tool must understand the internal Git object format, as well as
knowing the entire object graph. It is also maximally inefficient for
uploaders: Tahoe's per-file overhead would be paid for every file and
parent directory that was changed, as well as the commit itself. Making
a single byte change to a 1MB file would cause a whole megabyte to be
stored (and then downloaded on the other side), with no opportunity to
take advantage of the similarity between revisions. It would also upload
additional (smaller) files for the tree and the commit objects, tripling
the number of Tahoe upload operations (which are not as fast as we'd
like).</p>
<p>It's pretty good for downloaders who are grabbing single arbitrary
revisions, or are limiting themselves to specific directories. These
"browsing" clients can fetch exactly the commit and tree and blob they
need, without pulling anything else out of the Tahoe store.</p>
<p>But it's bad for the main use-case: downloaders who already have some
portion of the history (and just want to update to the current version),
or who want to retrieve the whole history. These folks must fetch lots
of nearly-identical blobs, with no deltas or compression to make things
faster (a full MB per commit, in our TODO-list example).</p>
<h2>One CHK Per Object, Git-style</h2>
<p>Another variant is to store each blob/tree/commit object verbatim, in a
CHK file, and then stash these filecaps in a shared object directory,
named after their Git-side SHA1. This structure would look just like the
<code>.git/objects</code> directory. </p>
<p>It means we can't use <code>tahoe ls</code> as a history-navigation tool, but that
probably wasn't a big win anyways.</p>
<p>Fetch would do a traversal like above, except that the tool must parse
the Git object to figure out what child objects are needed. Instead of a
bi-directional translation function, we just need a unidirectional
function that takes a Git object (as bytes) and returns a list of child
objects that it references. The tool doesn't need to know what the
relationship is (commit->commit, commit->tree, tree->tree, tree->blob).
It just needs to know how to walk the graph so it can find all the nodes
that must be downloaded from Tahoe and copied into the Git object store.</p>
<p>This approach has the same efficiency problems as above.</p>
<h2>Just store .git in Tahoe</h2>
<p>The simplest thing to <em>implement</em> would be to just store the whole
<code>.git</code> directory in Tahoe. That would let us avoid parsing Git data
structures at all.</p>
<p>We could define our remote to point at a "bare" repo in a separate
(local) directory, then add a
<a href="https://git-scm.com/docs/git-receive-pack.html#_post_update_hook"><code>post-update</code> hook</a>
to do a <code>tahoe cp --recursive</code> after the push is complete. Git
conveniently only ever adds objects to the target directory; the only
mutation is to replace the <code>refs</code> files, and those are small.</p>
<p>Unfortunately Git is too clever when you push to a local filesystem, and
frequently stores loose objects instead of packfiles. In some cases it
knows it can hardlink the same loose object file instead of actually
making a copy, which is super efficient. But even when it can't, it
still prefers to avoid the computational overhead of generating a
packfile. There appears to be some heuristic involved: pushing a lot of
commits at the same time can create a packfile, but pushing a single
commit (where the new objects are loose in the source repo) seems to
push loose objects, not a packfile.</p>
<p>Storing a lot of loose objects into Tahoe is going to be inefficient,
both for the pusher and the later fetcher/cloner, due to Tahoe's
relatively-high per-file overhead. Making a one-byte change to a 1MB
file in the top-most directory will yield three new objects (a new copy
of the 1MB blob, a new tree object, and a new commit object), two of
which are probably very small. It would be nicer to combine all three
objects into a single tahoe upload.</p>
<p>In this scheme, <code>tahoe ls</code> would show you the same thing as regular
<code>/bin/ls</code> on the local Git object store, and Tahoe doesn't need to
know much about Git's internals. It remains completely unaware of Git's
object graph.</p>
<p>Git avoids putting too many files in a single directory by sharding the
<code>objects/</code> directory into 256 subdirs; this structure would be
replicated in the Tahoe directory. For the "objects/" subdir, we could
use a whole bunch of mutable directories, or stick to a tree of
immutable directories (with a single mutable at the top). For "refs/",
it'd be best to have a tree of immutables with a single mutable root, so
that a coordinator daemon can watch just the one "refs/" directory for
changes (which should happen exactly once per push). Then we necessarily
have a mutable container directory, for which the dircap goes into the
Git URL.</p>
<h2>One-packfile-per-push</h2>
<p>So to improve efficiency, we'd like to have one packfile for each push,
and never see loose objects in the Tahoe directory.</p>
<p>To force the creation of a new packfile for each push, we'd need our
helper program to pretend to be a remote repo, even though it's really
being stored on the same local disk as the original, as if we were doing
ssh-to-self. The actual implementation wouldn't use ssh, but could just
run a local copy of <code>git receive-pack</code>, just like the ssh remote would
normally do on the target host.</p>
<p>We could either <code>tahoe cp -r</code> the resulting directory, or use FUSE to
mount the local directory into a Tahoe dircap (perhaps with <code>sshfs</code>).
The resulting tahoe objects would look the same. Using FUSE would reduce
the total disk space used (one local copy of each object, instead of
two), but FUSE isn't an option on all platforms (it may require root
access, and kernel support). Also, <code>git receive-pack</code> assumes it has
fast read access to it's "local" disk, so FUSE may be slower than
keeping an extra (cached) copy of the packfiles and indices.</p>
<p>Basically each time our helper is told to push something, it should
build a packfile that contains every object that isn't already in the
remote, which means each object that we don't remember pushing before.
This packfile can use the previous value of the ref as a basis (storing
deltas against those objects instead of complete copies, when that
helps).</p>
<p>This is pretty efficient for the uploader (although "thin" packs would
make it better, see below), as we do one Tahoe upload (the packfile) per
change, plus a mutable write to the Tahoe object that contains the refs.
For our 100-byte change to the 1MB TODO file, this will add a Tahoe
object slightly larger than 1MB for each commit.</p>
<p>It is similarly efficient for a subscriber, which reads the one packfile
for each revision.</p>
<p>However new clones have to read all the packfiles, which is roughly 1MB
per revision, and grows linearly with the size of history. This loses
out on the compression opportunities we could get if we could merge the
packfiles all together.</p>
<h3>Implementation Details</h3>
<p>So we want something like <code>tahoe cp -r</code>, except that most of the
source files will already be present in the target Tahoe directory.
<code>tahoe backup</code> uses a database to remember what it's written before,
to avoid duplicate uploads, and compares filenames and sizes to predict
equality, but it also retains old snapshots (as behooves a backup tool).</p>
<p>In this case, we know that Git behaves in a specific way: it never
modifies a non-tempfile in the objects/ directory, and most filenames
are based on a hash of the contents. So we probably need a new tool,
which can take advantage of Git's constraints to efficiently map the
object store into a Tahoe directory:</p>
<ul>
<li>ignore tempfiles (these should be cleaned up by the time the tool runs
anyways)</li>
<li>for all filenames in the source that are also in the destination,
assert that their filesizes are the same, then skip them</li>
<li>copy all new files, creating CHK immutables for them</li>
<li>this will include both .pack and .idx files, although we could
probably omit the index (see below)</li>
<li>update refs/ (which can hopefully be stored as LITs)</li>
<li>finally, delete any files that don't exist in the source: this will be
the result of a GC</li>
</ul>
<p>This tool will have an outgoing Git <code>objects/</code> and <code>refs/</code> directory
to copy from, and a Tahoe dircap to copy into.</p>
<p>On the download side, the tool will have a dircap to read from, and a
pair of Git directories to write into. Once it finishes populating them,
the usual Git "smart" protocol can be allowed to "pull" from the local
copy.</p>
<p>The easiest thing to do is to assume that all packfiles are necessary:
the downloading client can list them all, then feed all the new .pack
files into <code>git index-pack</code>. That will copy the .pack into the local
<code>objects/</code> cache, check connectivity of all objects, and finally build
a new index file (so we could probably avoid storing them in the first
place).</p>
<p>The downside is that shallow clones, which don't care about history,
only need the most recent packfile, and without the indices (and code
that knows how to parse them), we can't tell which one that might be.</p>
<h2>Storing "thin" packs</h2>
<p>If we're only changing one byte of a 1MB file, the need for packfiles to
be self-contained forces us to store the other 999999 bytes that weren't
changed. But Git's online protocol knows how to avoid this: it can
create a "thin" packfile, in which the new blob object is recorded as a
delta against some other blob that isn't in the packfile. It knows the
recipient has the old object, so it can reference it safely.</p>
<p>So if we have accurate information about what's already in the
tahoe-side repo, we can ask Git to create a thin pack (by piping object
references into <code>git pack-objects --thin --stdout</code>), and store the
result as a tahoe file.</p>
<p>These thin packs are not acceptable residents of a
<code>.git/objects/pack/</code> directory: Git insists that everything in the
local object store is self-contained. They normally only exist as a
stream, coming out of <code>git pack-objects --stdout</code> on one side of a
push, and being consumed by <code>git receive-pack</code> or <code>git index-pack</code>
on the other. But we can define the Tahoe directory to contain something
other than a normal Git object store: it represents a frozen copy of
what would have been sent over a wire, like how you might replace a
network link with a hand-delivered parcel of disk drives.</p>
<p>This reduces our new tahoe files to just the delta between revisions
(plus the tree and commit objects). For adding 100 bytes to a TODO file,
the packfile will be 100 bytes long (plus maybe 100 bytes for the other
objects), minus any zlib compression savings. Removing 100 bytes is even
cheaper.</p>
<h2>One-packfile-per-client-delta</h2>
<p>So far, we're creating one packfile per push. If we're doing the
Dropbox-like thing and committing each time inotify tells us something
has changed (ideally once per application-level "Save" command), we'll
get a constant stream of one-file-changed commits, and one push per
commit. This is ideal for live downloading clients, who already have the
previous state and want a cheap update operation. But new clones must
fetch a large number of (small) Tahoe files to complete their history.
Stale clients will also need to fetch a lot of files.</p>
<p>We can improve this by being aware of what our downstream clients
currently know, and consolidating adjacent packfiles when nobody needs
to (efficiently) read the intermediate state. The most efficient thing
possible for full-history downloaders (subscribers and new clones) would
be to have one packfile per old client version. In our example with
clients at versions 5, 8, and 21, we would need a "0->21" packfile (for
new clones), a "5->21" packfile (in case that version-5 client wakes up
and needs to update), and an "8->21" packfile (for the version-8
client). That doesn't minimize the tahoe-side <em>storage</em> (since we're
storing multiple copies of the same data), but it does minimize the work
that downloaders have to do.</p>
<p>For shallow-history downloaders, we don't need the whole "0->21"
packfile, we just need a "21" packfile (one which contains the current
tree, but none of the history). So we'd need "5->21", "8->21", and "21".</p>
<p>But again, we're storing three copies of the latest tree, just to allow
that version-5 client to only fetch one tahoe object. And we aren't even
convinced that this stale client is going to show up any time soon. So
we can compromise by storing a "5->8" packfile, and an "8->21" packfile.
If we're supporting full-history clones, we'll also store a "0->5"
packfile, and new clones will need to grab three tahoe objects. If not,
we'll store a "21" packfile, and new clones only grab one.</p>
<p>The general approach will be for each new push to add a new packfile
(e.g. 21->22, then 22->23, etc). Live subscribers will need to fetch
each packfile anyways, to catch up quickly, so having them around is
efficient. But eventually the number of these small (one-change)
packfiles will grow uncomfortable, and the work needed for new clones
will grow (linearly with the number of changes). So every once in a
while, we'll want to consolidate or "repack", replacing all the
one-change packfiles with a single many-change packfile. If there are
any stale clients, this packfile should go from the least-stale client's
last-known version, to the current version. If there are no stale
clients, it should just contain the entire history (0->current).</p>
<p>This incurs a certain amount of overhead for the periodic consolidation,
causing <code>git push</code> to take longer than usual, and creating tahoe-side
garbage (as the old small-packfiles are removed from the tahoe
directory). It also requires the consolidation process to be aware of
the subscribing clients. Normal <code>git push</code> doesn't need this
knowledge, only the occasional repack. This should probably be driven by
the synchronization daemon rather than the git-remote helper itself.</p>
<h2>Other Optimization Targets</h2>
<p>Specific optimization goals would prompt the use of alternate (mutually
exclusive) strategies, each at the expense of other goals.</p>
<p>To optimize strictly for uploader bandwidth, we would upload a single
packfile per push, which contains just the new objects, expressed as
deltas against objects that are already present in the old commit. This
is also optimal for live subscribers, but new clones and stale clients
must fetch O(commits) objects to catch up, and the storage costs will
grow similarly.</p>
<p>To achieve a "direct representation" (where standard Tahoe CLI commands
can show you the git tree), each push must write a full copy of each
modified git object into a separate Tahoe object. This minimizes the
bytes fetched by random-access ("browsing") clients, but maximizes the
number of tahoe objects that must be downloaded (increasing the per-file
overhead). And it is seriously inefficient for regular subscribers.</p>
<p>To optimize strictly for storage consumed, we would have exactly one
packfile (with all history) at all times: each push replaces it with a
new (slightly larger) one. This is optimal for new clones, but
worst-case for live subscribers, who must fetch O(reposize) each time.
It's pretty bad for stale clients too, depending upon how stale they
are. And it creates tahoe garbage at a tremendous rate.</p>
<p>To optimize for stale subscribers, we would store one packfile per known
stale-client version (which brings that version up to the present). Live
subscribers are just "stale" at version N-1. This is fine for uploaders
if all subscribers are live, but gets worse as you add stale clients,
and even worse as those clients get more stale, something like
O(commits^2). Storing just the inter-client deltas is a reasonable
compromise. Occasional "repack" consolidation is probably a good idea.</p>
<h2>Existing Tools</h2>
<p>Why do this at all? Can we achieve our goals with existing tools?</p>
<p>Tahoe's reliability comes from redundancy: spreading the data across
multiple servers, so you can tolerate the loss of some of them. Git
natively lets you achieve plain replication, by just pushing your data
to multiple git servers. Tahoe uses erasure-coding, which gives a better
robustness-per-expansion ratio than plain replication. But it's fair to
argue that the complexity of using Tahoe is greater than the robustness
improvement we get from its erasure-coding.</p>
<p>Tahoe's security comes from encrypting all data before it leaves the
client, and integrity comes from including hashes in the filecaps.
<a href="https://spwhitton.name/tech/code/git-remote-gcrypt/">git-remote-gcrypt</a>
provides a remote-helper which encrypts data before sending it to a
remote git server. git-remote-gcrypt uses GPG and requires coordination
between GPG keys on all clients, in addition to having access rights to
a shared backend Git repository. I'm not a big fan of GPG, but in my
quick scan of the code, the approach looks sound. It appears to achieve
one-packfile-per-push, which is great.</p>
<p>It uses a single shared repository, however Git provides the equivalent
of locking, so clients aren't in danger of losing data when simultaneous
writes happen. Encryption is not convergent, so using the
one-write-repo-per-client topology described above will cause a few
rounds of redundantly-encrypted identical plaintexts to bounce around
before things settle down. While the backend is a git repository, it
does not use history in the normal Git way (there is only one commit,
which is replaced wholesale on each push), so the ciphertext cannot be
replicated with a plain "git pull" (you can create a mirror with "git
clone", but subsequent pulls cannot be merged because there's no common
history, so you'd need something like "git fetch && git update-ref
master FETCH_HEAD" to make updates, and the git-fetch might not be able
to efficiently select a minimal set of objects to deliver).</p>
<p>Adding a new client with git-remote-gcrypt requires the new client be
given write access to the git repo, and ensuring data is encrypted to
the new client's GPG key. The simplest workflow is to give the new
client a shared SSH private key (for repo access) and a shared GPG
private key (for data decryption). It's also possible to copy the new
client's SSH pubkey to the server (.ssh/authorized_keys), and copy their
GPG pubkey to the other clients, but some old client must then upload a
new version before the new one can read anything. In contrast,
git-over-tahoe clients could be configured with just a dircap (assuming
Tahoe was already configured).</p>
<p><a href="https://git-annex.branchable.com/">git-annex</a> lets you store selected
git files in a separate location, rather than including them directly
inside the git repository. It has several modes, includine one which can
put these files into Tahoe. However the main git repo is neither
encrypted or redundant. I'm still looking for a way to take advantage of
git-annex for this use case.</p>
<h2>Synchronization Daemon, Membership Management</h2>
<p>Once we have a git-over-tahoe tool implemented, with reasonable
performance and efficiency for our expected use cases, we'll need a
daemon to drive it. The purpose of using Git is to allow this daemon to
manage conflicts better: git can be configured to manage merges however
you like, including refusing to merge at all. Dropbox itself doesn't
merge anything, but instead shows you both copies (yours and theirs),
and you use renaming and deleting to express your preferred solution.
With Git you can look at all three copies (yours, theirs, and the common
ancestor), which gives you more information to work with.</p>
<p>This daemon will need a way to ask Tahoe to notify it about remote
changes in the other client's outbound directories (maybe a <code>tahoe
watch DIRCAP</code> command). It should use inotify/fsevents to watch the
local filesystem, to trigger a <code>git add/commit/push</code> cycle. This could
be made more accurate with a pass-through FUSE filesystem that can tell
when an application still has a file open for writing: nothing should be
committed to git until the save process is complete.</p>
<p>I'll examine this daemon more in a later blog post.</p>SPAKE2 "random" elements2016-01-19T13:10:00-08:002016-01-19T13:10:00-08:00Brian Warnertag:www.lothar.com,2016-01-19:/blog/54-spake2-random-elements/<p>
SPAKE2 requires two special "arbitrary" constants M and N. What
properties do these constants really need? What attacks are possible if
these requirements are not met?
</p>
<p><a href="http://www.di.ens.fr/~pointche/Documents/Papers/2005_rsa.pdf">SPAKE2</a>,
like all PAKE ("Password-Authenticated Key Exchange") protocols, allows
two people start with a weak password and then agree upon a strong
shared key …</p><p>
SPAKE2 requires two special "arbitrary" constants M and N. What
properties do these constants really need? What attacks are possible if
these requirements are not met?
</p>
<p><a href="http://www.di.ens.fr/~pointche/Documents/Papers/2005_rsa.pdf">SPAKE2</a>,
like all PAKE ("Password-Authenticated Key Exchange") protocols, allows
two people start with a weak password and then agree upon a strong
shared key, despite active attackers getting in the way. There are a
variety of protocols in this family: SRP is probably the most well-known
(but has the weakest security proofs), and J-PAKE is the one we used in
the original Firefox Sync. But SPAKE2 is my current favorite: it's
simpler, faster, and has a better security reduction.</p>
<h2>How SPAKE2 works</h2>
<p>Assume the following notation: we have some group with generator (base
element) B, we use additive notation (so <code>B*x</code> instead of <code>g^x</code>),
lower-case letters are scalars, upper-case letters are elements, and <code>*</code>
represents scalar multiplication.</p>
<p>Now the basic exchange looks like this:</p>
<p>one-time setup:</p>
<ul>
<li>choose <code>M</code> and <code>N</code> as random group elements</li>
</ul>
<p>the protocol:</p>
<ul>
<li>Alice knows <code>pw1</code>, Bob knows <code>pw2</code>, hopefully they are the same</li>
<li>Alice chooses random secret scalar <code>x</code>, sends <code>X = B*x + M*pw1</code></li>
<li>Bob chooses random secret scalar <code>y</code>, sends <code>Y = B*y + N*pw2</code></li>
<li>Alice computes <code>Z1 = (Y-N*pw1)*x</code>, then <code>K1 = hash(X,Y,Z1,pw1)</code></li>
<li>Bob computes <code>Z2 = (X-M*pw2)*y</code>, then <code>K2 = hash(X,Y,Z2,pw2)</code></li>
</ul>
<p>The promise of PAKE is that <code>K1</code> and <code>K2</code> will be the same <strong>if and only
if</strong> the <code>pw1</code>/<code>pw2</code> passwords were the same. If they were different,
then the keys are completely unrelated. For SPAKE2, this property stems
from <code>Z1</code> and <code>Z2</code>. That <code>N*pw1</code> is a "blinding factor": it obscures the
Diffie-Hellman <code>B*y</code> element in transit. Alice unblinds the element
before using it to compute <code>Z1</code>.</p>
<p>This means a passive attacker has no hope of figuring out the shared
key: from their point of view, all keys are equally likely, as are all
passwords.</p>
<p>An active attacker only gets one guess (or maybe two, depending upon how
you count). They make this guess by pretending to be Bob and running the
protocol as normal. If their <code>pw2</code> guess was right, they'll get the same
key as Alice, and they win: they know the password <em>and</em> the key. Then
they turn around and pretend to be Alice, using the successfully-guessed
password in a protocol run with Bob, which should succeed (with a
different key). Now that they know both session keys, they can MitM the
Alice-Bob connection like they would with traditional unauthenticated
Diffie-Hellman.</p>
<p>If their guess was wrong, the session keys are independent and
unrelated, and they won't be able to talk with Alice (or Bob) at all.
For each time that Alice or Bob is willing to run the protocol, they get
another guess.</p>
<p>(incidentally, both sides usually include some sort of identity string
in their transcript hashes, so an attacker can't splice together
unrelated sessions: Alice runs this protocol with a specific intent to
construct a session key for server "foo.com", and can't be confused by a
response that was replayed from an earlier session with "bar.com". <code>K1 =
hash(idA,idB,X,Y,Z1,pw1)</code>)</p>
<h2>Where do M and N come from?</h2>
<p>The original paper describes M simply as "an element in G associated
with user A". A different paper (Boneh) describes M and N as "randomly
chosen elements of G". While it's probably obvious to an experienced
student of cryptography (which I am not), it turns out that what really
matters is that <strong>nobody knows the discrete log</strong> of M and N. That is to
say, nobody knows a scalar <code>m</code> for which <code>M = B*m</code>, and likewise for N.</p>
<p>If you don't realize this, and you're struggling to figure out how
groups work anyway, you'd probably make the same mistake that I did, and
construct M by choosing an arbitrary scalar (just a large number, modulo
the group order <code>|G|</code>), and scalar-multiply it by the base point. My
first implementation used <code>11*B</code> and <code>12*B</code>, which seemed sufficiently
arbitrary to me :-). When I showed it to Mike Hamburg, he kindly pointed
out the necessary properties of M and N, and I eventually figured things
out. You can't start with a scalar and multiply your way to an element:
you must somehow start with an element.</p>
<p>The traditional way to prove that nobody knows the scalar is to hash
some simple string (with limited wiggle room) and somehow convert the
hashed output into an element. Popular strings include pi, e, sin/cos
functions, and the names of the parameters themselves. The nominal
argument is that you'd have to tamper with the fundamental constants of
the universe to have enough control over the output to steer it towards
an element for which you already knew the discrete log.</p>
<p>It turns out to be much easier to choose an arbitrary element in an
integer group, like <code>Zp*</code>: you treat the hash output as a random member
of 0..p-1, then just raise it to <code>(p-1)/q</code> (since <code>q</code> is the order of
the subgroup, <code>(p-1)/q</code> is the "cofactor"). The result will be in the
right group and will be just as uniformly distributed as the hash output
itself. There's an example
<a href="https://github.com/warner/python-spake2/blob/v0.3/spake2/groups.py#L132">here</a>.</p>
<p>For elliptic curves, you must turn the hash output into an x (or y)
coordinate, recover the other coordinate (giving you a point on the
right curve, but not necessarily in the right group), then either check
the group order or just multiply (known as "clearing the cofactor").
<a href="https://github.com/warner/python-spake2/blob/v0.3/spake2/ed25519_basic.py#L269">Here</a>
is the function which does this in my
<a href="https://github.com/warner/python-spake2">python-spake2</a> implementation.</p>
<h2>Why must M and N be random?</h2>
<p>A thing that puzzled me up until now was why, exactly, it was so
important that nobody knows the discrete logs for these constants. I
knew there was some sort of attack possible, but I couldn't figure out
the details. Mike Hamburg pointed me in the right direction in
<a href="https://moderncrypto.org/mail-archive/curves/2015/000424.html">his discussion</a>
of "SPAKE2 - Elligator Edition" on the moderncrypto.org
<a href="https://moderncrypto.org/mailman/listinfo/curves">curves mailing list</a>.
But I didn't work out the attack until just recently.</p>
<p>Here's the deal: if Mallory (our active attacker) can pretend to be Bob
for the duration of the protocol, then later she can mount an offline
dictionary attack against the password that Alice used. In particular,
Mallory can construct a function that converts a potential password into
a potential session key, and this function can be run without further
interaction with Alice and Bob. As long as she has some way to check
whether a potential session key is correct or not, she can feed a list
of common passwords into the function and quickly identify which one (if
any) was correct. This violates PAKE's promise of security: an active
attacker is supposed to be limited to just one online guess.</p>
<p>Mallory almost certainly has a way to observe <em>some</em> use of the session
key, which will give her a way to test her guesses. Maybe Alice
immediately sends a key-confirmation message (a simple hash of the key)
so Bob can tell whether the PAKE succeeded or not: then Mallory just
hashes the potential key and sees if it matches the confirmation
message. Or if Alice uses the key for authenticated encryption, and
Mallory gets to see the ciphertext, then trial decryption of that
message lets her test each guess. Because this test is <strong>offline</strong>, the
attacker can test guesses of <code>K1</code> as frequently as she likes.</p>
<p>The math looks like this. Suppose that the attacker knows that <code>N = B*n</code>
(<code>n</code> being the discrete log of the no-longer-arbitrary point <code>N</code>).</p>
<p>Now, when our attacker Mallory pretends to be Bob, she picks a random
<code>y</code> and sends <code>Y = B*y</code>, omitting the password and blinding factor <code>N</code>
entirely. Mallory receives <code>X = B*x + M*pw1</code> from Alice. Note that <code>B*x
= X - M*pw1</code>.</p>
<p>Alice will then compute <code>Z1 = (Y-N*pw1)*x</code>, which is really
<code>(Y-B*n*pw1)*x</code>, which is really <code>(B*y-B*n*pw1)*x</code>, which (since scalar
multiplication is associative) is really <code>(y-n*pw1)*B*x</code>, which means
<code>Z1=(y-n*pw1)*(X-M*pw1)</code>.</p>
<p>Note that every term in that final equation is known to Mallory except
the password <code>pw1</code>: she picked <code>y</code> herself, <code>n</code> is the discrete log of
<code>N</code>, and <code>X</code> was given to her by Alice. Mallory has all the time in the
world to try various values of <code>pw1</code>, compute a potential <code>Z1</code>, and test
the resulting session key <code>K1</code> until she finds the right password.</p>
<p>This only works if Mallory can factor out the <code>B*something</code> in Alice's
<code>Z1</code> computation, which is why she needs the discrete log of <code>N</code> to pull
it off.</p>
<p>Of course, if <code>N</code> were chosen safely, but <code>M</code>'s discrete log were known,
then Mallory would pretend to be Alice instead of Bob. Hamburg pointed
out that if you can constrain the order of the messages, and have Alice
prove herself to Bob first, then it's safe to drop <strong>one</strong> of the
blinding factors entirely (send <code>B*x</code> instead of <code>B*x + N*pw</code>). But if
the protocol isn't constrained this way, you must have both.</p>
<h2>Weaker things</h2>
<p>M and N might not be completely independent, but still nobody knows the
discrete log of them. For example, maybe:</p>
<ul>
<li>we know that N is some constant times M (hence <code>n</code> is some constant
<code>k</code> <strong>plus</strong> <code>m</code>)</li>
<li>we know that N == M (e.g. <code>k=0</code>)</li>
</ul>
<p>I don't think these result in attacks, but I'm still looking for papers
to prove it. I'm told that the
<a href="http://eprint.iacr.org/2003/038.pdf">2003 Kobara/Imai paper</a> proves
this, and Mike told me that M==N yields a proof that reduces to the
CDH-Squared problem instead of the usual CDH (Computational
Diffie-Hellman) problem, but that they're basically the same thing.</p>
<p>I especially want M==N to work because that makes the message flows in
<a href="https://github.com/warner/magic-wormhole">magic-wormhole</a> easier. If M
and N are distinct, then the two sides need to agree (ahead of time)
which role they're going to play. If M==N then the protocol is much more
symmetric, and the humans constructing offline wormhole codes don't need
to choose sides as well.</p>
<p>I use the M==N form in python-spake2's
<a href="https://github.com/warner/python-spake2/blob/v0.3/spake2/spake2.py#L209">SPAKE2_Symmetric</a>
class, and in
<a href="https://github.com/warner/magic-wormhole/blob/0.6.2/src/wormhole/blocking/transcribe.py#L284">magic-wormhole</a>
itself.</p>
<h3>Acknowledgments</h3>
<p>Many thanks to Mike Hamburg for pointers and feedback on this post, and
to Prof. Dan Boneh for telling me about SPAKE2 in the first place.</p>Petmail mailbox-server delivery protocol2015-07-25T16:38:00-07:002015-07-25T16:38:00-07:00Brian Warnertag:www.lothar.com,2015-07-25:/blog/53-petmail-delivery/<p><a href="https://github.com/warner/petmail">Petmail</a> senders use a
"<a href="https://github.com/warner/petmail/blob/48a712d8b0b6556dd608fbcb1d05178270ef3a8f/docs/mailbox.md">Mailbox server</a>"
to queue encrypted messages when their recipient is offline (and even when
they aren't). The recipient might pick up the message right away, or might
not learn about it until later. These mailboxes need a way to tell whether
they should spend their precious …</p><p><a href="https://github.com/warner/petmail">Petmail</a> senders use a
"<a href="https://github.com/warner/petmail/blob/48a712d8b0b6556dd608fbcb1d05178270ef3a8f/docs/mailbox.md">Mailbox server</a>"
to queue encrypted messages when their recipient is offline (and even when
they aren't). The recipient might pick up the message right away, or might
not learn about it until later. These mailboxes need a way to tell whether
they should spend their precious disk space to queue an incoming message, or
if it's just unwanted spam. The "delivery protocol" must convey enough
information for the mailbox server to make this decision.</p>
<p>We have several potential security goals for this delivery protocol,
categorized as follows (in each case, "0" is the best):</p>
<p>For the sender S:</p>
<ul>
<li>S0: two different senders cannot tell if they're talking to the same
recipient or not</li>
<li>S1: they can, by comparing their keys and delivery tokens.</li>
</ul>
<p>(when I started Petmail, I thought S0 was important, but I've since changed
my mind, and these days I'm not trying so hard to achieve it)</p>
<p>For the mailbox server M:</p>
<ul>
<li>M0: the mailbox server cannot tell which message came from which sender,
not even that two messages came from the same sender, nor can it determine
how many senders might be configured for each recipient</li>
<li>M1: the server cannot correlate messages with senders (or each other), but
<strong>is</strong> able to count (or at least estimate) how many senders ther are per
server</li>
<li>M2: the server can correlate messages with each other, or with a specific
(pseudonymous) sender, and by extension can count senders too</li>
</ul>
<p>For the recipient R:</p>
<ul>
<li>R0: the recipient can use the transport information to accurately identify
the sender</li>
<li>R1: they cannot: the recipient depends upon information not visible to the
mailbox server to identify the sender, which means a legitimate (but
annoying) sender could flood the server without revealing which sender they
are</li>
</ul>
<p>And the revocation behavior:</p>
<ul>
<li>Rev0: R can revoke one sender without involving the remaining ones</li>
<li>Rev1: if R revokes one sender, they must somehow update all other senders
with new information</li>
</ul>
<p>Other design criteria include the amount of state that must be managed by the
mailbox for each recipient, and the computational and cryptographic
complexity of the protocol.</p>
<h2>Security Properties of Existing Delivery Protocols</h2>
<p>For example, a simple no-frills non-anonymous protocol would publically sign
each message with a constant (per-sender) signing key, and would include a
constant per-recipient queue id. The mailbox server could hold a list of
approved senders for each recipient. This would achieve "S1 M2 R0 Rev0":
minimal anonymity, but very easy to implement, and revocations are trivial
(just remove the bad sender from the approved list).</p>
<p>Petmail's current protocol uses re-randomizable delivery tokens (the
implementation is currently a simulated stub, but the math is pretty easy to
do properly). This uses an encryption scheme in which anyone with the right
pubkey can re-encrypt a token into a new one, and nobody can correlate the
two except for the privkey holder, who can decrypt both to the same
plaintext. The mailbox server allocates the plaintext token for each
recipient, and holds the private key. The recipient randomizes a new token
for each sender, and the sender re-randomizes their token for each message.
This achieves "S0 M0 R1 Rev1". The annoying "R1" means that the recipient
depends upon a second field, encrypted out of reach of the mailbox server, to
determine who sent the message. This means that a malicious sender can flood
the mailbox with messages the recipient cannot attribute to a specific
sender, so R won't know which S to revoke. Apart from this defect, the
anonymity properties are ideal, and the implementation complexity is
moderate. The mailbox state for each recipient is minimal (one token, one
private key). However the revocation process requires allocating a new token
and updating the remaining senders, which is racy and can deanonymize senders
who are blocked from hearing about the new token.</p>
<p><a href="https://pond.imperialviolet.org/">Pond</a>'s original protocol used BBS group
signatures, and achieves "S1 M0 R0 Rev1". The "S1" results from each sender
getting the same (group) public key, so senders can trivially compare keys to
confirm that they're talking to the same recipient. The improved "R0" results
from the group signatures: the same signature that allows the mailbox to
confirm group membership also allows the recipient to identify the specific
sender, so they know which sender needs to be revoked. Unfortunately the
cryptographic complexity is higher (fancy math), and the revocation story is
similarly tricky. I've heard that Pond will abandon the group signature
scheme in favor of a simpler "stash of tokens" approach, but I haven't seen
any details.</p>
<h2>Delivery+Decrypt Tokens</h2>
<p>Recently, on the
<a href="https://moderncrypto.org/mail-archive/messaging/">messaging</a> list, we've
<a href="https://moderncrypto.org/mail-archive/messaging/2015/001769.html">discussed</a>
how the delivery identifiers could interact with forward-security key
rotation. The simplest scheme I can think of would assume that keypairs are
cheap (yay Curve25519!) and replace an interactive two-key ratchet (updated
once per roundtrip) with an (interactive) explicit list of single-use
pubkeys. It would look like this:</p>
<ul>
<li>Recipient maintains a set of a few thousand Curve25519 keypairs. For each
one, they derive privkey -> pubkey -> HMAC key -> HKID (mostly by hashing),
and remember a table that maps HKID->(senderid, privkey). Each is
single-use, and it creates more to replace them as they get used up.</li>
<li>Recipient gives the HMAC keys to the Mailbox server, keeping it up-to-date
as new ones are created. M maintains a table mapping HKID->(HMAC key,
recipient).</li>
<li>Recipient gives some pubkeys to each sender, keeping them stocked with
maybe 20 at a time.</li>
<li>Each time the sender creates a message, they derive the HMAC key and HKID,
encrypt their message (with an ephemeral Curve25519 keypair, attaching the
ephemeral pubkey to the message), append the HMAC tag, prepend the HKID,
then send the result to the mailbox server. The mailbox looks up the HKID
to get the HMAC key, validates the HMAC, and enqueues the whole message to
the recipient. The recipient fetches the queued messages, looks up the HKID
to find the sender and privkey, derives and validates the HMAC, then
decrypts the message and destroys the (HKID, privkey) pair.</li>
</ul>
<p>This protocol achieves "S0 M1 R0 Rev0". We fail to get "M0" because the
mailbox can count outstanding tokens to estimate how many senders can send to
each recipient (although we can hide moderate numbers of senders pretty
easily). But otherwise it's ideal. The forward-security window (during which
a recipient state compromise reveals old messages) is absolutely as small as
possible: each private key can be destroyed as soon as the message is
received/decrypted. Revocation is trivial (just stop making new tokens), and
can probably be sped up by cancelling the outstanding ones. Traffic whitening
(so make each message uniformly random) could be achieved with Elligator, or
by deriving a symmetric key from each token and using it to encrypt the
ephemeral pubkey.</p>
<p>One downside is the storage space that mailbox servers must dedicate to these
tokens. Imagine 1000 recipients using the same server, each of which has 500
senders, with 20 tokens each: that's 10M tokens. But the HMAC keys don't need
to be particularly strong, as they're protecting server space, not
confidentiality. And we can afford a few collisions in the HKID values too,
since the mailbox can trial-verify against multiple tokens. 128-bit HMAC keys
and 32-bit HKID values give you 20 bytes per token (and one HKID collision
per 1000 messages), so 10M tokens would require 200MB of space on the mailbox
server, plus the actual messages that it's getting paid to queue.</p>
<p>Using HMAC (instead of Ed25519) means the mailbox server can create its own
tags and mix-and-match messages with tokens, but they'll just be dropped when
the subsequent Curve25519 decrypt fails, and M already has the ability to
destroy arbitrary messages. It might be appropriate for R to ignore the HMAC
tag, and just treat HKID as a hint to avoid expensive trial-decryption.</p>
<p>Another minor downside is the interaction necessary to update both senders
and the mailbox server. We should probably amortize the updates: deliver
tokens to the mailbox in batches of 100, and to senders in batches of 5. Each
message should tell the receiving agent how many tokens the sender has left
(to tolerate lost messages better), and we might want a special mailbox
channel (not managed by tokens) to simply say "help! send more tokens!".</p>
<p>But the most significant downside is the "fail-stop" behavior of an offline
recipient. If your agent doesn't pick up messages frequently enough, a busy
sender can run out of tokens, and then they can't send you new messages until
you collect the old ones (and deliver more tokens). This behaves more like a
full voice-mail-box than an email server (except the limit is per-sender
rather than per-recipient). It's not necessarily a bad constraint on
individual human senders, but would be particularly annoying for aggregated
mailing lists or automated/machine-generated messages.</p>
<h2>Design Choices</h2>
<p>I think the last issue is unavoidable for protocols that use per-message
tokens. We basically have three design options, each with its own problems.
The mailbox server can either recognize:</p>
<ul>
<li>one thing per message (tokens): fail-stop on exhaustion</li>
<li>one thing per sender (public signatures): not sender-anonymous</li>
<li>one thing per recipient (group signatures or re-randomizable tokens):
hard to revoke a sender</li>
</ul>
<p>Or recognize nothing, accept all messages, and be vulnerable to DoS attacks
(not exactly spam, since the end user never sees it, but there won't be
server space left for the desired messages).</p>
<p>I think I'm going to prototype the one-token-per-message approach for Petmail
and see how it works out. I kind of like the current re-randomizable tokens
scheme, but I'm a bit worried about the unidentifiable junk-mail problem, and
per-message tokens are the only clean way I can see to avoid it.</p>
<p>This is definitely very "chatty": lots of little messages are being sent, in
the interests of preventing the wrong big messages from being delivered (as
well as providing forward-security).</p>Anonymity, Pseudonyms, and Linkability2015-06-01T14:10:00-07:002015-06-01T14:10:00-07:00Brian Warnertag:www.lothar.com,2015-06-01:/blog/52-linkability/<h2>Linkability</h2>
<p>
Communication systems can be evaluated on how much <em>linkability</em> they offer (or attempt to conceal) between different aspects of one's identity.
</p>
<p>At one extreme, a face-to-face conversation with your long-term friend provides extremely strong linkability between two things. One is your conception of them: the idea in your head …</p><h2>Linkability</h2>
<p>
Communication systems can be evaluated on how much <em>linkability</em> they offer (or attempt to conceal) between different aspects of one's identity.
</p>
<p>At one extreme, a face-to-face conversation with your long-term friend provides extremely strong linkability between two things. One is your conception of them: the idea in your head that points at them, something like "my old college friend Phil", along with all the feelings and memories you associated with them, the sound of their voice, the image of their face, and the tenor of their thoughts. The other is the words that you're hearing from them. You can be really confident that <em>those</em> words, and no others, are being produced by <em>that</em> person, and no other.</p>
<p>At the other extreme, a typewritten note in machine-translated english on untraceable paper, slipped under your door while nobody was looking and you were away on vacation for a month, is pretty much unlinkable to anything else. And if you're the sort of person who regularly receives such messages (perhaps you're a detective or journalist known for accepting anonymous tips), then a second one arriving under the door might or might not have anything to do with the first. We might say the message is unlinked even from the method of delivery.</p>
<h2>Identity and Pseudonyms</h2>
<p>As a basic goal, we usually want the recipient to be able to link each incoming message to some sender's "identity", which includes such notions as:</p>
<ul>
<li>how did we meet?</li>
<li>our reasons for talking to them</li>
<li>private things we think about them</li>
<li>who do we think they are?</li>
<li>our conversation history: the messages we've exchanged back and forth with them</li>
</ul>
<p>This history includes the correspondent's claimed name: frequently, the first message we receive from a new sender starts with "Hi, my name is XYZ". The French equivalent, "Je m'appelle XYZ", is literally translated as "I call myself XYZ".</p>
<p>Perhaps we believe that other people are conversing with the same person; the use of cryptographically signed messages might even prove it. In this case, we can add another piece:</p>
<ul>
<li>what other people have told us about them</li>
</ul>
<p>All the other forms of a "name" fall into this category. When Brian tells you "I have this friend named Zooko, I think you'd find him really interesting", this creates a slot in your brain that's labeled "the guy that Brian calls Zooko", and bound with a note that says "Brian thinks you'd find him really interesting". And if you're lucky enough to meet Zooko, this mental database row will eventually get bound to your future interesting conversations with him.</p>
<p>This includes ideas like "official identity". A drivers license binds together a picture, a name, a birthdate, a home address, and the implied-positive results of a driving test, with all the strength (or lack thereof) of a large-scale overworked bureaucratic government department with the moral and legal authority to throw counterfeiters in jail. </p>
<p>We almost always label these collections of identity notions with a name of some sort. People we meet in IRC chat rooms tend to get labeled with a screen name, then the entries get populated with our observations of their personalities and skills. Later, if we meet them in real life, we might bind this identity to a face and a voice, or with a "real name".</p>
<p>If this name is carefully never associated with a "real world" identity, and is cultivated to give it a long-term life of its own, we call it a "pseudonym".</p>
<p>And notably, the same human being might be bound to multiple identities, even within the same "identity database". You might read John's blog post that he prefaces with "speaking only for myself, not on behalf of my employer", and then later see him cited as the spokesperson for the organization in a more official communication. These are, at least nominally, two separate identity "slots" for the same person, cross-referenced but ideally distinct.</p>
<h2>Flavors of Anonymity</h2>
<p>"Sender Anonymity" frequently means that the recipient of a message doesn't know very much about the person who sent it. An "anonymous tip" is the usual example: the tipster sends a message (reporting a crime) to a well-known figure (the police). Everyone knows who the recipient is, but the system hides the sender's public identity to protect them from reprisals. This is what most systems mean when they claim to provide some sort of anonymity.</p>
<p>"Receiver Anonymity" means the <em>sender</em> doesn't know who they're sending the message <em>to</em>, even though the recipient might learn who the message is coming <em>from</em>. It rarely means <strong>no</strong> knowledge: yelling at random strangers is not a conversation per se. More commonly the sender has some vague idea or concept of "who" they're talking to (like "the people who run Wikileaks", or "the author of that intriguing-yet-scandalous essay I read yesterday", or Batman), but are unable to learn more conventional aspects of the recipient's identity like a birth name, address of residence, name of employer, IP address, or a list of what other scandalous essays they've written under different pseudonyms. Any of these might be used to harm the recipient (even when the IP address doesn't reveal a location, it subjects them to a DoS attack).</p>
<p>Receiver anonymity is less common. One example is when the police televise an appeal on the local news, asking for "anyone who has information on this crime to please call XYZ". Anyone watching TV that night will get the message, and the police won't know who they are (until they call). Commissioner Gordon doesn't know where Batman lives, but when he shines a bat-shaped spotlight into the clouds, the caped crusader will get the message.</p>
<p>These examples hint at a deeper property: to use either form of anonymity in a practical (bidirectional) communication system, you must generally have a "reverse channel" of some sort. Sender anonymity without a way to get messages back to the sender is pretty unsatisfying: it might useful for delivering praise or anger, but not for having an actual conversation. Receiver anonymity without a response pathway would leave Batman wondering who, exactly, needs his help: fortunately for the comic books, there's only one Bat-Signal.</p>
<p><a href="https://www.torproject.org/">Tor</a>, because it is low-latency and circuit-based, makes this reverse channel easy, at least while the sender is delivering a message. High-latency mix networks typically do not provide such a reverse channel, requiring nymservers and other tools to enable two-way communication. And achieving both sender- and receiver- anonymity, at the same time, needs even more work. More on this in a later blog post.</p>
<p>Finally, note that it is rare for receiver-anonymous systems to include confidentiality too: if you can't name the intended recipient, it's hard to hide your message from everyone else. All of Gotham knows when the Bat-Signal is activated. But this <em>can</em> be achieved if the recipient's public identity includes a public encryption key. In the ideal system, this pubkey is the only thing we learn about the recipient.</p>
<h2>Relationship Hiding</h2>
<p>One other aspect of identity is the relationship: who else are they talking to? This might be revealed by reading the headers or routing information from a message, or by watching the network traffic of one or both ends to see where the packets are going (and then associating an IP address with other aspects of their identity). Even if you run all your connections over Tor, there are other subtle clues available to the attacker: correlation between timing of network activity (without IP addresses), public behavior provoked by information being revealed (imagine a <a href="http://tvtropes.org/pmwiki/pmwiki.php/Main/FeedTheMole">subtle disinformation campaign</a> used to reveal a leak), or merely the act of using a computer at the same time as someone else.</p>
<p>Some of these attacks can reveal a hidden correspondent ("who is Alice talking to?"), others are better at confirming or denying the potential connection between two known correspondents ("is Alice talking to Bob?"). And some require access to network traffic in multiple places: the two participant's computers, intermediate servers, or other potential correspondents.</p>
<p>And if intermediate servers are used (perhaps to queue inbound messages while the recipient is offline), there may be administrative or financial connections between the recipient and the server that could be revealed.</p>
<h2>Unlinkability is Hard</h2>
<p>Unlinkability is a noble goal, but it's hard. Most of the secure messaging systems currently in development don't even try (with <a href="https://pond.imperialviolet.org">Pond</a> being the notable exception).</p>
<p>Running all connections over Tor (and using Hidden Services for listening sockets) hides IP addresses from direct observers. But it costs: higher setup complexity, increased connection latency, and reduced throughput. We need to ship a Tor binary in our packages, or require users to install one on their own. It also excludes some interesting use cases that I want to explore, like publishing "one-to-everybody" data to the entire world.</p>
<p>No (practical) systems are secure against a "global passive adversary", but it's possible to hide timing correlations against weaker eavesdroppers who can only observe a subset of the machines. Pond does this by using randomized constant-rate access patterns for message delivery and retrieval. The downside that it's very slow: Pond sends/receives an average of 16KiB every 5 minutes, which is 55 <em>bytes</em> per second. That's fine for email (I certainly can't type any faster than that), but would be frustrating for larger uses (Pond has a separate scheme for large attachments, but it's clearly marked as providing less anonymity).</p>
<p>Using a recipient-side "mailbox" server presents another challenge. You need one to let you receive messages even when your computer is offline, but <em>somebody</em> has to manage (and pay for) that server. You can run or rent your own server, but buying server time or rack space anonymously is not very easy. Borrowing space from a friend implies a link between you two, which could be traced. AGL offers a Pond server for free use, but that won't scale if it becomes popular.</p>
<p>The final challenge lies in the invitation protocol by which two new users connect for the first time. This requires a server too, and timing correlations between key-exchange messages can be used to link users before they've sent their first real message.</p>
<h2>What To Include In Petmail?</h2>
<p>I've been trying to decide what sorts of anonymity to include in Petmail. I like the deployment path of a locally-hosted web-app, which basically rules out Tor (until it gets embedded directly into the browser, wouldn't that be cool?). That would let us leverage the WebRTC code that's in modern browsers to establish point-to-point connections between online agents for file-transfer and realtime chat, but that reveals IP addresses all over the place.</p>
<p>On the other hand, using Tor gets you NAT-traversal for free (via hidden servers), and it's at least theoretically possible to bundle a Tor daemon along with a larger app (instead of asking the user to install one themselves). <a href="https://txtorcon.readthedocs.org/en/latest/">txtorcon</a> makes it pretty easy to launch and publish a hidden-service port.</p>
<p>I'm not sure where Petmail will go. It's unfortunate that unlinkability (like security) really needs to designed-in up-front. It's not the kind of thing you can bolt on later, so it has the potential to hold up the rest of a project. I may hack on the other communication features first, to get a sense of what the possibilities are, and then circle back around to anonymity issues (and redesign a lot of stuff).</p>
<p>(many thanks to Nick Sullivan for reviews and suggestions)</p>Petmail, an introduction2014-09-28T19:47:00-07:002014-09-28T19:47:00-07:00Brian Warnertag:www.lothar.com,2014-09-28:/blog/51-petmail/<p><a href="https://github.com/warner/petmail">Petmail</a> is a secure-communications project I've been noodling at for a couple of months now. To be honest, I guess I've been noodling at it for over a decade: this latest effort is really a reboot of a <a href="http://petmail.lothar.com/">project</a> that I did ten years ago, and <a href="http://petmail.lothar.com/CodeCon04/index.html">presented</a> (<a href="https://archive.org/download/codecon2004audio/CodeCon_2004-02-21_4.mp3">audio</a>) at a …</p><p><a href="https://github.com/warner/petmail">Petmail</a> is a secure-communications project I've been noodling at for a couple of months now. To be honest, I guess I've been noodling at it for over a decade: this latest effort is really a reboot of a <a href="http://petmail.lothar.com/">project</a> that I did ten years ago, and <a href="http://petmail.lothar.com/CodeCon04/index.html">presented</a> (<a href="https://archive.org/download/codecon2004audio/CodeCon_2004-02-21_4.mp3">audio</a>) at a conference named <a href="http://web.archive.org/web/20110722174725/http://www.codecon.org/2004/">CodeCon04</a> (which was also the venue for one of the early Tor presentations). My original Petmail was about spam-resistant secure email-like messaging.</p>
<p>I decided to re-use the name because my latest project has been converging with that old one. I'm not paying so much attention to the spam problem this time, but it's still about establishing a cryptographic connection between two people's user agents, and using that connection for messaging of various sorts.</p>
<h2>What (the heck is Petmail)?</h2>
<p>First off, Petmail is still just a testbed: you can't actually do anything with it yet. Even if it passes the unit tests on your computer, it will probably steal your dog and eat all your ice cream. Don't give it the chance. If you foolishly checked out the repo from <code>https://github.com/warner/petmail.git</code> , my best advice to you is to delete it before it grows. You've been warned.</p>
<p>But, what Petmail aspires to be (some day) is a secure communication tool. By "communication" I mean to start with moderate-latency mostly-text point-to-point queued message delivery (like email). Then I want to add low-latency conversations (like IM), and some form of group messaging.</p>
<p>I'd also like to include file sharing, in several modes. The simplest is share-with-future-self (aka "backup"). The next is share-with-other-self (aka Dropbox). Then there's share-with-other (what most people think of as "file sharing"). I might try to incorporate share-with-world ("publish"), eventually.</p>
<p>By "secure", I mean that at the very least, an eavesdropper does not get to learn the contents of your conversations or that of the files you are storing/sharing. I also want forward-security, meaning that if your computer (and all of its secrets) gets stolen tomorrow, then we can limit what the thief learns about your conversations from yesterday. Deleting a message from your UI should actually delete it from your computer, not merely hide it from view. This is surprisingly difficult.</p>
<p>I'm also interested in various forms of anonymity, pseudonymity, and relationship-hiding. These are expensive (they increase protocol complexity, reduce performance, and generally make deployment more difficult), so I haven't yet decided whether to include them or not. I'll be writing more about the tradeoffs involved in later blog posts.</p>
<h2>Why (am I doing this)?</h2>
<p>This latest effort started as a <a href="https://github.com/warner/toolbed">testbed</a> where I could experiment with new UI and setup ideas for <a href="https://tahoe-lafs.org/">Tahoe-LAFS</a>, specifically invitation-code -based key-management, the database-backed node structure, and the secure all-web frontend.</p>
<p>There are a lot of secure messaging tools being developed right now (the moderncrypto.org <a href="https://moderncrypto.org/mailman/listinfo/messaging">Messaging</a> list is a good place to see them being discussed). I won't pretend that mine is any better. And I'll admit to a certain about of Not-Invented-Here syndrome, where I prefer my own tool because of the language it's written in, the protocols/UX/architecture it uses, or simply because I get to make whatever changes I like. Or because it's easier for me to understand and trust my own code than to read and study someone else's.</p>
<p>But I've found that the best way to explain the ideas in my head, to other people, is to implement them and show them the code. Especially when the ideas include the way that two people establish a connection. I think a lot of security tools, mine included, get stuck because they were unable to start with the end user's experience in mind. There's much to be said for asking people to pretend to perform some task (where we might expect security to matter), see what (possibly crazy) assumptions they make about it, and then search for ways to make those assumptions come true.</p>
<p>For example, when someone runs a local mail client like Thunderbird, and sees a message with a "From:" line that has a name they recognize, it's reasonable for them to assume that this message was really written by that person. And if you know anything about SMTP, you know that's incredibly false. Likewise it's fair to expect that typing a recipient's name into a new message, or choosing someone from the addressbook, will result in a message that's only actually visible to that one person. Both assumptions are reasonable (in that lots of people would hold them), but are not met by existing systems, because they're pretty challenging to provide. We probably need to change user's expectations, but if at all possible we should find a way to meet them, because that's (by definition) the most intuitive mental model they're likely to construct and act upon.</p>
<p>I think that improving the security of communication tools will require a couple of efforts, working together:</p>
<ul>
<li>study what users want to do, and how they want to express it</li>
<li>build a framework in which most of that is possible</li>
<li>expose the limitations as clearly as we can: teach users what's possible and what's not</li>
</ul>
<p>This needs to be iterative and somewhat interactive. If we mandate that users are allowed to do impossible things (like expecting for-your-eyes-only security from a bare email address), then we'll never succeed.</p>
<p>One option is to compromise on end-to-end security to make those expectations work. For example, some systems use TLS to fetch the recipient's purported public key from the target domain's SMTP server. This has the advantage of being backwards-compatible with established email practice (i.e. you can still copy a familiar user@domain address off a business card), but adds both their server and the Certificate Authority roots to the reliance set.</p>
<p>I'm not too excited about that direction, and I'd rather leave it for other projects to explore. I'm more interested in how to teach folks to use a new model that <em>is</em> possible, even if it's different or not as immediately useful as the insecure system.</p>
<h2>How (does it work)?</h2>
<p>The basic idea is that you have an agent (a long-running program) working on your behalf, on your computer or phone. You introduce your agent to the agents of your friends, either by having the agents talk directly to each other (scanning QR codes, NFC pairing, etc), or by mediating the connection through the humans (you and your friend exchange a short code displayed by your computer):</p>
<div class="highlight"><pre><span></span>agent1 &lt;-&gt; human1 &lt;-&gt; human2 &lt;-&gt; agent2
</pre></div>
<p>This latter approach also allows you to bootstrap the new Petmail connection from some pre-existing relationship, like email or IM. Doing it this way yields different security properties -- it's hard to be sure you've connected with a specific person when they aren't actually standing in front of you -- but I think the result is good enough to be useful, and we can add post-introduction verification steps to close the gap. "I want to talk securely (to a person I've never physically met)" is a common enough user request, but kind of impossible, and raises some really deep questions that can sharpen our design.</p>
<p>When the introduction process finishes, the agents will have shared keys that they can use for encrypting subsequent messages. They also receive directions on how to reach the other agent, to deliver those messages (either for human consumption, or with internal administrative traffic).</p>
<p>People will actually have multiple agents, one per phone or computer. You introduce your own agents to each other with the same tools as before, but with a "meet your sibling" flag that tells the agents to trust each other more thoroughly. Your cluster of agents can then collude: to make sure that you see exactly one copy of each message, or that a correspondent added with your phone is also available from your laptop.</p>
<p>My initial system uses a python-based daemon that runs on your computer, and you talk to it with a local web browser (the <code>petmail open</code> command instructs the agent to open a new browser tab with the UI). Eventually I'd like to port it to a browser extension, then maybe as a standalone <a href="https://developer.mozilla.org/en-US/Apps">web app</a>, because the <a href="https://developer.mozilla.org/en-US/Marketplace/Options/Open_web_apps_for_desktop">WebRT feature</a> makes those easy to install directly to Windows/Mac/Linux/Android/FxOS (just like a native application, but I don't have to learn Windows or Mac programming tools). It might also be interesting to build a hosted form of Petmail, with the obvious security limitations that entails, as a stepping-stone to a local install.</p>
<h2>More Posts To Come</h2>
<p>I have a slew of design choices to make and/or explain: I plan to write up the resulting tradeoffs as future blog posts:</p>
<ul>
<li>What is relationship-hiding, pseudonymity, anonymity, and how much can Tor help us? Hidden services, low- vs high- latency mix networks, PIR retrieval systems. Is it really possible to receive messages anonymously?</li>
<li>Introduction protocols: how short can we make the code? What else do we need to make it secure?</li>
<li>Deployment modes: can we combine the invitation code with a "click here to install" URL? Safely?</li>
<li>Secure web UI: avoiding secret URL leaks, CSRF, and shared-origin attacks</li>
<li>Mailbox servers: queuing messages for agents that are offline, but hiding sender identities from the server.</li>
<li>Bundling Tor with your app?</li>
<li>How to rent a mailbox server: enabling an economy of services.</li>
<li>Backup tools and progress indicators: since you can't be fast, be transparent.</li>
<li>Configurable storage backends</li>
<li>As-secure-as-you-want-it filecap URLs: making Tahoe-LAFS filecaps useable by regular web browsers too.</li>
</ul>The new Sync protocol2014-05-23T12:33:00-07:002014-05-23T12:33:00-07:00Brian Warnertag:www.lothar.com,2014-05-23:/blog/50-new-sync-protocol/<p>(This wraps up a two-part series on recent changes in Firefox Sync, based on <a href="http://www.lothar.com/presentations/fxa-rwc2014/">my presentation</a> at <a href="http://realworldcrypto.wordpress.com/">RealWorldCrypto 2014</a>. Part 1 was about problems we observed in the old Sync system. Part 2 is about the protocol which replaced it.)</p>
<p>
<a href="../../blog/49-pairing-problems">Last time</a> I described the user difficulties we observed with …</p><p>(This wraps up a two-part series on recent changes in Firefox Sync, based on <a href="http://www.lothar.com/presentations/fxa-rwc2014/">my presentation</a> at <a href="http://realworldcrypto.wordpress.com/">RealWorldCrypto 2014</a>. Part 1 was about problems we observed in the old Sync system. Part 2 is about the protocol which replaced it.)</p>
<p>
<a href="../../blog/49-pairing-problems">Last time</a> I described the user difficulties we observed with the pairing-based Sync we shipped in Firefox 4.0. In late April, we released Firefox 29, with a new password-based Sync setup process. In this post, I want to describe the protocol we use in the new system, and their security properties.</p>
<p>(For the cryptographic details, you can jump directly to the full <a href="https://github.com/mozilla/fxa-auth-server/wiki/onepw-protocol">technical definition</a> of the protocol, which we've nicknamed "onepw", since there is now just "one password" to protect both account access and your encrypted data)
</p>
<h2>Design Constraints</h2>
<p>To recap the last post, the biggest frustration we saw with the old Sync setup process was that it didn't "work like other systems": users <em>thought</em> their email and password would be sufficient to get their data back, but in fact you need access to a device that was already attached to your account. This made it unsuitable for people with a single device, and made it mostly impossible to recover from the all-too-common case of losing your only browser. It also confused people who thought email+password was the standard way to set up a new browser.</p>
<p>In addition, we've been building a new system called Firefox Accounts, aka "FxA", which will be used to manage access to Mozilla's new server-based features like the application marketplace and FirefoxOS-specific services.</p>
<p>So our design constraints for the new Sync setup process were:</p>
<ul>
<li>must work well with Firefox Accounts</li>
<li>must sign in with traditional email and password: no pre-connected device necessary</li>
<li>all Sync data must be end-to-end encrypted, just like before, using a key that is only available to you and your personal devices</li>
</ul>
<h2>Firefox Accounts: Login + Keys</h2>
<p>To meet these constraints, we designed Firefox Accounts to both support the needs of basic login-only applications, <em>and</em> provide the secret keys necessary to safely encrypt your Sync data, while using traditional credentials (email+password) instead of pairing.</p>
<p>For login, FxA uses BrowserID-like certificates to affirm your control over a GUID-based account identifier. These are used to create a "<a href="https://github.com/mozilla/id-specs/blob/prod/browserid/index.md#backed-identity-assertion">Backed Identity Assertion</a>", which can be presented (as a bearer token) to a server. The Sync servers require one of these assertions before granting read/write access to the encrypted data they store.</p>
<p>Each account also manages a few encryption keys, one of which is used to encrypt your Sync data.</p>
<h2>What Does It Look Like?</h2>
<p>In Firefox 29, when you set up Sync for the first time, you'll see a box that asks for an email address and a (new) password:</p>
<p><img alt="FF 29 Sync Account-Creation Dialog" src="./create.png" width="270px" /></p>
<p>You fill that out, hit the button, then the server sends you a confirmation email. Click on the link in the email, and your browser automatically creates an encryption key and starts uploading ciphertext.</p>
<p>Connecting a second device to your account is as simple as signing in with the same email and password:</p>
<p><img alt="FF 29 Sync Sign-In Dialog" src="./sign-in.png" width="270px" /></p>
<h2>The Gory Details</h2>
<p>This section describes how the new Firefox Accounts login protocol protects your password, the Sync encryption key, and your data. For full documentation, take a look at the <a href="https://github.com/mozilla/fxa-auth-server/wiki/onepw-protocol">key-wrapping protocol specs</a> and the <a href="https://github.com/mozilla/fxa-auth-server">server implementation</a>.</p>
<p>This post only describes how the master "Sync Key" is managed. To learn about how the individual records are encrypted (which hasn't changed), take a look at the <a href="http://docs.services.mozilla.com/sync/storageformat5.html">storage format docs</a>.</p>
<h3>Encryption Keys</h3>
<p>Each account has two full-strength 256-bit encryption keys, named "kA" and "kB". These are used to protect two distinct categories of data: recoverable "class-A", and password-protected "class-B". Nothing uses class-A yet, so I'll put that off until a future article.</p>
<p>Sync data falls into class B, and uses the kB key, which is protected by your account password. In technical terms, the FxA server holds a "wrapped copy" of kB, which requires your password to unwrap. Nobody knows your password but you and your browser, not even Mozilla's servers. Not even for a moment during login. The same is true for kB.</p>
<p>To access any data encrypted under kB, you must remember your password. This means that anyone who <strong>doesn't</strong> know the password can't see your data.</p>
<p>If you forget the password, you'll have to reset the account and create a new kB, which will erase both the old kB and the data it was protecting. This is a necessary consequence of properly protecting kB with the password: if there were any other way for <strong>you</strong> to recover the data without the password, then a bad guy could do the same thing.</p>
<p>kB is a "root" key: it isn't used directly. Instead, we derive a distinct subkey for each application (like Sync) that wants to encrypt class-B data. That way, applications are prevented from decrypting data that wasn't meant for them. Sync is the only application we have so far, but we may add more in the future.</p>
<h3>Keeping your secrets safe</h3>
<p>To make sure your Sync data is really end-to-end encrypted, we must prevent anyone else from figuring out your password, otherwise they could learn kB and decrypt your data. "End-to-end" means we even have to exclude our own server from learning your password. The server is usually on your side, but we'd like to maintain security even if it gets compromised. A compromised server is the most powerful kind of attack we can handle. So if we can keep your password away from our server, we can keep it away from any attackers too.</p>
<p>We use multiple layers of security to protect your password. To start with, the server is never told your raw password: you must prove that you know the password, but that's not the same thing as revealing it. The client sends a hashed form of the password instead.</p>
<p>This hashed form uses "key-stretching" on the password before sending anything to the server, to make it hard for a compromised server to even attempt to guess your password. This stretching is pretty lightweight (1000 rounds of <a href="https://tools.ietf.org/html/rfc2898#section-5.2">PBKDF2-SHA256</a>), but only needs to protect against the attacker who gets to see the stretched password in-flight (either because they compromised the server, or they've somehow broken TLS).</p>
<p>Finally, the data stored on the server is stretched even further, to make a static compromise of the server's database less useful to an attacker. This uses the "<a href="http://www.tarsnap.com/scrypt.html">scrypt</a>" algorithm, with parameters of (N=64k, r=8, p=1). At these settings, scrypt takes 64MB of memory, and about 250ms of CPU time.</p>
<p>This complicated diagram shows how the password is processed before sending anything to the server, and then used to unwrap the server's response when it gets back:</p>
<p><img alt="FxA key handling diagram" src="./onepw.png" width="800px" /></p>
<h2>Security Properties</h2>
<p>Sync retains the same end-to-end security that it had before. The difference is that this security is now derived from your password, rather than pairing. So your security depends upon having a good password: if someone can guess it, they'll be able to connect their own browser to your account and then see all the Sync data you've stored there. </p>
<p>On the flip side, by using passwords, you can connect a new browser to your account without having an existing device nearby to pair with, and you can even access your Sync data after losing your last device, neither of which was possible with the old pairing process.</p>
<h3>How Hard Is It For Someone To Guess My Password?</h3>
<p>The main factor is how well you generate the password. The best passwords are randomly generated by your own computer. The process you use (or rather the process that an attacker <em>thinks</em> you used) determines a set of possible passwords, which an attacker would need to try, one at a time, until they find the right one. Hopefully this set is very very large: billions or trillions at least.</p>
<p>The difficulty of testing each guess depends upon what the attacker has managed to steal. Regular attackers out on the internet are limited to "online guessing", which means they just try to sign in with your email address and their guessed password, as if they were a regular user, and see whether it works or not. This is rate-limited by the server, so they'll only get dozens or maybe hundreds of guesses before the server cuts them off.</p>
<p>An attacker who gets a copy of the server's database (perhaps through an SQL injection attack, the sort you read about every month or two) have to spend about a second of computer time for each guess, which <a href="http://keywrapping.appspot.com/">adds up</a> when they must try a lot of them. The most serious kind of attack, where the bad guy gets full control of the server and can eavesdrop on login requests, yields an easier offline guessing attack (PBKDF rather than scrypt) which could be made cheaper with specialized hardware.</p>
<p>The security of old Sync didn't depend upon a password, because the pairing protocol it used meant there were no passwords.</p>
<h3>Can I Change My Password?</h3>
<p>Of course! From the Preferences menu, choose the Sync tab, and hit the "Manage" link. That will take you to the "Manage account" page, which has a "Change password" link, where you can just type your old and new passwords. Changing your password will automatically disconnect any other browsers that were syncing with your account. You'll need to re-sign-in on those browsers before they'll be able to resume syncing. All of your server-side data will be retained.</p>
<h3>What Happens If I Forget My Password?</h3>
<p>If you can't remember your password, you'll have to reset your account (by using the "Forgot password?" link from the login screen). This will send you a password-reset confirmation email with a link in it. Click on the link, and you'll be taken to a page where you can set up a new password. As with changing your password, this will disconnect all browsers from your account, so once you've finished the reset process, you'll need to sign back into each browser with your new password.</p>
<p>Resetting the account will necessarily erase any data stored on the server. To be precise, the old data is irretrievable (it was encrypted by a key that was wrapped by your now-forgotten password; and since you can't recover that key, you can't recover the data either), and the Sync storage server will erase the old data when it learns that you're using a new encryption key.</p>
<p>If you reset your account from a browser that was already syncing and up-to-date, then after you reconnect, your browser will simply repopulate the server with your bookmarks/etc, and nothing will be lost. It's also fine to reset your account from one (empty) browser, then reconnect a second (full) browser: your data will be merged, and everything will eventually be available on both devices.</p>
<p>The one case where you can't recover your old data is if you lose or break your only device and <strong>also</strong> forget your password. In this case, when you reset your account from a new (empty) browser, then your old Sync data is lost, and you'll have to start again from a blank slate. You may want to write down your password in a safe place at home to avoid this, sort of like leaving a spare housekey with a trusted neighbor in case you lose your own.</p>
<h3>What If I'm Already Running Sync?</h3>
<p>If you've been using Sync for a while now, you probably set up Sync with the pairing scheme from Firefox 28 or earlier. Never fear! Your browsers will continue to sync with each other even after you upgrade some or all of them to FF29.</p>
<p>If you're still running FF28, the FF24 ESR (<a href="http://www.mozilla.org/en-US/firefox/organizations/faq/">Extended Support Release</a>), or another pre-FF29 browser, you can still use the pairing flow to connect additional old browsers. We'll support this flow until at least the end of the ESR maintenance period (14-Oct-2014), maybe a bit longer, but eventually we'll shut down the servers necessary to support the old pairing flow, and pairing will stop working. We hope to have a new pairing system in place by then: see below.</p>
<p>Likewise, after most users have migrated to New Sync, and everyone has been given fair notice to upgrade, the old-style Sync storage servers will eventually be shut down. But for now, existing Sync users don't need to make any changes.</p>
<p>However, pairing-based Old Sync and password-based FxA-powered New Sync don't mix: if you used pairing to connect two FF28 browsers together, you won't be able to connect a third FF29 browser to them, even if you upgrade them all to FF29. You'll need to move everything to FxA to connect all three:</p>
<ul>
<li>upgrade everything to FF29</li>
<li>disconnect both old browsers from Sync</li>
<li>create a Firefox Account</li>
<li>sign all three browsers into your new account with the same email and password</li>
</ul>
<p>This process won't lose any data: everything will be merged together in the new account.</p>
<h2>Future Directions</h2>
<p>We're <a href="https://id.etherpad.mozilla.org/fxa-2FA">working</a> on adding two-factor authentication ("2FA") to Firefox Accounts. If you enable this, you'll need to type in an additional code (generally provided by your mobile phone) when you log in. The two main options we're investigating are TOTP one-time passwords (e.g. the Google Authenticator app), and SMS codes.</p>
<p>We are also looking for ways to re-introduce pairing as an optional feature, after the main login. This might use an additional key "kC", which is only transferred via pairing. Once enabled, to set up Sync on a new device, you would need grant it permission from an old device that is already connected. We think we can make the pairing experience better than it was before, because we'll have more information to work with (you've already logged in, so we know which other devices might be available for pairing).</p>
<h2>Conclusion</h2>
<p>The new Sync sign-up process is now live and adding thousands of users every day. The password-based login makes it possible to use Sync with just a single device: as long as you can remember the password, you can get back to your Sync data. It still encrypts your data end-to-end like before, but it's important to generate a good random password to protect your data completely.</p>
<p>To set up Sync, just upgrade to Firefox 29, and follow the "Get started with Sync" prompts on the welcome screen or the Tools menu.</p>
<p>Happy Syncing!</p>
<p>(Thanks to Karl Thiessen, Ryan Kelly, Chris Karlof, and Daniel Kahn Gillmor for their invaluable feedback. <a href="http://blog.mozilla.org/warner/2014/05/23/the-new-sync-protocol/">Cross-posted</a> to my <a href="http://blog.mozilla.org/warner">work blog</a>.)</p>Pairing Problems2014-04-02T10:09:00-07:002014-04-02T10:09:00-07:00Brian Warnertag:www.lothar.com,2014-04-02:/blog/49-pairing-problems/<p>(This begins a two-part series on upcoming changes in Firefox Sync, based on <a href="http://www.lothar.com/presentations/fxa-rwc2014/">my presentation</a> at <a href="http://realworldcrypto.wordpress.com/">RealWorldCrypto 2014</a>. Part 1 is about problems we observed in the old system. Part 2 will be about the system which replaces it.)</p>
<p>In March of 2011, <a href="https://www.mozilla.org/en-US/firefox/sync/">Sync</a> made its debut in Firefox 4 …</p><p>(This begins a two-part series on upcoming changes in Firefox Sync, based on <a href="http://www.lothar.com/presentations/fxa-rwc2014/">my presentation</a> at <a href="http://realworldcrypto.wordpress.com/">RealWorldCrypto 2014</a>. Part 1 is about problems we observed in the old system. Part 2 will be about the system which replaces it.)</p>
<p>In March of 2011, <a href="https://www.mozilla.org/en-US/firefox/sync/">Sync</a> made its debut in Firefox 4.0 (after spending a couple of years as the <a href="https://blog.mozilla.org/labs/2007/12/introducing-weave/">Weave</a> add-on). Sync is the feature that lets you keep bookmarks, preferences, saved passwords, and other browser data synchronized between all your browsers and devices (home desktop, mobile phone, work computer, etc).</p>
<p>Our goal for Sync was to make it secure and easy to share your browser state among two or more devices. We wanted your data to be encrypted, so that only your own devices could read it. We weren't satisfied with just encrypting during transmission to our servers (aka "data-in-flight"), or just encrypting it while it was sitting on the server's hard drives (aka "data-at-rest"). We wanted proper end-to-end encryption, so that even if somebody broke into the servers, or broke SSL, your data would remain secure.</p>
<p>Proper end-to-end encryption typically requires manual key management: you would be responsible for copying a large randomly-generated encryption key (like <code>cs4am-qaudy-u5rps-x/qca-hu63l-8gjkl-28tky-6whlt-fn0</code>) from your first device to the others. You could make this easier by using a password instead, but that ease-of-use comes at a cost: short, easy-to-remember passwords aren't very secure. If an attacker could guess your password, they could get your data.</p>
<p>We didn't like that tradeoff, so we designed an end-to-end encryption system that didn't use passwords. It worked by "pairing", which means that every time you add a new device, you have to introduce it to one of your existing devices. For example, you could pair your home computer with your phone, and now both devices could see your Sync data. Then later, you'd pair your phone with your work computer, and now all three devices could synchronize together.</p>
<p>The introduction process worked by copying a short single-use "pairing code" from one device to the other. This code was fed into some crypto magic (the J-PAKE protocol), allowing the two devices to establish a temporary encrypted connection. Then everything necessary to access your account (including the random long-term data-encryption key) was copied through that secure connection to the new device.</p>
<p>The cool thing about pairing is that your data is safely protected by a strong encryption key, against everyone (even the Mozilla server that hosts it), and you don't need to manage the key. You never even see it.</p>
<p><img alt="FF 4.0 pairing dialogs" src="./pairing-codes.png"></p>
<h2>Problems With Pairing</h2>
<p>But, we learned that our pairing implementation in Firefox Sync had some problems. Some were shallow, others were deep, but the net result is that a <em>lot</em> of people were confused by Sync, and we didn't get as many people using it as we'd hoped. This post is meant to capture some of the problems that we observed.</p>
<p></p>
<h2>Idealized Design vs Actual Implementation</h2>
<p>Back in those early days, four years ago now, I was picturing a sort of idealized Sync setup process. In this fantasy world, next to the rainbows and unicorns, the first machine would barely have a setup UI at all, maybe just a single button that said "Enable Sync". When you turned it on, that device would create an encryption key, and start uploading ciphertext. Then, in the second device, the "Connect To Sync" button would initiate the pairing process. At no point would you need a password or even an account name.</p>
<p>But, for a variety of reasons, by the time we had a working deliverable, our setup page looked like this:</p>
<p><img alt="FF 4.0 Sync Create-Account dialog" src="./Sync-create-account.png"></p>
<p>Some of the reasons were laudable: having an email address lets us notify users about problems with their account, and establishing a password enabled things like a "Delete My Account" feature to work. But part of the reason included historical leftovers and backward-compatibility with existing prototypes.</p>
<p>In this system, the email address identified the account, and the password was used in an HTTP Basic Auth header to enable read/write access to encrypted data on the server. The data itself was encrypted with a random key, which came to be known as the "recovery key". The pairing process copied all three things to the new device.</p>
<h2>Deceptive UI</h2>
<p>The problem here was that users still had to pick a password. The account-creation screen gave the impression that this password was important, and did nothing to disabuse them of the notion that email+password would be sufficient to access their data later. But the data was encrypted with the (hidden) key, not the password. In fact, this password was never entered again: it was copied to the other devices by pairing, not by typing.</p>
<p>It didn't help that Firefox Sync came out at about the same time as a number of other products with "Sync" in the name, all of which <em>did</em> use email+password as the complete credentials.</p>
<p>This wasn't so bad when the user went to set up a second device: they'd encounter the unusual pairing screen, but could follow the instructions and still get the job done. It was most surprising and frustrating for folks who used Firefox Sync with only one device.</p>
<h2>"Sync", not "Backup"</h2>
<p>We, too, were surprised that people were using Sync with only one device. After all, it's obviously not a backup system: given how pairing works, it clearly provides no value unless you've got a second device to hold the encryption key when your first device breaks.</p>
<p>At least, it was obvious to me, living in that idealistic world with the rainbows and unicorns. But in the real world, you'd have to read the docs to discover the wonderous joys of "pairing", and who in the real world ever reads docs?</p>
<p>It turns out that an awful lot of people just went ahead and set up Sync despite having only one device. For a while, the <em>majority</em> of Sync accounts had only one device connected.</p>
<p>And when one of these unlucky folks lost that device or deleted their profile, then wanted to recover their data on a new device, they'd get to the setup box:</p>
<p><img alt="FF 4.0 Sync Setup dialog" src="./Sync-setup.png"></p>
<p>and they'd say, "sure, I Have an Account", and they'd be taken to the pairing-code box:</p>
<p><img alt="FF 4.0 pairing dialog" src="./Sync-pair-device.png"></p>
<p>and then they'd get confused. Remember, for these users, this was the first time they'd ever heard of this "pairing" concept: they were expecting to find a place to type email+password, and instead got this weird code thing. They'd have no idea what those funny letters were, or what they were supposed to do with them. But they were desperate, so they'd keep looking, and would eventually find a way out: the subtle "I don't have the device with me" link in the bottom left.</p>
<p>Now, this link was intended to be a fallback for the desktop-to-desktop pairing case, where you're trying to sync two immobile desktop-bound computers together (making it hard to transcribe the pairing code), and involves an extra step: you have to extract the recovery key from the first machine and carry it to the second one. By "I don't have the device with me", we meant "another device exists, but it isn't here right now". It was never meant to be used very often.</p>
<p>This also provided a safety net: if you had magically known about the recovery key ahead of time, and wrote it down, you could recover your data without an existing device. But since pairing was supposed to be the dominant transfer mechanism, this wasn't emphasized in the UI, and there were no instructions about this at account setup time.</p>
<p>So when you've just lost your phone, or your hard drive got reformatted, it's not unreasonable to interpret "I don't have the device with me" as something more like "that device is no longer with us", as in, <a href="http://www.youtube.com/watch?v=MH7KYmGnj40">"It's dead, Jim"</a>.</p>
<p>Following the link gets them to the fallback dialog:</p>
<p><img alt="FF 4.0 fallback Sign In dialog" src="./Sync-use-recovery-key.png"></p>
<p>Which looked <em>almost</em> like what they were expecting: there's a place for an account name (email), and for the password that they've diligently remembered. But now there's this "Sync Key" field that they've never heard of. The instructions tell them to do something impossible (since "your other device" is broken). A lot of very frustrated people wound up here, and it didn't provide for their needs in the slightest.</p>
<p>Finally, these poor desperate users would click on the only remaining ray of hope, the "I have lost my other device" link at the bottom. Adding insult to injury, this actually provides instructions to reset the account, regenerate the recovery key, and <em>delete</em> all the server-side data. If you understand pairing, it's clear why deleting the data is the only remaining option (erase the ciphertext that you can no longer decrypt, and reload from a surviving device). But for most people who got to this point, seeing these instructions only caused even more confusion and anger:</p>
<p><img alt="FF 4.0 reset-key dialog" src="./Sync-recovery-key.png"></p>
<p>(When reached through the "I have lost my other device" link, this dialog would highlight the "Change Recovery Key" button. This same dialog was reachable through Preferences/Sync, and is how you'd find your Recovery Key and record it for later use.)</p>
<h2>User Confusion</h2>
<p>The net result was that a lot of folks just couldn't use Sync. You can hear the frustration in these quotes from SUMO, the Firefox support site, circa December 2013:</p>
<p><img alt="SUMO Sync quotes" src="./pairing-wtf.png"></p>
<p>The upshot is that, while we built a multi-device synchronization system with excellent security properties and (ostensibly) no passwords to manage, a lot of people actually wanted a backup system, with an easy way to recover their data even if they'd only had a single device. And they wanted it to look just like the other systems they were familiar with, using email and password for access.</p>
<h2>Lessons Learned</h2>
<p>We're moving away from pairing for a while: Firefox 29 will transition to Firefox Accounts (abbreviated "FxA"), in which each account is managed by an email address and a password. Sync will still provide end-to-end encryption, but accessed by a password instead of pairing. My next post will describe the new system in more detail.</p>
<p>But we want to bring back pairing some day. How can we do it better next time? Here are some lessons I've learned from the FF 4.0 Sync experience:</p>
<ul>
<li>do lots of user testing, early in the design phase</li>
<li><em>especially</em> if you're trying to teach people something new</li>
<li>pay attention to all the error paths</li>
<li>if your application behaves differently than the mainstream, make it look different too</li>
<li>observe how people use your product, figure out what would meet their expectations, and try to build it</li>
<li>if you think their expectations are "wrong" (i.e. they don't match <em>your</em> intentions), that's ok, but now you have two jobs: implementation <em>and</em> education. Factor that into your development budget.</li>
</ul>
<p>I still believe in building something new when it's better than the status quo, even if it means you must educate your users. But I guess I appreciate the challenges more now than I did four years ago.</p>
<p>(<a href="http://blog.mozilla.org/warner/2014/04/02/pairing-problems/">cross-posted</a> to my <a href="http://blog.mozilla.org/warner">work blog</a>)</p>Remote Entropy2014-03-04T10:29:00-08:002014-03-04T10:29:00-08:00Brian Warnertag:www.lothar.com,2014-03-04:/blog/48-remote-entropy/<p><strong>Can you safely deliver entropy to a remote system?</strong></p>
<p>Running a system without enough entropy is like tolerating a toothache:
something you'd really like to fix, but not quite bothersome enough to deal
with.
</p>
<p><img alt="low-entropy munin graph" src="./entropy-week.png"></p>
<p>I recently bought a <a href="http://www.entropykey.co.uk/">Simtec EntropyKey</a> to fix
this locally: it's a little USB dongle with …</p><p><strong>Can you safely deliver entropy to a remote system?</strong></p>
<p>Running a system without enough entropy is like tolerating a toothache:
something you'd really like to fix, but not quite bothersome enough to deal
with.
</p>
<p><img alt="low-entropy munin graph" src="./entropy-week.png"></p>
<p>I recently bought a <a href="http://www.entropykey.co.uk/">Simtec EntropyKey</a> to fix
this locally: it's a little USB dongle with avalanche-noise generation
hardware and some firmware to test/whiten/deliver the resulting stream to the
host. The dongle-to-host protocol is encrypted to protect against even USB
man-in-the-middle attacks, which is pretty hardcore. I like it a lot. There's
a simple <a href="http://packages.debian.org/squeeze/ekeyd">Debian package</a> that
continuously fills /dev/random with the results, giving you something more
like this (which would look even better if Munin didn't use entropy-consuming
TCP connections just before each measurement):</p>
<p><img alt="high-entropy munin graph" src="./entropy-day-good.png"></p>
<p>But that's on local hardware. What about virtual servers? I've got several
remote VPS boxes, little Xen/KVM/VirtualBox slices running inside real
computers, rented by the hour or the month. Like many "little" computers
(including routers, printers, embedded systems), these systems are usually
starved for entropy. They lack the sources that "big" computers usually have:
spinning disk drives and keyboards/mice, both of which provide mechanical- or
human- variable event timing. The EntropyKey is designed to bring good
entropy to "little" machines. But I can't plug a USB device into my remote
virtual servers. So it's pretty common to want to deliver the entropy from my
(real) home computer to the (virtual) remote boxes. Can this be done safely?</p>
<h2>Nope!</h2>
<p>Well, mostly nope: it depends upon how you define the threat model. First,
let's go over some background.</p>
<h2>Guessing Internal State</h2>
<p>Remember that entropy is how you measure uncertainty, and it's always
relative to an observer who knows some things but not others. If I roll an
8-sided die on my desk right now, the entropy from your point of view is 3
bits. From my point of view it's 0 bits: <em>I</em> know I just rolled a five. And
now that <em>you</em> know that I rolled a five, it's 0 bits from your POV too.</p>
<p>Computers use entropy to pick random numbers for cryptographic purposes:
generating long-term SSH/GPG/TLS keys, creating ephemeral keys for
Diffie-Hellman negotiation, unique nonces for DSA signatures, IVs, and TCP
sequence numbers. Most of these uses are externally visible: the machine is
constantly shedding clues as to its internal state. If the number of possible
states is limited, and an eavesdropper can observe all (or most) of these
clues, then they can deduce what that internal state is, and then predict
what it will be next. The amount of computation Eve needs to do this depends
upon how uncertain she is, and on the nature of the clues.</p>
<p>The most conservative model assumes that Eve sees every packet going into and
out of the system, with perfect timing data, and that she knows the complete
state of the system before the game begins (imagine that Eve creates a VM
from the same EC2 AMI as you do). If she is truly omniscient, and the system
is deterministic, then she will know the internal state of the system
forever: all she has to do is feed her own clone the same input as your box
receives, at the same time, and watch how its internal state evolves. She doesn't even need to watch what your box outputs: it will always emit the same things as her clone.</p>
<p>If she misses a few bits (maybe she can't measure the arrival time of a
packet perfectly), or if there are hidden (nondeterministic) influences, then
she needs to guess. For each guess, she needs to compare her subsequent
observations against the predicted consequences of that guess, to determine
which guess was correct. It's as if she creates a new set of nearly-identical
VMs for each bit of uncertainty, and then throws out most of them as new
measurements rule them out.</p>
<p>There might be a lot of potential states, and it might take her a lot of CPU
time to test each one. She might also not get a lot of observations, giving
her fewer opportunities to discard the unknowns. Our goal is to make sure she
can't keep up: at any important moment (like when we create a GPG key), the
number of possibilities must be so large that all keys are equally likely.</p>
<p>(In fact, our goal is to make sure she can't retroactively catch up either.
If we create a key, and then immediately reveal all the internal state,
without going through some one-way function first, she can figure out what
the state was <em>earlier</em>, and then figure out the key too. So the system also
needs forward-security.)</p>
<h2>Bootstrapping Towards Entropy Is Technically Impossible ...</h2>
<p>To get out of this compromised Eve-knows-everything state, you have to feed
it with enough entropy (which are bits that Eve doesn't see) to exceed her
ability to create and test guesses. But she's watching the network. So you
must feed entropy in locally (via the keyboard, locally-attached hardware, or
non-deterministic execution).</p>
<p>Could you deliver entropy remotely if you encrypted it first? Sure, but you
have to make sure Eve doesn't know the key, otherwise she can see the data
too, and then it isn't entropy anymore. Encrypting it symmetrically (e.g.
AES) means your remote random-number generator machine shares a secret key
with the VM, but we already assumed that Eve knows the VM's entire state, so
it has no pre-existing secrets from her. To encrypt it asymmetrically (via a
GPG public key) means the VM has a corresponding private key: again, Eve's
insider knowledge lets her decrypt it too.</p>
<p>Can you use authenticated Diffie-Hellman to build a secure connection <em>from</em>
the VM to the remote entropy source? This would put a public key on the VM,
not a private one, so Eve doesn't learn anything from the key. But DH
requires the creation of a random ephemeral key (the "x" in "g^x"), and Eve
can still predict what the VM will do, so she can guess the ephemeral key
(using the published g^x to test her guesses), determine the shared DH key,
and decrypt the data.</p>
<p>So, in the most conservative model, there's no way to get out of this
compromised state using externally-supplied data. You <em>must</em> hide something
from Eve, by delivering it over a channel that she can't see.</p>
<h2>But It Might Be Possible In Practice</h2>
<p>The real world isn't quite this bad, for a few reasons:</p>
<ul>
<li>
<p>watching every input is actually pretty hard. The packet sniffer must be
running 24x7, never drop a packet, and must capture high-resolution
timestamps very close to the VM's inputs</p>
</li>
<li>
<p>busy computers have an awful lot of state, making Eve's worst-case modeling
job pretty expensive. It's still deterministic, but depends on a lot of
race conditions. The ideal kernel RNG would hash all of memory all the
time, to make it maximally sensitive to system state. Unfortunately, that's
expensive and intrusive ("hey! the kernel is reading my private user data
and publishing some derivative of it to the world!"), and good engineering
practice (modularity) prefers small sub-systems with <em>reduced</em> sensitivity
to unrelated inputs, so we may not get as much benefit from this as we'd
like.</p>
</li>
<li>
<p>kernel RNGs are designed to be forward-secure: it's not as if /dev/urandom
just returns the raw entropy pool. Every read and write causes the pool to
be stirred. So observations don't reveal state directly, and Eve has to do
(significant) computation to check her guesses.</p>
</li>
<li>
<p>RNGs also batch inputs into larger chunks to prevent small incremental
attacks. If we added one bit of entropy at a time (say, one per second),
then let Eve make some observations, she could probably deduce that one bit
in time to repeat the process for the next bit. But if we hide it in memory
(i.e. not allow it to influence anything remotely observable) for a few
minutes, and then dump 128 bits in all at once, Eve has 128 seconds to test
2^128 possibilities, and won't be able to keep up.</p>
</li>
</ul>
<h2>How To Do It</h2>
<p>So in practice, once the kernel pool gets perhaps 128 or 256 bits of real
entropy, Eve's job becomes impossible. This needs to happen before any
significant secrets are generated. How can we get to this point?</p>
<ul>
<li>
<p>the best tool is a local hardware RNG that can feed entropy to the kernel
without traversing the network. This might be a special CPU instruction
(e.g. Intel's RdRand) that can be used by the guest VM. Or the guest VM
should be able to ask the host OS (dom0) for entropy, which can get it from
an on-chip HWRNG (VIA Padlock) or USB-attached EntropyKey. This source
should be used very early during startup, before first-boot SSH host keys
are generated. It can be periodically refreshed afterwards, but it's the
initial seed that really matters.</p>
</li>
<li>
<p>next best is for the guest VM creation process to include a unique initial
seed. Linux systems typically save a few kB of entropy to disk at shutdown,
and write it back into the kernel at boot: if the contents of disk remain
secret, rebooting a box doesn't cause it to lose entropy. The control
system that creates VMs could pre-populate this entropy file from a real
RNG, with fresh data for each one. I don't know if EC2 AMIs work this way:
I suspect the disk image is identical each time an AMI is instantiated, but
the startup process might do something better.</p>
</li>
<li>
<p>failing that, the VM should make network requests for entropy. These
requests should go to a box that already has good entropy (perhaps relayed
from box to box, ultimately supplied by some kind of HWRNG). And the
requests should be as local as possible, so Eve would have to get her
packet sniffer into the datacenter network to see it. Pulling entropy from
multiple directions might help (maybe she can watch one router but not all
of them). Pulling large amounts of data might help (maybe she won't be able
to keep up with the data), as might pulling it frequently over a long
period of time (maybe the sniffer breaks down every once in a while: if you
can get 256 bits through while it's offline, you win). Try to include
high-resolution timing data too (sample the TSC when you receive each
packet and write the contents into the kernel pool along with the data).</p>
</li>
</ul>
<p>You'd probably think you ought to encrypt these network requests, but as
described above it's not really clear what this buys you. The best hope is
that it increases the cost of Eve's guess-testing. You might not bother with
authenticating this link: if the RNG is well-designed, then it can't hurt to
add more data, even attacker-controlled data (but note that entropy counters
could be incorrectly incremented, which means it can hurt to <em>rely</em> on
attacker-controlled data).</p>
<p>Continuing this analysis, you might not even bother decrypting the data
before adding it to the pool, since that doesn't increase the entropy by more
than the size of the decryption key, so you can get the same effect by just
writing the key into the pool too. (But it might be more expensive for Eve if
her guess-testing function must include the decryption work).</p>
<p>And if you don't bother decrypting it, then clearly there's no point to
encrypting it in the first place (since encrypted random data is
indistinguishable from unencrypted random data). Which suggests that really
you're just piping /dev/urandom from one box into netcat, plus maybe some
timestamps, and just have to hope that Eve misses a packet or two.</p>
<h3>Entropy Counters</h3>
<p>What about entropy counters, and the difference between /dev/random and
/dev/urandom? They're trying to provide two different things. The first is to
protect you against using the RNG before it's really ready, which makes a lot
of sense (see <a href="https://factorable.net/">Mining Your Ps and Qs</a> for evidence
of failures here). The second is to protect you against attackers who have
infinite computational resources, by attempting to distinguish between
computational "randomness" and information-theoretic randomness. This latter
distinction is kind of silly, in my mind. Like other folks, I think there
should be one kernel source of entropy, it should start in the "off" mode
(return errors) until someone tells it that it is ready, and switch to the
"on" mode forevermore (never return errors or block).</p>
<p>But I'll have to cover that in another post. The upshot is that it isn't safe
to make this startup-time off-to-on mode switch unless you have some
confidence that the data you've added to the kernel's entropy pool is
actually entropy, so attacker-supplied data shouldn't count. But after you've
reached the initial threshold, when (in my opinion) you don't bother counting
entropy any more, then it doesn't hurt to throw anything and everything into
the pool.</p>
<p>(<a href="http://blog.mozilla.org/warner/2014/03/04/remote-entropy/">cross-posted</a> to my <a href="http://blog.mozilla.org/warner">work blog</a>)</p>urllib32012-06-25T04:42:00-07:002012-06-25T04:42:00-07:00Brian Warnertag:www.lothar.com,2012-06-25:/blog/47-urllib3/<p>Today I learned about the <a class="reference external" href="http://pypi.python.org/pypi/urllib3">urllib3</a>
module. The biggest feature (from my point of view) is that this one can
properly validate SSL sessions.
The python 2.x <tt class="docutils literal">urllib</tt>, <tt class="docutils literal">urllib2</tt>, and
<tt class="docutils literal">httplib</tt> libraries all vaguely speak SSL, but none of them actually look
at the certificate they receive (and will …</p><p>Today I learned about the <a class="reference external" href="http://pypi.python.org/pypi/urllib3">urllib3</a>
module. The biggest feature (from my point of view) is that this one can
properly validate SSL sessions.
The python 2.x <tt class="docutils literal">urllib</tt>, <tt class="docutils literal">urllib2</tt>, and
<tt class="docutils literal">httplib</tt> libraries all vaguely speak SSL, but none of them actually look
at the certificate they receive (and will cheerfully silently connect to a
man-in-the-middle). This mode only provides protection against passive
eavesdroppers. Until recently it wasn't even obvious that the stdlib modules
had this problem (so a lot of folks writing HTTPS clients have not been
getting the protections they imagined they would get), but at least the py2.7
docs include a big red warning.</p>
<p>It's not a trivial thing to fix (at least without changing the API), but the
temptation to do just that is indicative of a deeper issue in security
engineering. As a consumer of the urllib API, you want your job to end when
you provide the URL: you expect it will be "secure" by default. But you've
provided a (domain) name that isn't "Secure" in the sense of <a class="reference external" href="http://en.wikipedia.org/wiki/Zooko%27s_triangle">Zooko's
Triangle</a>, so there is
some other party who has the right to change what your reference points to.
The traditional web/SSL approach is to start by letting DNS tell you which IP
address to aim your packets at, then let the internet's routing layers
deliver your packets to <em>somebody</em>, then set up an SSL connection with that
somebody, then accept the connection if the somebody's SSL certificate was
signed by one of the CA roots that you trust. Ultimately your configured CA
roots get to choose who you connect to. We're content with making the DNS
layer implicit (using a pre-configured list of a.gtld-servers.net addresses
and their responses), as well as the routing layer (leaving that up to your
ISP), but a library author isn't quite so comfortable making assumptions
about which CA roots you'd like to use. Web browsers, which are making
decisions like this for their users all the time, come with a slowly-changing
list of CAs which get reviewed frequently. Embedding a list into a static
library is probably more power than the authors really wanted to claim.</p>
<p>So with <tt class="docutils literal">urllib3</tt>, you must give it a list of CA roots (and pass
<tt class="docutils literal"><span class="pre">cert_reqs="CERT_REQUIRED"</span></tt>), and then it only connects if the site's cert
is signed by one of those CAs. If you know which CA a given site uses, you
can achieve a limited form of cert-pinning by only including that one CA root
in the list you give it. That reduces the set of people who can control your
connection to just the one CA. It looks like the companion <a class="reference external" href="http://pypi.python.org/pypi/requests">requests</a> library has facilities to grab the
CA list from your OS if it provides one, which makes it easy delegate the
decision to your OS vendor.</p>
<p><tt class="docutils literal">urllib3</tt> also provides connection pooling: keeping an HTTP(S) connection
open for multiple requests, which is a big win for performance, especially
when doing a lot of little requests.</p>
<p>It looks like Python3 fixes this too, with a stdlib <tt class="docutils literal">urllib.request</tt> module
that takes a CA list just like <tt class="docutils literal">urllib3</tt>.</p>
New Blog Software2012-06-19T09:16:00-07:002012-06-19T09:16:00-07:00Brian Warnertag:www.lothar.com,2012-06-19:/blog/46-New-Blog-Software/<p>Just finished moving the web site to a new host, and switching (yet again!)
to new blog software in the process. I wanted to get rid of CGIs on the new
host, so I switched to a static blog-site generator named <a class="reference external" href="https://github.com/ametaireau/pelican">Pelican</a>.
I'm still trying to work things
out, but …</p><p>Just finished moving the web site to a new host, and switching (yet again!)
to new blog software in the process. I wanted to get rid of CGIs on the new
host, so I switched to a static blog-site generator named <a class="reference external" href="https://github.com/ametaireau/pelican">Pelican</a>.
I'm still trying to work things
out, but it's looking pretty good so far. I had to make a few patches to let
me retain the permalinks from the old (PyBlosxom) site, they're in my <a class="reference external" href="https://github.com/warner/pelican">Github
fork</a>.</p>
<p>Managed to get everything updatable by git over git-foolscap too, for
objcap-inspired goodness.</p>
Zombie T-Shirts2011-05-28T14:37:00-07:002011-05-28T14:37:00-07:00Brian Warnertag:www.lothar.com,2011-05-28:/blog/45-Zombie-T-Shirts/<p>Just wanted to say hi to Dave and mention his nerd t-shirt store at
<a class="reference external" href="http://www.nerdkungfu.com">http://www.nerdkungfu.com</a> .
</p>
<p>He's a regular at our weekly Bad Movie Night, and I think a lot of the movies
we've screened have shown up as t-shirts on his site a few weeks later. If …</p><p>Just wanted to say hi to Dave and mention his nerd t-shirt store at
<a class="reference external" href="http://www.nerdkungfu.com">http://www.nerdkungfu.com</a> .
</p>
<p>He's a regular at our weekly Bad Movie Night, and I think a lot of the movies
we've screened have shown up as t-shirts on his site a few weeks later. If
you're looking for shirts from your favorite old video games, sci-fi movies,
or TV shows, you'll probably find them there. I picked up a delightful (if
dated) School House Rock <a class="reference external" href="http://www.nerdkungfu.com/School_House_Rock_Conjunction_Junction_T_Shirt_p/scas2002.htm">Conjunction Junction</a>
t-shirt, which I later learned is only meaningful to people within about 5
years of my own age.. guess those after-school specials weren't on TV for
very long. His Star Trek collection is pretty impressive too.</p>
emacs command of the day2011-04-12T12:09:00-07:002011-04-12T12:09:00-07:00Brian Warnertag:www.lothar.com,2011-04-12:/blog/44-emacs-command-of-the-day/<p>C-x 4 c : clone-indirect-buffer-other-window
</p>
<p>I keep learning new tricks in emacs. Today I was studying an overstuffed
file, with two large classes, and I needed to navigate around both as I
followed the code paths bouncing back and forth between them. I frequently
use the "narrow-to-region" command (C-x n n …</p><p>C-x 4 c : clone-indirect-buffer-other-window
</p>
<p>I keep learning new tricks in emacs. Today I was studying an overstuffed
file, with two large classes, and I needed to navigate around both as I
followed the code paths bouncing back and forth between them. I frequently
use the "narrow-to-region" command (C-x n n) to temporarily clip the buffer
to a single class or method of interest, because then jumping to the
beginning/end of the buffer really takes me quickly to the beginning/end of
the class, and searching is limited to the class, etc. But this time, I
needed to narrow the buffer to two separate regions.</p>
<p>Enter today's interesting command: "clone-indirect-buffer-other-window",
reached by C-x 4 c . All the "C-x 4" commands put things in a new window
(which means a new region of the current "frame", where each frame gets a new
OS-level window). This one makes an "indirect buffer" that's a mirror of the
current one, but with a slightly different name (it gets a <2> tacked on to
the end). Both buffers are looking at the same file, so any changes you make
in one will also appear in the other. But you can narrow each one separately.
So I was able to narrow to the first class in the main buffer, and narrow to
the second class in the second buffer, and then search/study/explore as if
they were two entirely different files. When you're done, just close the
second buffer.</p>
phishing training2010-11-30T10:57:00-08:002010-11-30T10:57:00-08:00Brian Warnertag:www.lothar.com,2010-11-30:/blog/43-phishing-training/<p>I stopped by the bank this morning to make a deposit. While fussing with the
ATM machine, I was listening to a nearby bank employee making a phone call.
His side of the conversation went like: "Hi, this is Bob from $YOURBANK. Your
father just opened an account with us …</p><p>I stopped by the bank this morning to make a deposit. While fussing with the
ATM machine, I was listening to a nearby bank employee making a phone call.
His side of the conversation went like: "Hi, this is Bob from $YOURBANK. Your
father just opened an account with us, and I'd like to give you the referral
credit for it, but I don't have your account number here. Could you read it
off your ATM card to me?"</p>
<p>Wow. Step one: decide what is secret and what isn't, and then be consistent
in how you ask users to deal with them. Training users to reveal secrets to
anyone with a convincing pitch may not be serving them well in the long run.</p>
<p>It also reminds me of the joke: the definition of "secret" is a piece of
information that, when you tell it to someone, you also tell them to not tell
it to anyone else.</p>
projects2010-11-24T13:49:00-08:002010-11-24T13:49:00-08:00Brian Warnertag:www.lothar.com,2010-11-24:/blog/42-projects/<p>Must.. write.. more. I'm trying to get over the temptation to rewrite my blog
software again (probably using <a class="reference external" href="https://github.com/mojombo/jekyll/">Jekyll</a>).
My blog-yak-shaving process works like this: "Oh, here's an interesting idea,
I should blog about it. But my blog software is kind of annoying, I should
really rewrite it first. Maybe …</p><p>Must.. write.. more. I'm trying to get over the temptation to rewrite my blog
software again (probably using <a class="reference external" href="https://github.com/mojombo/jekyll/">Jekyll</a>).
My blog-yak-shaving process works like this: "Oh, here's an interesting idea,
I should blog about it. But my blog software is kind of annoying, I should
really rewrite it first. Maybe I'll do it in git this time. (/me goes to
start rewriting it). Oh, look, it's already in git, I guess I've been over
this before. Hm, what other ways might I put off actually writing english
instead of python?".</p>
<p>Anyways. A few things of note from the last year:</p>
<ul class="simple">
<li>my day job has me working on <a class="reference external" href="https://github.com/mozilla/addon-sdk">Jetpack</a>, an SDK to make it easier/safer
to build add-ons for Firefox (and other Mozilla applications)</li>
<li><a class="reference external" href="http://nodejs.org/">node.js</a> is looking interesting: like Twisted for
Javascript</li>
<li>I still hack on <a class="reference external" href="http://tahoe-lafs.org/">Tahoe-LAFS</a> each day on the
train.</li>
<li><a class="reference external" href="http://www.bitcoin.org/">Bitcoin</a> is my new fascination, a
decentralized cryptographically-backed currency</li>
<li>I use Git everywhere I can.</li>
<li><a class="reference external" href="http://foolscap.lothar.com/">Foolscap</a> is still going strong. I've got
a hack to let you push/pull Git over a FURL, but now that Git has
"git-remote-helpers" support, I need to rewrite that hack.</li>
<li>My coworker Atul has a project named <a class="reference external" href="http://www.toolness.com/wp/?p=678">Pydermonkey</a> which is a Python binding for the
SpiderMonkey JS engine. I want to tie this into either Foolscap or Tahoe,
to create a safe-remote-code-execution environment. I'm still trying to
sort out how it ought to work (in particular, starting with what it might
be good for).</li>
</ul>
darcs-fast-export2009-06-24T12:03:00-07:002009-06-24T12:03:00-07:00Brian Warnertag:www.lothar.com,2009-06-24:/blog/41-darcs-fast-export/<p>So idnar just turned me on to <a class="reference external" href="http://vmiklos.hu/project/darcs-fast-export/">darcs-fast-export</a>, which can be used with
git-fast-import to quickly convert a repository from darcs to git. I've been
using Git more and more in the last few months, and I'm growing quite fond of
it. Tahoe is managed in darcs, and I've been …</p><p>So idnar just turned me on to <a class="reference external" href="http://vmiklos.hu/project/darcs-fast-export/">darcs-fast-export</a>, which can be used with
git-fast-import to quickly convert a repository from darcs to git. I've been
using Git more and more in the last few months, and I'm growing quite fond of
it. Tahoe is managed in darcs, and I've been using a private Git mirror to
manage the several dozen feature branches that I work on at any given moment.
I wanted to make a more-official mirror that would be reasonable to publish
on <a class="reference external" href="http://github.com/">GitHub</a>.</p>
<p>I had to patch the darcs-fast-export script a little bit, one because our
darcs repository happens to have some bad (non-UTF8) characters in some old
patches (before darcs started rejecting those), and two because I wanted to
preserve our tag names (like "allmydata-tahoe-1.4.1", and darcs-fast-export
was squashing the hyphens down to underscores).</p>
<p>Tahoe has about 4000 patches. darcs-fast-export started doing about
170ms/patch (20 patches per second), and towards the end of the job is
slowing to about 1.1s/patch. In contrast, when I first tried the conversion
with tailor, the "darcs pull" operation was taking about 20 seconds per
patch. Tailor finished after 13.5 hours. darcs-fast-export took 42 minutes.</p>
<p>darcs-fast-export also takes care of incremental updates, so I can update the
mirror later as more darcs patches arrive. It also suggests that it can be
used bidirectionally. I might start using this to move my git patches back
into trunk.</p>
Foolscap-0.4.2 released2009-06-20T12:59:00-07:002009-06-20T12:59:00-07:00Brian Warnertag:www.lothar.com,2009-06-20:/blog/40-Foolscap/<p>I've released foolscap-0.4.2 .. download it from
<a class="reference external" href="http://foolscap.lothar.com/trac">http://foolscap.lothar.com/trac</a> .
</p>
<p>I made the relase last week, and as usual
I've managed to not send out the announcement email yet. One reason for that
is that I wanted to blog about it first, and I've started using a …</p><p>I've released foolscap-0.4.2 .. download it from
<a class="reference external" href="http://foolscap.lothar.com/trac">http://foolscap.lothar.com/trac</a> .
</p>
<p>I made the relase last week, and as usual
I've managed to not send out the announcement email yet. One reason for that
is that I wanted to blog about it first, and I've started using a
foolscap-0.4.2 -based tool to manage my blog, and I effectively got into a
circular dependency between the blog and the blog software, with the email
depending upon both.</p>
<p>The big new feature in this release is the "FooLscap APPlication SERVER", or
"flappserver". It's like twistd for foolscap, enabling non-programmers to
deploy pre-written tools without needing to write new code. twistd makes it
easy to create and launch things like a web server or FTP server. flappserver
makes it easy to create and launch a service which is accessed remotely via a
secure FURL. There is a corresponding "flappclient" which takes a FURL (and
some arguments) and does something with that service. The service runs as
whichever user started the server, and it's easy to daemonize the server and
run it in the background. Typically you'd start the server from a @reboot
crontab entry or /etc/init.d script or LaunchAgent.plist file.</p>
<p>Eventually flappserver will have a plugin mechanism, but for now it comes
with two remarkably useful basic services. The first is named "upload-file":
the client provides the file and basename, the server provides the directory.
It's like a write-only drop-box, accessed with a FURL. This is great for
buildslaves that need to drop a generated package into some world-visible
directory: the buildslave can touch that one directory and no others, and
there are no funny filenames or shell-escape tricks it can use to break out
of there.</p>
<p>The second service is named "run-command": the server controls everything
about the command: executable, arguments, and working directory. The client
just gets to push the button. It's like a remote-garage-door-opener for
program execution. Optionally, the client can pass stdin and get stdout,
letting you use it like a secure network pipe to a server that's run
on-demand, sort of like inetd but with actual security.</p>
<p>It is nominally possible to do this sort of thing over SSH, but you have to
start by creating a keypair for each purpose and add it to your
authorized_keys file, and then figure out what sort of command= option to add
to keep that key from being able to control your entire account (which
usually means writing a script to implement the exact functionality you <em>do</em>
want to offer), then hope that nothing they sent as an environment variable
will compromise your security, then give them the 600-plus--character-long
pubkey, then have them write a script which translates their input arguments
into some "ssh -i single-purpose-pubkey hostname args-for-processing"
command.</p>
<p>With a running flappserver, it's just:</p>
<blockquote>
flappserver add ~/server upload-file ~/incoming # returns FURL</blockquote>
<p>flappclient --furl FURL upload-file foo.jpg</p>
<p>As a demo of what you can do with those two tools, I've started to update
this very blog's back-end Git repository over a flappserver-based connection.
The half-a-dozen computers that I use all have a copy of the "update my blog"
FURL (really the "run git-daemon in the blog entries directory" FURL). The
details are in the foolscap source tree, in doc/examples (in TRUNK, not in
0.4.2). More about this in the next post.</p>
moved blog to git2009-06-19T13:20:00-07:002009-06-19T13:20:00-07:00Brian Warnertag:www.lothar.com,2009-06-19:/blog/39-moved-blog-to-git/<p>I just finished moving this weblog to be managed in a Git repository, using
the scheme described in
<a class="reference external" href="http://joemaller.com/2008/11/25/a-web-focused-git-workflow/">http://joemaller.com/2008/11/25/a-web-focused-git-workflow/</a> . Plus, I'm
running the connection over Foolscap.. more on that in a moment if this
update actually works..</p>
<p>I just finished moving this weblog to be managed in a Git repository, using
the scheme described in
<a class="reference external" href="http://joemaller.com/2008/11/25/a-web-focused-git-workflow/">http://joemaller.com/2008/11/25/a-web-focused-git-workflow/</a> . Plus, I'm
running the connection over Foolscap.. more on that in a moment if this
update actually works..</p>
web updates2008-05-29T18:59:00-07:002008-05-29T18:59:00-07:00Brian Warnertag:www.lothar.com,2008-05-29:/blog/38-web-updates/<p>I finally updated the system that hosts <a class="reference external" href="http://buildbot.net">http://buildbot.net</a> and
<a class="reference external" href="http://foolscap.lothar.com">http://foolscap.lothar.com</a> (a dedicated VM that just runs apache for CGIs,
needed to make trac and mod_python work well). Upgrading it from edgy to
anything newer was a hassle, because the "update-manager" package that I
wanted to …</p><p>I finally updated the system that hosts <a class="reference external" href="http://buildbot.net">http://buildbot.net</a> and
<a class="reference external" href="http://foolscap.lothar.com">http://foolscap.lothar.com</a> (a dedicated VM that just runs apache for CGIs,
needed to make trac and mod_python work well). Upgrading it from edgy to
anything newer was a hassle, because the "update-manager" package that I
wanted to use wasn't installed, and because edgy is now too old to appear on
most Ubuntu mirrors. It does appear on <a class="reference external" href="http://old-releases.ubuntu.com">http://old-releases.ubuntu.com</a> ,
though, but unfortunately the update-manager package doesn't work unless both
the "from" and the "to" releases are available on the same APT repository.
Since feisty isn't old enough for old-releases yet, there's nothing you can
put in your /etc/apt/sources.list that will appease update-manager.</p>
<p>So I had to do it the old-fashioned way: change sources.list, apt-get update,
apt-get dist-upgrade . That worked, but then trac broke: the default version
of python switched from 2.4 to 2.5, and none of the trac plugins I was using
had eggs that were built for 2.5 . I decided to upgrade all the way to hardy
before trying to fix anything else.</p>
<p>After fixing the eggs, it turned out that python-clearsilver in hardy is just
broken: it doesn't include a 2.5 version, and I guess it was trying to make
do with a 2.4 version, because I was getting errors about missing symbols. I
finally found <a class="reference external" href="https://bugs.launchpad.net/ubuntu/+source/trac/+bug/114930">https://bugs.launchpad.net/ubuntu/+source/trac/+bug/114930</a> and
followed the advice to rebuild the python-clearsilver package with the right
version of python.</p>
<p>I also had to upgrade the trac databases in the process, but that's an easy
"trac-admin TRACDIR upgrade".</p>
<p>And now everything is working again, with only an hour of unexpected
downtime.</p>
pastebinit2008-05-28T18:34:00-07:002008-05-28T18:34:00-07:00Brian Warnertag:www.lothar.com,2008-05-28:/blog/37-pastebinit/<p>Another package that appeared in debian today: pastebinit, which is a
command-line tool to upload bits of code to some of the various pastebin web
servers out there (handy when you want to discuss some code over IRC and
don't want to jam the whole thing into the channel.. it …</p><p>Another package that appeared in debian today: pastebinit, which is a
command-line tool to upload bits of code to some of the various pastebin web
servers out there (handy when you want to discuss some code over IRC and
don't want to jam the whole thing into the channel.. it is much more polite
to put it in a pastebin and then refer to it by URL).</p>
<p>Now what I want is an emacs interface to this, since the code I'd be
referring to would always come from one of my emacs buffers anyways.</p>
Mutation Testing2008-05-28T18:24:00-07:002008-05-28T18:24:00-07:00Brian Warnertag:www.lothar.com,2008-05-28:/blog/36-Mutation-Testing/<p>I've often thought that it would be a great idea to test your test suite by
randomly changing bits of code and seeing if the tests catch it. It turns out
that other people feel the same way: I just saw a Ruby library named "Heckle"
show up in debian …</p><p>I've often thought that it would be a great idea to test your test suite by
randomly changing bits of code and seeing if the tests catch it. It turns out
that other people feel the same way: I just saw a Ruby library named "Heckle"
show up in debian sid (the package is named libheckle-ruby). The blurb says:</p>
<blockquote>
Heckle is a mutation tester. It modifies your code and runs your tests to
make sure they fail. The idea is that if code can be changed and your tests
don't notice, either that code isn't being covered or it doesn't do
anything.</blockquote>
<p>In a security context, this is similar to an approach thought up by (I
believe) David Wagner, Ka-Ping Yee, and Mark Miller, during the security
analysis of Ping's electronic voting software. The unusual challenge was that
the defined security goal was to be safe against the author of the software,
not just the usual malicious attackers (who try to provide bad input, or make
the code act in surprising ways). Their scheme was to have one team modify
the code to insert intentional errors (or opportunities for mischief), then
the second team try to find those errors. If the second team finds other
errors, then the code is obviously buggy, and loses. If the second team can't
find the errors, then the code is too complicated to analyze, and it loses.
If the design of the code is so straightforward that bugs and backdoors stand
out like a sore thumb, the code wins.</p>
<p>Of course, this requires really good, really tightly specified unit tests. In
my experience, if you're using the right language, a test that specifies the
desired result so precisely is effectively your functional code anyways, so
you have to be careful to define your tests in some way that doesn't mean
you're writing the same code twice.</p>
<p>I don't know Ruby, but I may need to learn enough about it to be able to read
this Heckle library and see if it can be ported to Python.</p>
Emacs Trick of the Day2008-05-28T18:15:00-07:002008-05-28T18:15:00-07:00Brian Warnertag:www.lothar.com,2008-05-28:/blog/35-Emacs-Tricks/<p>There are a few million gems hidden inside emacs. The two that I ran into
most recently are:</p>
<p>C-x r m, C-x r b, C-x r l : these create named bookmarks, each of which
records the file that you're visiting and a position within that file. When I
need to …</p><p>There are a few million gems hidden inside emacs. The two that I ran into
most recently are:</p>
<p>C-x r m, C-x r b, C-x r l : these create named bookmarks, each of which
records the file that you're visiting and a position within that file. When I
need to hold my place while I looked elsewhere, I usually split the window
(C-x 2) and leave one of them fixed while I moved around in the other one to
find something. Then C-x 0 makes that window go away, leaving me in my
original position. But if you do that too deeply, the windows get too small.</p>
<p>C-x r m creates a bookmark, and the name defaults to the name of the file (so
if you only use one bookmark per file, you don't even have to type anything).
Then C-x r b jumps back to that bookmark. C-x r l lists all your bookmarks.</p>
<p>Bookmarks can also be persistent.</p>
<p>highlight-trailing-space: by setting this to 't', any trailing whitespace
will be highlighted in an ugly orange color that makes you want to delete it
right away. Darcs does the same thing when you're committing code (it shows
you a special "[_$_]" -like symbol to make you aware of the whitespace at
then end of the line), so I've been in the habit of deleting that whitespace
anyways.. even wrote a little python tool to find it all for me. With
highlight-trailing-space turned on, I get to see the whitespace as I'm
editing, so I can remove it earlier.</p>
Levenshtein Distance2008-04-28T18:45:00-07:002008-04-28T18:45:00-07:00Brian Warnertag:www.lothar.com,2008-04-28:/blog/34-Levenshtein-Distance/<p>A library just showed up in debian ("python-levenshtein") to measure the
<a class="reference external" href="http://en.wikipedia.org/wiki/Levenshtein_Distance">Levenshtein Distance</a>
between two strings: the minimum number of edits (inserts, changes, deletes)
necessary to turn one string into another.</p>
<p>I've been thinking about ways to implement efficiently-edited large mutable
files for <a class="reference external" href="https://tahoe-lafs.org">Tahoe</a>, and it seems like a tool …</p><p>A library just showed up in debian ("python-levenshtein") to measure the
<a class="reference external" href="http://en.wikipedia.org/wiki/Levenshtein_Distance">Levenshtein Distance</a>
between two strings: the minimum number of edits (inserts, changes, deletes)
necessary to turn one string into another.</p>
<p>I've been thinking about ways to implement efficiently-edited large mutable
files for <a class="reference external" href="https://tahoe-lafs.org">Tahoe</a>, and it seems like a tool
like this might help. Something clever like what rsync does is probably going
to be involved too. The trick is that you want to determine what deltas to
store without reading the whole file over the wire, from a server who isn't
allowed to see the plaintext. You can store whatever ciphertext hashes you
want on the far end. We're planning to provide insert/delete delta messages
in the server side, using something like Mercurial's "revlog" format. The
question is how to efficiently figure out the deltas on a very large file.</p>
sparkfun toys2007-07-20T18:46:00-07:002007-07-20T18:46:00-07:00Brian Warnertag:www.lothar.com,2007-07-20:/blog/33-sparkfun-toys/<p>I was thumbing through some of my old <a class="reference external" href="http://del.icio.us/warner">del.icio.us</a> bookmarks today, and came across <a class="reference external" href="http://sparkfun.com">sparkfun
electronics</a> again. Man, their coolness doubles in
size every six months. $25 for a half-inch square self-contained <a class="reference external" href="http://www.sparkfun.com/commerce/product_info.php?products_id=152#">radio data
link</a>,
serial interface that you can run with a microcontroller, 3V, built-in
antenna. Wow …</p><p>I was thumbing through some of my old <a class="reference external" href="http://del.icio.us/warner">del.icio.us</a> bookmarks today, and came across <a class="reference external" href="http://sparkfun.com">sparkfun
electronics</a> again. Man, their coolness doubles in
size every six months. $25 for a half-inch square self-contained <a class="reference external" href="http://www.sparkfun.com/commerce/product_info.php?products_id=152#">radio data
link</a>,
serial interface that you can run with a microcontroller, 3V, built-in
antenna. Wow. $6 for a white Luxeon 1W LED ($8 for 3W, $25 for 5W). $5 for a
1W Luxeon that's TWO FRIGGING MILLIMETERS on a side. Holy crap.</p>
<p>And $20 for a color LCD like the ones from a cellphone. And speaking of
cellphones, $184 gets you a quad-band cellphone module with a GPS receiver,
camera driver, and a python interpreter. Add an antenna, a battery, a serial
port, and a SIM card, and you've got a mobile data node. And I think you can
even get prepaid SIM cards that can be topped-off online.</p>
<p>(note to self, places like <a class="reference external" href="http://www.myworldphone.com/prepaidsim.html">this</a> sell such cards, generally 5
to 20 cents per minute, which can be recharged with scratch-off coupons. And
it looks like you can buy them from retail cellphone shops too. They all come
with a phone number.. no wonder the phone numberspace is getting so crowded,
you can buy them from vending machines in some countries..)</p>
<p>Each time I visit these folks (or browse through the digikey catalog, or just
look through my old notebooks), I feel such a strong drive to build
something. The delay involved in actually getting the parts usually means I
don't get around to doing it. But maybe if I just keep buying stuff and
stocking my workbench then the next time I'm in a construction mood I'll have
everything I need already at hand and I can just start soldering away...</p>
trac spam2007-07-17T00:47:00-07:002007-07-17T00:47:00-07:00Brian Warnertag:www.lothar.com,2007-07-17:/blog/32-trac-spam/<p>Oh happy day! The <a class="reference external" href="http://buildbot.net">buildbot.net</a> trac instance just
recently got visited by the link spammers. They haven't caused any actual
damage yet, just a user account created with advertising in the profile text,
but I'm afraid it's only a matter of time before the bots descend upon us and …</p><p>Oh happy day! The <a class="reference external" href="http://buildbot.net">buildbot.net</a> trac instance just
recently got visited by the link spammers. They haven't caused any actual
damage yet, just a user account created with advertising in the profile text,
but I'm afraid it's only a matter of time before the bots descend upon us and
we're smothered by a wave of sentient AIs dedicated to filing mass buildbot
bug reports containing nothing but links to offshore casinos and faux
designer watches.</p>
<p>sigh.</p>
<p>I guess I should add some sort of "prove you can read" test to the
account-creation page, just barely enough to make the script kiddies work for
a living. Something like "what is 1+2?" or "type the word 'please' in here"
or something.</p>
<p>Reminds me of a suggestion someone made to me while I was working on <a class="reference external" href="http://petmail.lothar.com">petmail</a>: you don't need super-clever CAPTCHA
techniques if you can manage to have a whole bunch of different requirements
instead, like each user creating their own simple technique. A bot could be
written to mass-solve any particular one, but since everybody is creating
their own, the bot-writers job is that much harder.</p>
<p>And sometimes, just raising the bar a bit is good enough for now. As the joke
goes, I don't have to outrun the lion.. I just have to outrun <strong>you</strong> :-).</p>
foolscap.lothar.com2007-07-13T12:41:00-07:002007-07-13T12:41:00-07:00Brian Warnertag:www.lothar.com,2007-07-13:/blog/31-foolscap.lothar.com/<p>I just finished building a Trac instance for Foolscap, now online at
<a class="reference external" href="http://foolscap.lothar.com/trac">http://foolscap.lothar.com/trac</a> . It's got a (mercurial-based) code browser,
tickets, and a wiki.</p>
<p>Setting it up required some twisted.web hacking, because my setup puts a
twisted.web server out front, and reverse-proxies certain requests to …</p><p>I just finished building a Trac instance for Foolscap, now online at
<a class="reference external" href="http://foolscap.lothar.com/trac">http://foolscap.lothar.com/trac</a> . It's got a (mercurial-based) code browser,
tickets, and a wiki.</p>
<p>Setting it up required some twisted.web hacking, because my setup puts a
twisted.web server out front, and reverse-proxies certain requests to a
separate Xen virtual machine which handles all CGI (for multiple sites, like
buildbot.net and foolscap.lothar.com). That CGI host is running apache, and
since URLs inside returned pages are not being rewritten, I had to use named
virtual hosts to distinguish between, say, <a class="reference external" href="http://buildbot.net/trac">http://buildbot.net/trac</a> and
<a class="reference external" href="http://foolscap.lothar.com/trac">http://foolscap.lothar.com/trac</a> .</p>
<p>But the normal twistd.web.proxy <a class="reference external" href="http://twistedmatrix.com/trac/browser/trunk/twisted/web/proxy.py#L158">ReverseProxyResource</a>
clobbers the Host: header when it forwards the request (setting it equal to
the new host being targeted). I suppose this is to hide the presence of the
proxy from the new host, but in my situation is has the effect of making it
impossible to use vhosts on the apache side to distinguish between requests
that were received for different hostnames.</p>
<p>So I subclassed and commented out that line, and apache is happy. Now that I
can have more than one trac instance on this box, I'm creating Tracs for
everything. Whee!</p>
mercurial2007-07-09T21:29:00-07:002007-07-09T21:29:00-07:00Brian Warnertag:www.lothar.com,2007-07-09:/blog/30-mercurial/<p>Wow, so long since I updated this. Each time I remember that I <strong>do</strong> have
a technical blog, and think to add something to it, I am tempted to start by
rewriting the whole blog system in some brand new way that will make it
easier to post to (and …</p><p>Wow, so long since I updated this. Each time I remember that I <strong>do</strong> have
a technical blog, and think to add something to it, I am tempted to start by
rewriting the whole blog system in some brand new way that will make it
easier to post to (and, the theory goes, therefore make me more likely to
write in it). The process of writing more code creates something that I'm
even less likely to understand next time, and code begets more code. It's
like a depth-first search through an infinite design space. Bad idea.</p>
<p>And speaking of technical distractions, I've been playing with <a class="reference external" href="http://www.selenic.com/mercurial/wiki/">Mercurial</a> recently. I like it. I moved
<a class="reference external" href="http://foolscap.lothar.com">Foolscap</a> from Darcs to Mercurial last week,
mostly to learn more about it, and I've been pleased. My main reason was to
make it easier for folks to hack on Foolscap: darcs is all fine if you're
running debian and someone else has compiled it for you, but if you have to
build it yourself you have to start by building GHC, which is a non-trivial
adventure.</p>
<p>Mercurial's plugin architecture is pretty nice: one line in the .hgrc file
tells it to import a .py file, which registers a set of new subcommands with
the main /usr/bin/hg entry point. Which reminds me that I want to adapt
Trac's plugin mechanism (which lets you drop an .egg file in a specific
directory and then reference modules inside it from the config file) to
Buildbot, to make it easier for users to get interesting code into their
master.cfg files. Not that huge of a change, but it would make the
installation instructions for that code to get simpler; no need to change
sys.path from within master.cfg .</p>
<p>And because the plugin approach makes it easy, people are writing fun
plugins. The Tk-based graphical revision browser is great (and has a little
tram-line-style graph of which revisions got merged into which, very cute).
The 'bisect' extension helps you do an efficient binary search for the
revision which introduced (or fixed) a bug.</p>
<p>I'm still trying to figure out the "forest" extension, though. I think it's
what I want for tracking a couple dozen separate small projects (things I've
been doing in CVS for years, since I can update just one at a time, or commit
the whole lot of them and push the work from my laptop to my desktop). But
for the life of me I can't figure out how to use it, and the documentation is
heavy on the per-subcommand reference and light on the big-picture
descriptions.</p>
<p>And mercurial is <strong>fast</strong>. The cgi-based web server lets them speed up the
initial checkout: for the full Foolscap repository, doing a 'darcs get'
through the naive (twisted.web) server took 22 seconds (of which probably 17
was network), whereas doing the equivalent 'hg clone' from a hgwebdir.cgi
server (under apache) took 6 total. Mercurial manages to store the history
more compactly too: the tree with full history under darcs was 4.4MB, and
2.9MB in hg.</p>
<p>I've been using Darcs for a year or two now, and we've been using it
extensively at <a class="reference external" href="http://allmydata.org">work</a>, and it's fun (the incremental
commit feature is amazing, and I miss it in hg, and it wouldn't be impossible
to add). But every once in a while something explodes (possibly because we've
used 'darcs oblit' more than once, and that seems to be an underexplored
corner of the darcs jungle). I <strong>really</strong> like the append-only and
cryptographically-secure nature of hg revisions, and regret that you can't
securely and concisely name a specific darcs revision the way you can with
mercurial. Having spent a lot of time defining sha-256 hash-based identifiers
recently, I'm coming to be wary of any system that doesn't let me create
strong references like that.</p>
<p>So I'm looking forward to playing with it more. Commuting patches is nifty,
but for things like Buildbot and Foolscap I'm not really creating crazy
branches with patches that need to be held out of trunk for months at a time.
So I think hg has a lot of promise.</p>
forgetfulness-based development2007-03-05T17:55:00-08:002007-03-05T17:55:00-08:00Brian Warnertag:www.lothar.com,2007-03-05:/blog/29-forgetfullness-based-development/<p>You're probably familiar with eXtreme Programming, and branch-based
development, and agile development. But I've discovered that I've been using
a new technique recently, that I call Forgetfulness-Based Development. The
way it works is this: I come up with something insanely complicated, that
takes me weeks to get my head around …</p><p>You're probably familiar with eXtreme Programming, and branch-based
development, and agile development. But I've discovered that I've been using
a new technique recently, that I call Forgetfulness-Based Development. The
way it works is this: I come up with something insanely complicated, that
takes me weeks to get my head around and document and implement and test, but
seems like it's the best way to solve whatever the current problem is. And
then I go away on vacation for two weeks, and forget absolutely everything
about it. And then I come back, and look at it again, and discover how little
I can understand. After a few days of cursing the fool who wrote the insane
thing, I start seeing ways that it could be done more simply, or more
generally, or more robustly, or more understandably. And then I write some
more code to replace the old stuff.</p>
<p>Lather, rinse, repeat, and eventually you wind up with a design that solves
the problem <em>and</em> makes sense to a new employee/developer. As the python
folks say, Readability Matters. And as Brian Kernighan says: "Debugging is
twice as hard as writing the code in the first place. Therefore, if you write
the code as cleverly as possible, you are, by definition, not smart enough to
debug it."</p>
<p>(of course, to make this work right, you have to take a lot of vacations. but
usually it's a sacrifice I'm willing to take.)</p>
PyCon2007, Buildbot2007-03-01T14:06:00-08:002007-03-01T14:06:00-08:00Brian Warnertag:www.lothar.com,2007-03-01:/blog/28-PyCon/<p>I just got back from <a class="reference external" href="http://us.pycon.org/TX2007/HomePage">PyCon</a>.
Highly inspirational as always, saw some fascinating projects and some
thought-provoking keynotes. r0ml's talk in particular has me thinking about
how to structure code as a narrative, trying to bring the world of
human-to-human communication and the world of human-to-machine communication
closer together. He …</p><p>I just got back from <a class="reference external" href="http://us.pycon.org/TX2007/HomePage">PyCon</a>.
Highly inspirational as always, saw some fascinating projects and some
thought-provoking keynotes. r0ml's talk in particular has me thinking about
how to structure code as a narrative, trying to bring the world of
human-to-human communication and the world of human-to-machine communication
closer together. He had a lot fo say about parallels between the development
of writing systems (the introduction of random-access pages in a book rather
than linear-access scrolls, the use of standardized fonts, the use of spaces
between words) and the development of programming languages.</p>
<p>I ran a Buildbot BOF, and had about 25 people show up! There are a lot of
folks out there using this thing. Very gratifying.</p>
<p>I spent a few days sprinting, mostly working with Eric Mangold (aka teratorn)
on a Buildbot <a class="reference external" href="http://buildbot.net/repos/trac-plugin/">plugin</a> for Trac.
It's starting to take shape nicely.</p>
<p>Also, I foolishly walked into a room where a bunch of people were playing a
PyGame space-themed production game called <a class="reference external" href="http://pygame.org/projects/20/340/">Galcon</a>, and stupidly installed it. It's
amazingly addictive for such a straightforward game. Now I'm seeing little
spaceships launching and crashing into planets every time I close my eyes.
I'm hopeful that the hallucinations will only last a few days.</p>
Trac2007-01-29T02:29:00-08:002007-01-29T02:29:00-08:00Brian Warnertag:www.lothar.com,2007-01-29:/blog/27-Trac/<p>I've been setting up a <a class="reference external" href="http://trac.edgewall.org/">Trac</a> instance for
<a class="reference external" href="http://buildbot.sf.net">Buildbot</a>, to make it easier for people other
than me to publish advice and tips in a persistent and easily-searchable
fashion, also to make the Buildbot web page a little bit less ugly. Trac is
quite spiffy, and I've been looking over …</p><p>I've been setting up a <a class="reference external" href="http://trac.edgewall.org/">Trac</a> instance for
<a class="reference external" href="http://buildbot.sf.net">Buildbot</a>, to make it easier for people other
than me to publish advice and tips in a persistent and easily-searchable
fashion, also to make the Buildbot web page a little bit less ugly. Trac is
quite spiffy, and I've been looking over the <a class="reference external" href="http://trac-hacks.org/wiki">Trac Hacks</a> page at the wide variety of neat plugins that
are available. In particular the one that exposes wiki-page editing via
XMLRPC (in conjunction with the emacs wiki-editing tool) is quite intriguing.</p>
<p>I hope that one day Buildbot will have a list of plugins like that.</p>
utilities2006-10-07T15:12:00-07:002006-10-07T15:12:00-07:00Brian Warnertag:www.lothar.com,2006-10-07:/blog/26-utilities/<p><tt class="docutils literal">/usr/bin/watch</tt> is a little utility that will erase the screen, run a
command, sleep for a few seconds, then repeat. You can use it to follow files
in /proc without continually re-typing the command.</p>
<p>This program has been around since 1991. How is it that I've been unaware …</p><p><tt class="docutils literal">/usr/bin/watch</tt> is a little utility that will erase the screen, run a
command, sleep for a few seconds, then repeat. You can use it to follow files
in /proc without continually re-typing the command.</p>
<p>This program has been around since 1991. How is it that I've been unaware of
it all this time? How many other thousands of useful tools like this are
lurking on my system <strong>right now</strong> that I've remained ignorant of?</p>
<p>So the advice of the day: spend some time getting to know your <tt class="docutils literal">/usr/bin</tt>
directory. Tomorrow, make it a point to learn a new emacs keybinding that
you've never used before (if you don't know about M-/, start with that one).</p>
promise syntax2006-09-25T23:32:00-07:002006-09-25T23:32:00-07:00Brian Warnertag:www.lothar.com,2006-09-25:/blog/25-promise-syntax/<p>Zooko's in town, and already I feel 20% smarter. I roped him into a
discussion about the Promise syntax I'm developing for Foolscap, and he
suggested an alternative that has some good properties.</p>
<p>I'll illustrate with an example where promise-pipelining actually does you
some good. (many of the use cases …</p><p>Zooko's in town, and already I feel 20% smarter. I roped him into a
discussion about the Promise syntax I'm developing for Foolscap, and he
suggested an alternative that has some good properties.</p>
<p>I'll illustrate with an example where promise-pipelining actually does you
some good. (many of the use cases I've been thinking of involve some sort of
publish/subscribe scheme, and in those cases you win almost nothing with
pipelining). I'm imagining a theoretical Buildbot status interface using
newpb, and a tools that wants to connect to the buildmaster and retrieve the
results of the latest build for a given Builder. The oldpb code would look
like this:</p>
<pre class="literal-block">
# Example 1
def checkResults(results):
if results == SUCCESS:
print "yay!"
def oops(failure):
print "boo"
#
s = getStatus()
d = s.callRemote("getBuilder", "python-2.4-full")
d.addCallback(lambda builder: builder.callRemote("getBuild", -1))
d.addCallback(lambda build: build.callRemote("getResults"))
d.addCallback(checkResults)
d.addErrback(oops)
</pre>
<p>The syntax I've currently got in Foolscap would make it look like this:</p>
<pre class="literal-block">
# Example 2
s = getStatus()
b = send(s).getBuilder("python-2.4-full")
b1 = send(b).getBuild(-1)
r = send(b1).getResults()
when(r).addCallback(checkResults).addErrback(oops)
</pre>
<p>The big win with the promise pipelining is that all 3 calls (4 if you include
getStatus) take place in one round trip, whereas the oldpb approach requires
3 or 4 separate roundtrips. As MarkM has said, the pipes are getting wider
but not shorter, and eventually the round-trip latency will be the biggest
bottleneck.</p>
<p>The syntax that Zooko suggested would make this all look much more like the
(blocking) synchronous form:</p>
<pre class="literal-block">
# Example 3
s = getStatus()
b = s.getBuilder("python-2.4-full")
b1 = b.getBuild(-1)
r = b1.getResults()
r._then(checkResults)._except(oops)
</pre>
<p>Or you could chain it all into a single column, which my editor wouldn't like
(you'd have to add some outer parenthesis to keep it indenting happily) but
which python will still accept:</p>
<pre class="literal-block">
# Example 4
getStatus().getBuilder("python-2.4-full")
.getBuild(-1)
.getResults()
._then(checkResults)
._except(oops)
</pre>
<p>which is a lot easier to read than the same collapsed form with my send()
syntax:</p>
<pre class="literal-block">
# Example 5
when(send(send(send(getStatus()).getBuilder("python-2.4-full")).getBuild(-1)).getResults()).addCallback(checkResults).addErrback(oops)
</pre>
<p>Now, a syntax which looks synchronous is great for programmers who aren't
familiar with asynchronous control flows: they can look at example 3 or 4
and, except for the funny _then clause, it all looks exactly like what they
expect from xmlrpclib or other blocking RPC mechanisms. The problem with this
syntax is that they might forget that they're actually dealing with Promises,
and try to do something like:</p>
<pre class="literal-block">
results = b.getResults()
if results == SUCCESS:
print "yay!"
</pre>
<p>and forget that 'results' is actually a Promise, and the only things you can
do with a promise is to send messages to it, or invoke _then or _except. In
some cases this could just raise an exception:</p>
<pre class="literal-block">
counter = b.getCounter()
print counter + 1
# TypeError: unsupported operand types(s) for +: 'instance' and int
</pre>
<p>And in other cases (like 'results is SUCCESS') it might fail silently, always
returning False. Whereas the send() syntax would make it obvious that you're
dealing with a Promise.</p>
<p>One thing I like about Zooko's approach is that I can have the _then and
_except methods be simplified wrappers for the more general purpose _when or
_when_resolved method, the one that returns a Deferred:</p>
<pre class="literal-block">
results = b.getResults()
d = results._when()
d.addCallback(checkResults)
</pre>
<p>That way <em>I</em> can use Deferreds for my control flow, while the newcomers for
whom Deferreds still seem magical can use a somewhat-familiar _then(callback)
approach. (without this, we'd be walking backwards in time to the beginning
of the evolutionary path that has resulted in Deferreds as a general-purpose
callback management tool).</p>
<p>In addition, these two syntaxes aren't necessarily mutually exclusive. I
could have one kind of Promise that implements the __getattr__ magic
necessary to make Zooko's syntax work, but if you call send() on one, it sets
a flag to disable that magic, so that you end up using the send/when syntax.</p>
<p>There was more to the discussion but it's all in a notebook in the other room
and I'm too sleepy to express it all right now.</p>
new microcontrollers2006-09-24T18:19:00-07:002006-09-24T18:19:00-07:00Brian Warnertag:www.lothar.com,2006-09-24:/blog/24-new-microcontrollers/<p>I've been playing with <a class="reference external" href="http://www.phidgets.com/">Phidgets</a> recently,
having a lot of <a class="reference external" href="http://www.lothar.com/Projects/Phidgets/">fun</a>. They're
great for prototyping, but they would be too expensive to use for most of the
production purposes I have in mind. I've been thinking that for gadgets I
plan to make more than one of, I'd use an …</p><p>I've been playing with <a class="reference external" href="http://www.phidgets.com/">Phidgets</a> recently,
having a lot of <a class="reference external" href="http://www.lothar.com/Projects/Phidgets/">fun</a>. They're
great for prototyping, but they would be too expensive to use for most of the
production purposes I have in mind. I've been thinking that for gadgets I
plan to make more than one of, I'd use an <a class="reference external" href="http://www.ftdichip.com/index.html">FTDI</a> usb-to-serial chip (somewhere around
2.5UKP from their <a class="reference external" href="http://apple.clickandbuild.com/cnb/shop/ftdichip?op=catalogue-categories-null">web store</a>,
and I think about $5 from the parallax store) and a small AVR microcontroller
(for another few dollars). The FTDI web store also sells adapter modules (USB
B on one side, header pins on the other) for 10UKP. For the basic
make-lights-blink peripheral I have in mind, the FTDI chip alone would
suffice, as it's got 5 GPIO pins in addition to the serial port.</p>
<p>I've played with the AnchorChips/Cypress EZUSB before, and it's pretty handy,
and you can get them from digikey (page 493 of the digikey catalog lists the
full-speed ones at about $10, and the high-speed ones from $15 to $20), but
it uses an 8051 core, which is a real drag to program.</p>
<p>So I was pleased to see that Atmel is in the USB game, with their
<a class="reference external" href="http://www.atmel.com/dyn/products/product_card.asp?part_id=3874">AT90USB1286</a> and
related parts. 128K flash, 8K ram, firmware that lets you program the flash
over the USB box, sample code and libraries to do mouse/keyboard/HID stuff
(although it doesn't look like the sample code is under a free software
license, boo), and a $31 evaluation kit (basically a USB dongle with breakout
headers). Digikey has the chips for $14, and a cheaper 64K-flash version is
due out soon.</p>
<p>And Atmel also has a <a class="reference external" href="http://www.atmel.com/products/avr/z-link/">handful</a>
of ZigBee/805.15.4 chips available, which could be really cool. They include
the MAC stack. It's not clear where to buy them or how much they'll cost,
though. It looks like there's enough RF goo that you'd want to go with the
eval board, and that probably means a couple hundred bucks. But eventually
this stuff will make it out to smaller boards.</p>
<p>They're also coming out with a <a class="reference external" href="http://www.atmel.com/products/AVR/picopower/Default.asp">new series</a> of AVRs with
<strong>really</strong> low power consumption, down below a microamp.</p>
Promises2006-09-23T10:25:00-07:002006-09-23T10:25:00-07:00Brian Warnertag:www.lothar.com,2006-09-23:/blog/23-promises/<p>Aaaagh! Promises are hurting my brain.</p>
<p>I'm trying to figure out how to provide a useful subset of E's <a class="reference external" href="http://www.erights.org/elib/concurrency/refmech.html">reference
mechanics</a> in
newpb/<a class="reference external" href="http://twistedmatrix.com/trac/wiki/FoolsCap">foolscap</a>.
Specifically, one of the clever things that E does is to provide <a class="reference external" href="http://www.erights.org/elib/distrib/pipeline.html">Promise
Pipelining</a>, a limited
form of remote code execution, in which I can ask …</p><p>Aaaagh! Promises are hurting my brain.</p>
<p>I'm trying to figure out how to provide a useful subset of E's <a class="reference external" href="http://www.erights.org/elib/concurrency/refmech.html">reference
mechanics</a> in
newpb/<a class="reference external" href="http://twistedmatrix.com/trac/wiki/FoolsCap">foolscap</a>.
Specifically, one of the clever things that E does is to provide <a class="reference external" href="http://www.erights.org/elib/distrib/pipeline.html">Promise
Pipelining</a>, a limited
form of remote code execution, in which I can ask you for an object and tell
you to deliver a message to that object in a single round trip (rather than
the usual two). So I want to be able to do something like:</p>
<pre class="literal-block">
# target, record, and results are all Promise objects
target = tub.getReferenceAsPromise(sturdyref)
record = send(target).getRecord(args)
results = send(record).getField(otherargs)
def printResults(r):
print r
when(results).addCallback(printResults) # when() returns a Deferred
</pre>
<p>You can also include Promises as arguments:</p>
<pre class="literal-block">
record = send(target).getRecord(args)
send(laserprinter).printRecord(record)
</pre>
<p>So I'd like to provide this feature in python/foolscap, both because using
Promises as a programming technique holds a lot of promise (as it were) for
being a cleaner asynchronous style, and because it opens up the possibility
of doing pipelining (which is an actual performance win).</p>
<p>The challenge is that E has very different reference mechanics than python.
In E, <strong>any</strong> reference could be a Promise. (specifically, each reference is
in any one of <a class="reference external" href="http://www.erights.org/elib/concurrency/refmech.html">5 states</a>: LocalPromise,
RemotePromise, Near, Far, and Broken). Whereas in python, references are
always Near, and we have to fake everything else with wrapper objects.</p>
<p>My current approach is to have the Promise class be the wrapper and have it
handle everything except Near references. The basic Promise is created with a
matching resolver:</p>
<pre class="literal-block">
promise, resolver = foolscap.makePromise()
resolver(result) # resolves the promise
</pre>
<p>But the most common way to get one is to do an eventual send to something:</p>
<pre class="literal-block">
from foolscap import send
class Adder:
def add(arg):
return arg+1
a = Adder()
promise = send(a).add(4)
</pre>
<p>There are only two things you can do with a promise: send it more messages,
and wait for it to resolve. The former is done with <tt class="docutils literal">send</tt> (which accepts
either a promise or a regular object, and <strong>always</strong> does an eventual-send),
the latter is done with <tt class="docutils literal">when</tt>:</p>
<pre class="literal-block">
from foolscap import when
d = when(promise)
d.addCallback(printResults)
</pre>
<p>The <tt class="docutils literal">when</tt> always returns a Deferred that will fire with the
resolution of the Promise. So <tt class="docutils literal">send</tt> moves us from the synchronous
world to the asynchronous+promise world, while <tt class="docutils literal">when</tt> and
<tt class="docutils literal">addCallback</tt> move us back to the synchronous one. (<tt class="docutils literal">when</tt> by
itself moves us from the asynchronous+promise world to the
asynchronous+Deferred world).</p>
<p>So far so good. But here are some problems:</p>
<ul class="simple">
<li>can Promises be resolved multiple times? I don't think so. The state
transitions between LocalPromise and RemotePromise don't count.</li>
<li>how should this interact with eventual-send? certainly when you do
<tt class="docutils literal"><span class="pre">send(a).foo()</span></tt>, and a is a normal reference (not a Promise), that
<tt class="docutils literal">a.foo()</tt> call should not happen in the same call stack (i.e. Vat
turn, i.e. Reactor tick). But should all the events be sent in a big batch
as soon as the promise is resolved? Or should they be sent out one at a time
somehow? I suppose if Promises can only ever be resolved once, this is not
as complicated as I'd first thought.</li>
<li>Should we try to support immediate calls to resolved Promises? In E, if
you have a Near reference, you can do both immediate and eventual sends. In
python, it would look like:</li>
</ul>
<pre class="literal-block">
p = send(obj).foo(args)
# later.. p is probably resolved
send(p).bar() # eventual send
p.baz() # immediate send
</pre>
<p>Hm, maybe that isn't such a great idea.</p>
newpb-0.0.2 released2006-09-18T00:45:00-07:002006-09-18T00:45:00-07:00Brian Warnertag:www.lothar.com,2006-09-18:/blog/22-newpb-released/<p>I finally got some twisted time this weekend, so I fixed ticket <a class="reference external" href="http://twistedmatrix.com/trac/ticket/1999">#1999</a> and moved newpb out of the
Twisted subdirectory entirely, renaming it to <a class="reference external" href="http://twistedmatrix.com/trac/wiki/FoolsCap">Foolscap</a> in the process. I also
released version <a class="reference external" href="http://twistedmatrix.com/~warner/Foolscap/foolscap-0.0.2.tar.gz">0.0.2</a>, so
there's a complete tarball ready to install and play with.</p>
<p>Having it live …</p><p>I finally got some twisted time this weekend, so I fixed ticket <a class="reference external" href="http://twistedmatrix.com/trac/ticket/1999">#1999</a> and moved newpb out of the
Twisted subdirectory entirely, renaming it to <a class="reference external" href="http://twistedmatrix.com/trac/wiki/FoolsCap">Foolscap</a> in the process. I also
released version <a class="reference external" href="http://twistedmatrix.com/~warner/Foolscap/foolscap-0.0.2.tar.gz">0.0.2</a>, so
there's a complete tarball ready to install and play with.</p>
<p>Having it live outside the Twisted tree has a number of advantages. Twisted
is mature enough to have moved to a slower development model that preserves
stability at the expense of making new development easy. Each potential
change to the codebase must be reviewed before being applied to the trunk, so
all development takes place on branches and must serve to fix a specific
ticket. Very little of the newpb development falls under this model, and
there are a distinct scarcity of people able to review newpb code. By moving
it outside the Twisted tree, I can continue to work on it in a more suitable
development model.</p>
<p>In addition, moving it outside the <tt class="docutils literal">twisted.</tt> package makes it <strong>much</strong>
easier to test and deploy. When it lived in <tt class="docutils literal"><span class="pre">twisted/pb/*.py</span></tt>, you had to
actually install it before using it, into the same directory as the rest of
Twisted. Now that it lives in <tt class="docutils literal"><span class="pre">foolscap/*.py</span></tt> instead, you can run it from
the source tree. This will make things easier for everybody.</p>
<p>The new name is a bit of a compromise, though. I'm not entirely satisfied
with "Foolscap". It has some good properties (google thinks it is fairly
unique, it has "cap" which might make you think of capabilities, it has "oo"
which might make you think of objects, there's the visual of a twisted
foolscap of paper, the jester's hat-and-bells could make a nice logo). But it
also has some bad ones (MarkM points out that there's enough negative baggage
around the word "capabilities" that you might not want "cap" in your protocol
name, using the word "fool" gives some negative connotations, the
promise-pipelining aspects are really more interesting than the capabilities
ones, and anyways "foolscap" doesn't really flow off the tongue in a glib
manner). But it needed a name to live outside Twisted, and now it has one.
That might change, but Foolscap should get us through the next couple of
months.</p>
<p>I've been staring at E's CapTP protocol a lot, thanks to help from Mark
Miller, trying to understand what their goals are, how they accomplish them,
and what pieces would be useful to implement in Foolscap. What I learned last
week was how the CapTP 3-Vat introduction system works. I think I can
implement it in Foolscap, but I'm trying to decide if it's worth it. CapTP
does some funny tricks to make sure that messages which introduce two Vats
are delivered in the correct order relative to other messages between those
Vats (this is called E-Order in MarkM's papers). I assume this is a good
property to maintain (my general approach is to assume that everything MarkM
does has a good reason behind it, and that if I work at it long enough I may
learn that reason for myself, but for now just shut up and implement it).</p>
<p>But a lot of CapTP is tied up in Promises, and I'm still getting my head
around how to provide something in python that resembles a Promise and is
still useable. We don't have a lot of the language features that E does, in
particular the way that an E object holding a reference to a Promise will
eventually discover (after the promise has been resolved) that they're
holding a reference to some other object. We don't have that sort of silent
slot mutation in Python, so I'm trying to figure out what would be a
meaningful equivalent. So far the Promise syntax is looking something like:</p>
<pre class="literal-block">
p2 = send(p1).foo(args)
# equivalent of E's: p2 = p1 <- foo(args)
</pre>
<p>Of course you can also use <tt class="docutils literal">send()</tt> on non-promises if you just want
to do an eventual-send. This is a more precise way to accomplish what I've
been (crudely) doing with <tt class="docutils literal"><span class="pre">reactor.callLater(0,..)</span></tt> all these years.
I'm also writing a <tt class="docutils literal">sendOnly</tt> for when you want to throw away the
return value. E has compiler support for this, it knows whether the results
of the <tt class="docutils literal">send</tt> are used or not, and can switch between <tt class="docutils literal">send</tt>
and <tt class="docutils literal">sendOnly</tt> automatically. Python does not have such a context
sensor, so we have to do it by hand.</p>
<p>Then, when you want to interface back to the synchronous world, you use
<tt class="docutils literal">when()</tt> to turn the promise into a Deferred, to which you can then
attach some code to run:</p>
<pre class="literal-block">
def _stuff(value):
print value
d = when(p2)
d.addCallback(_stuff)
</pre>
<p>Trying to get this to work with the actual eventual-send queue and make the
result Promises work correctly is making my head spin. I need to sit down
with Zooko on this stuff, he'll understand it well enough to help me get my
brain around it.</p>
antispam2006-03-29T01:45:00-08:002006-03-29T01:45:00-08:00Brian Warnertag:www.lothar.com,2006-03-29:/blog/21-antispam/<p>I ran some stats on my spambuckets tonight, comparing which of my email
addreses get a lot of spam now versus 6 months ago, and noticed a few
addresses that had stopped getting spam altogether. This gives me hope that
by making my 10-year-old primary address less harvestable, the 500-plus …</p><p>I ran some stats on my spambuckets tonight, comparing which of my email
addreses get a lot of spam now versus 6 months ago, and noticed a few
addresses that had stopped getting spam altogether. This gives me hope that
by making my 10-year-old primary address less harvestable, the 500-plus spams
per day might start drying up somewhat.</p>
<p>So a bit of find and perl later, and my web site no longer has a bare email
address on it. I obfuscated it with just a character entity, so cut-and-paste
will still work. Now I'll give it a few months and re-run the stats, to see
if there is any noticeable effect.</p>
new kernel options2005-10-29T14:04:00-07:002005-10-29T14:04:00-07:00Brian Warnertag:www.lothar.com,2005-10-29:/blog/20-new-kernel-options/<p>I'm in the process of upgrading my systems to linux-2.6.14, and noticed a
couple of neat patches that made it into the kernel this time around.</p>
<p>One is that FUSE (<a class="reference external" href="http://fuse.sourceforge.net">http://fuse.sourceforge.net</a>) has finally gotten in. One
thing I'd like to use this for is setting …</p><p>I'm in the process of upgrading my systems to linux-2.6.14, and noticed a
couple of neat patches that made it into the kernel this time around.</p>
<p>One is that FUSE (<a class="reference external" href="http://fuse.sourceforge.net">http://fuse.sourceforge.net</a>) has finally gotten in. One
thing I'd like to use this for is setting up a UML-based jail, to limit the
authority of applications to the minimum necessary. Each app would get a
separate jail. The guest code runs in a virtual kernel that has read-only
access to things like /bin and /usr/lib (so system administration isn't a
nightmare, plus you don't have to have multiple copies of your base system,
plus memory-mapped libraries like libc.so can be shared amongst <em>everything</em>
in the system rather than each kernel keeping its own copy around). The jail
would then have a private read-write copy of everything it's supposed to have
read-write (say /tmp and /var).</p>
<p>The nice thing about this approach as opposed to the usual
big-file-as-block-device scheme that usually gets used with UML is that you
can look into the filesystem from the outside. If you want to see what
exactly the program has changed on its "disk", you just diff -r their
workspace with a checkpoint that you cp -r'ed out earlier. In contrast, the
fake-block-device approach requires that you <em>log in</em> to the guest system and
examine it from the inside, and if you assume that a malicious program has
already compromised as much as it can from the inside, you may no longer be
able to trust the tools that you would use to perform the comparison.</p>
<p>Of course, you still have to trust that the guest code is unable to
compromise the UML kernel, otherwise it now has control of a local user on
the host system, and may be able to bootstrap that upwards. But it limits the
immediate danger of a root compromise within the guest system, and allows for
better monitoring of the jail.</p>
<p>And you still need to patch the host kernel with the SKAS patch
(<a class="reference external" href="http://www.user-mode-linux.org/~blaisorblade/">http://www.user-mode-linux.org/~blaisorblade/</a>) because, while the necessary
arch-specific code to create a UML guest kernel has been merged into the
linux source, the changes that enable fast and safe UML operation in the host
have not.</p>
<p>The other neat feature that just showed up is CONFIG_SECCOMP. From the blurb:</p>
<blockquote>
Once seccomp is enabled via /proc/<pid>/seccomp, it cannot be disabled and
the task is only allowed to execute a few safe syscalls defined by each
seccomp mode.</blockquote>
<p>The idea is that you have a parent process that opens up a couple of pipes to
itself, forks off the child, then throws the child into seccomp mode. After
that, the child relies upon RPC over those pipes to make requests of the
parent. In this way, you get to run compiled languages at full speed, but
they are dependent upon an external entity to actually <em>do</em> anything. The
parent process can then grant capabilities to the child. Someone at the
cap-talk meeting at HP mentioned an approach like this about a month ago,
somebody had speculated about setting up an SELinux policy that prohibited
all syscalls except read(), write(), and select(). It appears that
CONFIG_SECCOMP does exactly this without requiring a full SELinux setup.
(although SELinux might be trivial to set up.. I've never tried).</p>
<p>SECCOMP comes from Andrea Archangeli, who is using it to provide exactly
these sorts of services on a pennies-per-CPU-second basis
(<a class="reference external" href="http://www.cpushare.com">http://www.cpushare.com</a>), using a bunch of Twisted-based code, no less.</p>
<p>Less exciting: CONFIG_CONNECTOR, which makes it easier to write kernel-side
event-driven interfaces that userspace can access through normal
socket/bind/send/recv/poll calls (via special netlink sockets). I've built
interfaces like this through magic /dev/foo files, but you have to build up
your own queueing routines, and implementing the necessary poll() hooks is a
nuisance. This unifies everything into an existing event-oriented interface,
so things like /dev/foo can stick to synchronous "give me the current state
<em>now</em>" -style applications. Also RELAYFS, which creates a filesystem
interface for efficiently transferring large streams of data from userspace
to kernelspace.</p>
<p>Also of interest to me: netfilter's netlink-socket interface has been
unified, so the IPv4-only ipt_ULOG target is turning into an all-protocol
NFNETLINK target. This is also intended to replace the syslog-based ipt_LOG
target. Queueing packets to userspace is being changed the same way, with the
more-flexible TARGET_NFQUEUE. Finally the kernel interface allows multiple
queues to userspace, which addresses some of the traffic-control problems
inherent to multiple kinds of traffic all sharing the same queue.</p>
<p>Plus, the ieee80211 code made it into the kernel, so I don't need to keep
building a separate module for my laptop's ipw2200 driver. And HOSTAP is now
in the kernel, for my PCMCIA prism2 card.</p>
concurrency2005-09-15T23:26:00-07:002005-09-15T23:26:00-07:00Brian Warnertag:www.lothar.com,2005-09-15:/blog/19-concurrency/<p>Had a great chat with <a class="reference external" href="http://ulaluma.com/pyx/">Donovan</a> today, about
newpb and E and secure python and concurrency management. It turns out we
have some of the same ideas about interesting things to do with these kinds
of tools. He pointed me at <a class="reference external" href="http://www.iolanguage.com">a language named Io</a> that's doing some neat stuff …</p><p>Had a great chat with <a class="reference external" href="http://ulaluma.com/pyx/">Donovan</a> today, about
newpb and E and secure python and concurrency management. It turns out we
have some of the same ideas about interesting things to do with these kinds
of tools. He pointed me at <a class="reference external" href="http://www.iolanguage.com">a language named Io</a> that's doing some neat stuff with lightweight
coroutines, and had some interesting thoughts on coroutines in python (making
protocol-parsing code look a good bit simpler than the purely data-driven
model that twisted Protocol classes tend to have).</p>
happy birthday!2005-07-29T20:07:00-07:002005-07-29T20:07:00-07:00Brian Warnertag:www.lothar.com,2005-07-29:/blog/18-happy-birthday/<pre class="literal-block">
% whois lothar.com
...
domain: LOTHAR.COM
person: Brian Warner
nic-hdl: BW116-GANDI
address: The Castle Lothar
...
reg_created: 1995-07-29 00:00:00
</pre>
<p>Ten years ago today, I registered my little personal domain, with a woman at
best.com named Pandora, who was nicely amused by the "company name". In the
intervening time …</p><pre class="literal-block">
% whois lothar.com
...
domain: LOTHAR.COM
person: Brian Warner
nic-hdl: BW116-GANDI
address: The Castle Lothar
...
reg_created: 1995-07-29 00:00:00
</pre>
<p>Ten years ago today, I registered my little personal domain, with a woman at
best.com named Pandora, who was nicely amused by the "company name". In the
intervening time, it has been through two registrars, three hosting
companies, four IP addresses, and five server platforms. For a while it lived
as a verio vhost, for a while it ran on a Cobalt Qube on the near end of a
DSL line, and a mini-ITX board booting from a read-only USB drive. These days
it is a UML slice at linode.com.</p>
<p>I keep meaning to do more with it, but overall I'm pretty happy just to have
a little corner of the 'net that I can call home.</p>
hacking2005-07-13T23:08:00-07:002005-07-13T23:08:00-07:00Brian Warnertag:www.lothar.com,2005-07-13:/blog/17-hacking/<p>The last few weeks have been mostly filled with <a class="reference external" href="http://buildbot.sf.net/">hacking</a> hacking. I'm neck-deep in the implementation
phase of a big new set of features, and it's taking <em>forever</em>. But I think
I'm finally past the hardest part, the design issues that remain to be solved
are at last medium-sized ones …</p><p>The last few weeks have been mostly filled with <a class="reference external" href="http://buildbot.sf.net/">hacking</a> hacking. I'm neck-deep in the implementation
phase of a big new set of features, and it's taking <em>forever</em>. But I think
I'm finally past the hardest part, the design issues that remain to be solved
are at last medium-sized ones instead of huge ones, and even the unit tests
pass. So I'm feeling pretty good about that.</p>
<p>I'm also trying to hack on <a class="reference external" href="http://petmail.lothar.com/">Petmail</a> a little
bit more. There's a spam conference at Stanford next week that I'll be
attending, and even though it's unlikely I'll be showing it off to anyone,
I'd like to be sufficiently back in the Petmail mindset that I can discuss it
intelligently while I'm there.</p>
<p>I'm trying to shift Petmail's configuration interface from the current Gtk
app into a web page one, using <a class="reference external" href="http://nevow.org/">Nevow</a>, because
eventually (when Bill gets some time to work on the Thunderbird plugin) the
send/receive mail interface will be through XMLRPC (or whatever Mozilla code
can get to most conveniently). I haven't figured it out yet, though, nevow
provides some nice features for free, but I don't yet know if they're the
ones that I need to implement this sort of add/edit/remove configuration
stuff.</p>
<p>Also, I'm moving Petmail development over to <a class="reference external" href="http://abridgegame.org/darcs/">Darcs</a>. I've been a bit frustrated with my recent
Buildbot development push, because I'm using Bazaar on my laptop, with a
local repository so I can make commits offline, but pushing changes back and
forth between repositories is enough of a hassle that I just don't do it. So
I'm doing all the buildbot work slouched over my laptop (which I really like,
but the keyboard is making my hands just a little bit uncomfortable), rather
than the desktop with the proper keyboard and proper chair. It looks like
Darcs would make it a bit easier to fling changes from one place to another,
so using it might encourage me to do development anywhere I feel like. (plus,
I should really get a new monitor for the desktop machine.. my
ex-brother-in-law has a gorgeous 20" LCD, something from Dell, which I'm
really tempted to splurge for).</p>
<p>So anyway, there's a Darcs tree for Petmail available at
<a class="reference external" href="http://petmail.lothar.com/repos/trunk">http://petmail.lothar.com/repos/trunk</a> , which replaces the old CVS repository
on that same site. I don't have a Darcs equivalent for ViewCVS up yet,
though. I've seen a web-based Darcs patch viewer, but I wasn't really
impressed. So I'll keep looking.</p>
Go Tools2005-05-28T12:29:00-07:002005-05-28T12:29:00-07:00Brian Warnertag:www.lothar.com,2005-05-28:/blog/16-Go-Tools/<p>I was talking with my brother-in-law about a gadget to make playing Go online
a bit more like playing it in person. The feel of the board and the THWACK!
as you plunk down stones adds a lovely touch to the game, but you don't get
that when clicking on …</p><p>I was talking with my brother-in-law about a gadget to make playing Go online
a bit more like playing it in person. The feel of the board and the THWACK!
as you plunk down stones adds a lovely touch to the game, but you don't get
that when clicking on the <a class="reference external" href="http://cgoban1.sourceforge.net/">cgoban</a>
window. We talked about using a real Go board at each end, pointing a camera
at it to figure out where you've just moved and relay it to the server, and
using a targettable laser pointer (on a pair of servos) to point to where
your partner has just played.</p>
<p>I ran into <a class="reference external" href="http://www.lychnis.net/blosxom/go/index.lychnis">this blog</a>
today about a guy who's interested in part of this problem, specifically
using image-processing software to create a log of a game in progress. He
also has a link to a japanese academic paper about doing the same thing
(specifically creating a game log, aka <a class="reference external" href="http://senseis.xmp.net/?Kifu">Kifu</a>, from a recording of a TV program).</p>
<p>I visited the <a class="reference external" href="http://www.sfgoclub.com/">SF Go Club</a> for the first time
last week, and had a great time.. looking forward to going again next week.</p>
Twist-E2005-05-27T18:00:00-07:002005-05-27T18:00:00-07:00Brian Warnertag:www.lothar.com,2005-05-27:/blog/15-Twist-E/<p>Spent another great day down at HP, talking about implementing E and
web-calculus concepts within Twisted and newpb. <a class="reference external" href="http://www.waterken.com">Tyler Close</a> was kind enough to spend the entire afternoon
with me, explaining how his <a class="reference external" href="http://www.waterken.com/dev/Web/">web-calculus</a> works and the design decisions behind
it. I'm really excited about implenting this stuff in newpb …</p><p>Spent another great day down at HP, talking about implementing E and
web-calculus concepts within Twisted and newpb. <a class="reference external" href="http://www.waterken.com">Tyler Close</a> was kind enough to spend the entire afternoon
with me, explaining how his <a class="reference external" href="http://www.waterken.com/dev/Web/">web-calculus</a> works and the design decisions behind
it. I'm really excited about implenting this stuff in newpb: I think we can
make a system that's both secure <em>and</em> highly usable. Some of the ideas I
came away with that I want write up before I forget:</p>
<p>Promises: In addition to Deferred, we can build a Promise. The usage syntax
would look like:</p>
<pre class="literal-block">
p = tub.getReference(url)
p.authorize(credentials).subscribe(self)
when(p.getReady()).addCallback(lambda res: p.trigger())
p2 = Promise(d1) # turn "deferred which fires with an instance" into a Promise
p3 = p2.invoke()
d2 = when(p3)
d2.addCallback(stuff)
</pre>
<p>The Promise object is basically a wrapper around any Deferred that expects to
fire with an instance. It has a __getattr__ which lets it pretend to
implement any method. Such methods just queue the call and its arguments,
then finish immediately, returning a new Promise. Something like:</p>
<pre class="literal-block">
class Promise:
def __getattr__(self, methname):
if self.resolved:
m = getattr(self.resolution, methname)
assert callable(m)
return m
def newmethod(*args, **kwargs):
self.calls.append((methname, args, kwargs))
# except more cleverness in case the method is invoked after the
# promise is resolved
return newmethod
</pre>
<p>When the Deferred fires, all pending calls are invoked on the instance it
fired with. Each call also returns a Promise, possibly already fulfilled,
with the results of that call, so that <tt class="docutils literal"><span class="pre">p.meth1().meth2()</span></tt> is the
asynchronous equivalent of <tt class="docutils literal"><span class="pre">o.meth1().meth2()</span></tt>, or
<tt class="docutils literal">func2(func1(o))</tt>. '<tt class="docutils literal"><span class="pre">p.meth1();</span> p.meth2()</tt>' means that meth2
must be invoked <em>after</em> meth1: I'm not sure what other kind of sequencing
promises to make (should we wait until meth1 has finished before invoking
meth2?).</p>
<p>If the Deferred errbacks instead, then the Promise is "smashed", which is
like an errback. No further method calls are made, any dependent Promises are
smashed too.</p>
<p>The idea is to make the asynchronous domain be the normal case, and mark the
boundary with the synchronous domain specially. <tt class="docutils literal">when()</tt> would be a
function that turns a Promise into a Deferred, with which the transition
could be scheduled:</p>
<pre class="literal-block">
def when(p):
if not isinstance(p, Promise):
return defer.succeed(p.resolution)
if p.resolved:
return defer.succeed(p.resolution)
else:
d = defer.Deferred()
p.waiting.append(d)
return d
</pre>
<p>He pointed out that E currently has two separate method invocation syntaxes:
'o.foo()' requires a local reference, and may or may not return a Promise. 'p
<- foo()' can accept either a local reference or a Promise, and always
returns a Promise. (actually I'm not sure I'm getting this right, but the
implication was that there were two forms, one for local and one for remote,
whereas Tyler felt that there should only be one).</p>
<p>Then, later, we'll create the RemotePromise, which is a Promise that's
associated with a RemoteReference. <tt class="docutils literal">rp.foo(args)</tt> is equivalent to
<tt class="docutils literal">d.addCallback(lambda res: <span class="pre">res.callRemote("foo",</span> args))</tt> . When
Promises are serialized, they get a clid and show up as another Promises on
the far end. You push the waiting as far away as possible, apparently this is
the way to reduce the probability of deadlocks.</p>
<p>My main concern with this syntax is that it may confuse the
synchronous-domain developers that we (as Twisted) have been trying to gently
nudge into the world of asynchronous programming. We're not blocking, but the
code looks a lot like that's what's happening. But, once you've stopped
thinking that the lack of a <tt class="docutils literal">.callLater</tt> implies immediate execution,
the <tt class="docutils literal">p.meth(args)</tt> syntax really is a lot cleaner. You just assume
that <strong>everything</strong> could be a promise, and you use <tt class="docutils literal">when()</tt> if you
need to assure that you have an immediate value.</p>
<p>One problem with reference counting is that your peer can force you to retain
an object for arbitrarily long times, by just never sending you the decref
(and Gifts make things even worse). Tyler's hunch is that distributed
reference counting is the wrong approach, and it is more practical to manage
object lifetime with the Vat/Tub. Break application processing into units,
create a Tub for each unit, when the unit is finished, destroy the Tub. All
objects that pass through a Tub are registered (under an unguessable name) in
that Tub, so they remain accessible for the lifetime of the Tub, and then
become inaccessible when the Tub is destroyed.</p>
<p>To use this well, it must be easy to create new Tubs and destroy them later.
These Tubs must be able to share listener ports, which can distinguish the
desired Tub by its keyid. To accomplish this with newpb, I think we may need
a module-level registry of Listeners, so that two Tubs that are asked to
listen on the same port will register with the same Listener. (it might also
make sense to use <tt class="docutils literal">newtub = oldtub.makeTub()</tt>, and have the Listener
be inherited). We should pay attention to the possibility of sharing a TCP
connection to an existing Tub, but keep in mind that separate TLS keys will
require separate TCP connections.</p>
<p>Secure PB URLs want a key as the primary specifier, followed by a list of
location hints, followed by a Tub-scoped name:</p>
<pre class="literal-block">
PBY url: pby://key@1.2.3.4,foo.com,[::1],loc2,loc3/name
key is base32(sha1(tub.pubkey))
unix socket is trickier
non-authenticated url still requires Tub ID
</pre>
<p>He also feels that DoS prevention (one of the three reasons for Constraints,
the other two being semantic typechecking assertions and API documentation)
is difficult to implement and hard to get right, and unlikely to do the
complete job that you'd want out of it. He said MarkM burned a lot of cycles
trying to build DoS prevention techniques into CapIDL, and it would be worth
asking him for his thoughts.</p>
<p>He said one deployment pattern would be to put security proxies in a set of
separate processes, which perform deserialization, check arguments, etc, and
then pass the results on to the real object. The security proxies would be
CPU/memory limited, and there would be one per connection, so that if someone
started to abuse their connection, only they would suffer. Once you get to a
service large enough to be worried about DoS attacks, you'd want this
architecture anyway because then you can distribute it out to multiple
machines. I was skeptical about how to go about implementing this sort of
proxy: how much CPU time do you give it? If it takes 1ms to deserialize a
message that then consumes 1s of server time, do you have to restrict it to
1/1000th the CPU time of the server? Note that other possibilities include
strict prioritization of the processes/threads (so the connections are
starved until the server becomes idle), and enforcing one-at-a-time
processing of messages.</p>
<p>His approach in web-amp was just to limit each serialized argument to 8kb.
The objection that this might not be enough is countered by the fact that if
you're sending more data than that, you should mark it explicitly (by
creating a publish/subscribe model), because there's a good chance that the
data is being used on the wrong side of the wire. The attacker is allowed to
do whatever evil they can accomplish in 8kb, maybe that means a 2k-deep
nested series of lists, but whatever it is won't be too big. I feel that at
some point you have to enforce a limit.. in web-amp, you must limit the total
number of arguments they can send you, or the number of method calls per
second, or something.</p>
<p>The non-DoS-related semantic typechecking (I'm expecting an int, is it really
an int?) is just as easily done with assert()s inside the method body. I want
this kind of checking to happen as close to the top of the method as
possible.. doing it in a RemoteInterface in some separate file feels wrong to
me. One approach is a func.guard method attribute (whose constructor takes
arguments much like the RemoteInterface methods do), which could be pulled up
to the top of the method body with a decorator. The big difference in thought
here is the idea of providing objects (which happen to implement a certain
set of methods) versus providing methods (which happen to be bound to a
particular object).</p>
<p>A lot of the typechecking concerns are eased with finer-grained capabilities.
Ideally, the worst they can do by sending you a weird object type is to cause
an exception. As long as you haven't registered an Unslicer that gives the
resulting object some ambient authority, you aren't going give them any new
privileges by invoking a method on something they <em>can</em> give you. Tyler says
you only do typechecking when you're considering granting them some new
privileges. The notion is that it's the bound-method capability that is the
basis of power, not what they do with it or what they send to it.</p>
<p>The constraints are useful for method documentation, especially if they can
be serialized and passed to an object browser, but can only document the list
of methods and the names/types of their arguments. The actual API description
still needs to be in epydoc, which can provide (non-machine-parseable)
argument name/type docs too.</p>
<div class="section" id="positional-parameters-for-interoperability-with-java">
<h2>positional parameters for interoperability with java:</h2>
<p>java doesn't have keyword args. To provide interoperability, the python-newpb
method call serializer needs to send args in strict order, the java newpb
receiver would ignore the argument names (only using the values). In the
other direction, the java method call serializer would send None for the
argument names, and the python receiver would use the local RemoteInterface
to turn the argument list into a kwargs dict.</p>
<p>Finally, I need to study the XML schemas in the web-calculus more closely. In
it, the bound method closure URL can be used for two purposes: a GET returns
the method schema (a description of what types the positional parameters will
accept), while a POST will invoke the closure. However, the object which
provided that URL has a class, and the method clause had a name, and the
method schema is always the same for any given (class, methodname) pair, so
even a fully send-time-checking implementation doesn't have to retrieve any
method schema more than once. I had first thought that there was some
reduncancy in the XML data being returned, but Tyler's put a lot of thought
and time into it to minimize the round-trips and avoid redundancy. newpb
would be well-served by studying his approach carefully.</p>
</div>
books2005-05-26T12:57:00-07:002005-05-26T12:57:00-07:00Brian Warnertag:www.lothar.com,2005-05-26:/blog/14-books/<p>I started in on Alastair Reynolds' <cite>Century Rain</cite> last night, got about
halfway through before I finally succumbed to sleep. It's a good read:
finally he gets to have at least a few chapters that don't involve pervasing
nanotechnology or uploaded personality constructs or galaxy-spanning machine
intelligences.</p>
<p>I was thrown …</p><p>I started in on Alastair Reynolds' <cite>Century Rain</cite> last night, got about
halfway through before I finally succumbed to sleep. It's a good read:
finally he gets to have at least a few chapters that don't involve pervasing
nanotechnology or uploaded personality constructs or galaxy-spanning machine
intelligences.</p>
<p>I was thrown at first, however, because he's got a system-wide human
government named The Polity, and just last week I had finished reading Neal
Asher's <cite>Line Of Polity</cite>, in which <em>his</em> galaxy-wide human government (also
named The Polity) is considerably more powerful, and somewhat less
conflicted, and certainly motivated by different things. It took me a while
to put that Polity out of my mind.</p>
and a calendar too2005-05-24T01:18:00-07:002005-05-24T01:18:00-07:00Brian Warnertag:www.lothar.com,2005-05-24:/blog/13-and-a-calendar-too/<p>Hey, that wasn't too bad. I also added some CSS to make everything a tiny bit
less ugly.</p>
<p>Now all I need is auto-completion on the category elisp..</p>
<p>Hey, that wasn't too bad. I also added some CSS to make everything a tiny bit
less ugly.</p>
<p>Now all I need is auto-completion on the category elisp..</p>
adding subcategories2005-05-23T20:29:00-07:002005-05-23T20:29:00-07:00Brian Warnertag:www.lothar.com,2005-05-23:/blog/12-adding-subcategories/<p>I think I've gotten my elisp code to handle pyblosxom categories now.
pyblosxom was easy, but I have to add the glue to let you choose a category.
Unfortunately creating new categories requires manual work (registering the
CVS directory).</p>
<p>Next step: find a pyblosxom plugin to create that spiffy little …</p><p>I think I've gotten my elisp code to handle pyblosxom categories now.
pyblosxom was easy, but I have to add the glue to let you choose a category.
Unfortunately creating new categories requires manual work (registering the
CVS directory).</p>
<p>Next step: find a pyblosxom plugin to create that spiffy little category
sidebar I've seen on so many other blogs.</p>
great week2005-05-21T15:08:00-07:002005-05-21T15:08:00-07:00Brian Warnertag:www.lothar.com,2005-05-21:/blog/11-great-week/<p>Man, what a great week. I spent a couple of days working with <a class="reference external" href="http://ulaluma.com/pyx/">Donovan</a> at his office on a couple of issues: making
<a class="reference external" href="http://codespeak.net/py/current/doc/test.html">py.test</a> capable of
running Twisted test cases, improving <a class="reference external" href="http://www.nevow.com/">LivePage</a>
event notification, and setting up a BuildBot for their in-house test suite.</p>
<p>Thursday night was the <a class="reference external" href="http://www.baypiggies.net/">BayPIGgies …</a></p><p>Man, what a great week. I spent a couple of days working with <a class="reference external" href="http://ulaluma.com/pyx/">Donovan</a> at his office on a couple of issues: making
<a class="reference external" href="http://codespeak.net/py/current/doc/test.html">py.test</a> capable of
running Twisted test cases, improving <a class="reference external" href="http://www.nevow.com/">LivePage</a>
event notification, and setting up a BuildBot for their in-house test suite.</p>
<p>Thursday night was the <a class="reference external" href="http://www.baypiggies.net/">BayPIGgies</a> meeting (a
local Python users group), held at Google's spiffy office complex in mountain
view. I handed off some JavaButton hardware that I'm loaning to Pavel for a
month, and wound up hanging out with <a class="reference external" href="http://zooko.com/">Zooko</a> for the
rest of the evening, talking about some software licensing ideas he's been
thinking about. We agreed that they need a bit of work, but were still quite
promising, and we were up pretty late arguing about the details. When you
start talking about metalicenses, you know it's getting late.</p>
<p>Friday I spent at HP hanging out with some of the E/Capabilities people. In
the discussion I happened to mention <a class="reference external" href="http://oblomovka.com/entries/2003/10/13#1066058820">an essay I'd seen</a> about expectations of
privacy in online spaces, unfortunately I wasn't able to remember the site or
the author in realtime. Of course it turns out that it was written by Danny
O'Brien, whom I met at CodeCon and when we talked to the ZigBee people about
licensing their technology and brands in a way that would make them more
compatible with free-software. Small world.</p>
<p>The afternoon was spent at <a class="reference external" href="http://lists.canonical.org/pipermail/kragen-tol/2005-May/thread.html">Kragen's</a>
office watching he and Donovan and Mark work on <a class="reference external" href="http://www.wheatfarm.org/">Wheat</a>. When Tyler showed up we spent about half an
hour talking about newpb could incorporate some of the ideas of his
<a class="reference external" href="http://www.waterken.com/dev/Web/">web-calculus</a> model. This was really
useful, it sounds like he's addressed most of the problems we've encountered
in building newpb. I think there exists a possibility that we could use his
serialization scheme and (since they're working on making E speak the same
protocol) thus make newpb interoperate with E. <em>That</em> would be a nice
accomplishment.</p>
SPF2005-05-16T20:05:00-07:002005-05-16T20:05:00-07:00Brian Warnertag:www.lothar.com,2005-05-16:/blog/10-SPF/<p>I've been trying to decide whether to publish an SPF record for lothar.com or
not. The last few days have seen an absolute deluge of spam from some german
bastards, much of which is being forged in my name. The only real solution
is, of course, to sign everything …</p><p>I've been trying to decide whether to publish an SPF record for lothar.com or
not. The last few days have seen an absolute deluge of spam from some german
bastards, much of which is being forged in my name. The only real solution
is, of course, to sign everything and make sure the entire rest of the world
knows about that practice. Or magically switch everybody over to my
<a class="reference external" href="http://petmail.lothar.com/">http://petmail.lothar.com/</a> project.</p>
<p>But I'm starting to think that SPF might address the specific frustration I'm
feeling with this forgery. And I'm seeing about 2-3 TXT record lookups per
hour, so <em>somebody</em> out there is using it.</p>
<p><a class="reference external" href="http://homepages.tesco.net/~J.deBoynePollard/FGA/smtp-spf-is-harmful.html">http://homepages.tesco.net/~J.deBoynePollard/FGA/smtp-spf-is-harmful.html</a>
<a class="reference external" href="http://www.anders.com/projects/sysadmin/djbdnsRecordBuilder/">http://www.anders.com/projects/sysadmin/djbdnsRecordBuilder/</a></p>
iButtons2005-05-16T11:27:00-07:002005-05-16T11:27:00-07:00Brian Warnertag:www.lothar.com,2005-05-16:/blog/9-iButtons/<p>I was talking with Pavel (aka PenguinOfDoom, on #twisted) last week about
iButtons, and mentioned the JavaButton I picked up years ago that I haven't
really managed to do anything with yet. That prompted me to poke around the
web site (was dalsemi.com, since bought by <a class="reference external" href="http://www.maxim-ic.com">http://www.maxim-ic …</a></p><p>I was talking with Pavel (aka PenguinOfDoom, on #twisted) last week about
iButtons, and mentioned the JavaButton I picked up years ago that I haven't
really managed to do anything with yet. That prompted me to poke around the
web site (was dalsemi.com, since bought by <a class="reference external" href="http://www.maxim-ic.com">http://www.maxim-ic.com</a>), and it
turns out they have a new-ish version of the portable C code that interfaces
with the things. The last time I looked (version 300b2), there was a single
function left unimplemented which prevented the use of JavaButtons on a USB
adapter under linux. (non-java buttons were ok, serial port adapters were ok,
it was just the combination that didn't work). I don't yet know if that's
been fixed in the "new" (2004) version 300 library.</p>
<p>Trying to buy a JavaButton looks hard, much harder than it was when I got
mine. It probably requires talking to a sales rep. I got a starter kit that
included a DS1957 on a USB key fob, very nicely designed. The only part I can
see listed on their web site is the DS1955, which has like 8kB of ram (the
DS1957 has more like 150kB). The JavaButtons include cryptographic code, so
they require a license/export agreement, but it would have been nice if they
made it clear how you obtain such a thing.</p>
<p>Anyway, here are a handful of links, since their web site seems particularly
hard to navigate.</p>
<p><a class="reference external" href="http://www.maxim-ic.com/products/ibutton/software/1wire/wirekit.cfm">http://www.maxim-ic.com/products/ibutton/software/1wire/wirekit.cfm</a></p>
<dl class="docutils">
<dt><a class="reference external" href="http://www.maxim-ic.com/pl_list.cfm/filter/22">http://www.maxim-ic.com/pl_list.cfm/filter/22</a></dt>
<dd>list of iButton data sheets</dd>
<dt><a class="reference external" href="http://www.maxim-ic.com/1-Wire.cfm">http://www.maxim-ic.com/1-Wire.cfm</a></dt>
<dd>regular ICs (not in a steel can) using the same protocol, usually TO-92</dd>
<dt><a class="reference external" href="http://www.maxim-ic.com/products/microcontrollers/crypto_ibutton_license_application.cfm">http://www.maxim-ic.com/products/microcontrollers/crypto_ibutton_license_application.cfm</a></dt>
<dd>might be the entry point to buying a JavaButton, or maybe just one of their
crypto iButtons</dd>
</dl>
<p>UPDATE: no, version 300 still does not support JavaButtons over USB. The
specific issue is that JavaButtons require a strong pullup to provide lots of
power while they're crunching away in the crypto routines. The USB adapter
can do this, but the Linux interface code doesn't know how to turn it on.
<tt class="docutils literal">lib/general/Link/USB_Linux/usblnk.c</tt> has a routine named
<tt class="docutils literal">hasPowerDelivery</tt>, which currently reads:</p>
<pre class="literal-block">
SMALLINT hasPowerDelivery(int portnum)
{
// Adapter supports it but not implemented yet
return FALSE;
}
</pre>
<p>Sigh.</p>
sparklines2005-05-07T12:40:00-07:002005-05-07T12:40:00-07:00Brian Warnertag:www.lothar.com,2005-05-07:/blog/8-sparklines/<p>My friend Drew just sent this one along:</p>
<blockquote>
<a class="reference external" href="http://bitworking.org/news/Sparklines_in_data_URIs_in_Python">http://bitworking.org/news/Sparklines_in_data_URIs_in_Python</a></blockquote>
<p>I'm pondering things I might do with this. I've been using Data: URIs for one
of my projects, they're pretty handy and both Firefox and Safari are more
than happy to take ridiculously large ones (50k or …</p><p>My friend Drew just sent this one along:</p>
<blockquote>
<a class="reference external" href="http://bitworking.org/news/Sparklines_in_data_URIs_in_Python">http://bitworking.org/news/Sparklines_in_data_URIs_in_Python</a></blockquote>
<p>I'm pondering things I might do with this. I've been using Data: URIs for one
of my projects, they're pretty handy and both Firefox and Safari are more
than happy to take ridiculously large ones (50k or more). Like Drew, I'm
wondering what I could do with sparklines.</p>
<p>The first thing that comes to mind is a compact representation of BuildBot
test results. When you look at the history of a single builder, a series of
builds over time, what you care about it how the results have changed from
one build to the next. I've been thinking about having the buildbot pay
attention to things like when any given test starts failing or starts passing
again, but until I get around to writing that code, you could use a sparkline
to represent the test results in a compact glyph, and then just show the last
50 of those. The user could then scan them visually to look for changes.</p>
<p>I'm not sure where else to use them yet. I'm tempted to write a Nevow
renderer to create them, though, because that would make it a lot easier to
insert them into other pages. That would let you use some HTML like: <tt class="docutils literal"><div
<span class="pre">nevow:render="sparkline"</span> <span class="pre">nevow:data="stuff"</span> /></tt> and then implement a
<cite>data_stuff</cite> method that would return whatever you wanted to put into the
sparkline.</p>
pyblosxom-noindex2005-05-04T03:13:00-07:002005-05-04T03:13:00-07:00Brian Warnertag:www.lothar.com,2005-05-04:/blog/7-pyblosxom-noindex/<p>After some amount of perseverance, I finally figured out how to make
pyblosxom insert "noindex" meta tags in the top-level index page. This was
the last barrier keeping me from linking this blog to the main site, since I
didn't want Google indexing a page that's going to change every …</p><p>After some amount of perseverance, I finally figured out how to make
pyblosxom insert "noindex" meta tags in the top-level index page. This was
the last barrier keeping me from linking this blog to the main site, since I
didn't want Google indexing a page that's going to change every few days
anyway.</p>
<p>For reference, here's the plugin I made. It's remarkably simple, after I
traced through the code for several hours to figure out what function needed
to be hooked:</p>
<pre class="literal-block">
#! /usr/bin/python
import sys
template = \
"""<html>
<head><title>$blog_title_with_path</title>
<meta name="robots" content="follow,noindex" />
</head>
<body><h1>$blog_title</h1><p>$pi_da $pi_mo $pi_yr</p>
"""
def cb_head(args):
"""This replaces the HEAD portion of the template whenever a 'directory'
is being rendered. The modified template adds special 'noindex' meta tags
to tell google that it shouldn't bother indexing the main page (since it
will change), but to index the permalink pages instead.
"""
#print >>sys.stderr, args['template']
if args['request'].getData()['bl_type'] == "dir":
args['template'] = template
return args
</pre>
buildbot versus windows2005-04-27T15:14:00-07:002005-04-27T15:14:00-07:00Brian Warnertag:www.lothar.com,2005-04-27:/blog/6-buildbot-versus-windows/<p>I just spent several hours getting a reasonable python environment working
under Windows, something I had hoped to never have a need for. The Buildbot
is having some.. disagreements.. with Windows, and it became clear that being
able to reproduce the problem locally was the only sane way to fix …</p><p>I just spent several hours getting a reasonable python environment working
under Windows, something I had hoped to never have a need for. The Buildbot
is having some.. disagreements.. with Windows, and it became clear that being
able to reproduce the problem locally was the only sane way to fix it.</p>
<p>Man, was that painful.</p>
<p>For the record, here's what I did. Many thanks to Bear for creating this
checklist and walking me through the process:</p>
<pre class="literal-block">
0. Check to make sure your PATHEXT environment variable has ";.PY" in
it -- if not set your global environment to include it.
Control Panels / System / Advanced / Environment Variables / System variables
1. Install python -- 2.4 -- http://python.org
* run win32 installer - no special options needed so far
2. install zope interface package -- 3.0.1final --
http://www.zope.org/Products/ZopeInterface
* run win32 installer - it should auto-detect your python 2.4
installation
3. python for windows extensions -- build 203 --
http://pywin32.sourceforge.net/
* run win32 installer - it should auto-detect your python 2.4
installation
the installer complains about a missing DLL. Download mfc71.dll from the
site mentioned in the warning
(http://starship.python.net/crew/mhammond/win32/) and move it into
c:\Python24\DLLs
4. at this point, to preserve my own sanity, I grabbed cygwin.com's setup.exe
and started it. It behaves a lot like dselect. I installed bash and other
tools (but *not* python). I added C:\cygwin\bin to PATH, allowing me to
use tar, md5sum, cvs, all the usual stuff. I also installed emacs, going
from the notes at http://www.gnu.org/software/emacs/windows/ntemacs.html .
Their FAQ at http://www.gnu.org/software/emacs/windows/faq3.html#install
has a note on how to swap CapsLock and Control.
I also modified PATH (in the same place as PATHEXT) to include C:\Python24
and C:\Python24\Scripts . This will allow 'python' and (eventually) 'trial'
to work in a regular command shell.
5. twisted -- 2.0 -- http://twistedmatrix.com/projects/core/
* unpack tarball and run
python setup.py install
Note: if you want to test your setup - run:
python c:\python24\Scripts\trial.py -o -R twisted
(the -o will format the output for console and the "-R twisted" will
recursively run all unit tests)
I had to edit Twisted (core)'s setup.py, to make detectExtensions() return
an empty list before running builder._compile_helper(). Apparently the test
it uses to detect if the (optional) C modules can be compiled causes the
install process to simply quit without actually installing anything.
I installed several packages: core, Lore, Mail, Web, and Words. They all got
copied to C:\Python24\Lib\site-packages\
At this point
trial --version
works, so 'trial -o -R twisted' will run the Twisted test suite. Note that
this is not necessarily setting PYTHONPATH, so it may be running the test
suite that was installed, not the one in the current directory.
6. I used CVS to grab a copy of the latest Buildbot sources. To run the
tests, you must first add the buildbot directory to PYTHONPATH. Windows
does not appear to have a Bourne-shell-style syntax to set a variable just
for a single command, so you have to set it once and remember it will
affect all commands for the lifetime of that shell session.
set PYTHONPATH=.
trial -o -r win32 buildbot.test
To run against both buildbot-CVS and, say, Twisted-SVN, do:
set PYTHONPATH=.;C:\path to\Twisted-SVN
</pre>
buildbot hacking2005-04-23T03:50:00-07:002005-04-23T03:50:00-07:00Brian Warnertag:www.lothar.com,2005-04-23:/blog/5-buildbot-hacking/<p>I'm pushing to get a new <a class="reference external" href="http://buildbot.sf.net">BuildBot</a> release out on
monday, so the last few days have been a flurry of commits (and the weekend
will probably be the same). I was very pleased to hear that the Boost crew
have implemented a <a class="reference external" href="http://build.redshift-software.com:9990">Buildbot</a> to
run their (very large) regression …</p><p>I'm pushing to get a new <a class="reference external" href="http://buildbot.sf.net">BuildBot</a> release out on
monday, so the last few days have been a flurry of commits (and the weekend
will probably be the same). I was very pleased to hear that the Boost crew
have implemented a <a class="reference external" href="http://build.redshift-software.com:9990">Buildbot</a> to
run their (very large) regression test suite, especially because Dave Abrahms
and I talked about setting one up two years ago, at PyCon, and I was never
able to give them the time to make it happen. I was even more pleased to hear
that their goal is to move all their testing over to buildbot. You couldn't
ask for better marketing than for the STL heir-apparent to be using your
project :).</p>
<p>Both Thomas (at <a class="reference external" href="http://build.fluendo.com:8080/">Fluendo</a>) and the Boost
folks have patched their buildbots to allow the waterfall display be themed
with CSS, and the results look great. I'm looking forward to getting Thomas's
code pulled into the mainline sources.. finally a way to make the waterfall
display less ugly.</p>
<p>Finally, the metabuildbot is shaping up. This is a buildbot that works to run
the buildbot's own unit tests. I need to find a reasonable hostname and link
for it, then I'll make it publically visible. Bear has put a lot of time so
far into making the win32 slave work correctly, with no success yet (the
specific problem is that I'm using Arch to get up-to-date sources out to the
buildslaves, and tla is not happy on win32, some kind of 260-character limit
on pathnames that tla runs up against when it does a checkout). I've dropped
back to CVS for now (with a three-hour timeout in the hopes of getting around
sf.net's enormous anoncvs latency), but a separate bug in the buildslave,
compounded by a bug in the buildslave's error-handling code, have conspired
to get the win32 slave into a state that requires manual intervention to
un-jam. Grr, stupid windows.</p>
twisted talk2005-04-20T01:15:00-07:002005-04-20T01:15:00-07:00Brian Warnertag:www.lothar.com,2005-04-20:/blog/4-twisted-talk/<p>So I think the talk went really well. I spoke for about an hour before the
room was needed for another meeting, to about 10 or 15 OSAF developers. I
managed to cover the reactor, Protocols, Factories, building higher-level
protocols, Failures, Deferreds, reactor.run() vs twistd -y vs mktap/twistd …</p><p>So I think the talk went really well. I spoke for about an hour before the
room was needed for another meeting, to about 10 or 15 OSAF developers. I
managed to cover the reactor, Protocols, Factories, building higher-level
protocols, Failures, Deferreds, reactor.run() vs twistd -y vs mktap/twistd
-f, and even a bit of twisted.web (the resource-tree model) and threads
(reactor.runInThread/runFromThread). The things that were on my list but
which I didn't get to cover were Cred, usage.Options, PB, and Interfaces.</p>
<p>But all in all I think the session helped a lot of people get their heads
around the architecture.. I think they're now in a position to understand the
existing HOWTOs and other documentation.</p>
<p>After the session, I sat down with Brian and two other OSAF folks: Lisa
Dusseault and Grant Baillie. They are working on WebDAV, and have a strong
interest in a functional WebDAV client library. As I understand it, this
library's top-level API would need to look like an abstract file system, with
directory lookups, pathnames, something like file handles, and file
attributes. Inside, it would need to have a back end which actually speaks
WebDAV to some server, creating new connections when necessary, or re-using
persistent connections is possible. There would also need to be some sort of
cache-management policy hting, since smart caching can make or break the
performance of a WebDAV session.</p>
<p>Given their needs, we agreed that a Twisted WebDAV client library would be a
great solution, and they've got the motivation and the knowledge (apparently
Lisa was one of the primary WebDAV folks at Microsoft) to pull it off.</p>
<p>I described the recent work that's gone into an abstract file system (by spiv
and others, for twisted.ftp), thinking that it would be the best place to
start. The next step will probably be to introduce them to spiv, and float a
post on twisted-python to see who else has an interest.</p>
<p>Brian also gave me a quick demo of Chandler, giving me a better idea about
where they're going and what their plans are. It's funny, about 15 years ago
I had a summer job at a research lab who had a similar goal. They were
working on OCR and search technology, and wanted to make a box that could
digitize and read all the random bits of paper that you produce in the course
of a day, then let you index the information contained on them in a useful
way. The Chandler folks want to take all the random bits of digital
information that you create in the course of a day (email, IMs, calendar
entries, todo lists) and organize/share them in a useful way. Kinda neat. I
look forward to seeing where it goes.</p>
OSAF Twisted talk2005-04-19T02:44:00-07:002005-04-19T02:44:00-07:00Brian Warnertag:www.lothar.com,2005-04-19:/blog/3-OSAF-Twisted-talk/<p>This is a rough outline of the talk I'll be giving at the OSAF tomorrow.</p>
<pre class="literal-block">
definition of Twisted, resources:
http://www.twistedmatrix.com
svn://svn.twistedmatrix.com/svn/Twisted/trunk
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
http://twistedmatrix.com/bugs/
http://twistedmatrix.com/buildbot/
#twisted, #twisted.web on freenode …</pre><p>This is a rough outline of the talk I'll be giving at the OSAF tomorrow.</p>
<pre class="literal-block">
definition of Twisted, resources:
http://www.twistedmatrix.com
svn://svn.twistedmatrix.com/svn/Twisted/trunk
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
http://twistedmatrix.com/bugs/
http://twistedmatrix.com/buildbot/
#twisted, #twisted.web on freenode
relationship of subprojects, dependencies:
core, names, mail, web, words, conch, trial
zope.interface, python2.2
optional: pyopenssl, db stuff
directory overview:
twisted.python: usage.Options, Failure, log
twisted.internet: reactors, base classes for Protocol+Factory, Deferred
twisted.protocols: simple protocols: finger, socks, telnet
subproject directories
doc/*/howto
doc/core/howto/tutorial/listings/finter/*.py
motivation:
simple client
simple server
not-so-simple server
client+server
need for a generalized solution
threads, processes, event loop
event loop:
asyncore
reactor
picture: reactor with select() call, sockets in .readers/.writers
sockets have .doRead, .doWrite, are scheduled with .addReader/etc
timers
different kinds of reactors, using other event loops: gtk, kqueue
picture: Protocol with Transports, reactor
Protocol: connectionMade, dataReceived, connectionLost, transport.write
how do those Protocols get created?
reactor.listenTCP(port, factory)
picture (server): Protocols, Factory
listening socket (Port) points to Factory, creates new Protocols
Factory gets startFactory, stopFactory, buildProtocol
Protocols generally have .factory
reactor.connectTCP(host, port, factory)
picture (client):
Factory gets startedConnecting, clientConnectionFailed, clientConnectionLost
as well as startFactory, stopFactory, buildProtocol
Connector is responsible for getting a connection to host+port+factory
possibly multiple times, for ReconnectingClientFactory
skip over Connector stuff
writing Protocols, using existing ones
picture: t.p.finger.Finger
overridable methods for getUser, getDomain, forwardQuery
subclass, override method
make a Factory which instantiates your new subclass
attach to listenTCP
Protocols are used for both clients and servers
state machine
return one-shot results with Deferreds
return multi-shot results by overriding methods
larger protocols have more complex setup
names: protocol parses the query, hands to factory
factory does self.handleQuery, asks self.resolver, calls self.sendReply
# good example of API, use of deferred: t.n.server.py:120, dns.py:1050
web: basic HTTP protocol creates Requests, then does req.process
twisted.web.site implements a Resource tree
picture(web): root, getChild(), isLeaf, render(req)
specialized subclasses provide CGI processing, static.File, distrib
imap: involves cred, Mailbox objects, Message objects
top-level invocation:
__main__, reactor.run()
connectTCP, listenTCP
or, creating an Application, then using twistd
motivation: daemonization, logging, setuid/chroot, reactor, profiling
think /etc/init.d
picture: trees of Service/MultiService objects
each gets startService, stopService
t.a.internet.TCPServer(port, factory), TCPClient
twistd -y foo.tac, script which creates an Application object
sidebar: python as a configuration language
serialize the Application, then launch it again later: twistd -f foo.tap
shortcuts for common applications: mktap
mktap plugins: Options, makeService(), register with plugins.tml
threads:
nothing here needs threads
where are they useful?
wrapping blocking APIs: adbapi in particular
integrating with other code
threadpool: run a function in a thread, tell me when it is done
t.p.log:
log.msg(msg, msg) emits a log
log.err() emits the current exception
log.err(f) emits a Failure object
log output goes to an observer
running from twistd: goes to twistd.log, or syslog
running from __main__: log messages are discarded
log.startLogging()
Failure:
encapsulates a python exception
can be serialized, printed, queried about what caused it
Failure() inside an except: block wraps the current exception
Deferred:
callback management
use web.client.getPage as an example
synchronous style:
a=foo()
b=bar(a)
baz(b)
asynchronous style:
d=foo();
d.addCallback(bar)
d.addCallback(baz)
callback vs errback, ladder diagram
fire-before-addCallback is safe
callbacks can return Deferreds: sub-ladders
usage.Options:
create subclass, attributes indicate valid options
optFlags, optParameters, subCommands
define opt_foo(self,str) to implement --foo=str
methods can customize processing further
parseArgs, postOptions
str() provides usage message
Options implements the dict interface, opts['foo'], opts['v']
usually invoked with opts.parseOptions(), which grabs sys.argv
why? mktap plugins use the 'Options' class from the plugin to parse argv
lore:
turn .xhtml into .html (or .latex, others)
inline listings, pretty-print python code
links to epydoc-generated API docs
pb:
translucent RPC
f=pb.PBServerFactory(root); reactor.listenTCP(port, f)
cf=pb.PBClientFactory(); reactor.connectTCP(host, port, cf)
d=cf.getRootObject(); d.addCallback(dostuff)
ref.callRemote("method", args)
def remote_method(self, args)
cred: howto is really good
avatar, portal, realm, credentials, checker, mind
portal has a set of checkers
checker gets credentials, decides if they're ok, provides an avatarID
realm gets avatarID and desired interfaces, returns an avatar
protocol gets back the avatar, does stuff with it
interfaces: PEP245-style
twisted/python/components.py
zope.interface, tiny portion of Zope3
many APIs want "object that can be adapted to IFoo" rather than an instance
of a specific class
some systems use it extensively: nevow's 'context': IRequest,ISession,ISite
</pre>
emacs2005-04-18T02:13:00-07:002005-04-18T02:13:00-07:00Brian Warnertag:www.lothar.com,2005-04-18:/blog/2-emacs/<p>I set up a few tools to post blog entries from emacs. All entries are kept in
CVS, and the whole tree is rsync'ed over to the web server. The elisp which
actually publishes the entry looks like this:</p>
<pre class="literal-block">
(defvar pyblosxom-entry-dir "~/stuff/Projects/WebLog/entries")
;; adapted from http://wiki.woozle …</pre><p>I set up a few tools to post blog entries from emacs. All entries are kept in
CVS, and the whole tree is rsync'ed over to the web server. The elisp which
actually publishes the entry looks like this:</p>
<pre class="literal-block">
(defvar pyblosxom-entry-dir "~/stuff/Projects/WebLog/entries")
;; adapted from http://wiki.woozle.org/BlogdorEngine
;; and http://list-archive.xemacs.org/xemacs/200211/msg00022.html
(defun char-isalpha-p (thechar)
"Check to see if thechar is a letter"
(and (or (and (>= thechar ?a) (<= thechar ?z))
(and (>= thechar ?A) (<= thechar ?Z)))))
(defun char-isnum-p (thechar)
"Check to see if thechar is a number"
(and (>= thechar ?0) (<= thechar ?9)))
(defun char-isalnum-p (thechar)
(or (char-isalpha-p thechar) (char-isnum-p thechar)))
(require 'cl-seq)
(defun blog-publish ()
"Publish the blog entry in the current buffer"
(interactive)
(shell-command (format "cvs commit -m 'blog entry' %s"
(file-name-nondirectory buffer-file-name)))
(shell-command "make -C .. publish") ; publish
)
(define-minor-mode pyblosxom-post-minor-mode
"Minor mode for blog posts"
nil
" blog-post" ; mode-line indicator
'(
("\C-c\C-c" . blog-publish)
)
() ; forms run on mode entry/exit
)
(defun blog-post (title)
"Create a journal entry"
(interactive "sTitle: ")
(let ((filetitle (substitute-if-not ?_
(lambda (c) (char-isalnum-p c))
title)))
(find-file (concat pyblosxom-entry-dir "/"
filetitle
(format-time-string "-%Y-%m-%d-%H-%M")
".txt"))
(goto-char (point-min))
(insert title "\n\n")
(save-buffer)
(vc-register)
(pyblosxom-post-minor-mode 1)
))
</pre>
blog startup2005-04-18T01:47:00-07:002005-04-18T01:47:00-07:00Brian Warnertag:www.lothar.com,2005-04-18:/blog/1-blog-startup/<p>I've been trying to get my project notes online for years now, and I finally
realized that I need to start smaller. After a week of intermittent activity,
I finally got <a class="reference external" href="http://pyblosxom.sourceforge.net">PyBlosxom</a> set up and
behaving fairly well.</p>
<p>In the process, I discovered that the CGI specification doesn't actually
require …</p><p>I've been trying to get my project notes online for years now, and I finally
realized that I need to start smaller. After a week of intermittent activity,
I finally got <a class="reference external" href="http://pyblosxom.sourceforge.net">PyBlosxom</a> set up and
behaving fairly well.</p>
<p>In the process, I discovered that the CGI specification doesn't actually
require the web server to close the child process' stdin: the child is
supposed to read exactly CONTENT_LENGTH bytes and then stop. Pyblosxom
violates this (it just reads until EOF), but it usually doesn't matter
because most web servers are nice enough to close it. It turns out that
Twisted's CGI handling module does not. I fixed it upstream (at least in the
old 'web' module.. web2 is another matter), and filed a pyblosxom bug. For
now I have to monkeypatch Twisted in my web setup program, at least until
Twisted-2.0.1 comes out.</p>
<p>My goal for this web log is twofold. The first is to contain an archive of
useful things I come across, cool stuff, new ideas, things I've learned, the
usual blog fodder. The second is to publish the diaries that I keep on each
of my projects, diaries that I use to help remember what I was thinking when
I last put energy into that project, in the hopes of reducing the
context-switching penalty that comes about from having a dozen active
projects at once. I have something like 1.5MB of diary entries on these
projects, going back to mid-2002 when I started keeping them. By getting
these projectlogs online (and enabling comments on them), I hope to give the
rest of the world a chance to look at my workbench and tell me what they find
interesting. Maybe someone else will pick up on an idea that I haven't had
time to pursue, maybe they will leave a note with useful directions to go or
tools to use. Perhaps by letting others in, and hopefully forming a bit of a
community here, I'll be more encouraged to work on the projects that have a
chance of forming communities of their own.</p>
<p>I plan to use pyblosxom's categories to put all the projects in places like
Projects/BuildBot . Non-project related things like site-setup notes or
personal rants can go in other categories next to Projects.</p>
<p>The blog is pretty basic for now, and not necessarily pleasant to look at,
but it's a start, and I've learned that waiting for perfection is the best
way to never get anything finished at all.</p>
<p>Welcome!</p>
<blockquote>
-Brian</blockquote>