Exploring Tor Exit Lists and ECC Signatures
Published:
At some point, I might write up a series of articles on the Tor Directory Protocol and how it is all working, but first there are some things I’m trying to get my head around in order to finish off a new specification for Tor Exit Lists that will include signatures. Tor Exit Lists are produced by an “exit scanner”. We archive exit lists and use them to provide data for ExoneraTor and Onionoo.
Some background: it is possible to configure an exit relay to use an IP address for exit connections that is different from the IP address that appears in the consensus for Onion Routing (OR) connections or directory connections. The IP addresses are useful to be able to say whether a connection from a given IP at a given time was likely to be from a Tor user or not.
Looking over 72 hours of data, we see an average of 7435 distinct IP addresses (IPv4 and IPv6) in each consensus. In the same period, there are an average of 938 distinct IP addresses in the exit lists. Not all addresses found in the consensus will be used for exit connections, even if they belong to relays with the exit flag. Using this list to “block Tor” will result in overblocking, incorrectly blocking over 7 times the number of addresses as would be correctly blocked.
Using the list of addresses in the consensus would also result in underblocking. In this period, we see an average of 73 addresses that are only to be found in the exit lists, and are not found in the consensus. Active measurement is able to show the addresses that are used by exit relays by controlled experimentation as opposed to guesswork.
Up to now, there have not been any signatures on the exit lists used by Tor Metrics. As Tor Metrics are both the primary producers and consumers of this format, we are able to extend it, and so we are currently working on a (not backwards compatible) revision of the specification.
Tor relay “router identity” keys are limited to 1024-bit RSA keys. When you see relays mentioned in tor log files, or are looking up a relay using Relay Search, the fingerprint you see identifying routers is based on this 1024-bit key. We don’t want to implement new code with 1024-bit RSA as that wouldn’t be very future-proof, but luckily Tor is already moving towards another scheme that was first proposed in 2013 and was implemented in tor 0.3.0.1-alpha: Ed25519.
Roughly 29% of the relays running in the Tor network are using pre-0.3.0 versions of tor which do not yet use Ed25519 signatures and so 1024-bit RSA is still the primary mechanism used for identifying routers. Around 15 months ago, work was going on to add the Ed25519 relay identity keys to the consensus but due to performance issues that work was shelved at the time.
So, let’s take a look at “server descriptors” (ยง2.1.1 dir-spec). These are the signed documents that relays publish to the directory authorities, that the authorities then use for constructing their votes. Server descriptors contain information that the relay is asserting about itself, like what addresses it has and what features it supports. This is also where it publishes the keys that are needed to use it with the Tor protocol (the onion and ntor keys).
For example:
router uoaerg 137.50.19.11 443 0 80
identity-ed25519
-----BEGIN ED25519 CERT-----
AQQABpmpAaxBnHwb8In0z5hnq8RpxXNnV0VSrLeDxBoasZv9ozcqAQAgBAB6FFq4
tV9O7gFcWKb7xZdYLR5XHkAz9MecwXB7Ohnu3Uj2oUaFVDboy1LkzuK4PoGako1X
c3WIO4I1/VHrUugGYgzrXKVNDu5pFNf8ibf7ksBid0Cv4tK6pWMCS6r2agQ=
-----END ED25519 CERT-----
master-key-ed25519 ehRauLVfTu4BXFim+8WXWC0eVx5AM/THnMFwezoZ7t0
platform Tor 0.3.5.8 on Linux
proto Cons=1-2 Desc=1-2 DirCache=1-2 HSDir=1-2 HSIntro=3-4 HSRend=1-2 Link=1-5 LinkAuth=1,3 Microdesc=1-2 Relay=1-2
published 2019-04-08 09:43:45
fingerprint CF0C C69D E1E7 E75A 2D99 5FD8 D9FA 7D20 9835 31DA
uptime 1674318
bandwidth 1658880 16588800 2994620
extra-info-digest 0BFA51DEE097612601868748DB058B5377B69F5A 1b2JkVb0ZjpiLZfjGbVfl2RSJNzf//MCbKxeQvuSALM
onion-key
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBALsZpcDVk13jt0588OeA/Lf2PKhTjfQTwjiqau9RLot0UZyA8ee5nlkf
2Rr2R2asT/tMMFI3PvClqJfx+QHnVGZDsXhOuxcNSZBzUqnsxooeejBSuFWzCLlP
VHqqj58cg43GmW4p9EqaitQD9uiP2Ov5gZu30yGy9IFHhhuoYF2ZAgMBAAE=
-----END RSA PUBLIC KEY-----
signing-key
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBANIzr5wvMsZhSUxACSSTeuOqcjHgRw1iJWOXaTqWwdUXRudBsOR1+RaG
6i9E3UFMcCCMv+haLzVnlOCuUkvhSWIgeEn0RxgpaO41EieXnKcTWWMfPlXQoe2K
j0MhzCaI5YRxFK0AnFovBBX7eV9x+2fASo/+5aaavqBCy7ZFoTQJAgMBAAE=
-----END RSA PUBLIC KEY-----
onion-key-crosscert
-----BEGIN CROSSCERT-----
QCps8gFRHDyPvzWAQZAQlSmjxD3t2R1xdiMJ0w3Yzmu9fDuOFIaJz1mFD0U02Pl5
VU/ZfztQXmhCx9s/1Nt0YIJ1BUkg23Pn9nBkfPkLh+i/scH++cCg/0aGu14AqSbi
B+c2jMOF6fNtdXr1SCaHuPTaAl8Ja6IVozg+dcwJ14g=
-----END CROSSCERT-----
ntor-onion-key-crosscert 0
-----BEGIN ED25519 CERT-----
AQoABpmaAXoUWri1X07uAVxYpvvFl1gtHlceQDP0x5zBcHs6Ge7dADNSKRx3aKBy
adX9gcFBmJ4g7LTGWvKp8n6md6KwtnUAIaJjSr3JTgJ4eZj6gtyH0uzDwxfRpElE
ITqc2DHfzg8=
-----END ED25519 CERT-----
family $E59CC0060074E14CA8E9469999B862C5E1CE49E9
hidden-service-dir
contact 0xF540ABCD Iain R. Learmonth <irl@fsfe.org>
ntor-onion-key etwB4T2yCuYne+BZMBE7IU/s9cWkxwyPYpoNcUJMAHQ=
reject *:*
tunnelled-dir-server
router-sig-ed25519 2o32rFB0FvGO4ajKl65JaQpwDWQ4XZHr2Q4qdncQICK2avcEDNyGC05Ny1f82e59JMT/LyAZ7cvsZO3Z7q5tBA
router-signature
-----BEGIN SIGNATURE-----
pA0gSNGipcCXiCQ5A6vg+bwCR10cWQsIerdo2eTomfSo8LTp+9lQGjtokyjJKK4c
7A0JSG+DH33EogN2t4/J3PH508WU9l5A8xkXIBa/PW8U4ZhHWPj8+tEBQWkCn1Mu
+aWkEo6hCmOyr4c4Qj7ha+1KO507hgfa0XRVPmRNJPY=
-----END SIGNATURE-----
There are three keywords here that we’ll focus on relevant to the use of
Ed25519 signatures: identity-ed25519
, master-key-ed25519
, and
router-sig-ed25519
.
The first of these, identity-ed25519
, contains an Ed25519 certificate
(cert-spec). That base64 encoded data isn’t very human friendly but luckily
stem has a module for dissecting these.
The master-key-ed25519
keyword line is here for convenience only as this is
the same key as is used to sign the certificate for identity-ed25519
. This
key is also bundled with the certificate, and the specification requires that
these keys are the same. Let’s see if that is happening:
>>> from stem.descriptor.certificate import Ed25519Certificate
>>> cert = Ed25519Certificate.parse("""AQQABpmpAaxBnHwb8In0z5hnq8RpxXNnV0VSrLeDxBoasZv9ozcqAQAgBAB6FFq4
... tV9O7gFcWKb7xZdYLR5XHkAz9MecwXB7Ohnu3Uj2oUaFVDboy1LkzuK4PoGako1X
... c3WIO4I1/VHrUugGYgzrXKVNDu5pFNf8ibf7ksBid0Cv4tK6pWMCS6r2agQ=
... """)
>>> cert.extensions
[Ed25519Extension(type=4, flags=[], flag_int=0, data=b'z\x14Z\xb8\xb5_N\xee\x01\\X\xa6\xfb\xc5\x97X-\x1eW\x1e@3\xf4\xc7\x9c\xc1p{:\x19\xee\xdd')]
>>> # We can see here a single extension, with type 4, which is
>>> # the extension type containing the bundled key. The data
>>> # does not match what we see in the master-key-ed25519 line
>>> # because there it is base64 encoded.
>>> from base64 import encodebytes
>>> encodebytes(cert.extensions[0].data).strip(b'\n=')
b'ehRauLVfTu4BXFim+8WXWC0eVx5AM/THnMFwezoZ7t0'
>>> # Now it matches!
As we can extract the identity key successfully from the certificate, let’s
ignore the master-key-ed25519
field going forward. In the certificate, the
master key is certifying the signing key, which can expire. This is how the
offline signing key system works, because the certificate needs to be regenerated only when the signing key is rotated. If the signing key is compromised, it is only useful until it expires.
>>> # the signing key is the key in the certificate
>>> encodebytes(cert.key).strip(b'\n=')
b'rEGcfBvwifTPmGerxGnFc2dXRVKst4PEGhqxm/2jNyo'
>>> # which is a medium term key
>>> cert.expiration.isoformat()
'2019-05-07T01:00:00'
Now we can check that the signature on the descriptor was generated using the signing key in the certificate.
>>> # /tmp/server_descriptor.txt contains the server descriptor quoted above
>>> from stem.descriptor import parse_file
>>> desc = next(parse_file("/tmp/server_descriptor.txt", descriptor_type="server-descriptor 1.0"))
>>> cert.validate(desc)
>>> # not throwing an exception means success
We can check what happens when we use another certificate, which should fail as the signature will have used the wrong key:
>>> cert2 = Ed25519Certificate.parse("""AQQABpmWAVLI15lBP2vLQtbMpvZmBTzFsMIcmN1WC3CNmUaqXn+CAQAgBADi5wOy
... PA8fELwr32F2BkVcd3h8PLnOkQGDa4Edv3PKCeu+KGVoXmY6WnEB2qsYkvsWQk+a
... MRmS18Sbekg0aWPPfmpajEs4kq2dK3eKRL6Mtoad0s4fLkhRYLCmRvQuAA4=""")
>>> cert2.validate(desc)
Traceback (most recent call last):
File "<input>", line 1, in <module>
cert2.validate(desc)
File "/usr/local/lib/python3.7/dist-packages/stem/descriptor/certificate.py", line 254, in validate
raise ValueError('Ed25519KeyCertificate signing key is invalid (%s)' % exc)
ValueError: Ed25519KeyCertificate signing key is invalid (Signature was forged or corrupt)
Looking at the source code for the validate function, it’s not entirely clear to me what this error means, but an error was expected and we got one so that’s good. It would be nice to be able to validate the certificate independently of validating a server descriptor but that doesn’t currently seem to be possible.
I’ve also not entirely figured out yet if it’s comparing the certificate against the key that is contained in the server descriptor or using the keys in the certificate.