Skip to Content.
Sympa Menu - Re: [] Call Notes - 6/8

Subject: SIP in higher education

List archive

Re: [] Call Notes - 6/8

Chronological Thread 
  • From: (Dennis Baron)
  • To:
  • Cc: ,
  • Subject: Re: [] Call Notes - 6/8
  • Date: Thu, 15 Jun 2006 11:29:48 -0400 Conference Call June 8, 2006


Dennis Baron, MIT
Joel Dunn, MCNC
Candace Holman, Harvard
Deke Kassabian, Penn
Jerry Keith, UC Riverside
Jeff Kuure, Internet 2
Toby Murray, Kansas State
Christian Schlatter, UNC Chapel Hill
Stanislav Shalunov, Internet 2
Ben Teitelbaum, Internet 2
John Todd, Tello
Mike Van Norman, UCLA
Garett Yoshimi, Hawaii
Kurt Zoglmann, Kansas State
Phil Zimmermann


Before the official start of the call, Dennis asks if anyone has tested
the GoogleTalk to VOIP gateway service offered by John
Todd has done some GoogleTalk to VOIP testing using a native Asterisk
module, but only locally and never with anyone else's service.

Mike Van Norman asks if anyone is familiar with the CC1 ENUM trial. He
has spoken to a person at Verizon and is willing to pass on some
information if anyone is interested. The trial is a temporary endeavor,
using temporary numbers and seems to be focused more on procedural and
process issues rather than technology. John Todd has notes from a public
email list that he will forward to the list.

The remainder of today's call is an overview of the ZRTP protocol and
the Zphone product which implements it, provided by Phil Zimmermann. Phil
is the developer of ZRTP and Zphone, and the creator of PGP. ZRTP is an
encryption protocol for VOIP traffic which is based on SRTP.

When he began developing Zphone, Phil considered how PSTN secure phones
operated. These were devices that were inserted into the line between
the phone and the network, and had two buttons labeled "secure" and
"clear". If the secure button was pressed, the modem chip in the device
performed a Diffie-Hellman exchange with the device at the other end of
the call. The devices then display some digits on a small LCD which can
be compared by the participants to ensure that the call is secure. Phil
thought that this would be a good model for Internet phone traffic, and
this was implemented in a product called PGP Phone which was originally
for PSTN traffic.

Zphone is architecturally similar. When the RTP traffic is underway,
RTP header extensions are sent so each side knows that the other is a
ZRTP endpoint and what algorithms are supported. Calls don't have to
be secured, as encryption can be turned on at any time. In the current
version of Zphone, however, calls are secure by default. When a call is
encrypted, the two sides conduct a Diffie-Helman exchange which results
in a shared secret that can be used to encrypt the call using SRTP.

This method involves the media only and is not dependent on the
servers. Other methods rely on the servers or signaling layer, which Phil
sees as insecure in various ways. For example, a session key is sometimes
inserted into the SIP packets, meaning that the SIP server knows the
session key; only the other client should know the key. Other methods
involve storing the key in an email message encrypted with SMIME. This
requires the SIP client to support an email encryption standard and
depends on persistent keys. Persistent keys are valuable for email,
which may be read immediately, in two weeks, or in two years. The keys
must persist in order for the message to be read in the future. Phone
calls are much more temporary, and the keys do not need to be kept
around for future use. ZRTP creates keys at the beginning of the call and
destroys them at the end, which prevents a call from being retroactively
compromised with leaked keys.

PKI is yet another method, but this requires a complex bureaucracy
in order to maintain the public keys. PGP showed that you don't need
to depend on a centrally managed key registry to encrypt email. Phil
believes that you don't need to burden phone calls with the overhead of
PKI. But PKI is very good at detecting man-in-the-middle attacks, which
Zphone detects by the display of a short authentication string. A hash
of the session key is created, and digits of this hash are displayed on
each client which can be compared by the participants. If these match,
there is no man-in-the-middle. Reading these digits aloud does not
change the security characteristics of the call; they just alert the
participants to the presence of a wiretapper.

At the end of the call, ZRTP keys are destroyed but a hash of the key
is cached. In the next call, the client performs a new exchange, and
it checks to see if there is an existing shared secret from an earlier
call. If so, this is incorporated into the new shared secret. This means
that the if the man-in-the-middle was not present on the earlier call,
they will also not be present on this call. Any wiretapper must be on all
calls between two clients for the keys to match, which is very unlikely.

Candace asks if it matters if you call a person who happens to be at
a different computer with the same URI, or a different URI on the same
computer. Phil says that in this case, a different cache is used. Every
Zphone client has a different 96 bit random number associated with it, and
a user interface element will indicate that a call is using a new cache
if the client on the other end is different. Currently, Zphone monitors
the IP stack and does not care which VOIP client is being used. However,
if the ZRTP client is integrated into the VOIP client, the cache used
by one client might be different than one used by a different client.

Phil mentions that the PGP key fingerprints are very long, which he
always felt was inconvenient. The authentication string in Zphone is
similar, but is a function of both side's public keys. Only four digits
of the string are read, due to the way that the string is generated. In
a classic Diffie-Helman man-in-the-middle attack, the attacker generates
both party's public parameters and sends them to each other. If the
attacker knows that you will only display and confirm a small number of
digits, he can find one hash that has the same digits at the beginning
of the strings. This can be prevented by reading lots of digits, but
this limits the usability for most end users.

In Zphone this is solved by using hash commitment. Instead of sending the
public parameter, a hash of this value is sent to the first client by the
second client. The first client replies with their public parameter. Then
the second client sends the actual public parameter which will match if
there is no man-in-the-middle. This method keeps the man-in-the-middle
from knowing both values at the same time. Attacks were the eavesdropper
generates many different values that hash the same way on both sides
are prevented, and short authentication strings are possible.

Ben asks how this would work where you want a trusted man-in-the-middle,
such as in a three-party call. Phil says that a typical three-party call
looks like a V, with two end points connected by a common point. Two
ZRTP sessions could be conducted with the common point verifying both
authentication strings. Ben asks if this a good thing, as the two end
points are unable to confirm the security of the call, but Phil mentions
that the human element is an important factor - the participants must
be able to trust the person at the other end of the call. If they don't
trust the person they're talking to, all the encryption in the world
won't matter.

Ben asks about session-oriented instant messaging, which does not seem
to be solved by ZRTP. Phil mentions that there is an API element in the
ZRTP SDK which deals with the short authentication string. It would be
possible to incorporate this string into an instant messaging session,
where the IM clients could compare this string as it was sent over a
secure channel. He does not recommend this, but it would be possible.

Phil then discusses scenarios where you may use a server that you don't
completely trust. You may not trust the server with the session key for
an encrypted VOIP conversation, but you do trust it to provide a secure
channel that can be used to send the short authentication string. One of
his clients, for example, mentioned wanting to send the authentication
string through an SSL channel.

Stanislav Shalunov asks about caching, and how it identifies other
clients. Phil explains that in the cache, there is a text field
associated with the shared secret. This field is displayed in the GUI
as a name field for the other party. This value is entered by the user
of the client, not by the party at the other end. The cache management
depends on the environment that you are operating in.

Candace interjects towards the end of the hour to say that she will post
comments to the email list. She also offers up the next half-hour for
Zphone testing and provides her SIP URI.

Phil warns that the current version of Zphone does not always work with
some forms of NAT traversal used by VOIP clients. Some use TCP tunneling
instead of UDP, and Zphone only looks for the UDP packets. Ideally,
the ZRTP protocol will be embedded into the VOIP client, but Zphone
operates by intercepting the packets and hoping that the client follows
the established protocols.

The call concludes with Phil offering some final philosophical remarks. He
says that protocols resemble the institutions who make them. Lots
of VOIP encryption protocols are created by companies that make the
server equipment, who build protocols that depend on their servers
and CALEA compliance. Phil believes that if two people want to get on
the phone and speak Navajo, for example, they should be able to do it
without needing any special dispensation from the phone company. Any
VOIP phone should be able to use the ZRTP protocol in the media stream,
which puts the control in the hands of the users. Some in the IETF view
this as a layer violation, but Phil believes that doing encryption in
the signaling layer is a violation. Encryption is about protecting the
rights of the individuals.

The next call will be on June 22nd.

  • Re: [] Call Notes - 6/8, Dennis Baron, 06/15/2006

Archive powered by MHonArc 2.6.16.

Top of Page