Here are some quick notes that I took while studying the SRTP from the standard document , they are not meant to be complete but rather a quick overview of the protocol.
SRTP: Secure Real-time Transport Protocol
is a profile of Real-time Transport protocol, a stream-cipher
provides confidentiality, message authentication, message integrity, and replay attack protection.
other goals are to have a small footprint, low bandwidth cost
additional features to simplify key management is introduction of a single MK(Master Key) ; all security services derive their keys from the MK using a key derivation function
note: reading the standard
SRTP provides a framework for encryption and message authentication of RTP and RTCP streams(Section 3). SRTP defines a set of default cryptographic transforms (Sections 4 and 5),and it allows new transforms to be introduced in the future (Section 6). With appropriate key management (Sections 7 and 8), SRTP is secure (Sections 9) for unicast and multicast RTP applications (Section 11).
SRTP Framework
SRTP is defined as a profile of the RTP protocol; an extension of the Audio/Video profile. It can be visualized residing between RTP application and transport layer.
SRTCP to RTCP resembles SRTP to RTP; providing same services, but with mandatory message authentication.
1. SRTP Packet
Payload size doesn’t change after encryption.
MKI [Optional] (Master Key Identifier)
– identifies the master key from which session keys are derived.
– shall not identify the cryptographic context.
Authentication tag [Recommended]
– carries message authentication data
– encryption shall be applied before authentication
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ |V=2|P|X| CC |M| PT | sequence number | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | timestamp | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | synchronization source (SSRC) identifier | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | | contributing source (CSRC) identifiers | | | .... | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | RTP extension (OPTIONAL) | | +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | payload ... | | | | +-------------------------------+ | | | | RTP padding | RTP pad count | | +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ | ~ SRTP MKI (OPTIONAL) ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | : authentication tag (RECOMMENDED) : | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +- Encrypted Portion* Authenticated Portion ---+
2. SRTP Cryptographic context
is the cryptographic state information required to be saved by the sender and the receiver (e.g: keys, encryption algorithms used), session keys are derived form master keys and used directly in the cryptographic transform
the cryptographic context parameters can be transform-independent(independent of the particular encryption or authentication transform used), or transform-dependent
a cryptographic context of a packet is defined by the triplet context identifier = < SSRC, network address, port number>
3. SRTP Packet Processing
@sender
– determine the cryptographic context to use.
– determine index of packet (from ROC, RTP packet sequence number, cryptographic context sequence number)
– determine master key and master salt , derive session keys and session salt from them.
– encrypt the payload with the algorithm defined by the cryptographic context,
– append the MKI if MKI indicator is set to 1
– compute the authentication tag defined by the cryptographic context
@receiver
– find out the cryptographic context to use.
– get packet index
– if MKI indicator is set to 1 get MKI from packet else use previous index, determine master key and master salt , session keys and session salt.
– authenticate
– decrypt
– update ROC and cryptographic context sequence number.
4. Predefined Algorithms for SRTP
for encryption
The encryption transforms defined in SRTP map the SRTP packet index
and secret key into a pseudo-random keystream segment. Each
keystream segment encrypts a single RTP packet. The process of
encrypting a packet consists of generating the keystream segment
corresponding to the packet, and then bitwise exclusive-oring that
keystream segment onto the payload of the RTP packet to produce the
Encrypted Portion of the SRTP packet. In case the payload size is
not an integer multiple of n_b bits, the excess (least significant)
bits of the keystream are simply discarded. Decryption is done the
same way, but swapping the roles of the plaintext and ciphertext.
– AES-CTR
– AES-f8
for authentication
We describe the process of computing authentication tags as follows.
The sender computes the tag of M and appends it to the packet. The
SRTP receiver verifies a message/authentication tag pair by computing
a new authentication tag over M using the selected algorithm and key,
and then compares it to the tag associated with the received message.
If the two tags are equal, then the message/tag pair is valid;
otherwise, it is invalid and the error audit message "AUTHENTICATION
FAILURE" MUST be returned.
– HMAC-SHA1
5- An Example of an encryption process using AES-CTR
Mainly, the encryption process is as simple as XORing the payload with a keystream segment . The keystream segment SHALL be the concatenation of the 128-bit output blocks of the AES cipher in the encrypt direction, using key k = k_e, in which the block indices are in increasing order. Symbolically, each keystream segment looks like E(k, IV) || E(k, IV + 1 mod 2^128) || E(k, IV + 2 mod 2^128) ... where the 128-bit integer value IV SHALL be defined by the SSRC, the SRTP packet index i, and the SRTP session salting key k_s, as below. IV = (k_s * 2^16) XOR (SSRC * 2^64) XOR (i * 2^16)