A Linux bash script to download all pdf files from a page

Are you trying to download multiple files from a webpage and bored from clicking and clicking ??

I needed to download like a 100 PDF from a single web page , so I started to look for a bash script that automates the process and found this interesting article by Guillermo Garron that combines several useful programs into a nice script to download all links from a page using lynx command line web browser and wget downloader.

First , install the the browser

$ sudo apt-get install lynx

Lynx has a nice feature that allows you to grab all links from a page

$ lynx --dump http://mlg.eng.cam.ac.uk/pub/ >> ~/links.txt

The output will be like this 

Image

So we need to filter out the first numbering column and all non pdf links for the output to be nice and readable by wget

$ lynx --dump //http://mlg.eng.cam.ac.uk/pub/  | awk '/http/{print $2}' | grep pdf  >> ~/links.txt

Resulting in a clean input to wget 

Image

and the last step is to pass this file into wget to download all the pdfs

$ for i in $( cat ~/links.txt ); do wget $i; done

 voilà ! you get all the files downloaded 

Image

Quick notes about SRTP (Secure Real-time protocol)

Here are some quick notes that I took while studying the SRTP from the standard document , they are not meant to be complete but rather a quick overview of the protocol.

SRTP: Secure Real-time Transport Protocol
is a profile of Real-time Transport protocol, a stream-cipher
provides confidentiality, message authentication, message integrity, and replay attack protection.
other goals are to have a small footprint, low bandwidth cost
additional features to simplify key management is introduction of a single MK(Master Key) ; all security services derive their keys from the MK using a key derivation function
note: reading the standard
SRTP provides a framework for encryption and message authentication of RTP and RTCP streams(Section 3). SRTP defines a set of default cryptographic transforms (Sections 4 and 5),and it allows new transforms to be introduced in the future (Section 6). With appropriate key management (Sections 7 and 8), SRTP is secure (Sections 9) for unicast and multicast RTP applications (Section 11).
SRTP Framework
SRTP is defined as a profile of the RTP protocol; an extension of the Audio/Video profile. It can be visualized residing between RTP application and transport layer.
SRTCP to RTCP resembles SRTP to RTP; providing same services, but with mandatory message authentication.
1. SRTP Packet
Payload size doesn’t change after encryption.
MKI [Optional] (Master Key Identifier)
     – identifies the master key from which session keys are derived.
     – shall not identify the cryptographic context.
Authentication tag [Recommended]
    – carries message authentication data
    – encryption shall be applied before authentication
   

        0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
     |V=2|P|X|  CC   |M|     PT      |       sequence number         | |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
     |                           timestamp                           | |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
     |           synchronization source (SSRC) identifier            | |
     +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |
     |            contributing source (CSRC) identifiers             | |
     |                               ....                            | |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
     |                   RTP extension (OPTIONAL)                    | |
   +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
   | |                          payload  ...                         | |
   | |                               +-------------------------------+ |
   | |                               | RTP padding   | RTP pad count | |
   +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
   | ~                     SRTP MKI (OPTIONAL)                       ~ |
   | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
   | :                 authentication tag (RECOMMENDED)              : |
   | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
   |                                                                   |
   +- Encrypted Portion*                      Authenticated Portion ---+
2. SRTP Cryptographic context
     is the cryptographic state information required to be saved by the sender and the receiver (e.g: keys, encryption algorithms used), session keys are derived form master keys and used directly in the cryptographic transform
    the cryptographic context parameters can be transform-independent(independent of the particular encryption or authentication transform used), or transform-dependent
    a cryptographic context of a packet is defined by the triplet context identifier = < SSRC, network address, port number>
3. SRTP Packet Processing
    @sender
    – determine the cryptographic context to use.
     determine index of packet (from ROC, RTP packet sequence number, cryptographic context sequence number)
    – determine master key and master salt , derive session keys and session salt from them.
     – encrypt the payload with the algorithm defined by the cryptographic context,
    – append the MKI if MKI indicator is set to 1
    – compute the authentication tag defined by the cryptographic context
    @receiver
    – find out the cryptographic context to use.
    – get packet index
    – if MKI indicator is set to 1 get MKI from packet else use previous index, determine master key and master salt , session keys and session salt.
    – authenticate
    – decrypt
    – update ROC and cryptographic context sequence number.
4. Predefined Algorithms for SRTP
    
    for encryption
   The encryption transforms defined in SRTP map the SRTP packet index
   and secret key into a pseudo-random keystream segment.  Each
   keystream segment encrypts a single RTP packet.  The process of
   encrypting a packet consists of generating the keystream segment
   corresponding to the packet, and then bitwise exclusive-oring that
   keystream segment onto the payload of the RTP packet to produce the
   Encrypted Portion of the SRTP packet.  In case the payload size is
   not an integer multiple of n_b bits, the excess (least significant)
   bits of the keystream are simply discarded.  Decryption is done the
   same way, but swapping the roles of the plaintext and ciphertext.
    – AES-CTR
    – AES-f8
 
    for authentication
   We describe the process of computing authentication tags as follows.
   The sender computes the tag of M and appends it to the packet.  The
   SRTP receiver verifies a message/authentication tag pair by computing
   a new authentication tag over M using the selected algorithm and key,
   and then compares it to the tag associated with the received message.
   If the two tags are equal, then the message/tag pair is valid;
   otherwise, it is invalid and the error audit message "AUTHENTICATION
   FAILURE" MUST be returned.
    – HMAC-SHA1    
5- An Example of an encryption process using AES-CTR
   Mainly, the encryption process is as simple as XORing the payload  
   with a keystream segment .
   The keystream segment SHALL be the concatenation of the 128-bit output
   blocks of the AES cipher in the encrypt direction, using key k = k_e,
   in which the block indices are in increasing order.  Symbolically,
   each keystream segment looks like

      E(k, IV) || E(k, IV + 1 mod 2^128) || E(k, IV + 2 mod 2^128) ...

   where the 128-bit integer value IV SHALL be defined by the SSRC, the
   SRTP packet index i, and the SRTP session salting key k_s, as below.

      IV = (k_s * 2^16) XOR (SSRC * 2^64) XOR (i * 2^16)