Script writing
For work, I was asked to help port a project to Linux. This involved working on a computer behind a firewall. I am sure that every one is familiar with ssh. For those of you who are not, ssh is a program that encrypts a telnet session. And telnet is a program that allows someone to log into a computer remotely. So, to access the computer, all I would need to do is to ssh into it. Unfortunately, it was not that easy.
More follows:
The first problem is the firewall which exists to provide some measure of security. The computer that I wanted to access (bob), was behind a firewall and as a result was not accessable. To bypass the firewall, I was provided with an account on that machine. Once I logged into the firewall (bill), I could then turn around and access bob.
ssh hamzy@bill
and then from within the ssh session
ssh hamzy@bob
What if I wanted to run the second ssh on my own machine (localhost)? Fortunately, ssh provides a solution called port forwarding. Port forwarding essentially is a small program the reads input from one port and sends it to another and at the same time reads output from that port and sends it back to the first port. Ssh uses port 22. If I were to create a virtual port (7777) and use port forwarding to connect bill with bob (7777 <-> 22), then I could ssh from my machine as if bob were directly accessible from my machine.
ssh -L 7777:bob:22 hamzy@bill
and then from my machine
ssh -p 7777 hamzy@localhost
It is still easy, right? Unfortunately, it wasn't. The second problem was the my connection is unreliable. Whatever the cause, some machine along the chain from my computer to the remote computer had some temporary problem and I would loose the connection. Whatever work I was in the middle of performing would be lost and I would start over. To solve this I turned to another powerful unix program called screen. Screen essentially allows one session to controll many children session. While doing this, screen has the ability to run in the background and not mind if you exit the session (logout) which would normally finish the session. So, on my local machine, I would normally start screen as follows
screen
Ssh allows any program to be run instead of the default login program by performing
ssh -p 7777 hamzy@localhost screen
Every time you start screen a new screen process is created. To access a previously running screen process, I would need to provide some command line arguments
screen -DR
This will tell screen to detach the first screen process it finds and reattach to it. All I have to do is add a -t option to ssh to provide a terminal for screen to use. So, now I will not loose my work if the connection is lost. All I need to do is put the commands into a loop that sleeps for a little bit (to be nice) and retry the connection process. So, was everything solved? No. When I would go back to my session and press a key, the program would wake up and realize that the connection was lost. After waiting for a little while, I would automatically reconnect. I could improve this. The version that I was using was an old version of ssh. When I used the latest version, I found that I could create a file called ~/.ssh/config and add the following:
ServerAliveInterval 15ServerAliveCountMax 4TCPKeepAlive yes
This would tell ssh to ping the remote machine every 15 seconds and terminate the connection if the remote machine did not respond after four attempts. The connection would be lost and my loop would reconnect.
This automation runs automatically but everytime I login to a machine it wants a username and a password. I provide the username on the command line since I do not care who knows that fact, but I do not want to provide the password on the command line. How can I solve this? Ssh has the ability to encrypt a password and if I do not want to be asked for a password, then I can tell it to encrypt an empty password. I do this by the following:
cd ~/.sshssh-keygen -t dsa
Ssh will ask me what password I want and encrypt it with the DSA algorithm. It creates a plain text file called ~/.ssh/id_dsa.pub . I can notify ssh of this on the remote machine, bob, by putting the contents of this file into another file called ~/.ssh/authorized_keys . This is of course not as secure as asking for a password every time. But the tradeoff is it will allow you to automate the connection.
Now, I can run commands on the remote machine and port the program. To test the program, I need to run it locally. I will build the program on the remote machine and copy the data to my local machine. This is called mirroring. How can I copy the files? I could run tar on bob to bundle files up and copy that bundle to a tar running on localhost that would unbundle those files. The steps to accomplish this is 1) run tar to create a bundle, 2) copy the bundle, and 3) run tar to unbundle the files. This is too many separate steps. I can improve this.
Unix allows you to chain programs together by taking program A's output and forwarding it to program B. Ssh follows this philosophy. So, using ssh, I can connect task 1 and task 3 together without performing task 2. Now, I do the following:
cd location_of_files_on_localhostssh -p 7777 hamzy@localhost "(cd location_of_files_on_bob; tar c *)" | tar xv
However, there are more efficient ways to mirror files. If the data has not changed, then nothing needs to be copied. Also, to speed the copying process, if only small differences exist between two files, then only those differences need to be copied. The program that does this in Unix is rsync. The people who wrote rsync knew about ssh. Connecting the dots, I would do the following:
rsync --rsh="ssh -l hamzy -p 7777" localhost:location_of_files_on_bob location_of_files_on_localhost
But what if I only wanted to copy files? I first need to set up the port forwarding but that creates a login session. This session needs to be closed manually by logging out. After I copy the files, I would leave this session running. This is not very clean. To solve this I could instead run some other program which would end automatically. I chose to run sleep which runs for some amount of time which I arbitrarly set as long enough (one day).
ssh -L 7777:bob:22 hamzy@bill "sleep 1d"# run rsync
One day is of course not a perfect fit and I am not looking for perfect but "good enough." At least, it consumes a minimal amount of resources on bob. And I can actually programmatically find it and kill it (unix terminology).
# kill the sleep session
Now, I am happy. I can edit files and copy files. But inspiration hit me and I thought I can use what I just learned and do something else with it. Screen is a pretty powerful program. You can name screen sessions. You can list running screen sessions. Screen even allows programmatic manipulation.
I can start a named screen session that is initially detached:
screen -dm -S AR
I can check to see if that session is running:
screen -r AR -ls -q
I can close a named screen session:
screen -dr AR -X quit
I can add a program Y to a named screen session:
screen -dr AR -X screen Y
One of thing that I do, is use a program called bittorrent. Of course, bittorrent is despised by Hollywood since it allows people to digitally copy media. And I am not going to go into that can of worms. It does have its legal uses to copy programs that allow themselves to be freely copied like Linux. This downloading takes a lot of time. So I want to start the downloading process simply and walk away.
Now I have all the tools that I need to accomplish that. I can remotely control a machine. I can remotely start a standalone session (screen). And I can remotely start bittorrent with screen. So what did I do? I wrote a shell script to do all of that. And how did I start that process? Email! Huh? Unix people allow email to run programs. When I send a specially formatted email to a machine, that machine will turn around and run my script. My script will then start the downloading. Mission accomplished!
Links:

Leave a comment