------------------------------------------------ The VMware Virtual Virtual Network Sandbox (VNS) ------------------------------------------------ Ian! D. Allen - idallen@idallen.ca - www.idallen.com INDEX to this file: Downloading, Installing, and Booting the VNS Configuring the environment Setting up and mounting a virtual disk file VMware image files Resuming or (Re-)Booting your VNS Saving and Moving the VNS images Connecting and Copying files to/from VNS Errors you may encounter Networking: VNS and DHCP Removing a hung UML machine Changing the screen resolution of the VNS VNS Problem Reporting Form The Virtual Network Sandbox (VNS) was created by Algonquin Professor Linda MacEwan based on Knoppix. This VMware version was packaged by Ian! D. Allen. VNS runs a base Knoppix-style in-memory Linux machine, with Knoppix-style hardware detection, and then allows you to start many separate user-mode linux (UML) images inside that Knoppix machine. Each UML image behaves as if it were a separate machine attached to its own network hub, allowing many possibilities for safely playing with network configurations. By default, none of the virtual UML images have outside network access; so, nothing you do in the "sandbox" will affect your local network. Thus, none of your UML machines have outside Internet access; unless, you explicitly set that up. The VMware implementation packages up the above Knoppix-style VNS machine into its own virtual machine (virtual in virtual), allowing you to suspend/resume the entire VNS environment and/or move it from one machine to another. You can resume a suspended VNS on any machine that can run the free VMware VMplayer (Windows/Linux/etc.). The VMware version of VNS boots off a (virtual) CDROM *.iso image and loads a fresh CD image into (virtual) memory on each cold reboot, without requiring any (virtual) hard disk. It runs entirely in (virtual) memory, and any changes to its memory are lost if you shut it down. Instead of shutting it down, you can use the "Suspend" feature of VMware/VMplayer to suspend the entire virtual machine so that your in-memory changes are preserved. Remember that to type into VMware, you need to first click in the window; putting your mouse over the VMplayer window isn't sufficient. Hardware detection and network configuration are done only when first booting; so, if you suspend and move your virtual machine to a different network environment and resume it, you will need to renew your DHCP lease for the new network environment. See the section below on Networking: VNS and DHCP. Downloading, Installing, and Booting the VNS -------------------------------------------- See Lab #6: http://teaching.idallen.com/net2003/07w/notes/lab06.txt Steps 1-6 Configuring the environment --------------------------- See Lab #6: http://teaching.idallen.com/net2003/07w/notes/lab06.txt Steps 7-12 I prefer a slower, less sensitive mouse inside the VNS: # xset m 1 1 You can install a larger, more complete version of vim using apt-get; but, be aware that it does take more memory and you may not be able to run as many UML systems and VIM at the same time. The UML systems themselves only have 24MB of memory and may not be able to run the larger VIM at all. (You will see out-of-memory errors from the UML machine.) Selecting the Virtual Networking style -------------------------------------- Normally the VMplayer defaults to "bridged" networking, where your VNS system is on the same network as your host system. (But remember that it does not *share* your host's network configuration - the VNS is a completely separate machine.) If your host system requires a VPN or wireless key to connect to the network, your VNS will also require VPN software or the same wireless keys. As an alternative VNS network config for those using wireless or VPN or who have only a single IP address at home: - in your VMplayer click on the Ethernet device and change the type of network on eth0 from "bridged" to "NAT" (or try "host-only") - start your VNS - make sure pump is running on eth0 (should be there already) - "ifconfig eth0" should show a private address - "ip route" should show a route to your VNS host machine gateway private IP - should be able to ping this gateway private IP - see if you can ping your VNS host machine external address - see if you can ping something on the Internet - details: see http://cri.ch/linux/docs/sk0020.html - you can also directly edit your *.vmx file and add/change this line: - ethernet0.connectionType = "nat", "hostonly" or "bridged" Setting up a virtual disk file ------------------------------ See Lab #6: http://teaching.idallen.com/net2003/07w/notes/lab06.txt Steps 13-15 (set-up, partition, create file system) Steps 16-18 (mount an existing disk) Once you have a virtual disk set up, you can mount it and use it after you (re-)boot your VNS (steps 16-18): # mkdir -p /mnt/hdb1 # mount /dev/hdb1 /mnt/hdb1 # df /mnt/hdb1 # find /mnt/hdb1/ Of course disks that you mount remain mounted after a VMware suspend/restore; you don't need to re-mount them unless you reboot. You can create multiple virtual disk files by adding their *.vmdk config files to your VNS-NET2003-07W.vmx config file and rebooting your VNS. Each new disk has to be partitioned, formatted, and mounted before it can be used to store files under Unix/Linux. VMware image files ------------------ Your VMware VNS image consists of these VMware files plus a CDROM image: VNS-NET2003-07W.vmdk - text description/config file for your virtual hard disk VNS-NET2003-07W-s001.vmdk - data portion of your virtual hard disk (sparse file) VNS-NET2003-07W.nvram VNS-NET2003-07W.vmem VNS-NET2003-07W.vmsd VNS-NET2003-07W.vmss - support files (including memory images) VNS-NET2003-07W.vmx - text master config file for the overall virtual machine vmware*.log - text log files The above files are updated when you run and save/suspend your VNS machine. You can edit the two config files directly (e.g. to change the name or location of the virtual devices) if you know what you are doing. In addition to the above VMware files, you also have the Knoppix-based VNS CDROM ISO image needed to load the VNS: virtual_network_sandbox_2006-01-10.iso - CDROM image (does not change) (Note: You can use this VMware machine to boot from *any* CDROM image; simply replace the above *.iso file with another file of the same name and it will boot that; the VNS CDROM image is not special in any way.) An unofficial list of VMX file parameters is here: http://sanbarrow.com/vmx.html#Minimalvmxfile Resuming or (Re-)Booting your VNS --------------------------------- If you have suspended your VNS, then VMware will automatically resume it when you start it up again from the *.vmx file. The resumed machine will probably wake up with the time wrong; reset it if you need accurate time. The NTP daemon will not reset your system clock if the time difference is "too big", as is the case after a VMware suspend/resume. (Your VNS has no way of knowing it was suspended.) You have to shut down NTP and use ntpdate to fix a large time change: # /etc/init.d/ntp-server stop # stop NTP (to allow ntpdate to work) # /etc/init.d/ntpdate restart # fix the large time difference # /etc/init.d/ntp-server start # restart NTP to keep the time If you do need to cold-start your VNS image, realize that it may try to boot from a configured virtual disk and may hang. Don't boot from disk. When VMware first starts up your image as a cold boot (not a resume from suspend), it gives a splash screen that says "push ESC" for boot options. Do that, and select to boot from the (virtual) CDROM image. When you reboot your VNS machine (or fail to recover from suspend), your machine will be freshly loaded from the *.iso virtual CDROM and will have none of your previous in-memory changes applied. If you have configured your (virtual) hard disk and used "Backup" to save virtual machine images there, you can mount that disk and then use the "Recover" from backup option to restore your saved virtual machines from hard disk and restart them. To mount your virtual hard disk after a reboot: # fdisk -l # make sure the disk is visible Disk /dev/hdb: 536 MB, 536870912 bytes [...] /dev/hdb1 1 1040 524128+ 83 Linux # mkdir -p /mnt/hdb1 # create a place to mount the partition # mount /dev/hdb1 /mnt/hdb1 # mount the partition on the directory # find /mnt/hdb1 # show all the files on the partition Make sure that any editing you do is done with files on the (virtual) hard disk; otherwise, your work may be lost if the virtual machine reboots. Saving and Moving the VNS images --------------------------------- To save your work and move it to another machine, you only need to suspend your VNS and then copy the VMware files; you don't need to save another copy of the CDROM *.iso image. You can get the CDROM image via download from the web site later - it doesn't change. (But make sure you remember to fetch it before you resume your machine!) These are the files that change when you suspend a VNS and that you must save if you want to move/resume your machine later: VNS-NET2003-07W.vmdk VNS-NET2003-07W-s001.vmdk VNS-NET2003-07W.nvram VNS-NET2003-07W.vmem VNS-NET2003-07W.vmsd VNS-NET2003-07W.vmss VNS-NET2003-07W.vmx You also need to download a copy of the virtual CDROM image virtual_network_sandbox_2006-01-10.iso to your VNS directory; but, you don't need to save copies of it since it's available online. You can copy your VNS image files to your Course Linux Server account. I've also set up a general class FTP account for you here: ftp.idallen.org username: u35482050-ftp Email me for the FTP password. Connecting and Copying files to/from VNS ---------------------------------------- Treat the VNS base machine (and every UML machine running inside it) as if it were a separate computer plugged into your local network. (It truly is a separate network device.) The VNS needs to run DHCP to get a network address. The VNS has its own firewall and network configuration. The network configuration of the machine *hosting* the VNS is of little consequence to the VNS itself. In particular, running a VPN on the machine *hosting* the VNS won't give the VNS itself any VPN access - they are separate machines. Your VNS, and every UML machine inside it, are all separate machines and they will not be part of any VPN of which your host might be a member. (Software run on the maching hosting the VNS does not affect how the VNS, with its own IP address, sees the world.) If you did want the VNS to access a machine on the VPN, you would either have to run Linux-based VPN software on the VNS machine itself (e.g. the vpnc package); or, you would have to route packets for the VPN via the machine hosting the base VNS machine. These are both tricky to get right and I don't recommend trying. The easiest solution is to treat the VNS and all the UML machines inside it as separate machines and use SSH, SCP, and SFTP to transfer files. You can use SSH to connect or copy files from your VNS to other machines, including to the machine hosting your VNS. Your VNS comes with external incoming SSH access disabled. To enable SSH incoming to the VNS, run the "startsshd" command as root. Your VNS needs a password set on the root account to permit "root" logins via SSH. (This is a security risk and is not recommended for production network servers; but, it's convenient for an academic system.) After starting the SSH daemon on the VNS and setting a root password, you can connect to your VNS using its IP address from the host machine running VMplayer. You may find it useful to enable the SSH daemon on your VNS machine so that you can use SCP or SFTP on your host machine to copy a file between the VNS and your host machine. If your VNS host machine is Windows, look for an SFTP or SCP client that will let you do this. You will need to know the IP address of the VNS and the machine hosting the VNS needs un-firewalled access to the SSH port 22 on that IP address. Once a file is copied from your VNS to your host machine, you are, of course, free to copy it elsewhere, including to destinations on the VPN of which your host machine might be a part. If you use a Windows machine to host your VNS, remember that you have to use network copy programs to move files between your Windows host machine and your VNS because they are different computers. You can either use VNS clients to talk to external servers, or you can use external clients to talk to VNS servers. Your VNS already has installed and ready-to-use FTP and SCP clients on it that can copy files between the VNS and any FTP or SCP servers on other machines anywhere else on the planet. You can probably find Windows versions of FTP or SCP server software, if you want your Windows host machine to run servers that will receive FTP or SCP connections from clients on your VNS computer. Install the Windows server software and connect to it from the VNS using the VNS clients: "ftp" or "scp". (You must ensure that your Windows firewall allows this access.) Going the other way, Windows clients connecting to VNS servers, the VNS does not, by default, run an FTP server or an SSH server. I've told you how to start an SSH server in the VNS. Once you do that, you can use the Windows client version of SCP to transfer files between your Windows host and your VNS server machine in either direction. Install the Windows SCP client software and then use the client software to connect to the VNS. (You must ensure that your Windows firewall allows this access.) The VNS already has all the necessary client and/or server software installed to transfer files between the VNS and your Windows machine. All you have to do is add the missing software (either client or server) to your Windows machine, and make sure your Windows machine firewall permits the required access. Networking: VNS and DHCP ------------------------ Hardware detection and network configuration are done only when first booting; so, if you suspend and move your virtual machine to a different network environment and resume it, you will need to renew your DHCP lease for the new network environment or perhaps even restart your DHCP client. The DHCP client software managing your IP address and /etc/resolv.conf file on your VNS is named "pump". A process listing will show pump listening on eth0. If no "pump" is running, you need to start it to ask for a DHCP address. Usually it is smart enough to find your network card, so just starting it without options may be sufficient: # pump After that, "pump -s" will show you if pump got a network address, and "ifconifg" will confirm that your eth0 network card is up and running. The pump man page shows options for getting the current status, renewing, and releasing leases, e.g.: # pump -R # renew all DHCP leases (must be run as root) The act of releasing a lease will cause the interface to go down and the current pump process to exit, and further changes to the interface won't be possible until you (as root) restart pump on that interface again. Q: From which start-up script is pump run at boot time? Q: What options are passed to pump at boot time? Q: What command line starts up pump on eth0? Errors you may encounter ------------------------ When handling errors from the UML machines running inside your VNS, it helps to know where the UML images and config files are. The UML console socket and config files are kept under $HOME/.uml/ : # find .uml | grep red .uml/red .uml/red/mconsole .uml/red/pid UML saved copy-on-write virtual images live under $HOME/virtuals/ : # find virtuals | grep red virtuals/red These images are the saved virtual memory images for each UML. If you keep a copy of this image, you can restore a UML back to that state. These are the files that the UML "Backup" operation saves. Common errors: 1. bash: programname: Input/output error You probably resumed your virtual machine without having the VNS CDROM attached to it. The in-memory copy of bash is finding commands, but the commands can't be pulled off the CDROM because you forgot to attach it to your virtual machine. "dmesg" might also show you errors. Your machine is damaged and should be shut down or abandoned. 2. No route to host If your VNS can't ping anything, even hosts on the local network, perhaps its IP address is wrong. Have you made sure that your VNS has an IP address that is valid on your local network? You may need to kill and restart "pump" to get a new DHCP address; or, if you have no DHCP server, you will need to use static addressing. Use "ifconfig" to set a static IP address, netmask, and broadcast address. Use "route" to set up a default route. Edit /etc/resolv.conf and add your DNS servers. # ifconfig eth0 192.168.1.10 netmask 255.255.255.0 broadcast 192.168.1.255 # route add default gw 192.168.1.254 # vi /etc/resolv.conf # ping 192.168.1.254 Unless your gateway machine is configured to do Network Address Translation (NAT) for external addresses, you will not be able to access the Internet through it if your VNS is using a private (RFC1918) address. MSWindows users should look into enabling "connection sharing". 3. Running the "uml" command doesn't start that uml. You probably have a uml running without a console window. See the section in this file on "Removing a hung UML machine". 4. When starting a UML, the uml starts running but then closes. The UML is unable to locate either its .uml or its virtuals directory, so it can't start up. Verify that /host/.uml/ and /host/virtuals both exist and that /host/.uml/umlsettings is a non-empty config file. Remove any existing virtuals/ file for the machine having problems. Removing a hung UML machine --------------------------- If you kill the window containing a UML, or kill the "uml" command that started the UML, the UML is likely still running in the background. You can't restart it, since it's already running - the "uml" command simply returns, doing nothing. Removing a hung UML involves these steps: 1. Trying to shut the UML down using uml_mconsole. 2. Find and kill all the UML processes. 3. Remove all the config files under .uml/ and virtuals/ With a hung UML, you will likely find "linux" processes with the UML machine name running, and a network DGRAM socket open: # ps gx | grep 'umid=red' ...many lines... # pgrep -lf 'umid=red' ...many lines... # netstat -p | grep red unix 2 [] DGRAM 8954 44444/linux /home/root/.uml/red/mconsole In the output of netstat, "44444" will be the process PID of the UML process "linux" that has the given console socket open. Sometimes just killing that one process will cause the hung UML to exit. You may find entries for the orphaned UML console in your .uml/ directory: # find .uml/. | grep red .uml/./red .uml/./red/pid .uml/./red/mconsole The UML console may (or may not) still be open by various UML processes: # fuser .uml/red/mconsole .uml/red/mconsole: 44444 44445 44446 44447 # lsof .uml/red/mconsole COMND PID USER FD TYPE DEVICE SIZE NODE NAME linux 44444 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole linux 44445 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole linux 44446 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole linux 44447 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole First try to brutally shut down the orphaned UML using the uml_mconsole "halt" command (this will lose any saved state in that one UML): # uml_mconsole red help [... read the list of available commands ...] # uml_mconsole red halt Verify that the .uml/ directory no longer has any entries for that UML. Verify that no more of those "linux" UML processes exist for that UML. If the uml_mconsole command doesn't work, you need to kill the hung UML. To clean out a UML machine that is hung or that will not boot properly (this will lose any saved state in that one UML): 1. Find all the related process PIDs for that one UML: # fuser .uml/red/mconsole # this may or may not show process IDs # lsof .uml/red/mconsole # this may or may not show process IDs # netstat -p | grep red # this may or may not show one process ID # ps gx | grep 'umid=red' # this should always work Kill all the UML processes and make sure they are all gone: # kill 44444 44445 44446 # select all the correct pids for your hung UML # ps gx | grep 'umid=red' # make sure all those UML processes are gone Make sure all the UML processes for that machine are gone. 2. Remove the left over virtual COW and config files (if any): # rm -r .uml/red virtuals/red Now the UML should start cleanly again. Changing the screen resolution of the VNS ----------------------------------------- Some people have asked me how to reconfigure screen resolution once the system is up and running. Here's how. Make sure you make the "Modifying the VNS not to halt when X server exits" modification before you kill the current X server, or else your VNS will shut down when you kill the X server. To change the screen resolution of your X11 desktop, you have to kill and restart the X server. (Unfortunately, you can't ask the Xvesa X11 server to simply resize itself; it's not smart enough.) Killing the X server means anything using the X server, including UML windows and shell windows, will be killed or orphaned. Before you kill the X server, close any editor windows; stop all virtual UML machines. Make sure you have an empty desktop before you kill the X server. 1) Edit the file /etc/sysconfig/vns and set the resolution you want; or, run /etc/init.d/xsetup to bring up a text menu to do it for you. 2) Close any editor windows; stop all virtual machines. Restarting the X server will cause all windows to be killed. 3) The standard xsession start-up script halts the machine when the X server exits. Make sure you have killed the xsession script before continuing: # pkill -x xsession 4) Use CTRL-ALT-BACKSPACE to kill the X server and all its windows; or, run "/etc/init.d/xsession stop" You will be returned to a simple console screen. If you get logged out and the machine halts, you forgot to kill the xsession script. 5) At the console prompt, restart the X server: # /etc/init.d/xsession start & Did you remember to remove /bin/halt from that xsession file? The new Xvesa server will use the new screen size. VNS Problem Reporting Form -------------------------- I get a lot of questions of the form "my VNS doesn't work - do you know why?". I don't; because, you don't tell me a thing about it. Fill in the form below and I'll be happy to help diagnose your problem. If any commands fail, copy the EXACT error message and send it to me with your problem reporting form. Also review the "Errors you may encounter" section, above. 1. Verify that all eight VNS files are present in the same directory and have approximately the right size (most are not zero size): VNS-NET2003-07W.vmdk 1K (text config file) VNS-NET2003-07W-s001.vmdk 128K or larger (depends on usage) VNS-NET2003-07W.nvram 8.5K VNS-NET2003-07W.vmem 256M or larger (VNS memory dump) VNS-NET2003-07W.vmsd 0 VNS-NET2003-07W.vmss 17M VNS-NET2003-07W.vmx 1K (text config file) virtual_network_sandbox_2006-01-10.iso 176M (CDROM image) a) Are all eight files present (count them) and readable? 2. Is the pump DHCP client running in your VNS? # ps ax | grep pump # pump -s a) What IP address is shown by "pump -s"? If you can't use DHCP, see above: Common Errors: No route to host You will need to set a static IP address. 3. Does eth0 have an IP address valid on your local network? # ifconfig eth0 a) What is your IP address and mask? 4. Does the kernel have a route to your network and a default gateway set? # netstat -nr a) What is the route to your local network? b) What is the route to the default gateway? c) Is your gateway IP reachable by your current IP and mask? 5. Can you ping your gateway? # ping -n 1.2.3.4 # use your own gateway IP address 6. Do you have valid DNS servers in your /etc/resolv.conf file? # grep nameserver /etc/resolv.conf 7. Can you ping your name servers (though they may not respond to ping)? # ping -n 1.2.3.4 # try each nameserver IP address 8. Can you ping google.com and/or idallen.com and/or idallen.org ? # ping -n google.com. # 64.233.167.99 72.14.207.99 64.233.187.99 # ping -n idallen.com. # 72.18.159.15 # ping -n idallen.org. # 82.165.138.2 9. From the VNS, can you ping the machine hosting your VNS? # ping -n 1.2.3.4 # use the IP of the machine hosting your VNS 10. From the machine hosting your VNS, can you ping your VNS IP address? 11. Is the ssh daemon running in the VNS? # ssh localhost # or: ssh 127.0.0.1 a) on the VNS did you set the root password? b) on the VNS can you ssh to localhost and login as root? c) on a Windows host machine, can you use PuTTY to login to the VNS IP address as root? 12. Does "dmesg" show any recent errors? # dmesg | less 13. Are any errors written on virtual console #3? to go to virtual console 3: CTRL-ALT-F3 to return to VNS: ALT-F2