------------------------------------------------
The VMware Virtual Virtual Network Sandbox (VNS)
------------------------------------------------
Ian! D. Allen - idallen@idallen.ca - www.idallen.com

INDEX to this file:
  Downloading, Installing, and Booting the VNS
  Configuring the environment
  Setting up and mounting a virtual disk file
  VMware image files
  Resuming or (Re-)Booting your VNS
  Saving and Moving the VNS images
  Connecting and Copying files to/from VNS
  Errors you may encounter
  Networking: VNS and DHCP
  Removing a hung UML machine
  Changing the screen resolution of the VNS
  VNS Problem Reporting Form

The Virtual Network Sandbox (VNS) was created by Algonquin Professor
Linda MacEwan based on Knoppix.  This VMware version was packaged by
Ian! D. Allen.

VNS runs a base Knoppix-style in-memory Linux machine, with Knoppix-style
hardware detection, and then allows you to start many separate user-mode
linux (UML) images inside that Knoppix machine.  Each UML image behaves
as if it were a separate machine attached to its own network hub, allowing
many possibilities for safely playing with network configurations.

By default, none of the virtual UML images have outside network access;
so, nothing you do in the "sandbox" will affect your local network.
Thus, none of your UML machines have outside Internet access; unless,
you explicitly set that up.

The VMware implementation packages up the above Knoppix-style VNS
machine into its own virtual machine (virtual in virtual), allowing you
to suspend/resume the entire VNS environment and/or move it from one
machine to another.  You can resume a suspended VNS on any machine that
can run the free VMware VMplayer (Windows/Linux/etc.).

The VMware version of VNS boots off a (virtual) CDROM *.iso image and
loads a fresh CD image into (virtual) memory on each cold reboot, without
requiring any (virtual) hard disk.  It runs entirely in (virtual) memory,
and any changes to its memory are lost if you shut it down.  Instead of
shutting it down, you can use the "Suspend" feature of VMware/VMplayer
to suspend the entire virtual machine so that your in-memory changes
are preserved.

Remember that to type into VMware, you need to first click in the window;
putting your mouse over the VMplayer window isn't sufficient.

Hardware detection and network configuration are done only when first
booting; so, if you suspend and move your virtual machine to a different
network environment and resume it, you will need to renew your DHCP lease
for the new network environment.  See the section below on Networking:
VNS and DHCP.

Downloading, Installing, and Booting the VNS
--------------------------------------------

See Lab #6:   http://teaching.idallen.com/net2003/07w/notes/lab06.txt
Steps 1-6

Configuring the environment
---------------------------

See Lab #6:   http://teaching.idallen.com/net2003/07w/notes/lab06.txt
Steps 7-12

I prefer a slower, less sensitive mouse inside the VNS:

    # xset m 1 1

You can install a larger, more complete version of vim using apt-get; but,
be aware that it does take more memory and you may not be able to run as
many UML systems and VIM at the same time.  The UML systems themselves
only have 24MB of memory and may not be able to run the larger VIM at all.
(You will see out-of-memory errors from the UML machine.)

Selecting the Virtual Networking style
--------------------------------------

Normally the VMplayer defaults to "bridged" networking, where your VNS
system is on the same network as your host system.  (But remember that
it does not *share* your host's network configuration - the VNS is a
completely separate machine.)  If your host system requires a VPN or
wireless key to connect to the network, your VNS will also require VPN
software or the same wireless keys.

As an alternative VNS network config for those using wireless or VPN or
who have only a single IP address at home:
 - in your VMplayer click on the Ethernet device and change the type of
   network on eth0 from "bridged" to "NAT" (or try "host-only")
 - start your VNS
 - make sure pump is running on eth0 (should be there already)
 - "ifconfig eth0" should show a private address
 - "ip route" should show a route to your VNS host machine gateway private IP
   - should be able to ping this gateway private IP
 - see if you can ping your VNS host machine external address
 - see if you can ping something on the Internet
 - details: see  http://cri.ch/linux/docs/sk0020.html
   - you can also directly edit your *.vmx file and add/change this line:
   - ethernet0.connectionType = "nat", "hostonly" or "bridged"

Setting up a virtual disk file
------------------------------

See Lab #6:   http://teaching.idallen.com/net2003/07w/notes/lab06.txt
Steps 13-15 (set-up, partition, create file system)
Steps 16-18 (mount an existing disk)

Once you have a virtual disk set up, you can mount it and use it after
you (re-)boot your VNS (steps 16-18):

    # mkdir -p /mnt/hdb1
    # mount /dev/hdb1 /mnt/hdb1
    # df /mnt/hdb1
    # find /mnt/hdb1/

Of course disks that you mount remain mounted after a VMware
suspend/restore; you don't need to re-mount them unless you reboot.

You can create multiple virtual disk files by adding their *.vmdk config
files to your VNS-NET2003-07W.vmx config file and rebooting your VNS.
Each new disk has to be partitioned, formatted, and mounted before it
can be used to store files under Unix/Linux.

VMware image files
------------------

Your VMware VNS image consists of these VMware files plus a CDROM image:

  VNS-NET2003-07W.vmdk
     - text description/config file for your virtual hard disk
  VNS-NET2003-07W-s001.vmdk
     - data portion of your virtual hard disk (sparse file)
  VNS-NET2003-07W.nvram
  VNS-NET2003-07W.vmem
  VNS-NET2003-07W.vmsd
  VNS-NET2003-07W.vmss
     - support files (including memory images)
  VNS-NET2003-07W.vmx
     - text master config file for the overall virtual machine
  vmware*.log
     - text log files

The above files are updated when you run and save/suspend your VNS
machine.  You can edit the two config files directly (e.g. to change the
name or location of the virtual devices) if you know what you are doing.

In addition to the above VMware files, you also have the Knoppix-based
VNS CDROM ISO image needed to load the VNS:

  virtual_network_sandbox_2006-01-10.iso
     - CDROM image (does not change)

(Note: You can use this VMware machine to boot from *any* CDROM image;
simply replace the above *.iso file with another file of the same name
and it will boot that; the VNS CDROM image is not special in any way.)

An unofficial list of VMX file parameters is here:

    http://sanbarrow.com/vmx.html#Minimalvmxfile

Resuming or (Re-)Booting your VNS
---------------------------------

If you have suspended your VNS, then VMware will automatically resume it
when you start it up again from the *.vmx file.

The resumed machine will probably wake up with the time wrong; reset it
if you need accurate time.  The NTP daemon will not reset your system
clock if the time difference is "too big", as is the case after a VMware
suspend/resume.  (Your VNS has no way of knowing it was suspended.)
You have to shut down NTP and use ntpdate to fix a large time change:

    # /etc/init.d/ntp-server stop      # stop NTP (to allow ntpdate to work)
    # /etc/init.d/ntpdate restart      # fix the large time difference
    # /etc/init.d/ntp-server start     # restart NTP to keep the time

If you do need to cold-start your VNS image, realize that it may try to
boot from a configured virtual disk and may hang.  Don't boot from disk.
When VMware first starts up your image as a cold boot (not a resume from
suspend), it gives a splash screen that says "push ESC" for boot options.
Do that, and select to boot from the (virtual) CDROM image.

When you reboot your VNS machine (or fail to recover from suspend),
your machine will be freshly loaded from the *.iso virtual CDROM and
will have none of your previous in-memory changes applied.  If you have
configured your (virtual) hard disk and used "Backup" to save virtual
machine images there, you can mount that disk and then use the "Recover"
from backup option to restore your saved virtual machines from hard disk
and restart them.  To mount your virtual hard disk after a reboot:

   # fdisk -l                    # make sure the disk is visible
   Disk /dev/hdb: 536 MB, 536870912 bytes
   [...]
   /dev/hdb1               1        1040      524128+  83  Linux
   # mkdir -p /mnt/hdb1          # create a place to mount the partition
   # mount /dev/hdb1 /mnt/hdb1   # mount the partition on the directory
   # find /mnt/hdb1              # show all the files on the partition

Make sure that any editing you do is done with files on the (virtual) hard
disk; otherwise, your work may be lost if the virtual machine reboots.

Saving and Moving the VNS images
---------------------------------

To save your work and move it to another machine, you only need to
suspend your VNS and then copy the VMware files; you don't need to save
another copy of the CDROM *.iso image.  You can get the CDROM image via
download from the web site later - it doesn't change.  (But make sure
you remember to fetch it before you resume your machine!)

These are the files that change when you suspend a VNS and that you
must save if you want to move/resume your machine later:

  VNS-NET2003-07W.vmdk
  VNS-NET2003-07W-s001.vmdk
  VNS-NET2003-07W.nvram
  VNS-NET2003-07W.vmem
  VNS-NET2003-07W.vmsd
  VNS-NET2003-07W.vmss
  VNS-NET2003-07W.vmx

You also need to download a copy of the virtual CDROM image
virtual_network_sandbox_2006-01-10.iso to your VNS directory; but,
you don't need to save copies of it since it's available online.

You can copy your VNS image files to your Course Linux Server account.
I've also set up a general class FTP account for you here:

    ftp.idallen.org
    username: u35482050-ftp

Email me for the FTP password.

Connecting and Copying files to/from VNS
----------------------------------------

Treat the VNS base machine (and every UML machine running inside it)
as if it were a separate computer plugged into your local network.
(It truly is a separate network device.)  The VNS needs to run DHCP
to get a network address.  The VNS has its own firewall and network
configuration.  The network configuration of the machine *hosting*
the VNS is of little consequence to the VNS itself.

In particular, running a VPN on the machine *hosting* the VNS won't give
the VNS itself any VPN access - they are separate machines.  Your VNS,
and every UML machine inside it, are all separate machines and they
will not be part of any VPN of which your host might be a member.
(Software run on the maching hosting the VNS does not affect how the VNS,
with its own IP address, sees the world.)

If you did want the VNS to access a machine on the VPN, you would either
have to run Linux-based VPN software on the VNS machine itself (e.g. the
vpnc package); or, you would have to route packets for the VPN via the
machine hosting the base VNS machine.  These are both tricky to get
right and I don't recommend trying.

The easiest solution is to treat the VNS and all the UML machines inside
it as separate machines and use SSH, SCP, and SFTP to transfer files.

You can use SSH to connect or copy files from your VNS to other machines,
including to the machine hosting your VNS.  Your VNS comes with external
incoming SSH access disabled.  To enable SSH incoming to the VNS, run the
"startsshd" command as root.  Your VNS needs a password set on the root
account to permit "root" logins via SSH.   (This is a security risk and
is not recommended for production network servers; but, it's convenient
for an academic system.)  After starting the SSH daemon on the VNS and
setting a root password, you can connect to your VNS using its IP address
from the host machine running VMplayer.

You may find it useful to enable the SSH daemon on your VNS machine so
that you can use SCP or SFTP on your host machine to copy a file between
the VNS and your host machine.  If your VNS host machine is Windows,
look for an SFTP or SCP client that will let you do this.  You will need
to know the IP address of the VNS and the machine hosting the VNS needs
un-firewalled access to the SSH port 22 on that IP address.

Once a file is copied from your VNS to your host machine, you are, of
course, free to copy it elsewhere, including to destinations on the VPN
of which your host machine might be a part.

If you use a Windows machine to host your VNS, remember that you have to
use network copy programs to move files between your Windows host machine
and your VNS because they are different computers.  You can either use
VNS clients to talk to external servers, or you can use external clients
to talk to VNS servers.

Your VNS already has installed and ready-to-use FTP and SCP clients on it
that can copy files between the VNS and any FTP or SCP servers on other
machines anywhere else on the planet.  You can probably find Windows
versions of FTP or SCP server software, if you want your Windows host
machine to run servers that will receive FTP or SCP connections from
clients on your VNS computer.  Install the Windows server software and
connect to it from the VNS using the VNS clients: "ftp" or "scp".
(You must ensure that your Windows firewall allows this access.)

Going the other way, Windows clients connecting to VNS servers, the VNS
does not, by default, run an FTP server or an SSH server.  I've told you
how to start an SSH server in the VNS.  Once you do that, you can use the
Windows client version of SCP to transfer files between your Windows host
and your VNS server machine in either direction.  Install the Windows
SCP client software and then use the client software to connect to the VNS.
(You must ensure that your Windows firewall allows this access.)

The VNS already has all the necessary client and/or server software
installed to transfer files between the VNS and your Windows machine.
All you have to do is add the missing software (either client or server)
to your Windows machine, and make sure your Windows machine firewall
permits the required access.

Networking: VNS and DHCP
------------------------

Hardware detection and network configuration are done only when first
booting; so, if you suspend and move your virtual machine to a different
network environment and resume it, you will need to renew your DHCP lease
for the new network environment or perhaps even restart your DHCP client.

The DHCP client software managing your IP address and /etc/resolv.conf
file on your VNS is named "pump".  A process listing will show pump
listening on eth0.  If no "pump" is running, you need to start it to ask
for a DHCP address.  Usually it is smart enough to find your network card,
so just starting it without options may be sufficient:

  # pump
  
After that, "pump -s" will show you if pump got a network address, and
"ifconifg" will confirm that your eth0 network card is up and running.

The pump man page shows options for getting the current status, renewing,
and releasing leases, e.g.:

  # pump -R            # renew all DHCP leases (must be run as root)

The act of releasing a lease will cause the interface to go down and the
current pump process to exit, and further changes to the interface won't
be possible until you (as root) restart pump on that interface again.

Q: From which start-up script is pump run at boot time?
Q: What options are passed to pump at boot time?
Q: What command line starts up pump on eth0?

Errors you may encounter
------------------------

When handling errors from the UML machines running inside your VNS,
it helps to know where the UML images and config files are.  The UML
console socket and config files are kept under $HOME/.uml/ :

   # find .uml | grep red
   .uml/red
   .uml/red/mconsole
   .uml/red/pid

UML saved copy-on-write virtual images live under $HOME/virtuals/ :

   # find virtuals | grep red
   virtuals/red

These images are the saved virtual memory images for each UML.  If you
keep a copy of this image, you can restore a UML back to that state.
These are the files that the UML "Backup" operation saves.

Common errors:

1.  bash: programname: Input/output error

    You probably resumed your virtual machine without having the VNS CDROM
    attached to it.  The in-memory copy of bash is finding commands,
    but the commands can't be pulled off the CDROM because you forgot
    to attach it to your virtual machine.  "dmesg" might also show you
    errors.  Your machine is damaged and should be shut down or abandoned.

2.  No route to host

    If your VNS can't ping anything, even hosts on the local network,
    perhaps its IP address is wrong.  Have you made sure that your VNS
    has an IP address that is valid on your local network?  You may
    need to kill and restart "pump" to get a new DHCP address; or, if
    you have no DHCP server, you will need to use static addressing.  Use
    "ifconfig" to set a static IP address, netmask, and broadcast address.
    Use "route" to set up a default route.  Edit /etc/resolv.conf and
    add your DNS servers.

    # ifconfig eth0 192.168.1.10 netmask 255.255.255.0 broadcast 192.168.1.255
    # route add default gw 192.168.1.254
    # vi /etc/resolv.conf
    # ping 192.168.1.254

    Unless your gateway machine is configured to do Network Address
    Translation (NAT) for external addresses, you will not be able
    to access the Internet through it if your VNS is using a private
    (RFC1918) address.  MSWindows users should look into enabling
    "connection sharing".

3.  Running the "uml" command doesn't start that uml.
 
    You probably have a uml running without a console window.  See the
    section in this file on "Removing a hung UML machine".

4.  When starting a UML, the uml starts running but then closes.

    The UML is unable to locate either its .uml or its virtuals directory,
    so it can't start up.  Verify that /host/.uml/ and /host/virtuals
    both exist and that /host/.uml/umlsettings is a non-empty config file.
    Remove any existing virtuals/ file for the machine having problems.

Removing a hung UML machine
---------------------------

If you kill the window containing a UML, or kill the "uml" command that
started the UML, the UML is likely still running in the background.
You can't restart it, since it's already running - the "uml" command
simply returns, doing nothing.

Removing a hung UML involves these steps:

  1. Trying to shut the UML down using uml_mconsole.
  2. Find and kill all the UML processes.
  3. Remove all the config files under .uml/ and virtuals/

With a hung UML, you will likely find "linux" processes with the UML
machine name running, and a network DGRAM socket open:

   # ps gx | grep 'umid=red'
   ...many lines...

   # pgrep -lf 'umid=red'
   ...many lines...

   # netstat -p | grep red
   unix 2 [] DGRAM 8954 44444/linux /home/root/.uml/red/mconsole

In the output of netstat, "44444" will be the process PID of the UML
process "linux" that has the given console socket open.  Sometimes just
killing that one process will cause the hung UML to exit.

You may find entries for the orphaned UML console in your .uml/ directory:

    # find .uml/. | grep red
    .uml/./red
    .uml/./red/pid
    .uml/./red/mconsole

The UML console may (or may not) still be open by various UML processes:

    # fuser .uml/red/mconsole
    .uml/red/mconsole:   44444 44445 44446 44447

    # lsof .uml/red/mconsole
    COMND   PID USER  FD TYPE     DEVICE SIZE NODE NAME
    linux 44444 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole
    linux 44445 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole
    linux 44446 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole
    linux 44447 root 10u unix 0xcf6d4e40 8954 /home/root/.uml/red/mconsole

First try to brutally shut down the orphaned UML using the uml_mconsole
"halt" command (this will lose any saved state in that one UML):

   # uml_mconsole red help
   [... read the list of available commands ...]
   # uml_mconsole red halt

Verify that the .uml/ directory no longer has any entries for that UML.
Verify that no more of those "linux" UML processes exist for that UML.

If the uml_mconsole command doesn't work, you need to kill the hung UML.
To clean out a UML machine that is hung or that will not boot properly
(this will lose any saved state in that one UML):

1.  Find all the related process PIDs for that one UML:

    # fuser .uml/red/mconsole  # this may or may not show process IDs
    # lsof .uml/red/mconsole   # this may or may not show process IDs
    # netstat -p | grep red    # this may or may not show one process ID
    # ps gx | grep 'umid=red'  # this should always work

    Kill all the UML processes and make sure they are all gone:

    # kill 44444 44445 44446   # select all the correct pids for your hung UML
    # ps gx | grep 'umid=red'  # make sure all those UML processes are gone

    Make sure all the UML processes for that machine are gone.

2.  Remove the left over virtual COW and config files (if any):

    # rm -r .uml/red virtuals/red

Now the UML should start cleanly again.

Changing the screen resolution of the VNS
-----------------------------------------

Some people have asked me how to reconfigure screen resolution once the
system is up and running.  Here's how.

Make sure you make the "Modifying the VNS not to halt when X server
exits" modification  before you kill the current X server, or else your
VNS will shut down when you kill the X server.

To change the screen resolution of your X11 desktop, you have to kill
and restart the X server.  (Unfortunately, you can't ask the Xvesa X11
server to simply resize itself; it's not smart enough.)  Killing the
X server means anything using the X server, including UML windows and
shell windows, will be killed or orphaned.

Before you kill the X server, close any editor windows; stop all virtual
UML machines.  Make sure you have an empty desktop before you kill the
X server.

1)  Edit the file /etc/sysconfig/vns and set the resolution you want;
    or, run /etc/init.d/xsetup to bring up a text menu to do it for you.

2)  Close any editor windows; stop all virtual machines.  Restarting
    the X server will cause all windows to be killed.

3)  The standard xsession start-up script halts the machine when the X
    server exits.  Make sure you have killed the xsession script before
    continuing:

      # pkill -x xsession

4)  Use CTRL-ALT-BACKSPACE to kill the X server and all its windows;
    or, run "/etc/init.d/xsession stop"

    You will be returned to a simple console screen.  If you get logged
    out and the machine halts, you forgot to kill the xsession script.

5)  At the console prompt, restart the X server:

      # /etc/init.d/xsession start &
      
    Did you remember to remove /bin/halt from that xsession file?
    The new Xvesa server will use the new screen size.

VNS Problem Reporting Form
--------------------------

I get a lot of questions of the form "my VNS doesn't work - do you
know why?".  I don't; because, you don't tell me a thing about it.
Fill in the form below and I'll be happy to help diagnose your problem.
If any commands fail, copy the EXACT error message and send it to me
with your problem reporting form.

Also review the "Errors you may encounter" section, above.

1.  Verify that all eight VNS files are present in the same directory and
    have approximately the right size (most are not zero size):

    VNS-NET2003-07W.vmdk          1K (text config file)
    VNS-NET2003-07W-s001.vmdk   128K or larger (depends on usage)
    VNS-NET2003-07W.nvram       8.5K
    VNS-NET2003-07W.vmem        256M or larger (VNS memory dump)
    VNS-NET2003-07W.vmsd          0
    VNS-NET2003-07W.vmss         17M
    VNS-NET2003-07W.vmx           1K (text config file)
    virtual_network_sandbox_2006-01-10.iso   176M (CDROM image)

    a) Are all eight files present (count them) and readable?

2.  Is the pump DHCP client running in your VNS?

    # ps ax | grep pump
    # pump -s

    a) What IP address is shown by "pump -s"?

    If you can't use DHCP, see above: Common Errors: No route to host
    You will need to set a static IP address.

3.  Does eth0 have an IP address valid on your local network?
    
    # ifconfig eth0

    a) What is your IP address and mask?

4.  Does the kernel have a route to your network and a default gateway set?

    # netstat -nr

    a) What is the route to your local network?
    b) What is the route to the default gateway?
    c) Is your gateway IP reachable by your current IP and mask?

5.  Can you ping your gateway?

    # ping -n 1.2.3.4           # use your own gateway IP address

6.  Do you have valid DNS servers in your /etc/resolv.conf file?

    # grep nameserver /etc/resolv.conf

7.  Can you ping your name servers (though they may not respond to ping)?

    # ping -n 1.2.3.4           # try each nameserver IP address

8.  Can you ping google.com and/or idallen.com and/or idallen.org ?

    # ping -n google.com.       # 64.233.167.99 72.14.207.99 64.233.187.99
    # ping -n idallen.com.      # 72.18.159.15
    # ping -n idallen.org.      # 82.165.138.2

9.  From the VNS, can you ping the machine hosting your VNS?

    # ping -n 1.2.3.4           # use the IP of the machine hosting your VNS

10. From the machine hosting your VNS, can you ping your VNS IP address?

11. Is the ssh daemon running in the VNS?

    # ssh localhost             # or: ssh 127.0.0.1

    a) on the VNS did you set the root password?
    b) on the VNS can you ssh to localhost and login as root?
    c) on a Windows host machine, can you use PuTTY to login to the
       VNS IP address as root?

12. Does "dmesg" show any recent errors?

    # dmesg | less

13. Are any errors written on virtual console #3?

    to go to virtual console 3:  CTRL-ALT-F3
    to return to VNS:            ALT-F2