================================================================= Assignment #06 - Floating Point Numbers II & Character Encoding ================================================================= - Ian! D. Allen - idallen@idallen.ca - www.idallen.com 1. What is floating-point underflow? Any number that is "too close to zero" to be represented. In IEEE-754 single-precision, numbers less than about 10**(-38) fall into this "zero hole". (You can actually get numbers smaller than this, using de-normalized floating point - see the notes for details.) 2. Give an example of a number that would cause floating-point underflow if you tried to calculate it or represent it using IEEE-754 single-precision floating-point. Anything closer to zero than about 10**(-38), e.g. 1.0e-100 3. Why must you never test floating-point numbers for equality, i.e. why is "if ( a == b )" a bad idea for floating point a and b? Due to errors in precision, floating-point numbers may not achieve exact equality. They may be extremely close together, but not equal. Since you already accept some error when using floating-point numbers, comparison for exact equality makes no sense. The comparisons must be done to with some tolerance, or "epsilon", that you choose depending on your application. 4. What is the correct way to test for floating-point "equality"? Decide how close the numbers should be to be considered equal. Call this small difference value "epsilon". Then use: IF ( abs(a - b) < epsilon ) THEN consider a and b are "equal" where a and b are being compared and epsilon is a small number, e.g. 1.0e-9 5. Write a tiny program fragment that uses a floating point loop that starts at 0.0, adds 0.01 each time through the loop, and stops when the loop counter is within 1.0e-8 ("epsilon") of the floating-point value 1000.1. Give the code for the loop here: for( counter=0.0; abs(counter-1000.1) >= 1.0e-8; counter += 0.01 ){ } 6. Find a hexadecimal table of EBCDIC letters and note the many "gaps" in the sequential order of the letters. Between what letters do the gaps in numbering occur in EBCDIC? (In other words, which letters differ by more than 1, such that you can't add one to a letter to get the next letter?) Write down all the gaps in both letter and hexadecimal form: Gap between i (89) and j (91) Gap between I (c9) and J (d1) Gap between r (99) and s (a2) Gap between R (d9) and S (e2) 7. Find a hexadecimal table of ASCII characters and translate this string of hexadecimal bytes into ASCII characters: 4C 69 6E 75 78 20 52 6F 63 6B 73 5C 21 0D 0A L i n u x SP R o c k s \ ! CR LF 8. In EBCDIC, which comes first (i.e. has lower hex values): upper-case letters or lower-case letters? Lower case letters precede upper-case letters in EBCDIC. They sort first. 9. In ASCII, which comes first (i.e. has lower hex values): upper-case letters or lower-case letters? Upper case letters precede lower-case letters in ASCII. They sort first. 10. What *printable* characters do EBCDIC and ASCII have in common (if any)? That is, if you sent an EBCDIC file consisting of letters and spaces to an ASCII printer, what printable character(s) would you see on the ASCII printer? EBCDIC letters are all 8-bit characters: none are printable ASCII. Only the EBCDIC spaces will appear printable, as ASCII '@' characters. 11. There is only one bit of difference between an upper-case EBCDIC character and its lower-case equivalent. What is the hexadecimal and decimal value of this bit? What EBCDIC character does this bit value represent? 'a' = 81h = 1000 0001(2) 'A' = C1h = 1100 0001(2) It is the 2**6 bit, with value 64 decimal or 40h or SP. 'A' - 'a' = ' ' = 0x40 = SP (EBCDIC) 12. There is only one bit of difference between an upper-case ASCII character and its lower-case equivalent. What is the hexadecimal and decimal value of this bit? What ASCII character does this bit value represent? 'A' = 41h = 0100 0001(2) 'a' = 61h = 0110 0001(2) It is the 2**5 bit, with value 32 decimal or 20h or SP. 'a' - 'A' = ' ' = 0x20 = SP (ASCII) 13. What ASCII characters result from the following ASCII arithmetic (characters in single quotes are ASCII characters): a) 'A' + 1 = 41h + 1 = 42h = 'B' b) 'Z' - 1 = 5Ah - 1 = 59h = 'Y' c) 'A' + ' ' = 41h + 20h = 61h = 'a' d) 'z' - 'Z' = 7Ah - 5Ah = 20h = ' ' e) '9' - '0' = 39h - 30h = 9h = 9 (decimal - not printable) 14. How do the ASCII character set and the Unicode character set relate to each other? The first 128 characters of Unicode are ASCII, but in 16-bit form. 'A' = 41h in ASCII (7-bit characters) 'A' = 0041h in Unicode (16-bit characters) 15. What is the default character set used in the Java language? http://www.exampledepot.com/egs/java.nio.charset/ConvertChar.html "Java's native character encoding is Unicode." http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=/rzaha/charenc.htm "Internally, the Java(TM) virtual machine (JVM) always operates with data in Unicode." 16. Without decoding using any ASCII table, explain why the following sequence of hexadecimal bytes is or is not likely to be from an ASCII text file: 84 03 fd ff ff c7 44 24 08 05 00 00 00 31 f6 c7 Many bytes above are either 8-bit characters (e.g. 84, ff) or are unprintable characters (values less than 0x20 such as 03 and 00). ASCII text contains 7-bit printable characters in the range 20h to 7Eh. 17. You look into memory and see the hexadecimal value 40h. Based on what you know in this course so far, what different things might this 40h represent? - an EBCDIC space - an ASCII '@' symbol - the decimal value 64 in 8-bit two's complement, or sign-magnitude, or one's complement (sign bit is off) - the excess-127 number -63 (e.g. an IEEE754 exponent) -- | Ian! D. Allen - idallen@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/