To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 2646
2645  |  2647
Subject: 
Re: 8-bit floating-point number representations?
Newsgroups: 
lugnet.off-topic.geek
Date: 
Fri, 5 Jan 2001 04:59:48 GMT
Viewed: 
115 times
  
Frank Filz wrote:
What are the range and characteristics of the numbers you are dealing
with?

I was thinking about this some more, and am really wondering what you
will be doing with the numbers. If the only computations you need to do
with them are comparisons, then once converted they will be extremely
cheap to work with (the reason the bias is used on the exponents is to
make the numbers capable of being ordered just by comparing them as
unsigned numbers, except for the sign bit). Another interesting thing is
that for 8 bit floats, any computation using two values can be expressed
by a 64k byte look up table.

Of course the other thing I realized is that an 8 bit float won't really
be able to express a much larger range of numbers than a 16 bit fixed
point number. The reason - you probably don't want to have more than 4
bits of exponent (of course 5 bits of exponent would just give you 32
bit fixed numbers). Computations on fixed point numbers of course are
relatively quick. Of course a 16 bit fixed point number can't express
the same precision for all values that a float can. Since you mentioned
the floats need not be signed, assuming a 4 bit exponent, you have a 5
bit mantissa (remember one bit is free). If the bias is 1 (exponents
range from 0 to 14), you can express numbers like 1.0625 (1.0001 in
binary - the float representation would be 0001 0001) 2.125 (float
representation of 0010 0001), 4.25 (0011 0001), all the way up to 31744
(1111 1111), but you would see rounding errors of up to almost 1 part in
32 (33 must be rounded to 32 [0110 0000] or 34 [0110 0001]).

Of course where the 8 bit float could really come in handy is if the
smallest value you need to express is say 32, then you can use a bias of
-4 and can express numbers almost as large as 2^20 (2^19*1.0625 or
2^15*31 or 1015808). If you don't need to represent 0, you can also get
an extra step of exponent (you could also state that a byte of all 0s
represents 0, and you just can't express the smallest non-zero number,
so for example, if you wanted to express numbers from 0 to 2^15, your
bias would be 0, which would normally mean 1 would be represented as
0000 0000,  so you could then either just round 1 up to 1.0625 or round
it down to 0).

Frank



Message has 1 Reply:
  Re: 8-bit floating-point number representations?
 
(...) Well, basically, whatever is fast and flexible -- whatever that turns out to be. It might even turn out that a fixed-point representation is flexible enough and maybe even faster since it would involve only a single floating- point multiply by (...) (23 years ago, 7-Jan-01, to lugnet.off-topic.geek)

Message is in Reply To:
  Re: 8-bit floating-point number representations?
 
(...) I've been a pretty serious geek, especially with low level stuff like this. My first real job involved developing a Fortran style formatted I/O for the Apple II which involved digging into the the internal guts of Applesoft BASIC so that we (...) (23 years ago, 4-Jan-01, to lugnet.off-topic.geek)

8 Messages in This Thread:

Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR