Over the past few months I spent a lot of time writing and testing new arithmetic routines.
The "addition" was completely rewritten, the core of the "division" and the "multiplication" have not changed (just got reorganized ), but the pre- and post-operations were rewritten.
The 3(4) basic arithmetic routines are already in usable format, they are now a bit faster and smaller (57+9 bytes) than the originals.
How i tested them:
- i made for each a simple visual testing program, where at first the Zeddy's BASIC is computing and displaying the result, thereafter the new routine is called, and you can compare them using your eyes (you can find them in the attachment in the "VisualTst" folder);
- there were 10 mass-tests also to simultaneously check the accuracy and speed of the new arithmetics. They are based on the production of 2000 numbers, using ~8000 bytes of the ZX81 ROM. (the picture shows how the operands were constructed)
How the test programs work?
- Step 1: the program fills the array defined by the first BASIC line - DIM A(2000) - and displays the address of the 1st array element - A(1).
Now these 2000 numbers can be saved as operands - EO: "File/Save Memory Block" menu (10000 bytes).
- Step 2: it calls the old arithmetic routine 1000 times and diplays the run-time in "frames". The results are stored in the first half of the array - A(1) to A(1000) - which can be also saved ("File/Save Memory Block", 5000 bytes)
- Step 3: finally it fills again the array, calls 1000 times the new arithmetic operation and prints the running time. (save: "File/Save Memory Block", 5000 bytes)
I wrote a little tool, which helped to compare the "old" and "new" values. It reads the operands, the outputs of the old and new arithmetic routines (which were saved as memory blocks), and creates etalon results also, finally it can save all these in a text file, byte by byte as hexadecimal values. The etalon value is based on IEEE 754 double-precision binary floating-point format. The operands are converted from 40 bit ZX float to 64 bit double, then the operation is performed, finaly the result is reconverted to 40 bit ZX format.
In case of the "multiplication" there was no difference between the outputs of the new and the old algorithm, and both were identical to the etalon. The new solution required 8% less running time.
In case of the "division", there were already more differences (200+/1000) to begin with in the last bit of the mantissas, but the outputs of the new solution was identical to the etalon and the speed was also better (-11%).
The results of the "addition" were really interesting. If the results of the original and the new routines are compared, then the deviations are more than 2.5% (25+/1000) in one of the last three (!) bits of the mantissas. When compared to the etalon values, the outputs of both routines (new and old) are different in more cases (35+/1000), but these differences are only in one of the last two (!) bits of the mantissas, and the new solution is a bit more accurate (less differences).
The new "addition" routine requires ~35% less running time than the original.
Finally i made other variants, where the distance of the operands was fixed:
Distance=0 (OP1.Epx=OP2.Exp=$80): there was more (60+/1000) differences in the last bit of the mantissas, but the outputs of the new solution were identical to the etalon and the speed was better (-42%).
Distance=32 and 33bits: both algorithms gave the same results - Which one is faster?
All tests were performed on EO v1.2, using the SG81E_BN ROM image. The results (hex_xxx.txt files), and the test programs are in the attachment (MassTest_xx folders).
I still have no ideas for testing the limits (underflow and overflow), so any HELP - idea, program, program idea, test result - would be very welcome.