Let's calculate! - renewed arithmetic routines

zsolt · Post by **zsolt** » Thu Apr 05, 2018 1:17 pm

Dear Zeddy Fans,

Over the past few months I spent a lot of time writing and testing new arithmetic routines.
The "addition" was completely rewritten, the core of the "division" and the "multiplication" have not changed (just got reorganized ), but the pre- and post-operations were rewritten.

The 3(4) basic arithmetic routines are already in usable format, they are now a bit faster and smaller (57+9 bytes) than the originals.

How i tested them:

i made for each a simple visual testing program, where at first the Zeddy's BASIC is computing and displaying the result, thereafter the new routine is called, and you can compare them using your eyes (you can find them in the attachment in the "VisualTst" folder);

there were 10 mass-tests also to simultaneously check the accuracy and speed of the new arithmetics. They are based on the production of 2000 numbers, using ~8000 bytes of the ZX81 ROM. (the picture shows how the operands were constructed)

: This algorithm ensures that the two operands have a distance of less than 32 bits: abs(EXP1-EXP2)<32

How the test programs work?

Step 1: the program fills the array defined by the first BASIC line - DIM A(2000) - and displays the address of the 1st array element - A(1).
Now these 2000 numbers can be saved as operands - EO: "File/Save Memory Block" menu (10000 bytes).

Step 2: it calls the old arithmetic routine 1000 times and diplays the run-time in "frames". The results are stored in the first half of the array - A(1) to A(1000) - which can be also saved ("File/Save Memory Block", 5000 bytes)

Step 3: finally it fills again the array, calls 1000 times the new arithmetic operation and prints the running time. (save: "File/Save Memory Block", 5000 bytes)

The results:

I wrote a little tool, which helped to compare the "old" and "new" values. It reads the operands, the outputs of the old and new arithmetic routines (which were saved as memory blocks), and creates etalon results also, finally it can save all these in a text file, byte by byte as hexadecimal values. The etalon value is based on IEEE 754 double-precision binary floating-point format. The operands are converted from 40 bit ZX float to 64 bit double, then the operation is performed, finaly the result is reconverted to 40 bit ZX format.

In case of the "multiplication" there was no difference between the outputs of the new and the old algorithm, and both were identical to the etalon. The new solution required 8% less running time.

In case of the "division", there were already more differences (200+/1000) to begin with in the last bit of the mantissas, but the outputs of the new solution was identical to the etalon and the speed was also better (-11%).

The results of the "addition" were really interesting. If the results of the original and the new routines are compared, then the deviations are more than 2.5% (25+/1000) in one of the last three (!) bits of the mantissas. When compared to the etalon values, the outputs of both routines (new and old) are different in more cases (35+/1000), but these differences are only in one of the last two (!) bits of the mantissas, and the new solution is a bit more accurate (less differences).
The new "addition" routine requires ~35% less running time than the original.

Finally i made other variants, where the distance of the operands was fixed:

Distance=0 (OP1.Epx=OP2.Exp=$80): there was more (60+/1000) differences in the last bit of the mantissas, but the outputs of the new solution were identical to the etalon and the speed was better (-42%).

Distance=32 and 33bits: both algorithms gave the same results - Which one is faster?

All tests were performed on EO v1.2, using the SG81E_BN ROM image. The results (hex_xxx.txt files), and the test programs are in the attachment (MassTest_xx folders).

I still have no ideas for testing the limits (underflow and overflow), so any HELP - idea, program, program idea, test result - would be very welcome.

nollkolltroll · Post by **nollkolltroll** » Thu Apr 05, 2018 5:00 pm

Really impressive coding and testing!

zsolt · Post by **zsolt** » Fri Jun 08, 2018 11:04 am

Hi,

As stefano suggested, the next "rocket stage"

is the updated CALCULATOR.
I have chosen an other way, because the literals of the lambda calculator were completely rearranged, so they are incompatible with the ZX81.

The first step was a new "INT" function (using direct calls) and an improved "TRUNCATION" routine.
These are not only faster, but a part of it can be used as a new function: "subtract one".
The "SGN" and "COS" functions have also been renewed - the latter is now calculated as COS(X)=SIN(X + PI/2).

The new "calculator literals" made it possible to replace some common literal sequences, resulting faster and shorter code:

Code: Select all

	new			old			(calc. stack)

	sub-one			stk-one			(x, 1)
				subtract		(x-1)

	mul-by-2		duplicate		(x, x)
				addition		(x+x =2x)

	mul-by-10		stk-ten			(x, 10)
	(2x+8x)			multiply		(x*10)

	stk-square		duplicate		(x, x)
				duplicate		(x, x, x)
				multiply		(x, x*x)

The affected functions and procedures are: ASIN, ATAN, EXP, FP-CALC, FP2BC, INT2FP, LN, NXTDGT1, RND, SER-GEN, SIN, TO-POWER

The "get-argt" literal now does nothing (it points to a simple RET), because the improved "GETARGT" function has been moved to the improved "SINE" function.

During tests of this "rocket stage" I found a bug in the "exponent correction" part of the 'MULTIPLICATION' and 'DIVISION' routines - the attachment contains the improved solution.

: The attachment also includes the test programs

I made some patches also in PRINTING, POINTERS, LINE-ENDS,IF and in SCROLL routines (see "changes" folder in the ZIP).

Please help in testing - run your favourite BASIC programs, etc - because...

: I want to surpass the 150%; sg81_g.png (4.69 KiB) Viewed 7745 times

... i have a "BIG BANG" for you

Regards,
Zsolt

Edit: here is the bug fixed version!

siggi · Post by **siggi** » Sat Jun 09, 2018 3:09 pm

Hi Zsolt
I loaded the ROM into flash of ZxMore (where it is easy to load a new rom), but it worked only in "compatibility" mode, where I cannot LOAD/SAVE programs from USB. Is there something different in handling the NMI, which is used by the ZxMore master program?

But it worked when I loaded the ROM into flash of ZxBlast. There I did first tests, running MEMOCALC, which does use the floating point routines in ROM.
Calculating some "equations" (sin(C1/PI)**2+cos(C1/PI)**2) gave correct results:

Here is MEMOCALC for more tests (is copied into RAM@8K and started by USR 13824):

MEMOCALC.P: (6.42 KiB) Downloaded 345 times

Siggi

olofsen · Post by **olofsen** » Sun Jun 10, 2018 9:29 am

The LD HL,($400C) in the NMI routine is three bytes earlier, at $0071, so that there is no display. But DS-L perhaps still works before going to compatibility mode?

siggi · Post by **siggi** » Sun Jun 10, 2018 10:12 am

olofsen wrote: But DS-L perhaps still works before going to compatibility mode?

Hmm, if I load a P-file (by master program instance 0) into ram of instance 5 (SG81G), it will be deleted when I switch to instance 5 (compat. mode), because the SG81G rom then boots the Zeddy and tests and clears all memory ...

Anyway: even if it would work: without any display it does not really help

Siggi

olofsen · Post by **olofsen** » Sun Jun 10, 2018 11:10 am

Siggi: does the attached version work? It has the original NMI routine (so it is a little slower), and the change to detect NMIs on the ZXmore.

siggi · Post by **siggi** » Sun Jun 10, 2018 12:38 pm

Yes, it works

Siggi

siggi · Post by **siggi** » Sun Jun 10, 2018 1:51 pm

AT ZxMore (3,25 MHz) CLCKFREQ result is 187% (maybe also caused by Wilf's WHY WAIT mod in ZxMore)

zsolt · Post by **zsolt** » Sun Jun 10, 2018 7:17 pm

Hi,

siggi wrote: ↑Sun Jun 10, 2018 10:12 am Anyway: even if it would work: without any display it does not really help

I am very sad, because ...

... the "3T pach" is one of the very first improvments of the ZX81 ROM (i know it since '85)

... this is the only sw solution that accelerates both BASIC and MC programs

... the "zxmore" project started later than this patch was incorporated into the SG81 upgrade project (see Need for speed), and now you see it as the cause of the problem

... the "sgmore.rom" above is good, but it is a little step back.

Regards,
Zsolt

Sinclair ZX80 / ZX81 / Z88 Forums

Let's calculate! - renewed arithmetic routines

Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines

Re: Let's calculate! - renewed arithmetic routines