Boriel Basic Forum

Welcome, Guest

You have to register before you can post on our site.

Search Forums

(Advanced Search)

Forum Statistics

» Members: 273
» Latest member: mindlin
» Forum threads: 1,085
» Forum posts: 6,486

Full Statistics

Online Users

There are currently 32 online users.
» 0 Member(s) | 31 Guest(s)
Bing

Latest Threads

location of heap manageme...
Forum: Help & Support
Last Post: boriel
2026-03-07, 12:13 AM
» Replies: 1
» Views: 212

non-paged supervisor code...
Forum: Help & Support
Last Post: sdo303
2026-02-20, 06:38 PM
» Replies: 8
» Views: 681

How to open fuse as an ex...
Forum: How-To & Tutorials
Last Post: Duefectu
2026-02-09, 01:52 PM
» Replies: 3
» Views: 1,060

Old zxbasic game errors
Forum: Help & Support
Last Post: boriel
2025-11-09, 11:52 AM
» Replies: 7
» Views: 1,671

Error: Undefined GLOBAL l...
Forum: Help & Support
Last Post: ardentcrest
2025-11-04, 05:46 PM
» Replies: 3
» Views: 807

A Fast(er) Plot Routine f...
Forum: How-To & Tutorials
Last Post: tubz74
2025-10-30, 03:16 PM
» Replies: 2
» Views: 926

Hall of Fame - Include fo...
Forum: How-To & Tutorials
Last Post: tubz74
2025-10-28, 03:48 PM
» Replies: 0
» Views: 473

[SOLVED] Array layout bug...
Forum: Bug Reports
Last Post: Zoran
2025-10-25, 05:48 PM
» Replies: 2
» Views: 942

3DOS Commands?
Forum: Help & Support
Last Post: boriel
2025-10-06, 02:54 PM
» Replies: 3
» Views: 1,105

CLS/Fade out ASM Sub-rout...
Forum: How-To & Tutorials
Last Post: tubz74
2025-10-05, 03:39 PM
» Replies: 2
» Views: 835

Faster Trigonometry

Posted by: britlion - 2010-02-20, 04:39 AM - Forum: How-To & Tutorials - Replies (3)

Sometimes we're willing to lose accuracy for speed. ZX BASIC uses the spectrum's ROM routines for many of its math functions, and as a result of them being general purpose routines that use 40 bit numbers, well, they can be a bit slow sometimes.

See my article on square roots, for a good example.

Here is a routine to produce SIN(angle) - where angle is in degrees. It uses a lookup table to calculate sin values, and it's a very quick and dirty method, in that it only actually knows 32 angles, and those to only 8 bit precision. It does linear interpolation between these known angles, however, so that does improve things somewhat.

I was actually surprised how precise it was - it's good for at least 2 decimal places, probably 3 as a rule of thumb. The average error is 0.002 That's probably good enough for games that need to calculate angles. It's about 4-5 times faster than the SIN(FLOAT) function, and not even written in native assembler.

If you need better accuracy, it would be fairly easy to change the method to use a bigger table - perhaps 2 bytes per entry, even.

Remember to work out COS and TAN can also use this function - COS is SIN(90+x) and TAN is SIN(x)/COS(x). It should be easy to write COSINE and TANGENT functions to do the adjustments and call the SINE function.

(And this one I didn't copy. It's all mine! Bugs and all. And now it's free for anyone to use.)

Code:
FUNCTION SINE(num as FIXED) as FIXED

 DIM quad as byte

 DIM est1,dif as uByte

 while num>360

  num=num-360

 end while

 IF num>180 then

    quad=-1

    num=num-180

 ELSE

    quad=1

 END IF

 IF num>90 then num=180-num: end if

 num=((num*31)/90)

 dif=num : rem Cast to byte loses decimal

 num=num-dif : rem so this is just the decimal bit

 est1=PEEK (@sinetable+dif)

 dif=PEEK (@sinetable+dif+1)-est1 : REM this is just the difference to the next up number.

 num=est1+(num*dif): REM base +interpolate to the next value.

 return (num/255)*quad

 sinetable:

 asm

 DEFB 000,013,026,038,051,064,076,088

 DEFB 100,112,123,134,145,156,166,175

 DEFB 184,193,201,209,216,223,229,234

 DEFB 239,243,247,250,252,254,255,255

 end asm

END FUNCTION

Sometimes it just goes crazy

Posted by: britlion - 2010-02-20, 04:03 AM - Forum: Bug Reports - Replies (1)

I was having trouble with a variable not being where I expected. In the end, I have this code:

Code:
FUNCTION SINE(num as FIXED) as FIXED

{skipped some code]

PRINT at 0,0;((num*32)/90)

 PRINT INT ((num*32)/90)

So num is a fixed, and the skipped code makes it 0<=num<=90

The routine prints out for those two:
0.35554504......
23301

Now, I'm thinking that INT(0.35) is 0. The compiler landed about 23,000 too high on that one.

Any idea why?

Compiler Speed Trials

Posted by: britlion - 2010-02-18, 05:20 PM - Forum: ZX Basic Compiler - Replies (76)

I know that ZX Basic is amazing, but I was wondering how it stood up to other basic compilers that were around for use on the ZX Spectrum. We know that Hisoft basic was pretty fast, for example, and LCD mentioned another compiler the other day that was pretty amazing too.

Let me borrow from an article in Crash Magazine: http://www.crashonline.org.uk/19/compilers.htm

In this article, Simon Goodwin talks about several compilers. Hisoft Basic isn't one of them - it wasn't out yet. He doesn't list the benchmarks, either; but they can be interpolated from this:

Code:
Benchmark BM1 : A null-action FOR, REPEAT or DO loop, executed 

                1000 times.

Benchmark BM2 : A  null-action explicitly-coded loop  executed 

                1000 times.

Benchmark BM3 : BM2 plus A=K/K*K+K-K in the loop.

Benchmark BM4 : BM2 plus A=K/2*3+4-5 in the loop.

Benchmark BM5 : BM4  plus  a branch to null-action  subroutine 

                from inside the loop.

Benchmark BM6 : BM5  plus  an array declaration  M(5),  and  a 

                null-action  FOR  loop (of 1-5)  also  in  the 

                loop.

Benchmark BM7 : BM6 plus M(L)=A in this 1-5 loop.

Benchmark BM8 : A  square  function,   log  function  and  sin 

                function  in  an  explicitly-coded  FOR  loop, 

                repeated 100 times.

Benchmark BM9 : Prime  numbers in the range 1-1000 are printed 

                to the screen,  calculated in an outer loop of 

                1000 and an inner loop of 500,  with no tricks 

                at  all.  This  is  a very  bad  prime  number 

                routine  indeed,  but a very useful basis  for 

                inter-machine,    interpreter   and   compiler 

                comparisons.

Simon didn't use Benchmark 9, and I can see why - it's not clearly specified. BM1 to BM8 are pretty clear, however.

My own personal testing with Sinclair Basic gave very slightly differing results. In all cases, my programs were very slightly faster than the timings Goodwin gave in the magazine article. Perhaps he specified things a little differently, perhaps he was using a stopwatch in hand, and human error was the result. Perhaps it was a different version of the ZX Spectrum used. I got the computer to time the programs using the 50 frames per second interrupt timer. For very fast running programs I increased the number of loops by a factor 10 or 100 and estimated back down.

The compilers goodwin tested were:

A Mehmood's "Compiler".
MCODER
Softek's FP and IS
And a little cheekily, Zip 1.5. He wrote that himself, I believe.

The first two rows are for Sinclair Basic. The first being Simon Goodwin's numbers, the second being my own. All times are in seconds, smaller is better.

Code:
BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8                    BMDRAW

    Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30                   80.18

    Boriel's ZX BASIC  0.038    0.032     0.30     0.15     0.16     0.328    2.20    24.0 

   ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)

   ZX Basic 1.2.8-s682 -O3                                                    0.88    20.56 (16.94 with fSin)  21.14

   ZX Basic 1.2.8-s758 -O3                                                    0.90    20.76 (17.10 with fSin)  21.32

    HiSoft FP          0.82     1.34      7.26     7.30     7.32    12.52    14.40    21.9

    HS Integer         0.042    0.67      0.08     0.088    0.334    0.50    10.76

    Mehmood            *        0.065     9.0      4.2      4.2     *        *        *

    ZIP 1 .5           0.031    0.064     0.194    0.108    0.115    0.29     0.46    *

    TOBOS              0.58     0.82      2.02     1.76     2.34     6.68     8.72    0.746

    SOFTEK FP          1.75     2.1       8.7      9.4      9.4     19.7     24.0     22.5

    SOFTEK IS          0.058    0.076     0.57     0.98     0.99     1.32    *        *

    MCODER2            0.043    0.097     0.62     0.90     0.92     1.17     1.47    *

The actual code used is listed below. It's possible to Extrapolate what BM1-6 are, because they simply add code to end up with BM7. Bm 8's main loop is listed separately.

Code:
REM BM7

FUNCTION t() as uLong

asm 

    LD DE,(23674)

    LD D,0

    LD HL,(23672)

end asm

end function

goto start

subroutine:

return

start:

DIM time,i as uInteger

DIM k,var,j as uByte

let time =t()

LET k=5

LET i=0

label:

LET i=i+1

LET var=k/2*3+4-5

gosub subroutine

DIM M(5) as uInteger

FOR j=0 to 4

LET M(j)=i

NEXT j

IF i<1000 then GOTO label: END IF

print (CAST (FLOAT,t())-time)/50

BM 8 replaces most of the code with:

Code:
REM BM8

DIM i,j as ubyte

j=2

FOR i=1 to 100

result=j^2

result=ln(j)

result=sin(j)

next i

This is changed from using constants to prevent constant folding optimizations.
RESULTS and DISCUSSION

First up, passing all the benchmarks and more, clearly Boriel's work is by far the most flexible and comprehensive compiler available. It blows the spots off everything else in terms of WHAT it can compile, and all credit to him for creating it. It is excellent!

In terms of performance, it's pretty amazing, too. It's the second fastest of all the compilers listed here. Only ZIP goes faster, generally. BM7 is a little disappointing, in that the produced code seems to be slower than both MCODER 2 and Zip by a quite significant margin. Perhaps some examination of array handling code could improve this. With version 1.25 beta, sadly, I couldn't use -O3 as an option - the programs all failed to compiler with this option enabled, so I couldn't see if peephole optimization would make a difference. It's worth noting that most On Spectrum compilers refused to deal with floating point numbers. In this roundup, only Softek FP could do it, and that barely faster than Basic. Boriel's compiler blew me away with the FP result, frankly. I had to check to see if it was doing it correctly, it was so amazing! There might be some sneaky optimization happening, but printing the numbers as it created them did seem to work fine. (Note: It WAS cheating. It was putting in constants at compile time. A clever option, but not what we were aiming to test. This number has been changed)

Fixed Hisoft Basic Numbers. These corrected numbers do in fact show it produces some of the fastest code available, sometimes beaten by ZIP 1.5. It far outmatches what ZIP can do, however, in that it deals with FP as well as integer - and it seems to do both faster than the competition. Of course ZX BASIC basic excels at being FP and Integer aware as well.

Added in Tobos. It's fully FP, so tends to be slow where integer math could improve things. But look at BM8!

ZX BASIC In short: Solid and well optimized. Seems to be slow in BM7 (array handling). Very clever use of constant insertion to produce good BM8 speed value of 0.1 but now times are corrected because that was cheating a little!
[Edit] - Array handling speed has been dramatically increased with later versions. Boriel has stated that he will be looking into further array optimizations similar to Hisoft Basic methods - so we can hope for another doubling of speed, perhaps! Confused

hock:

leer de un archivo txt

Posted by: omontero - 2010-02-18, 02:50 PM - Forum: Off-Topic - No Replies

	Saludos amigos, necesito saber como consultar la informacion de un archivo txt y colocarla en mi sitio drupal en cualquier pagina web, si alguien me puede ayudar se lo agradecere. Drupal se desarrollo con PHO por lo que muchas de las funciones de PHP corren en Drupal. Saludos a todos y gracias de antemano. Osbel

read from a txt file

Posted by: omontero - 2010-02-18, 02:47 PM - Forum: Off-Topic - Replies (1)

	Hi, I am developed a web site with drupal, and need some help. I need to read a txt file to view information and then put it in a web page. Thanks anyway. bye

Function calls

Posted by: britlion - 2010-02-17, 02:15 PM - Forum: ZX Basic Compiler - Replies (3)

When doing assembler based function calls, there's something that confuses me about the stack.

Code:
FUNCTION thing (num1 as uByte, num2 as uByte) as uByte

asm

DI 

HALT

end asm

END FUNCTION

When the virtual computer crashes, you can look at the registers and the stack and find out what it's doing.

The stack seems to have
uinteger <something>
uinteger <return address>
uinteger <44,num1>
uinteger <44,num2>

A is set to num1

First question: What's the <something> ? I end up popping it off the stack and dumping it. This worries me.
Second question: If I can trust A to be num1 already, why do I have to go through num1 to get to num2?
Third question: Why the 44's strapped to each byte parameter?

So right now I end up:

Code:
POP BC  ; throw this away

POP HL  ; return address

POP AF  ; num 1 -> A

POP DE  ; num2 -> D.

And since that's less than helpful:

LD E,D

LD D,0

To make DE the value of num2.

Question 1 worries me most. What IS that extra value on the stack?

Is there a better way of handling parameters - with IX+offset, say?
Does the compiler set it to clean up the stack afterwards, and I shouldn't POP it at all?

Sorry to whine, but this isn't documented, and I'm trying to reverse engineer it! I've got the hang of fastcall, but soemtimes I want more than one parameter.

Memory corruption (Bugs 1.25 Beta) (*solved*)

Posted by: britlion - 2010-02-16, 02:43 AM - Forum: Bug Reports - Replies (8)

* Local array issues:
I noticed that a table I had in a subroutine wasn't returning the correct values. Finally pinned down a short program that demonstrates this:
(Both printed lines should be the same)

Code:
SUB failing(a as ubyte,b as ubyte, c as byte, d as byte, text as string)

DIM table2(25) as uByte => {18,24,16,14,14,12,20,12,12,16,14,6,12,10,18,14,26,18,24,8,12,24,12,26,18,10}

print table2(0);" ";table2(1)

end sub

sub working()

DIM table(25) as uByte => {18,24,16,14,14,12,20,12,12,16,14,6,12,10,18,14,26,18,24,8,12,24,12,26,18,10}

print table(0);" ";table(1)

end sub

cls

working()

failing(1,2,3,4,"A")

Interestingly if you swap the order that the two subs are in, which one breaks swaps too - it's the first one defined that breaks each time.

Wishlist

Posted by: britlion - 2010-02-13, 12:44 PM - Forum: Wishlist - Replies (8)

* Faster Printing routine
* Ability to print anywhere on the 256*192 grid, with optional attributes
* Different sized printing (I'll probably tackle 64 Char printing as well as that 42 character one eventually)
* Interrupt driven routines
* Automated 128K support

*More people adding to function and subroutine libraries

Faster Square Roots

Posted by: britlion - 2010-02-13, 02:16 AM - Forum: How-To & Tutorials - Replies (12)

I've got a version that does LONG as well as uInteger later in this thread.

I note that the compiler uses the Spectrum ROM square root routine. This routine is hideously slow. It actually calculates x^(0.5) instead, and takes ages about it. The Newton-Raphson method would be a lot faster, and pretty easy to put in.

If you are willing to sacrifice accuracy, an integer square root would be faster still. For a lot of situations, an integer root would be just fine - for example, if I had to calculate the nearest of two objects on screen, I'm going to have to use pythagoras' theorum to calculate distances. [ A^2 = B^2 + C^2 ] that needs square roots to make a distance. But probably the nearest whole pixel would be a perfectly good enough result!

So, here are two functions, and some code to demonstrate them. One is a perfect replacement for sqr.asm in the library, actually - it's full floating point compatible, 100% accurate, and about 3-6 times faster. It actually uses the FP-Calculator in the rom. It just doesn't use the SQR command. [Note: It comes back with something instead of an error in case of a negative square root request. Boriel - you might want to change that behavor. Not sure.] The integer version... well - look for yourself! I reckon it's about 40 times faster than the fast version.

Also: Should be able to do a something similar for a 32 bit LONG integer.

Copy and compile this program. I hope you like it:

Code:
FUNCTION FASTCALL SQRT (radicand as FLOAT) as FLOAT

ASM

; FLOATS arrive in A ED CB

;A is the exponent.

          AND   A               ; Test for zero argument 

          RET   Z               ; Return with zero.

          ;Strictly we should test the number for being negative and quit if it is.

          ;But let's assume we like imaginary numbers, hmm?

          ; If you'd rather break it change to a jump to an error below.

          ;BIT   7,(HL)          ; Test the bit.

          ;JR    NZ,REPORT       ; back to REPORT_A 

                                ; 'Invalid argument'

          RES 7,E               ; Now it's a positive number, no matter what.

          call __FPSTACK_PUSH   ; Okay, We put it on the calc stack. Stack contains ABS(x)

          ;   Halve the exponent to achieve a good guess.(accurate with .25 16 64 etc.)

                                ; Remember, A is the exponent.

          XOR   $80             ; toggle sign of exponent

          SRA   A               ; shift right, bit 7 unchanged.

          INC   A               ;

          JR    Z,ASIS          ; forward with say .25 -> .5

          JP    P,ASIS          ; leave increment if value > .5

          DEC   A               ; restore to shift only.

ASIS:     XOR   $80             ; restore sign.

          call __FPSTACK_PUSH   ; Okay, NOW we put the guess on the stack

          rst  28h    ; ROM CALC    ;;guess,x

          DEFB $C3              ;;st-mem-3              x,guess

          DEFB $02              ;;delete                x 

SLOOP:    DEFB  $31             ;;duplicate             x,x.

          DEFB  $E3             ;;get-mem-3             x,x,guess

          DEFB  $C4             ;;st-mem-4              x,x,guess

          DEFB  $05             ;;div                   x,x/guess.

          DEFB  $E3             ;;get-mem-3             x,x/guess,guess          

          DEFB  $0F             ;;addition              x,x/guess+guess          

          DEFB  $A2             ;;stk-half              x,x/guess+guess,.5

          DEFB  $04             ;;multiply              x,(x/guess+guess)*.5

          DEFB  $C3             ;;st-mem-3              x,newguess

          DEFB  $E4             ;;get-mem-4             x,newguess,oldguess

          DEFB  $03             ;;subtract              x,newguess-oldguess

          DEFB  $2A             ;;abs                   x,difference.

          DEFB  $37             ;;greater-0             x,(0/1).

          DEFB  $00             ;;jump-true             x.

          DEFB  SLOOP - $       ;;to sloop              x.

          DEFB  $02             ;;delete                .

          DEFB  $E3             ;;get-mem-3             retrieve final guess.

          DEFB  $38             ;;end-calc              sqr x.

          jp __FPSTACK_POP

END ASM

END FUNCTION

FUNCTION FASTCALL SQRT16(radicand as uInteger) as uByte

asm

    XOR A

    AND A

    ld  a,l

    ld  l,h

    ld    de,0040h    ; 40h appends "01" to D

    ld    h,d

    ld b,7

sqrt16loop:

    sbc    hl,de        ; IF speed is critical, and you don't mind spending the extra bytes, you could unroll this loop 7 times instead of DJNZ.

    jr    nc,$+3    

    add    hl,de        

    ccf            

    rl    d        

    rla            

    adc    hl,hl        

    rla            

    adc    hl,hl        

    DJNZ sqrt16loop

    sbc    hl,de        ; optimised last iteration

    ccf

    rl    d

    ld a,d

end asm

END FUNCTION

FUNCTION t AS ULONG

   RETURN INT((65536 * PEEK (23674) + 256 * PEEK(23673) + PEEK (23672)))

END FUNCTION

CLS

DIM a,b as float

DIM i as uInteger

DIM time as long

PRINT "ROM","FAST"

REM show it's as accurate

for i=1 to 15

    LET a=rnd * 32768

    PRINT SQR(a),SQRT(a)

next i

PRINT

PRINT "Over 500 Cycles:"

PRINT

REM ROM version 500 times.

LET time=t()

for i=1 to 500

    b=SQR(a)

next i

PRINT "Rom routine: ";t()-time;" Frames."

REM MY version 500 times.

LET time=t()

for i=1 to 500

    b=SQRT(a)

next i

PRINT "Fast routine: ";t()-time;" Frames."

PRINT

PRINT "PRESS A KEY"

PAUSE 1: PAUSE 0

CLS

PRINT "NUM   FAST      INTEGER"

REM show it's as accurate

for i=1 to 15

    LET a=INT(rnd * 32768)

    PRINT a;TAB 6;SQRT(a);TAB 16;SQRT16(a)

next i

PRINT

PRINT "Over 500 Cycles:"

PRINT

REM MY version 500 times.

LET time=t()

for i=1 to 500

    b=SQRT(a)

next i

PRINT "Fast routine: ";t()-time;" Frames."

REM MY Integer version 500 times.

LET time=t()

for i=1 to 500

    b=SQRT16(a)

next i

PRINT "Integer routine: ";t()-time;" Frames."

How to make your code faster

Posted by: britlion - 2010-02-11, 01:20 AM - Forum: How-To & Tutorials - Replies (5)

One of the reasons you are probably looking at this is that you have some idea how to program in Sinclair Basic, and no idea how to code in machine code (or z80 assembler as it's sometimes called). You want to go play with the old spectrum stuff, and want faster programs - and it must be easier these days, right?

Well, with Boriel's compiler, it is. Most programs can be put into the compiler in a form almost identical to an original sinclair basic program, and it will work. It will be faster. But you want to make it as fast as you can, right?

First thing then: variable types. (see http://www.boriel.com/wiki/en/index.php/ZX_BASIC:Types for details on what variable types the compiler supports.)

Nothing you can do to your program will make as big a speed increase as making sure you use the smallest variable type possible in every case. A byte is better than an integer is better than a long and all those are better than using floating point numbers if you can avoid them.

Have a look at this program:

Code:
FUNCTION t AS ULONG

    RETURN INT((65536 * PEEK (23674) + 256 * PEEK(23673) + PEEK (23672)))

END FUNCTION

DIM i,j,k,fake as <insert type here>

DIM time as uLong

let fake=0

CLS

PRINT "Loop Start"

LET TIME=t()

for k=1 to 20    

    for j=1 to 125

        for i = 1 to 125

        LET fake=fake+1-(fake/2)

        next i

    next j

next k

PRINT "loop End"

print t()-TIME

If we set the type of variable for i,j,k,fake as FLOAT up there at the top, this program will disappear for ages before it reports that it took 119,551 frames to come back. That's almost 40 minutes! If you change the type of variable there to UBYTE it comes back in 839 frames. That's under 17 seconds. To put it another way, the code runs over 142 times faster. Variable types make a BIG difference!

NOTE: The nearest Sinclair BASIC equivalent of this program runs in 235,726 frames, or just over 78 minutes to do the same thing. Even using the same variable types as Sinclair BASIC (Which always uses five byte FLOAT types), a compiled program is quite a lot faster!

For the above program, here are the times, in frames (a frame is 1/50th of a second. Divide by 50 to get a time in seconds if you want - I left it this way to make a speed comparison)

Code:
uByte =     839

Byte  =     861

uinteger=  1126

integer =  1178

uLong =   31792

Long  =   32895

Fixed =   36711

Float =  119551

The rule is use the smaller one every time you can, especially in loops! If you're only going round a for/next loop about 10 times, use uByte.

If you can get away with positive numbers, unsigned types (uByte, uIntger and uLong) are a little bit faster than signed ones.

You may also be able to eliminate floating point numbers by multiplying up - for example store $3.02 as 302 pennies.

[Note: In computing terms generally (not just on the spectrum) there are good reasons not to store money in floating point numbers anyway - floating point numbers are NOT perfectly accurate and you may get rounding errors that could cause problems later on. Just as in decimal you can't write 1/3 without an infinitely long 0.33333333->forever happening, you can't store something like 0.1 in binary without an infinitely long binary number. So, far better to store currency as the smaller unit in an integer or long type. A long would allow you to keep track of up to +/- 2,147,483,647 pennies - or about 21 million currency units. If you want to track more than that you can definitely afford a more powerful computer than a Spectrum!]

Pages (101): « Previous 1 … 91 92 93 94 95 … 101 Next »

Login
Username:
Password:	Lost Password?
	Remember me