| Welcome, Guest |
You have to register before you can post on our site.
|
| Forum Statistics |
» Members: 273
» Latest member: mindlin
» Forum threads: 1,085
» Forum posts: 6,486
Full Statistics
|
| Online Users |
There are currently 32 online users. » 0 Member(s) | 31 Guest(s) Bing
|
| Latest Threads |
location of heap manageme...
Forum: Help & Support
Last Post: boriel
2026-03-07, 12:13 AM
» Replies: 1
» Views: 212
|
non-paged supervisor code...
Forum: Help & Support
Last Post: sdo303
2026-02-20, 06:38 PM
» Replies: 8
» Views: 681
|
How to open fuse as an ex...
Forum: How-To & Tutorials
Last Post: Duefectu
2026-02-09, 01:52 PM
» Replies: 3
» Views: 1,060
|
Old zxbasic game errors
Forum: Help & Support
Last Post: boriel
2025-11-09, 11:52 AM
» Replies: 7
» Views: 1,671
|
Error: Undefined GLOBAL l...
Forum: Help & Support
Last Post: ardentcrest
2025-11-04, 05:46 PM
» Replies: 3
» Views: 807
|
A Fast(er) Plot Routine f...
Forum: How-To & Tutorials
Last Post: tubz74
2025-10-30, 03:16 PM
» Replies: 2
» Views: 926
|
Hall of Fame - Include fo...
Forum: How-To & Tutorials
Last Post: tubz74
2025-10-28, 03:48 PM
» Replies: 0
» Views: 473
|
[SOLVED] Array layout bug...
Forum: Bug Reports
Last Post: Zoran
2025-10-25, 05:48 PM
» Replies: 2
» Views: 942
|
3DOS Commands?
Forum: Help & Support
Last Post: boriel
2025-10-06, 02:54 PM
» Replies: 3
» Views: 1,105
|
CLS/Fade out ASM Sub-rout...
Forum: How-To & Tutorials
Last Post: tubz74
2025-10-05, 03:39 PM
» Replies: 2
» Views: 835
|
|
|
| Faster Trigonometry |
|
Posted by: britlion - 2010-02-20, 04:39 AM - Forum: How-To & Tutorials
- Replies (3)
|
 |
Sometimes we're willing to lose accuracy for speed. ZX BASIC uses the spectrum's ROM routines for many of its math functions, and as a result of them being general purpose routines that use 40 bit numbers, well, they can be a bit slow sometimes.
See my article on square roots, for a good example.
Here is a routine to produce SIN(angle) - where angle is in degrees. It uses a lookup table to calculate sin values, and it's a very quick and dirty method, in that it only actually knows 32 angles, and those to only 8 bit precision. It does linear interpolation between these known angles, however, so that does improve things somewhat.
I was actually surprised how precise it was - it's good for at least 2 decimal places, probably 3 as a rule of thumb. The average error is 0.002 That's probably good enough for games that need to calculate angles. It's about 4-5 times faster than the SIN(FLOAT) function, and not even written in native assembler.
If you need better accuracy, it would be fairly easy to change the method to use a bigger table - perhaps 2 bytes per entry, even.
Remember to work out COS and TAN can also use this function - COS is SIN(90+x) and TAN is SIN(x)/COS(x). It should be easy to write COSINE and TANGENT functions to do the adjustments and call the SINE function.
(And this one I didn't copy. It's all mine! Bugs and all. And now it's free for anyone to use.)
Code: FUNCTION SINE(num as FIXED) as FIXED
DIM quad as byte
DIM est1,dif as uByte
while num>360
num=num-360
end while
IF num>180 then
quad=-1
num=num-180
ELSE
quad=1
END IF
IF num>90 then num=180-num: end if
num=((num*31)/90)
dif=num : rem Cast to byte loses decimal
num=num-dif : rem so this is just the decimal bit
est1=PEEK (@sinetable+dif)
dif=PEEK (@sinetable+dif+1)-est1 : REM this is just the difference to the next up number.
num=est1+(num*dif): REM base +interpolate to the next value.
return (num/255)*quad
sinetable:
asm
DEFB 000,013,026,038,051,064,076,088
DEFB 100,112,123,134,145,156,166,175
DEFB 184,193,201,209,216,223,229,234
DEFB 239,243,247,250,252,254,255,255
end asm
END FUNCTION
|
|
|
| Sometimes it just goes crazy |
|
Posted by: britlion - 2010-02-20, 04:03 AM - Forum: Bug Reports
- Replies (1)
|
 |
I was having trouble with a variable not being where I expected. In the end, I have this code:
Code: FUNCTION SINE(num as FIXED) as FIXED
{skipped some code]
PRINT at 0,0;((num*32)/90)
PRINT INT ((num*32)/90)
So num is a fixed, and the skipped code makes it 0<=num<=90
The routine prints out for those two:
0.35554504......
23301
Now, I'm thinking that INT(0.35) is 0. The compiler landed about 23,000 too high on that one.
Any idea why?
|
|
|
| Compiler Speed Trials |
|
Posted by: britlion - 2010-02-18, 05:20 PM - Forum: ZX Basic Compiler
- Replies (76)
|
 |
I know that ZX Basic is amazing, but I was wondering how it stood up to other basic compilers that were around for use on the ZX Spectrum. We know that Hisoft basic was pretty fast, for example, and LCD mentioned another compiler the other day that was pretty amazing too.
Let me borrow from an article in Crash Magazine: http://www.crashonline.org.uk/19/compilers.htm
In this article, Simon Goodwin talks about several compilers. Hisoft Basic isn't one of them - it wasn't out yet. He doesn't list the benchmarks, either; but they can be interpolated from this:
Code: Benchmark BM1 : A null-action FOR, REPEAT or DO loop, executed
1000 times.
Benchmark BM2 : A null-action explicitly-coded loop executed
1000 times.
Benchmark BM3 : BM2 plus A=K/K*K+K-K in the loop.
Benchmark BM4 : BM2 plus A=K/2*3+4-5 in the loop.
Benchmark BM5 : BM4 plus a branch to null-action subroutine
from inside the loop.
Benchmark BM6 : BM5 plus an array declaration M(5), and a
null-action FOR loop (of 1-5) also in the
loop.
Benchmark BM7 : BM6 plus M(L)=A in this 1-5 loop.
Benchmark BM8 : A square function, log function and sin
function in an explicitly-coded FOR loop,
repeated 100 times.
Benchmark BM9 : Prime numbers in the range 1-1000 are printed
to the screen, calculated in an outer loop of
1000 and an inner loop of 500, with no tricks
at all. This is a very bad prime number
routine indeed, but a very useful basis for
inter-machine, interpreter and compiler
comparisons.
Simon didn't use Benchmark 9, and I can see why - it's not clearly specified. BM1 to BM8 are pretty clear, however.
My own personal testing with Sinclair Basic gave very slightly differing results. In all cases, my programs were very slightly faster than the timings Goodwin gave in the magazine article. Perhaps he specified things a little differently, perhaps he was using a stopwatch in hand, and human error was the result. Perhaps it was a different version of the ZX Spectrum used. I got the computer to time the programs using the 50 frames per second interrupt timer. For very fast running programs I increased the number of loops by a factor 10 or 100 and estimated back down.
The compilers goodwin tested were:
A Mehmood's "Compiler".
MCODER
Softek's FP and IS
And a little cheekily, Zip 1.5. He wrote that himself, I believe.
The first two rows are for Sinclair Basic. The first being Simon Goodwin's numbers, the second being my own. All times are in seconds, smaller is better.
Code: BM1 BM2 BM3 BM4 BM5 BM6 BM7 BM8 BMDRAW
Sinclair 4.46 8.46 21.56 19.82 25.34 60.82 87.44 23.30 80.18
Boriel's ZX BASIC 0.038 0.032 0.30 0.15 0.16 0.328 2.20 24.0
ZX Basic 1.26-r1603 -O3 0.94 20.78 (17.14 with fSin)
ZX Basic 1.2.8-s682 -O3 0.88 20.56 (16.94 with fSin) 21.14
ZX Basic 1.2.8-s758 -O3 0.90 20.76 (17.10 with fSin) 21.32
HiSoft FP 0.82 1.34 7.26 7.30 7.32 12.52 14.40 21.9
HS Integer 0.042 0.67 0.08 0.088 0.334 0.50 10.76
Mehmood * 0.065 9.0 4.2 4.2 * * *
ZIP 1 .5 0.031 0.064 0.194 0.108 0.115 0.29 0.46 *
TOBOS 0.58 0.82 2.02 1.76 2.34 6.68 8.72 0.746
SOFTEK FP 1.75 2.1 8.7 9.4 9.4 19.7 24.0 22.5
SOFTEK IS 0.058 0.076 0.57 0.98 0.99 1.32 * *
MCODER2 0.043 0.097 0.62 0.90 0.92 1.17 1.47 *
The actual code used is listed below. It's possible to Extrapolate what BM1-6 are, because they simply add code to end up with BM7. Bm 8's main loop is listed separately.
Code: REM BM7
FUNCTION t() as uLong
asm
LD DE,(23674)
LD D,0
LD HL,(23672)
end asm
end function
goto start
subroutine:
return
start:
DIM time,i as uInteger
DIM k,var,j as uByte
let time =t()
LET k=5
LET i=0
label:
LET i=i+1
LET var=k/2*3+4-5
gosub subroutine
DIM M(5) as uInteger
FOR j=0 to 4
LET M(j)=i
NEXT j
IF i<1000 then GOTO label: END IF
print (CAST (FLOAT,t())-time)/50
BM 8 replaces most of the code with:
Code: REM BM8
DIM i,j as ubyte
j=2
FOR i=1 to 100
result=j^2
result=ln(j)
result=sin(j)
next i
This is changed from using constants to prevent constant folding optimizations.
RESULTS and DISCUSSION
First up, passing all the benchmarks and more, clearly Boriel's work is by far the most flexible and comprehensive compiler available. It blows the spots off everything else in terms of WHAT it can compile, and all credit to him for creating it. It is excellent!
In terms of performance, it's pretty amazing, too. It's the second fastest of all the compilers listed here. Only ZIP goes faster, generally. BM7 is a little disappointing, in that the produced code seems to be slower than both MCODER 2 and Zip by a quite significant margin. Perhaps some examination of array handling code could improve this. With version 1.25 beta, sadly, I couldn't use -O3 as an option - the programs all failed to compiler with this option enabled, so I couldn't see if peephole optimization would make a difference. It's worth noting that most On Spectrum compilers refused to deal with floating point numbers. In this roundup, only Softek FP could do it, and that barely faster than Basic. Boriel's compiler blew me away with the FP result, frankly. I had to check to see if it was doing it correctly, it was so amazing! There might be some sneaky optimization happening, but printing the numbers as it created them did seem to work fine. (Note: It WAS cheating. It was putting in constants at compile time. A clever option, but not what we were aiming to test. This number has been changed)
Fixed Hisoft Basic Numbers. These corrected numbers do in fact show it produces some of the fastest code available, sometimes beaten by ZIP 1.5. It far outmatches what ZIP can do, however, in that it deals with FP as well as integer - and it seems to do both faster than the competition. Of course ZX BASIC basic excels at being FP and Integer aware as well.
Added in Tobos. It's fully FP, so tends to be slow where integer math could improve things. But look at BM8!
ZX BASIC In short: Solid and well optimized. Seems to be slow in BM7 (array handling). Very clever use of constant insertion to produce good BM8 speed value of 0.1 but now times are corrected because that was cheating a little!
[Edit] - Array handling speed has been dramatically increased with later versions. Boriel has stated that he will be looking into further array optimizations similar to Hisoft Basic methods - so we can hope for another doubling of speed, perhaps! hock:
|
|
|
| leer de un archivo txt |
|
Posted by: omontero - 2010-02-18, 02:50 PM - Forum: Off-Topic
- No Replies
|
 |
Saludos amigos,
necesito saber como consultar la informacion de un archivo txt y colocarla en mi sitio drupal en cualquier pagina web, si alguien me puede ayudar se lo agradecere. Drupal se desarrollo con PHO por lo que muchas de las funciones de PHP corren en Drupal.
Saludos a todos y gracias de antemano.
Osbel
|
|
|
| read from a txt file |
|
Posted by: omontero - 2010-02-18, 02:47 PM - Forum: Off-Topic
- Replies (1)
|
 |
Hi,
I am developed a web site with drupal, and need some help. I need to read a txt file to view information and then put it in a web page.
Thanks anyway.
bye
|
|
|
| Function calls |
|
Posted by: britlion - 2010-02-17, 02:15 PM - Forum: ZX Basic Compiler
- Replies (3)
|
 |
When doing assembler based function calls, there's something that confuses me about the stack.
Code: FUNCTION thing (num1 as uByte, num2 as uByte) as uByte
asm
DI
HALT
end asm
END FUNCTION
When the virtual computer crashes, you can look at the registers and the stack and find out what it's doing.
The stack seems to have
uinteger <something>
uinteger <return address>
uinteger <44,num1>
uinteger <44,num2>
A is set to num1
First question: What's the <something> ? I end up popping it off the stack and dumping it. This worries me.
Second question: If I can trust A to be num1 already, why do I have to go through num1 to get to num2?
Third question: Why the 44's strapped to each byte parameter?
So right now I end up:
Code: POP BC ; throw this away
POP HL ; return address
POP AF ; num 1 -> A
POP DE ; num2 -> D.
And since that's less than helpful:
LD E,D
LD D,0
To make DE the value of num2.
Question 1 worries me most. What IS that extra value on the stack?
Is there a better way of handling parameters - with IX+offset, say?
Does the compiler set it to clean up the stack afterwards, and I shouldn't POP it at all?
Sorry to whine, but this isn't documented, and I'm trying to reverse engineer it! I've got the hang of fastcall, but soemtimes I want more than one parameter.
|
|
|
| Memory corruption (Bugs 1.25 Beta) (*solved*) |
|
Posted by: britlion - 2010-02-16, 02:43 AM - Forum: Bug Reports
- Replies (8)
|
 |
* Local array issues:
I noticed that a table I had in a subroutine wasn't returning the correct values. Finally pinned down a short program that demonstrates this:
(Both printed lines should be the same)
Code: SUB failing(a as ubyte,b as ubyte, c as byte, d as byte, text as string)
DIM table2(25) as uByte => {18,24,16,14,14,12,20,12,12,16,14,6,12,10,18,14,26,18,24,8,12,24,12,26,18,10}
print table2(0);" ";table2(1)
end sub
sub working()
DIM table(25) as uByte => {18,24,16,14,14,12,20,12,12,16,14,6,12,10,18,14,26,18,24,8,12,24,12,26,18,10}
print table(0);" ";table(1)
end sub
cls
working()
failing(1,2,3,4,"A")
Interestingly if you swap the order that the two subs are in, which one breaks swaps too - it's the first one defined that breaks each time.
|
|
|
| Wishlist |
|
Posted by: britlion - 2010-02-13, 12:44 PM - Forum: Wishlist
- Replies (8)
|
 |
- * Faster Printing routine
* Ability to print anywhere on the 256*192 grid, with optional attributes
* Different sized printing (I'll probably tackle 64 Char printing as well as that 42 character one eventually)
* Interrupt driven routines
* Automated 128K support
*More people adding to function and subroutine libraries
|
|
|
| Faster Square Roots |
|
Posted by: britlion - 2010-02-13, 02:16 AM - Forum: How-To & Tutorials
- Replies (12)
|
 |
I've got a version that does LONG as well as uInteger later in this thread.
I note that the compiler uses the Spectrum ROM square root routine. This routine is hideously slow. It actually calculates x^(0.5) instead, and takes ages about it. The Newton-Raphson method would be a lot faster, and pretty easy to put in.
If you are willing to sacrifice accuracy, an integer square root would be faster still. For a lot of situations, an integer root would be just fine - for example, if I had to calculate the nearest of two objects on screen, I'm going to have to use pythagoras' theorum to calculate distances. [ A^2 = B^2 + C^2 ] that needs square roots to make a distance. But probably the nearest whole pixel would be a perfectly good enough result!
So, here are two functions, and some code to demonstrate them. One is a perfect replacement for sqr.asm in the library, actually - it's full floating point compatible, 100% accurate, and about 3-6 times faster. It actually uses the FP-Calculator in the rom. It just doesn't use the SQR command. [Note: It comes back with something instead of an error in case of a negative square root request. Boriel - you might want to change that behavor. Not sure.] The integer version... well - look for yourself! I reckon it's about 40 times faster than the fast version.
Also: Should be able to do a something similar for a 32 bit LONG integer.
Copy and compile this program. I hope you like it:
Code: FUNCTION FASTCALL SQRT (radicand as FLOAT) as FLOAT
ASM
; FLOATS arrive in A ED CB
;A is the exponent.
AND A ; Test for zero argument
RET Z ; Return with zero.
;Strictly we should test the number for being negative and quit if it is.
;But let's assume we like imaginary numbers, hmm?
; If you'd rather break it change to a jump to an error below.
;BIT 7,(HL) ; Test the bit.
;JR NZ,REPORT ; back to REPORT_A
; 'Invalid argument'
RES 7,E ; Now it's a positive number, no matter what.
call __FPSTACK_PUSH ; Okay, We put it on the calc stack. Stack contains ABS(x)
; Halve the exponent to achieve a good guess.(accurate with .25 16 64 etc.)
; Remember, A is the exponent.
XOR $80 ; toggle sign of exponent
SRA A ; shift right, bit 7 unchanged.
INC A ;
JR Z,ASIS ; forward with say .25 -> .5
JP P,ASIS ; leave increment if value > .5
DEC A ; restore to shift only.
ASIS: XOR $80 ; restore sign.
call __FPSTACK_PUSH ; Okay, NOW we put the guess on the stack
rst 28h ; ROM CALC ;;guess,x
DEFB $C3 ;;st-mem-3 x,guess
DEFB $02 ;;delete x
SLOOP: DEFB $31 ;;duplicate x,x.
DEFB $E3 ;;get-mem-3 x,x,guess
DEFB $C4 ;;st-mem-4 x,x,guess
DEFB $05 ;;div x,x/guess.
DEFB $E3 ;;get-mem-3 x,x/guess,guess
DEFB $0F ;;addition x,x/guess+guess
DEFB $A2 ;;stk-half x,x/guess+guess,.5
DEFB $04 ;;multiply x,(x/guess+guess)*.5
DEFB $C3 ;;st-mem-3 x,newguess
DEFB $E4 ;;get-mem-4 x,newguess,oldguess
DEFB $03 ;;subtract x,newguess-oldguess
DEFB $2A ;;abs x,difference.
DEFB $37 ;;greater-0 x,(0/1).
DEFB $00 ;;jump-true x.
DEFB SLOOP - $ ;;to sloop x.
DEFB $02 ;;delete .
DEFB $E3 ;;get-mem-3 retrieve final guess.
DEFB $38 ;;end-calc sqr x.
jp __FPSTACK_POP
END ASM
END FUNCTION
FUNCTION FASTCALL SQRT16(radicand as uInteger) as uByte
asm
XOR A
AND A
ld a,l
ld l,h
ld de,0040h ; 40h appends "01" to D
ld h,d
ld b,7
sqrt16loop:
sbc hl,de ; IF speed is critical, and you don't mind spending the extra bytes, you could unroll this loop 7 times instead of DJNZ.
jr nc,$+3
add hl,de
ccf
rl d
rla
adc hl,hl
rla
adc hl,hl
DJNZ sqrt16loop
sbc hl,de ; optimised last iteration
ccf
rl d
ld a,d
end asm
END FUNCTION
FUNCTION t AS ULONG
RETURN INT((65536 * PEEK (23674) + 256 * PEEK(23673) + PEEK (23672)))
END FUNCTION
CLS
DIM a,b as float
DIM i as uInteger
DIM time as long
PRINT "ROM","FAST"
REM show it's as accurate
for i=1 to 15
LET a=rnd * 32768
PRINT SQR(a),SQRT(a)
next i
PRINT
PRINT "Over 500 Cycles:"
PRINT
REM ROM version 500 times.
LET time=t()
for i=1 to 500
b=SQR(a)
next i
PRINT "Rom routine: ";t()-time;" Frames."
REM MY version 500 times.
LET time=t()
for i=1 to 500
b=SQRT(a)
next i
PRINT "Fast routine: ";t()-time;" Frames."
PRINT
PRINT "PRESS A KEY"
PAUSE 1: PAUSE 0
CLS
PRINT "NUM FAST INTEGER"
REM show it's as accurate
for i=1 to 15
LET a=INT(rnd * 32768)
PRINT a;TAB 6;SQRT(a);TAB 16;SQRT16(a)
next i
PRINT
PRINT "Over 500 Cycles:"
PRINT
REM MY version 500 times.
LET time=t()
for i=1 to 500
b=SQRT(a)
next i
PRINT "Fast routine: ";t()-time;" Frames."
REM MY Integer version 500 times.
LET time=t()
for i=1 to 500
b=SQRT16(a)
next i
PRINT "Integer routine: ";t()-time;" Frames."
|
|
|
| How to make your code faster |
|
Posted by: britlion - 2010-02-11, 01:20 AM - Forum: How-To & Tutorials
- Replies (5)
|
 |
One of the reasons you are probably looking at this is that you have some idea how to program in Sinclair Basic, and no idea how to code in machine code (or z80 assembler as it's sometimes called). You want to go play with the old spectrum stuff, and want faster programs - and it must be easier these days, right?
Well, with Boriel's compiler, it is. Most programs can be put into the compiler in a form almost identical to an original sinclair basic program, and it will work. It will be faster. But you want to make it as fast as you can, right?
First thing then: variable types. (see http://www.boriel.com/wiki/en/index.php/ZX_BASIC:Types for details on what variable types the compiler supports.)
Nothing you can do to your program will make as big a speed increase as making sure you use the smallest variable type possible in every case. A byte is better than an integer is better than a long and all those are better than using floating point numbers if you can avoid them.
Have a look at this program:
Code: FUNCTION t AS ULONG
RETURN INT((65536 * PEEK (23674) + 256 * PEEK(23673) + PEEK (23672)))
END FUNCTION
DIM i,j,k,fake as <insert type here>
DIM time as uLong
let fake=0
CLS
PRINT "Loop Start"
LET TIME=t()
for k=1 to 20
for j=1 to 125
for i = 1 to 125
LET fake=fake+1-(fake/2)
next i
next j
next k
PRINT "loop End"
print t()-TIME
If we set the type of variable for i,j,k,fake as FLOAT up there at the top, this program will disappear for ages before it reports that it took 119,551 frames to come back. That's almost 40 minutes! If you change the type of variable there to UBYTE it comes back in 839 frames. That's under 17 seconds. To put it another way, the code runs over 142 times faster. Variable types make a BIG difference!
NOTE: The nearest Sinclair BASIC equivalent of this program runs in 235,726 frames, or just over 78 minutes to do the same thing. Even using the same variable types as Sinclair BASIC (Which always uses five byte FLOAT types), a compiled program is quite a lot faster!
For the above program, here are the times, in frames (a frame is 1/50th of a second. Divide by 50 to get a time in seconds if you want - I left it this way to make a speed comparison)
Code: uByte = 839
Byte = 861
uinteger= 1126
integer = 1178
uLong = 31792
Long = 32895
Fixed = 36711
Float = 119551
The rule is use the smaller one every time you can, especially in loops! If you're only going round a for/next loop about 10 times, use uByte.
If you can get away with positive numbers, unsigned types (uByte, uIntger and uLong) are a little bit faster than signed ones.
You may also be able to eliminate floating point numbers by multiplying up - for example store $3.02 as 302 pennies.
[Note: In computing terms generally (not just on the spectrum) there are good reasons not to store money in floating point numbers anyway - floating point numbers are NOT perfectly accurate and you may get rounding errors that could cause problems later on. Just as in decimal you can't write 1/3 without an infinitely long 0.33333333->forever happening, you can't store something like 0.1 in binary without an infinitely long binary number. So, far better to store currency as the smaller unit in an integer or long type. A long would allow you to keep track of up to +/- 2,147,483,647 pennies - or about 21 million currency units. If you want to track more than that you can definitely afford a more powerful computer than a Spectrum!]
|
|
|
|