ASCIIZ strings

Im having a bit of trouble writing a routine to find the length of zero terminated ascii strings. If anyone has any good references or code snippets that they have written as a good example any help would be much appreciated.
«1

Comments

  • : Im having a bit of trouble writing a routine to find the length of zero terminated ascii strings. If anyone has any good references or code snippets that they have written as a good example any help would be much appreciated.
    :

    Following is one of my assignments, it includes a part of counting the length of "$" terminated string, at the very beginning. You can do the similar thing to "0" terminated string.

    [code]
    ;Data Dictionary
    ; (datas)
    ;Exist: a boolean to indicate exist or not
    ;Source: str
    ;Target: Target
    ;AnsYes: string displayed if exist
    ;AnsNo: string displayed if not exist
    ;Prefix1,Prefix2: strings make the display informative
    ; (not initialzed)
    ;PtrStr,PtrTar: points to element of strings
    ;StrLen,TarLen: length of strings


    [BITS 16] ;16 bit code
    [ORG 0x0100] ;will assemble to .com file

    [SECTION .text]
    Init:
    mov byte [Exist], 0
    mov DX, 0
    mov [TmpDX], DX

    DispProblem:
    mov AH, 9
    mov DX, Prefix1
    int 21H
    mov DX, Source
    int 21H
    mov DX, Prefix2
    int 21H
    mov DX, Target
    int 21H

    CountLenStr:
    ;initialize the loop
    mov AX, 0 ;use AX to store the StrLen temporary
    mov BX, Source ;use BX to be temporary pointer
    ;the loop
    jmp FlowCtrl1
    Increment1:
    inc AX ;increase the length
    inc BX ;step forward the pointer
    FlowCtrl1:
    cmp byte [BX], "$"
    jne Increment1
    ;update mem data
    sub AX,2
    mov [StrLen],AX
    mov BX, Source
    mov [PtrStr], BX

    CountLenTar:
    ;initialize the loop
    mov AX, 0
    mov BX, Target
    ;the loop
    jmp FlowCtrl2
    Increment2:
    inc AX
    inc BX
    FlowCtrl2:
    cmp byte [BX], "$"
    jne Increment2
    ;update mem data
    sub AX,2
    mov [TarLen],AX
    mov BX, Target
    mov [PtrTar], BX

    Search:
    ;initialize the counter
    mov CX, [StrLen]
    sub CX, [TarLen]
    add CX, 1
    mov [TmpCX], CX ;temp CX for internall loop
    ;the loop
    jmp FlowCtrl3
    Increment3:
    dec CX
    mov [TmpCX], CX
    ;init internall loop counter
    mov CX, [TarLen]
    ;internal loop
    jmp FlowCtrl4
    Increment4:
    dec CX
    ;load first char
    mov BX, [PtrStr]
    add BX, CX
    add BX, [StrLen] ;BX=CX+[PtrStr]+[StrLen]
    sub BX, [TmpCX] ; -[TarLen]-[TmpCX]
    sub BX, [TarLen]
    mov AL, byte [BX] ;use BX to load a char to AL
    ;load the second char
    mov BX, [PtrTar]
    add BX, CX
    ;compare chars
    cmp AL, Byte [BX] ;1) if chars do not match
    jne FlowCtrl3 ; go to next outer loop
    cmp CX, 0 ;2) if chars match and no
    je WhenExistBegin ; more targer chars, exist
    FlowCtrl4:
    cmp CX, 0
    jne Increment4
    FlowCtrl3:
    mov CX, [TmpCX]
    cmp CX, 0
    jne Increment3


    jmp WhenExistEnd
    WhenExistBegin:
    mov byte [Exist], 1
    mov DX, [StrLen]
    sub DX, [TmpCX] ;DX=[StrLen]-[TmpCX]-[TarLen]
    sub DX, [TarLen] ;store DX when DX is for interrup
    mov [TmpDX], DX
    WhenExistEnd:


    DispResultNo:
    mov DX, AnsNo
    cmp byte [Exist], 1
    jne No
    mov DX, AnsYes
    No:
    mov AH, 9H
    int 21H

    ForTestByTA:
    mov DX, [TmpDX] ;prepare DX for checking

    Terminate:
    mov AX, 04C00H
    int 21H


    [SECTION .data]
    Source db "abcdefgdododthijk", 10, 13, "$"
    Target db "dodt", 10, 13, "$"
    Exist db 0
    AnsYes db "Target", 32, "in", 32, "str", 10, 13, "$"
    AnsNo db "Target", 32, "not", 32, "in", 32, "Str", 10, 13, "$"
    Prefix1 db "str=", "$"
    Prefix2 db "Target=", "$"

    [SECTION .bss]
    PtrStr resb 2
    PtrTar resb 2
    StrLen resb 2
    TarLen resb 2
    TmpCX resb 2
    TmpDX resb 2
    [/code]
  • [code]
    ;Nasm data
    STRING DB "123456789ABCDEF0",0 ;string +nul ending
    STRleng EQU $-STRING ;length of string + it's nul
    DB 0 ;if nul was here STRleng would be the strings length

    ;Nasm code
    XOR CX,CX ;CX=0 counter
    MOV SI,STRING ;DS:SI = addr of string
    TOP: INC CX
    LODSB
    OR AL,AL ;check for zero
    JNZ TOP ;loop
    DEC CX ;adjust off by 1 ?
    JZ NOlength ;error, no length, JCXZ works too

    ;another way is
    MOV BX,STRING ;STRING-1 & remove DEC BX below?
    DEC BX
    TOP: INC BX
    CMP BYTE [BX],0 ;DS:BX = string ptr
    ; CMP AL,[BX] ;would be faster if AL=0 or another reg
    JNZ TOP
    SUB BX,STRING ;subtract index ptr from STRING addr = BX=length
    JZ NOlength
    [/code]

    CMPSB SCASB might be other instructions to look into
    along with REPE or REPNE prefixes.

    I hope that helps some how.
    Bitdog


  • : Im having a bit of trouble writing a routine to find the length of zero terminated ascii strings. If anyone has any good references or code snippets that they have written as a good example any help would be much appreciated.
    :

    I do this on a regular basies, and here is the subroutine I use. The code could be made smaller by using a INC/CMP/JNE set of instructions, but this code is faster for long strings (a single REPNE SCASB instruction is faster than several independent instructions). I normally use the A86 Assembler, so the format of this may not be exactly what you're expecting:

    ;-----------------------------------------------------------------------
    ;CALCULATE THE LENGTH OF AN ASCIIZ STRING
    ;Inputs: DS:[DX] = Pointer to the string
    ; CLD already issued
    ;Outputs: CX = Length of the string
    ; ZF = Set if string is 0 length
    ; = Clear if not
    ;Changes:
    ;-----------------------------------------------------------------------
    CalcStrSizeDX:
    PUSH AX,DI,ES ;Save used registers
    MOV ES,DS ;Point ES:[DI]
    MOV DI,DX ; at the string
    XOR AL,AL ;Look for a 0
    MOV CX,-1 ;Start with max count for REPNE
    REPNE SCASB ;Find the end of the string
    MOV CX,DI ;Need the character count in CX
    DEC CX ;Adjust and calculate
    SUB CX,DX ; the number of characters (this also sets ZF)
    POP ES,DI,AX ;Restore used registers
    RET

    The above subroutine is called by the following subroutine, which is what I normally use to output messages to the screen. I prefer using ASCIIZ strings for messages instead of $-terminated strings for two reasons. They are aesthetically MUCH easier to look at, and I can write things to either the CON: Device or the ERR: Device (allowing me to control whether or not DOS is allowed to redirect the output away from the screen).

    ;-----------------------------------------------------------------------
    ;WRITE ASCIIZ STRING TO CON OR TO ERR
    ;Inputs: DS:[DX] = Pointer to string
    ;Outputs: Writes the string to CON or ERR
    ;Changes:
    ;-----------------------------------------------------------------------
    WriteZCon: ;Write to CON
    PUSH BX ;Save used register
    MOV BX,1 ;Device #1 (CON)
    JMP >Z10 ;Jump to do it
    WriteZErr: ;Write to ERR
    PUSH BX ;Save used register
    MOV BX,2 ;Device #2 (ERR)
    Z10: ;Write to CON or ERR
    PUSH AX,CX ;Save used registers
    CALL CalcStrSizeDX ;Calculate the size of the string (returns CX)
    JZ >Z80 ;If nothing to write, just quit
    MOV AH,40h ;Function 40h (Write to Device)
    INT 21h ;Do it
    Z80: ;We're done
    POP CX,AX ;Restore used registers
    POP BX ;Restore used register
    RET
  • Your codeing got me messin round with it a bit & I came up with this.
    [code]
    ;;Nasm macro gets string length in CX.
    ;; STRLENm LABEL,0
    %macro STRLENm 2 ;Recieves:
    MOV DI,%1 ; ES:DI = address of string (ES must be already set)
    MOV AL,%2 ; AL = end char, nul=0, $=36, LF=10, .=46, space=32
    MOV CX,-1 ;Returns:
    REPNE SCASB ; ES:DI = ptr to byte after end char (nul,LF,etc)
    NOT CX ; CX= string length (counting the end char)
    %endmacro

    ;Example code gets the length of a nul ending string
    PUSH CS ;push the seg that LABEL is in
    POP ES ;set ES
    STRLENm LABEL,0 ;LABEL=offset, 0=End_Of_String (EOS) character
    DEC CX ;Remove nul from CX= string length.
    JZ NOlength ;If CX=0 there is no string length.
    [/code]
    [green]
    The Nasm macro above uses NOT CX to get the string length,
    which is fast & small coding.

    The macro allows any End Of String (EOS) character,
    so you can get the length of a regular sentence includeing the period or CRLF.

    Following it with DEC CX & JZ NoLENGTH checks for 0 length strings
    & removes the nul=0 ending character from CX = the string length.

    Using a macro basicly removes a PROCs CALL & RETurn which speeds things up,
    but adds 4 bytes more than a call to a STRLEN proc would.
    The guts of the macro = 7 bytes,
    CALL LABEL = 3 bytes, 7-3=4 is the math.
    align 16 for a proc & the proc size it's self costs bytes,
    so if the macro is only used a few times, there is a size saveings.
    [/green][code]
    ;A macro can also be used to make a proc.
    MYPROC: PUSH regs ;Call with, DX=string offset & AH=end character
    PUSH CS
    POP ES
    STRLENm DX,AH
    POP regs
    DEC CX ; Returns, CX=length & ZF set if no string length
    RET ; JZ NoLENGTH
    [/code][green]
    An included macro does not add size to your executable unless it's used.
    Where any included proc does weather it's used or not.
    (include librarys & proc groups in include files, do this often?)
    (IFDEF stuff is a work around, but doubles the work.)

    For speed & size, macro's or procedures should NOT save/restore regs.
    Then if a reg needs preserving, add the PUSH/POP code for the needed regs.
    You may have debugged code that saves & restores all the regs a few times,
    then alters most of the regs right after that,
    which makes all the save/restore code a waste of size, clocks, & work.

    I could be off on all this stuff above,
    but there might be some usable ideas in there somewhere,
    and I'm open for suggestions or error corrections.

    Bitdog
    PS, what do you think about the PUSH/POP procedure below ?
    Your procedure can CALL PUSHALL at it's start & CALL POPALL at the end.
    [/green]
    [code]
    align 16
    PUSHALL:
    POP WORD [$+13]
    ; PUSH AX ; PUSH ALL the regs except AX (most regs anyway)
    PUSH BX
    PUSH CX
    PUSH DX
    PUSH DI
    PUSH SI
    PUSH BP
    PUSH DS
    PUSH ES
    DB 0x68 ;= PUSH WORD
    DW 0 ;= return address that gets pushed
    RET

    ;align 16 ;this proc is already aligned at 16
    POPALL: POP WORD [$+13] ; store return address at PUSH WORD DW below
    POP ES ; POP ALL the regs except AX
    POP DS
    POP BP
    POP SI
    POP DI
    POP DX
    POP CX
    POP BX
    ; POP AX
    DB 0x68 ;= PUSH WORD
    DW 0 ;= return address that gets pushed
    RET
    [/code][green]
    ;These procs are 32 bytes in size, total.
    ;They save 5 bytes of code every time they are called/needed.
    ;CALL PUSHALL = 3 bytes & PUSH 8 regs is 8 bytes, 8-3= 5 bytes saved.
    ;But it is slow coding & if a proc only needs 2 regs pushed (=2 bytes)
    ;it wastes a byte at each call + a lot of clocks.
    ;Some coding doesn't need speed though. (kbd input code can be slow?)
    ;Note: the proc's can only push 8 regs each & still stay aligned.
    ;There are 9 regs total, so one reg has to be commented out in each proc.
    [/green]

  • The main point of the subroutine was to show the advantage of using the REPNE SCASB instruction, as opposed to doing a CMP/JNE/INC type of loop. Implementation-specific issues for any particular program would involve:

    "Better" to implement as a Macro, PROCess, Subroutine, or in-line?
    Need to worry about saving the Registers?
    Need a Flag (ZF) indicating a 0-length string?
    Can simply assume that ES is already set correctly?
    What is the maximum string length?

    Using a NOT CX or a NEG CX (followed by a DEC CX or SUB CX,2 to compensate for the end-of-string byte) limits the string to 32766 bytes, while the original code (MOV CX,DI / DEC CX / SUB CX,DX) allows for strings up to 65534 bytes. Granted, ASCIIZ strings are almost never more than a few hundred bytes long. The original code is more general, even though it is slightly less efficient for "normal" strings.

    ******

    From a philosophical perspective, let me tell you where I'm coming from. My programs tend to look different than most other people's programs, for a few reasons. I use the A86 and A386 Assemblers, rather than MASM or NASM as most people use. ASSUMEs and PROCesses are basically foreign concepts in A86. As Eric Isaacson states in his A86 documentation, the concepts of ASSUMEs and PROCesses have created far, far more confusion over the years than they have ever solved. This is especially true in a program that is small enough to get compiled into a .COM format instead of an .EXE format.

    Also, nearly all of my programs are TSR's. Maintaining the integrity of the Stack, Registers, and Flags is absolutely imperative in the Resident portion of a TSR. Also, it is fairly critical to keep the size of the Resident code as small as possible (size generally taking precedence over speed, though the two are highly related). Also, I tend to write in a fairly structured manner, dividing things up into very well defined subroutines. The subroutines have definite inputs and outputs (Registers and Flags), and each subroutine is designed so that it can pretty much stand-alone. If even a small stretch of code needs to be used more than once, I find it is almost always better to implement it as a Subroutine rather than a Macro (to keep the overall program size as small as possible). I also try to keep a subroutine small enough so that I can see the entire subroutine code on a 50-line screen at the same time.

    I have found that by following these rules, my Code is relatively easy to maintain. My Code does have a lot of CALLs and RETs, and also a lot of PUSHes an POPs, (a lot more than most other people's programs, anyway). This overhead associated with using the Stack so much probably makes a program a little bit slower than it could otherwise be, but I also believe it is much easier to maintain. Also, with the cacheing capabilities of modern CPU's, a re-used subroutine can actually be processed faster than a Macro, which won't be cached at runtime. I have found that in all but the simplest TSR's, it is just too difficult for me to keep track of all the Registers and Flags and Variables at a "global" level. I have found that I really need to manage things at a Subroutine Level (basically, one screen at a time), or I get myself so confused that I never get anywhere.

    While maintaining the integrity of everything is absolutely critical in the Resident portion of a TSR, it is usually not very important in the Transient portion (or in a non-TSR program). Unfortunately, I can't make myself change habits even when I'm working on non-Resident code. I end up writing almost everything as small subroutines, instead of using Macros and more linear-type code. It is simply the "style" of coding I have adopted. I also find at times that I need to "move" a Transient piece of Code into the Resident portion of a program (for one reason or another), and having it in the correct "format" ahead of time usually makes things a whole lot easier.

    Hope that makes sense.


    : Your codeing got me messin round with it a bit & I came up with this.
    : [code]
    : ;;Nasm macro gets string length in CX.
    : ;; STRLENm LABEL,0
    : %macro STRLENm 2 ;Recieves:
    : MOV DI,%1 ; ES:DI = address of string (ES must be already set)
    : MOV AL,%2 ; AL = end char, nul=0, $=36, LF=10, .=46, space=32
    : MOV CX,-1 ;Returns:
    : REPNE SCASB ; ES:DI = ptr to byte after end char (nul,LF,etc)
    : NOT CX ; CX= string length (counting the end char)
    : %endmacro
    :
    : ;Example code gets the length of a nul ending string
    : PUSH CS ;push the seg that LABEL is in
    : POP ES ;set ES
    : STRLENm LABEL,0 ;LABEL=offset, 0=End_Of_String (EOS) character
    : DEC CX ;Remove nul from CX= string length.
    : JZ NOlength ;If CX=0 there is no string length.
    : [/code]
    : [green]
    : The Nasm macro above uses NOT CX to get the string length,
    : which is fast & small coding.
    :
    : The macro allows any End Of String (EOS) character,
    : so you can get the length of a regular sentence includeing the period or CRLF.
    :
    : Following it with DEC CX & JZ NoLENGTH checks for 0 length strings
    : & removes the nul=0 ending character from CX = the string length.
    :
    : Using a macro basicly removes a PROCs CALL & RETurn which speeds things up,
    : but adds 4 bytes more than a call to a STRLEN proc would.
    : The guts of the macro = 7 bytes,
    : CALL LABEL = 3 bytes, 7-3=4 is the math.
    : align 16 for a proc & the proc size it's self costs bytes,
    : so if the macro is only used a few times, there is a size saveings.
    : [/green][code]
    : ;A macro can also be used to make a proc.
    : MYPROC: PUSH regs ;Call with, DX=string offset & AH=end character
    : PUSH CS
    : POP ES
    : STRLENm DX,AH
    : POP regs
    : DEC CX ; Returns, CX=length & ZF set if no string length
    : RET ; JZ NoLENGTH
    : [/code][green]
    : An included macro does not add size to your executable unless it's used.
    : Where any included proc does weather it's used or not.
    : (include librarys & proc groups in include files, do this often?)
    : (IFDEF stuff is a work around, but doubles the work.)
    :
    : For speed & size, macro's or procedures should NOT save/restore regs.
    : Then if a reg needs preserving, add the PUSH/POP code for the needed regs.
    : You may have debugged code that saves & restores all the regs a few times,
    : then alters most of the regs right after that,
    : which makes all the save/restore code a waste of size, clocks, & work.
    :
    : I could be off on all this stuff above,
    : but there might be some usable ideas in there somewhere,
    : and I'm open for suggestions or error corrections.
    :
    : Bitdog
    : PS, what do you think about the PUSH/POP procedure below ?
    : Your procedure can CALL PUSHALL at it's start & CALL POPALL at the end.
    : [/green]
    : [code]
    : align 16
    : PUSHALL:
    : POP WORD [$+13]
    : ; PUSH AX ; PUSH ALL the regs except AX (most regs anyway)
    : PUSH BX
    : PUSH CX
    : PUSH DX
    : PUSH DI
    : PUSH SI
    : PUSH BP
    : PUSH DS
    : PUSH ES
    : DB 0x68 ;= PUSH WORD
    : DW 0 ;= return address that gets pushed
    : RET
    :
    : ;align 16 ;this proc is already aligned at 16
    : POPALL: POP WORD [$+13] ; store return address at PUSH WORD DW below
    : POP ES ; POP ALL the regs except AX
    : POP DS
    : POP BP
    : POP SI
    : POP DI
    : POP DX
    : POP CX
    : POP BX
    : ; POP AX
    : DB 0x68 ;= PUSH WORD
    : DW 0 ;= return address that gets pushed
    : RET
    : [/code][green]
    : ;These procs are 32 bytes in size, total.
    : ;They save 5 bytes of code every time they are called/needed.
    : ;CALL PUSHALL = 3 bytes & PUSH 8 regs is 8 bytes, 8-3= 5 bytes saved.
    : ;But it is slow coding & if a proc only needs 2 regs pushed (=2 bytes)
    : ;it wastes a byte at each call + a lot of clocks.
    : ;Some coding doesn't need speed though. (kbd input code can be slow?)
    : ;Note: the proc's can only push 8 regs each & still stay aligned.
    : ;There are 9 regs total, so one reg has to be commented out in each proc.
    : [/green]
    :
    :

  • "Also, with the cacheing capabilities of modern CPU's, a re-used subroutine can actually be processed faster than a Macro, which won't be cached at runtime."

    Wouldn't the repeated macro code be cached when read from the HD or other device? Something tells me I'm wrong, like separation of data and code caches.
  • I think you might be confusing two different kinds of Caches. There can be Caches associated with Mass Media devices (Floppies, Hard Drives, CD-ROMs' etc.). Sometimes those devices have Internal Caches, but there can also be external Caches (like SMARTDRV) that use RAM to Cache data from Mass Media devices.

    The Cache that is being referred to here is the CPU Cache, which is different. The "memory" associated with the CPU Cache is even faster than RAM, and is used to Cache data that has already made it into RAM. EVERYTHING that is in RAM originally came from somewhere else (ROM, battery-backed CMOS, Floppy, Hard Drive, Keyboard, Mouse, Network, etc.). When the computer first turns on, the RAM is completely void.


    I could be wrong about how the CPU Cache works, though. From what I understand, Code that has been run "recently" will be in the Cache. The way the Cache is able to know what was run recently will be from the Address (CS:IP) of the Code, not from the contents of the Code itself. At Compile time, Macro Code is expanded and written (effectively copied over and over again) to the Compiled program each time it is referenced in the Source Code. Every time this happens, it is in a different place (different CS:IP) in the end program.

    A subroutine is not copied multiple times like a Macro is. It is only stored once, and then CALLed each time it is needed, and its Address does not change. That being said, I believe modern CPUs also have some "look-ahead" type capabilites in the Cache, where they try to load several instructions directly after the one currently being processed into the Cache, also. So, as long as the Code is fairly linear, Code is effectively being Cached even if it has not already been Run. So, in the end, it may not make much difference whether it is a Subroutine or a Macro (at least from a Cacheing perspective).

    The real essence of all this is to point out the fact that while Macros may provide for more efficient Coding, they do not necessarily provide for more efficent Programs.

    : "Also, with the cacheing capabilities of modern CPU's, a re-used subroutine can actually be processed faster than a Macro, which won't be cached at runtime."
    :
    : Wouldn't the repeated macro code be cached when read from the HD or other device? Something tells me I'm wrong, like separation of data and code caches.
  • A macro expands into code at assembly time & can be used in a procedure.
    A macro does not have to be used or have to be used multiple times.
    It can be used once, most any where, even in a proc.
    At run time the CPU can't tell the difference between one code & another.
    The CPU doesn't know what bytes to cashe, and
    what bytes NOT to overwrite on the next read,
    so stateing that procs get cashed and macros don't sounds like
    so your just making stuff up as you go Bret.

    NOT CX ;handles 0-65525 not 32766 as you stated.
    your confused about signed & unsigned.
    REPNE SCASB ;appears to use CX as unsigned, and
    NOT CX is a good quick code tip for the code you posted.

    I read through your post and found too many contradictions.
    You claim quick code, then paragraphs justify your many slow coding
    techniques.
    I started picking apart your post, and it got so long
    that it made me look bad so I didn't post it.

    You can code any way you want, but good tips are approperate posts.
    Your REPNE SCASB was a good tip & so was my NOT CX.
    You don't have to use any tips here that you don't want to use.

    Sometimes what happens to me is, I first say to myself NO,
    then after thinking about the tip, I find I'm working it into my code
    and it was a good tip. You might try experimenting with NOT CX ?
    I'm going to look into caching.

    Bitdog

  • You're correct about NOT CX working correctly. I thought CX needed to be adjusted by one once it crossed into positive numbers, but it doesn't. I stand corrected, and will be changing things in my programs.

    Your also correct in that a Macro does not need to be copied more than once (or even at all), and that they can be used in Subroutines. My point is that some people use Macros as if they were Subroutines (issuing them multiple times in the same program), and simply assume that using Macros "automatically" creates efficient Programs. That is not necessarily the case.

    You're also correct in stating that a Cache actually doesn't know what it should overwrite and what it should keep. So, a Cache pretty much needs to keep everything until it starts getting full, and then start to overwrite what is old and hasn't been reused. One thing a Cache won't do is compare two pieces of code to see if they are the same, and just use the already-Cached copy of it (which can happen if you use a Macro as if it were a Subroutine).

    We're definitely getting into philosophical discussions here, and this is absolutely the wrong format for those. If you want to take it off-line, we can certainly do that.

    : A macro expands into code at assembly time & can be used in a procedure.
    : A macro does not have to be used or have to be used multiple times.
    : It can be used once, most any where, even in a proc.
    : At run time the CPU can't tell the difference between one code & another.
    : The CPU doesn't know what bytes to cashe, and
    : what bytes NOT to overwrite on the next read,
    : so stateing that procs get cashed and macros don't sounds like
    : so your just making stuff up as you go Bret.
    :
    : NOT CX ;handles 0-65525 not 32766 as you stated.
    : your confused about signed & unsigned.
    : REPNE SCASB ;appears to use CX as unsigned, and
    : NOT CX is a good quick code tip for the code you posted.
    :
    : I read through your post and found too many contradictions.
    : You claim quick code, then paragraphs justify your many slow coding
    : techniques.
    : I started picking apart your post, and it got so long
    : that it made me look bad so I didn't post it.
    :
    : You can code any way you want, but good tips are approperate posts.
    : Your REPNE SCASB was a good tip & so was my NOT CX.
    : You don't have to use any tips here that you don't want to use.
    :
    : Sometimes what happens to me is, I first say to myself NO,
    : then after thinking about the tip, I find I'm working it into my code
    : and it was a good tip. You might try experimenting with NOT CX ?
    : I'm going to look into caching.
    :
    : Bitdog
    :
    :

  • On line is fine. I don't have any issues.
    I got something out of the posts.
    I made a string length macro that works for me in the codeing I use. Your post enlightened me to the fact that a maximum string length of 0xFFFF is just fine, as long as you know that the string DOES HAVE the end byte your looking for. Nul, LF, etc.
    I failed to look at REPNE SCASB that way before
    so NOT CX never came to mind as a quickie code possibility.
    So Thankx, I needed that.

    Then my included PUSHALL proc post turned out to be a failure,
    because it requires DS to be aimed at CS at CALL
    which isn't always the case. So I'm working on a new version.
    JMP $+some number
    doesn't require a seg reg set since it uses CS, so I thought
    POP WORD [$+some number] was the same, but it wern't.
    Brackets change it to an offset address that uses DS as default. (Darn it)

    Bitdog
    PS, you took my bad post fairly well.
    I fumed over yours for a day or so.
    I'm getting over it though.

  • [blue]It is all good, but one thing: CPU cache actually sometimes keeps in itself the address range where the jumps are predicted, but they have to be predicted correctly. Example: a simple loop -

    LABEL_0001:
    ...
    LOOP LABEL_0001

    Of course, you should know that every jump reloads the cache from that address. But not in the above case.

    After couple of jumps to the same spot (address of LABEL_0001) the CPU will keep the body of that loop in memory all the time without reloads. It called [b]branch prediction[/b]. The modern CPUs have some tricky algorithms to 'predict' jumps.

    Only one time the jump will NOT be predicted - when the LOOP ends and you jump to the next instruction AFTER the LOOP.

    BTW, recently I was optimizing my IDE and I found out that LOOP is slower then use of a pair:

    DEC REG32
    JNZ LOOP_BEGIN_ADDRESS

    I was surprised...[/blue]
  • If one was concerned about whether the End-of-String character was actually going to be there or not, you could calculate and set CX at the beginning of the routine to make sure that DI doesn't "roll over" to 0. Then, immediately after the REPNE SCASB, you could do a JCXZ test to handle the error. Of course, if you did that, you couldn't use the NOT CX trick any more to calculate the size, and would need to do something more like my original code.

    Whenever I've had to deal with strings in the past, they've either been strings inside my program (that I have complete control of), or are command-line parameters or environment variables that I simply assume DOS sets up properly. So, I've never concerned myself with that type of error. Truly general-purpose, idiot-proof code would need to handle those kinds of situations, though.


    Like I said, I deal a lot with TSR's. Two things that STILL trip me up from time to time, even after all these years, are Segment Registers (usually, assuming CS=DS=ES=SS when it's not true), and Flags (usually, assuming STI and/or CLD when it's not true).

    One possibility to consider for your PushAll Macro might be to simply use a CS override:

    POP W CS:[$+x]

    If your program is in a .COM format (where CS=DS=ES=SS by default), that might work. If it's in an .EXE format (where it's hard to tell exactly what the Segment Registers might be), then it may not work. I haven't looked at or experimented with your code in enough detail to sort it all out.


    : On line is fine. I don't have any issues.
    : I got something out of the posts.
    : I made a string length macro that works for me in the codeing I use. Your post enlightened me to the fact that a maximum string length of 0xFFFF is just fine, as long as you know that the string DOES HAVE the end byte your looking for. Nul, LF, etc.
    : I failed to look at REPNE SCASB that way before
    : so NOT CX never came to mind as a quickie code possibility.
    : So Thankx, I needed that.
    :
    : Then my included PUSHALL proc post turned out to be a failure,
    : because it requires DS to be aimed at CS at CALL
    : which isn't always the case. So I'm working on a new version.
    : JMP $+some number
    : doesn't require a seg reg set since it uses CS, so I thought
    : POP WORD [$+some number] was the same, but it wern't.
    : Brackets change it to an offset address that uses DS as default. (Darn it)
    :
    : Bitdog
    : PS, you took my bad post fairly well.
    : I fumed over yours for a day or so.
    : I'm getting over it though.
  • "[purple]One possibility to consider for your PushAll Macro might be to simply use a CS override:

    POP W CS:[$+x]

    If your program is in a .COM format (where CS=DS=ES=SS by default), that might work. If it's in an .EXE format (where it's hard to tell exactly what the Segment Registers might be), then it may not work. I haven't looked at or experimented with your code in enough detail to sort it all out.[/purple]"

    It wouldn't matter what executable format he was using because of the CS override. Hehe, why not just use PUSHA(D)/POPA(D) unless you need to prevent certain registers from being overwritten by the POP. For example:
    [code]
    [section. text]

    ;...

    Push_AX_BX_CX_DX:
    pop word [cs:TempIP]
    push ax
    push bx
    push cx
    push dx
    jmp word [cs:TempIP]

    Pop_AX_BX_CX_DX:
    pop word [cs:TempIP]
    pop ax
    pop bx
    pop cx
    pop dx
    jmp word [cs:TempIP]

    ;...

    [section .bss]

    TempIP resw 1

    [/code]
  • That's interesting info Asmguru62
    Something I never really considered before.
    I don't quite know how to apply it with Bret's info
    so I can write improved code yet though.

    Correct me if I'm wrong on these guesses below:
    I take it that a call in a loop, get's the called proc in cache
    after a couple of calls using "branch prediction" which speeds things up.
    (other than the call & return instructions are a few clocks.)

    I can't see a proc at the beginning of your code being called enough
    for it to qualify for "branch prediction"
    and then a call to the same proc at the end of your code an hour later
    expecting the code to still be held in cache.

    Bitdog
    PS, Bret, after trying everthing I could think of, the CS over ride
    was the only thing that allowed DS to be anywhere at call in PUSHALL proc.

    POP W CS:[$+x] ;this worked, but it makes a lot of machine code/bytes.

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories