Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories

Detailed Confusion

I've searched through many places for getting started with assembly, and the only assembler that makes a lick of sense is the one that comes with GNU CC (GCC). If you could please, hear me out and see how much I understand, I will make it entertaining.

I assembled the following working code (project.s) by:
gcc -o Program.exe project.s
[code]
CALLED:
.ascii "Hello, world!
"
.string "line$: 2"
.globl _main
_main:
pushl %ebp
movl %esp, %ebp
movl $CALLED, (%esp)
call _printf
movl $0, %eax
leave
ret
[/code]

To start off around the main, < .globl _main > basically sets the _main label globally so that the assembler can find it, since thats what it looks for first.

The _main label shows the start of every assembly program of this type.

pushl is a command to put the frame pointer (Footnote:1) into the stack. This somehow is required for some unknown fairy dust reason. The suffix 'l' stands for long which denotes 32bit (Footnote:2).

movl copies the stack pointer(esp) to the frame pointer (ebp). This is also because of how fairy dust controls the AT&T System V/386 assembler language.

the next movl command calls the CALLED label, and puts all those pretty variables into the stack pointer address in memory. (%address) is for RAM, %address is for registers.

now the printf function is finally called with the (%esp) as the parameter. Maybe also %ebp is a parameter I'm not clear.

Next line just puts 0 into the return register. All values that are to be returned or that are returned from calls go into the %eax register.

ret just says the program is over, go home.


A random question:
What is the purpose of assembled object files built by AS.exe?

A few shouts of starting ASM:

Lots of people say to start with TASM, NASM, FASM, basically put a couple random letters in front of ASM and I've heard it. But the most documentation is what I've seen with 'The Art of Assembly Language'. However I have stereotyped HLA (high level assembly) just as the same thing as using C or C++. I'm sure there are some good background info that can be learned from The Art.. but I don't see any 'assembly language'.

I've footnoted all my sources for this AT&T language, which sums up to be not a lot. If GCC changes everything from C++ into this AT&T language then its a very powerful low level language that I want to learn.

What places if any has tutorials for specifically this AT&T language that is used by GNU AS (GAS)?



Footnote 1:
http://www.gnu.org/software/binutils/manual/gas-2.9.1/html_mono/as.html#SEC200
Footnote 2:
http://www.gnu.org/software/binutils/manual/gas-2.9.1/html_mono/as.html#SEC198
Another source:
http://en.wikibooks.org/wiki/X86_Assembly/GAS_Syntax

Comments

  • BitByBit_ThorBitByBit_Thor Member Posts: 2,444
    : I've searched through many places for getting started with assembly,
    : and the only assembler that makes a lick of sense is the one that
    : comes with GNU CC (GCC). If you could please, hear me out and see
    : how much I understand, I will make it entertaining.
    :
    : I assembled the following working code (project.s) by:
    : gcc -o Program.exe project.s
    : [code]:
    : CALLED:
    : .ascii "Hello, world!
    "
    : .string "line$: 2"
    : .globl _main
    : _main:
    : pushl %ebp
    : movl %esp, %ebp
    : movl $CALLED, (%esp)
    : call _printf
    : movl $0, %eax
    : leave
    : ret
    : [/code]:
    :
    : To start off around the main, < .globl _main > basically sets the
    : _main label globally so that the [color=Red]assembler [/color]can find it, since thats
    : what it looks for first.
    :

    I'm affraid I can't answer that part. My ASM programming has been limited to the basic Intel command set and the flat binary format, but I would guess that what you say is correct, if you replace 'assembler' with 'linker'.

    : The _main label shows the start of every assembly program of this
    : type.

    C standard dictates that "int main(...)" is the entry point for the application. The prepended underscore ('_') is to follow the standard C naming convention, which prepends the underscore to each function name.


    : pushl is a command to put the frame pointer (Footnote:1) into the
    : stack. This somehow is required for some unknown fairy dust reason.
    : The suffix 'l' stands for long which denotes 32bit (Footnote:2).
    :

    C calling convention dictates that the function preserves ebp (amoung others).
    It's common practice to set up ebp to be the entry value of esp (the stack pointer). That way, ebp is a solid, non-changing pointer to the beginning of the function stack. Anywhere in the function, ebp+8 is the first argument passed to the function, ebp+12 the second, and so on.
    The stack looks like this in your function (after pushing ebp):
    [code]
    esp + 12 Second parameter ebp + 12
    esp + 8 First parameter ebp + 8
    esp + 4 Return address* ebp + 4
    esp Saved ebp ebp

    *) Any function (including _main) is called by a 'call' instruction
    The call instruction pushes the return address (the address of the
    instruction to execute after the call line) on to the stack and
    then jumps to the function
    [/code]
    Notice that in this case esp=ebp

    Then you ask, why not use esp, cause it's meant for it?
    Because of the following (note that I am unfamilair with the GCC syntax, so I'm not going to put it in code):
    First you get the first parameter using esp+8. But then if you [italic]push[/italic] something on to the stack, you have to use esp+12. And when you pop it off again it's back to esp+8 and if you push 20 things it'll be esp+88 for the first parameter.
    Using ebp, it's ALWAYS ebp+8 - nifty,
  • qhimqqhimq Member Posts: 16
    Thanks, a lot of what you said made sense.

    : : A random question:
    : : What is the purpose of assembled object files built by AS.exe?
    :
    : GCC is a C compiler, but it can not compile ASM code. You compile
    : the ASM code using AS.EXE into a machine code with still unresolved
    : external labels (such as _printf is still unresolved after AS.EXE).
    : Then, the GCC compiler compiles the _printf function and creates an
    : object file for that.
    : The linker solves the 'externals' (basically fills in an address for
    : _printf) and throws it all together in an executable format.
    :
    : To better understand this, look into the C files are turned into an
    : executable (compiling and linking):
    :
    so If I understand the process right:
    GCC compiles the .C to a .S
    AS.exe assembles the .S files into an object file...
    ld.exe links the object file with some other stuff(libraries?) to create a .exe

  • BitByBit_ThorBitByBit_Thor Member Posts: 2,444
    : so If I understand the process right:
    : GCC compiles the .C to a .S
    : AS.exe assembles the .S files into an object file...
    : ld.exe links the object file with some other stuff(libraries?) to
    : create a .exe
    :

    I think GCC might be able to compile ASM code.
    If so, it would be: GCC compiles the .C into an object file
    The linker links the object file(s) with the libraries to create an exe.

    I think this, because you said "I compiled the following working code with ... gcc". In which case, GCC is responsible for compiling the Assembly code (ofcourse, it's possible that GCC simply automates the call an Assembly program for you).

    Best Regards,
    Richard

    The way I see it... Well, it's all pretty blurry
Sign In or Register to comment.