Linked-list or array?

SephirothSephiroth Fayetteville, NC, USA
I'm working on a project that will load 3D models from an archive into memory for use by an OpenGL rendering engine. I am creating a C++ class (the entire project is C++ and will be cross-platform) for loading, manipulating, and freeing the objects. The thing is, I can allocate the class in an array and loop through the array or I could create it as a linked-list. My question is which method is better? Speed is a priority since there may be several thousand loaded at any point in time, and I can't wait for ages for it to locate a single object. Examples of my current usage ideas are below.

Array Method:
[code]
ObjectData *pObject;

pObject = (ObjectData)malloc(sizeof(ObjectData));

//Assume we loaded a bunch of objects in
for(int iLoop = 0; iLoop < ObjectCount; iLoop++)
if(pObject[iLoop].ID == WhateverID)
return pObject[iLoop];
[/code]

Linked-list Method:
[code]
ObjectData *pObject;

pObject = new class ObjectData;

//Assume we loaded a bunch in, again
return pObject->Find(WhateverID);
[/code]
In the second example, the "Find(int)" method would compare the specified ID to its own ID. If the ID matches, it returns itself, but if not and the pointer to the next "ObjectData" is not NULL, it calls the next object find method, and this continues until the object is found or until the next object pointer is NULL, and it returns NULL.
-[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/green][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h[/red][/b][/italic]

Comments

  • Of course the Array Method will be a lot faster.

    [code]ObjectData *pObject;

    pObject = (ObjectData)malloc(sizeof(ObjectData));

    //Assume we loaded a bunch of objects in
    for(int iLoop = 0; iLoop < ObjectCount; iLoop++)
    if(pObject[iLoop].ID == WhateverID)
    return pObject[iLoop];[/code]

    Because the Linked-list involves a function call which is (assembler/cpu) expensive.
    At least you get a sequence of...
    // pushl %ebp
    // movl %esp, %ebp
    // ...
    // leave
    // ret
    on each call, which obiously you dont have on Array Method.

    I dissasembled and tested both of them, with the minimal construction.

    -----

    I got this kind of problem writting an IRC Server, and writting a little game that checked collisions.
    The problem arises when you have a lot of objects, and want to locate only one, based on a ID or anything.
    So I apply the statement : Divide and Conquer.

    I used a tree, in the first case by number, and the second case by the type.
    I share my aproach, maybe it could speed things up. If you find this useless, ignore the post.

    [code] +--------+
    |--> | 1..100 | --> pObjects
    +--------+ | +--------+
    | MAIN |<---|
    +--------+ | +--------+
    |--> |100..200| --> pObjects
    +--------+

    +--------+
    |--> | WALL | --> pObjects
    +--------+ | +--------+
    | MAIN |<---|
    +--------+ | +--------+
    |--> | FLOOR | --> pObjects
    +--------+[/code]

    So, I got...

    ObjectMain *pObjectMain;
    ObjectKind *pObjectKind;
    ObjectData *pObjectData;

    I insert the record by the meaning of Main, it locates the correct hive, and creates a new hive if it doesnt exists.
    When I need one record, I search by Main too, but the hives make the search smaller and faster. At the end, Main does all the work. I hope it helps.
    [red]Good luck![/red]
    [blue]Hackman[/blue]
  • : Of course the Array Method will be a lot faster.
    :
    : [code]: ObjectData *pObject;
    :
    : pObject = (ObjectData)malloc(sizeof(ObjectData));
    :
    : //Assume we loaded a bunch of objects in
    : for(int iLoop = 0; iLoop < ObjectCount; iLoop++)
    : if(pObject[iLoop].ID == WhateverID)
    : return pObject[iLoop];[/code]:
    :
    : Because the Linked-list involves a function call which is
    : (assembler/cpu) expensive.
    : At least you get a sequence of...
    : // pushl %ebp
    : // movl %esp, %ebp
    : // ...
    : // leave
    : // ret
    : on each call, which obiously you dont have on Array Method.
    :
    : I dissasembled and tested both of them, with the minimal
    : construction.
    :
    : -----
    :
    : I got this kind of problem writting an IRC Server, and writting a
    : little game that checked collisions.
    : The problem arises when you have a lot of objects, and want to
    : locate only one, based on a ID or anything.
    : So I apply the statement : Divide and Conquer.
    :
    : I used a tree, in the first case by number, and the second case by
    : the type.
    : I share my aproach, maybe it could speed things up. If you find this
    : useless, ignore the post.
    :
    : [code]: +--------+
    : |--> | 1..100 | --> pObjects
    : +--------+ | +--------+
    : | MAIN |<---|
    : +--------+ | +--------+
    : |--> |100..200| --> pObjects
    : +--------+
    :
    : +--------+
    : |--> | WALL | --> pObjects
    : +--------+ | +--------+
    : | MAIN |<---|
    : +--------+ | +--------+
    : |--> | FLOOR | --> pObjects
    : +--------+[/code]:
    :
    : So, I got...
    :
    : ObjectMain *pObjectMain;
    : ObjectKind *pObjectKind;
    : ObjectData *pObjectData;
    :
    : I insert the record by the meaning of Main, it locates the correct
    : hive, and creates a new hive if it doesnt exists.
    : When I need one record, I search by Main too, but the hives make the
    : search smaller and faster. At the end, Main does all the work. I
    : hope it helps.
    : [red]Good luck![/red]
    : [blue]Hackman[/blue]

    Getting an individual object from an array is faster if you know the index of the object. Walking through a linked list is faster than walking through an array, if you use a non-recursive algorthm:
    [code]
    pObject = (ObjectData)malloc(sizeof(ObjectData));

    pTemp = pObject;
    while ((pTemp != null) && (pTemp.ID != WhateverID)) {
    pTemp = pTemp->next;
    }
    [/code]
    In this case the code doesn't need to find the pointer to the object based on the index in the array. You also don't need to allocate space on the stack during the loop.
  • Since you are dealing with a large amount of data, you would most likely want to store it in a sorted way, to increase efficiency. To just iterate through the container from start to end is inefficient no matter if it is an array or linked list.

    If you want to add/free objects in the middle of the list, an array is very unsuitable. A common linked list is more suitable, but it is however slow.

    I would suggest using a hash table with chaining. That way, finding/adding/freeing an object will take pretty much the same amount of time no matter where it is located. A hash table is also one of the most efficient containers when dealing with large amounts of data.
  • Yes, its a little bit faster.

    You loose the code for the " Loop index iLoop++ ", but you gain some code for the " pTemp = assignment ". The code generated is almost " cpu/cycles " equal.

    At the end, depends of your data structure, if you need an array, or a linked-list.

    [color=Blue]BUT AVOID THE FUNCTIONS CALLS.[/color]


    -----
    :
    : Getting an individual object from an array is faster if you know the
    : index of the object. Walking through a linked list is faster than
    : walking through an array, if you use a non-recursive algorthm:
    : [code]:
    : pObject = (ObjectData)malloc(sizeof(ObjectData));
    :
    : pTemp = pObject;
    : while ((pTemp != null) && (pTemp.ID != WhateverID)) {
    : pTemp = pTemp->next;
    : }
    : [/code]:
    : In this case the code doesn't need to find the pointer to the object
    : based on the index in the array. You also don't need to allocate
    : space on the stack during the loop.

    [red]Good luck![/red]
    [blue]Hackman[/blue]
  • : : Getting an individual object from an array is faster if you know the
    : : index of the object. [color=Red]Walking through a linked list is faster than
    : : walking through an array[/color], if you use a non-recursive algorthm:

    [color=Blue]In theory, how list walking can be faster, if to reach each element you need to use additional indirection by a pointer? Besides that, nodes of a list have bigger probability of causing page faults, because they (nodes) may be located in different locations in memory and memory can be paged out of physical room.

    Write a simple model: a few thousands nodes and an array of few thousand items and measure the walking for both cases with QueryPerformanceCounter().[/color]
  • That is true, linked lists will always be slower for iteration. If you think the CPU cycles are identical, you don't understand the difference in assembler/op-codes between direct access and access through a pointer.

    To prove this:

    [code]#include
    #include
    #include

    #define N 5000

    typedef struct node
    {
    int x;
    struct node* next;

    } Node;

    int main()
    {
    int* array;
    Node* head, *tmp;
    int i;
    LARGE_INTEGER ticks1, ticks2;



    array = malloc(N*sizeof(int));

    QueryPerformanceCounter(&ticks1);

    for(i=0; inext = tmp;
    }

    QueryPerformanceCounter(&ticks1);

    for(tmp=head; tmp!=NULL; tmp=tmp->next)
    {
    if(tmp->x == 123) /* "dummy code" */
    tmp->x = 456;
    }

    QueryPerformanceCounter(&ticks2);
    printf("Linked list: %ld
    ", ticks2.LowPart-ticks1.LowPart);

    while(head != NULL)
    {
    tmp = head->next;
    free(head);
    head = tmp;
    }

    getchar();
    return 0;
    }
    [/code]


    Anyway, go for a hash table for optimal speed, as I suggested in my previous post.
  • : If
    : you think the CPU cycles are identical, you don't understand the
    : difference in assembler/op-codes between direct access and access
    : through a pointer.
    :

    Ok... we are getting out... but...

    This code ...
    [code]class ObjectData{
    public:
    int ID;
    ObjectData *next;
    ObjectData *Find(int IDs) {
    if(IDs == ID)
    return this;
    else
    return next->Find(ID);
    }
    };

    ObjectData *pObject;

    int main() {
    int ObjectCount = 0;
    for(int iLoop = 0; iLoop < ObjectCount; iLoop++)
    if(pObject[iLoop].ID == 1) {
    break;
    }
    return 0;
    }[/code]
    Generates this code ...
    [code] movl $0, -12(%ebp)
    movl $0, -8(%ebp)
    jmp .L2
    .L3: <-- START THE LOOP
    movl -8(%ebp), %eax
    sall $3, %eax
    movl %eax, %edx
    movl pObject, %eax
    leal (%edx,%eax), %eax
    movl (%eax), %eax
    cmpl $1, %eax
    je .L4
    incl -8(%ebp)
    .L2:
    movl -8(%ebp), %eax
    cmpl -12(%ebp), %eax
    jl .L3
    .L4: <-- END THE LOOP 12 Ops Incl: 1(sall) & 1(leal)[/code]

    And this code ...
    [code]class ObjectData{
    public:
    int ID;
    ObjectData *next;
    ObjectData *Find(int IDs) {
    if(IDs == ID)
    return this;
    else
    return next->Find(ID);
    }
    };

    ObjectData *pObject;

    int main() {
    int ObjectCount = 0;
    ObjectData *pTemp = pObject;
    while ((pTemp != 0) && (pTemp->ID != 1)) {
    pTemp = pTemp->next;
    }
    return 0;
    }[/code]
    Generates this code ...
    [code] movl $0, -12(%ebp)
    movl pObject, %eax
    movl %eax, -8(%ebp)
    jmp .L2
    .L3: <-- START THE LOOP
    movl -8(%ebp), %eax
    movl 4(%eax), %eax
    movl %eax, -8(%ebp)
    .L2:
    cmpl $0, -8(%ebp)
    je .L4
    movl -8(%ebp), %eax
    movl (%eax), %eax
    cmpl $1, %eax
    jne .L3
    .L4: <-- END THE LOOP 9 Ops[/code]
    I don't have time to convert/count each op/cpu cycles, but [color=Red]AsmGuru62[/color] could tell us which one is faster. I believe the second example is faster.
    Of course it depends on my compiler which is GNU gcc4.1 with cpp4.1 on Linux. Without optimizations.


    [red]Good luck![/red]
    [blue]Hackman[/blue]
  • [color=Blue]
    Weird syntax, however, code for an array is not optimized properly - you have to look at both versions in an optimized build.

    Overall, code is the same, but pObject is always reloaded to access an array element and it should be optimized by a compiler. VC++ will do it nicely, but I am not sure about the GNU compiler - I do not use these.

    I stand by my statement of the array being faster and not only in cycles - you have to take into account the fact that contiguous memory (array) will not produce as many page faults as the nodes of a list scattered around in memory. One page fault is about a million cycles!

    Here is how the optimized code will look in theory:
    [/color]
    [code]
    ;
    ; ARRAY OF STRUCTURES
    ; TObject array [100];
    ;
    MOV ESI, array ; Load starting address of the array
    MOV EDX, 100 ; Load the array size

    DoSomething: ; Body of the loop
    MOV EAX, [ESI + offset of m_ID] ; Access data in a structure
    ...

    ADD ESI, sizeof (TObject) ; Move address to next element
    SUB EDX, 1 ; Checking if counter ended?
    JNZ DoSomething ; Not yet

    ; Loop ends here!
    ; ---------------------------------------------

    ;
    ; LINKED LIST
    ; TObject* pNode;
    ;
    MOV ESI, pNode ; Load the first node

    DoSomething: ; Body of list walking sequence
    MOV EAX, [ESI + offset of m_ID] ; Access data in a structure
    ...

    MOV ESI, [ESI + offset of pNode->Next] ; Get next node
    TEST ESI, ESI ; Check it for NULL
    JNZ DoSomething ; Not null, so continue

    ; Walk ends here!
    ; ---------------------------------------------
    [/code]
    [color=Blue]
    I think these codes are pretty close in size, however, they differ in a method of finding the next structure. ADD is faster than MOV with address calculation, maybe very slightly. And the code accessing a different memory areas can hit page faults.

    The best way is to check it is to use measurements on random allocations with a large number of elements in array and list.
    [/color]
  • : I'm working on a project that will load 3D models from an archive
    : into memory for use by an OpenGL rendering engine. I am creating a
    : C++ class (the entire project is C++ and will be cross-platform) for
    : loading, manipulating, and freeing the objects. The thing is, I can
    : allocate the class in an array and loop through the array or I could
    : create it as a linked-list. My question is which method is better?
    : Speed is a priority since there may be several thousand loaded at
    : any point in time, and I can't wait for ages for it to locate a
    : single object. Examples of my current usage ideas are below.
    :
    : Array Method:
    : [code]:
    : ObjectData *pObject;
    :
    : pObject = (ObjectData)malloc(sizeof(ObjectData));
    :
    : //Assume we loaded a bunch of objects in
    : for(int iLoop = 0; iLoop < ObjectCount; iLoop++)
    : if(pObject[iLoop].ID == WhateverID)
    : return pObject[iLoop];
    : [/code]:
    :
    : Linked-list Method:
    : [code]:
    : ObjectData *pObject;
    :
    : pObject = new class ObjectData;
    :
    : //Assume we loaded a bunch in, again
    : return pObject->Find(WhateverID);
    : [/code]:
    : In the second example, the "Find(int)" method would compare the
    : specified ID to its own ID. If the ID matches, it returns itself,
    : but if not and the pointer to the next "ObjectData" is not NULL, it
    : calls the next object find method, and this continues until the
    : object is found or until the next object pointer is NULL, and it
    : returns NULL.
    : -[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/gre
    : en][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h
    : [/red][/b][/italic]

    Use an array and use std::vector to take care of it [1]. Consider using a std::list when you use more insertions/erasures then random access.

    See ya, Bilderbikkel

    [1] Herb Sutter and Andrei Alexandrescu. C++ coding standards: 101 rules, guidelines, and best practices. ISBN: 0-32-111358-6, chapter 76: 'Use vector by default.

    bilderbikkel
  • [b]Your right ![/b]

    Weird syntax, ... [color=Blue]don't you like it?, I like it a lot![/color].

    This thing could last forever!. I think your right, but for the ultimate word about this, depends of the needs of the programmer; sometimes could not be done with some pre-defined way.

    I had done applications in which I had to drop speed for specific needs. Anyway, I would not rely on C++ or std:: for this kind of applications, I had seen 50,000 tri's on OpenGL @ 15 FPS, without proper optimizations and without proper video drivers; its a headache. OpenGL is C.

    Nice to "post" with you AsmGuru!
    [red]Good luck![/red]
    [blue]Hackman[/blue]
  • : Use an array and use std::vector to take care of it [1]. Consider
    : using a std::list when you use more insertions/erasures then random
    : access.
    :
    : See ya, Bilderbikkel
    :
    : [1] Herb Sutter and Andrei Alexandrescu. C++ coding standards: 101
    : rules, guidelines, and best practices. ISBN: 0-32-111358-6, chapter
    : 76: 'Use vector by default.
    :
    : bilderbikkel


    std::vector may or may not put the elements in sorted order. It is however always forced to behave as an array and will be inefficient when insertions need to be done in the middle of it. Using std::set or std::map will be more efficient, since elements will be added in sorted order and find() will then automatically use binary search or some other efficient way of searching.

    STL might still be a good idea though. Use std::map to implement a hash table and you will get code that is both efficient and easy to maintain. Some compilers provide a hashtable which they fake being part of STL, but there are none which is part of the standard, which is why you have to write it yourself.

    Or take one from the [link=http://www.codeguru.com/cpp/cpp/algorithms/hash/article.php/c5101/]internet[/link].
  • : [code]:
    : ;
    : ; ARRAY OF STRUCTURES
    : ; TObject array [100];
    : ;
    : MOV ESI, array ; Load starting address of the array
    : MOV EDX, 100 ; Load the array size
    :
    : DoSomething: ; Body of the loop
    : MOV EAX, [ESI + offset of m_ID] ; Access data in a structure
    : ...
    :
    : ADD ESI, sizeof (TObject) ; Move address to next element
    : SUB EDX, 1 ; Checking if counter ended?
    : JNZ DoSomething ; Not yet
    :
    : ; Loop ends here!
    : ; ---------------------------------------------
    :
    : ;
    : ; LINKED LIST
    : ; TObject* pNode;
    : ;
    : MOV ESI, pNode ; Load the first node
    :
    : DoSomething: ; Body of list walking sequence
    : MOV EAX, [ESI + offset of m_ID] ; Access data in a structure
    : ...
    :
    : MOV ESI, [ESI + offset of pNode->Next] ; Get next node
    : TEST ESI, ESI ; Check it for NULL
    : JNZ DoSomething ; Not null, so continue
    :
    : ; Walk ends here!
    : ; ---------------------------------------------
    : [/code]:

    Hmm, this proves that the array is faster.
    The linked list accesses two memory locations, while the array version only accesses one. This improves caching, and the locality of the array version is also very important.

    The array version could easily be optimized further (written from head):

    [code]
    MOV ESI, array + offset of m_ID
    MOV EDI, array + offset of m_ID + sizeof(TObject) * 101
    ; the extra to compensate so the loop doesn't end before it should later

    DoSomething:
    MOV EAX, [ESI] ; 1st cycle
    ADD ESI, sizeof (TObject) ; 2nd
    CMP EAX, 0 ; 2nd too, because of the pipe.
    JE found ; a lot of waiting... could be one cycle...

    CMP ESI, EDI ; 4th cycle
    JNE DoSomething ; same as above

    ; Could be 5 cycles in total

    ;;;;;;;;;;;;;;;;;;;;; Linked list ;;;;;;;;;;;;;;;;;;;;;;
    ; (and the linked list for consistency)

    MOV ESI, pNode ; Load the first node

    DoSomething:
    MOV EAX, [ESI + offset of m_ID] ; 1st cycle
    CMP EAX, 0 ; 2nd cycle
    JE found ; same as above

    MOV ESI, [ESI + offset of pNode->Next] ;4rd cycle
    TEST ESI, ESI ; 5th cycle
    JNZ DoSomething ; Same as above

    ; 6 cycles in the loop, one cycle slower...
    [/code]

    I'm not counting the caching here, that would give a lot of favor for the array version...
  • : I'm working on a project that will load 3D models from an archive
    : into memory for use by an OpenGL rendering engine. I am creating a
    : C++ class (the entire project is C++ and will be cross-platform) for
    : loading, manipulating, and freeing the objects. The thing is, I can
    : allocate the class in an array and loop through the array or I could
    : create it as a linked-list. My question is which method is better?
    : Speed is a priority since there may be several thousand loaded at
    : any point in time, and I can't wait for ages for it to locate a
    : single object. Examples of my current usage ideas are below.
    :
    : Array Method:
    : [code]:
    : ObjectData *pObject;
    :
    : pObject = (ObjectData)malloc(sizeof(ObjectData));
    :
    : //Assume we loaded a bunch of objects in
    : for(int iLoop = 0; iLoop < ObjectCount; iLoop++)
    : if(pObject[iLoop].ID == WhateverID)
    : return pObject[iLoop];
    : [/code]:
    :
    : Linked-list Method:
    : [code]:
    : ObjectData *pObject;
    :
    : pObject = new class ObjectData;
    :
    : //Assume we loaded a bunch in, again
    : return pObject->Find(WhateverID);
    : [/code]:
    : In the second example, the "Find(int)" method would compare the
    : specified ID to its own ID. If the ID matches, it returns itself,
    : but if not and the pointer to the next "ObjectData" is not NULL, it
    : calls the next object find method, and this continues until the
    : object is found or until the next object pointer is NULL, and it
    : returns NULL.
    : -[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/gre
    : en][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h
    : [/red][/b][/italic]

    Michael Abrash has written a very good book on this, which I would recommend you to read.

    My solution is to use a separate sorted balanced binary tree of indexes of an array.

    This can search in O(log2 n), which is the fastest there is.

    [code]
    p = root
    while(p!=null){
    if(ID

    p.value)
    p=p.right;
    else
    break; // a match!
    };
    [/code]
    And some assembly: :
    [code]
    mov ebx, root
    mov eax, ID
    loop:
    mov ecx, [ebx offset to value]
    cmp eax, ecx
    ja left
    jb right
    jmp found ; a match


    left:
    mov ebx, [ebx offset to left]
    cmp ebx, 0
    jne loop
    jmp notfound ; no match

    right:
    mov ebx, [ebx offset to right]
    cmp ebx, 0
    jne loop
    jmp notfound ; no match
    [/code]

    This is about 6 and a half cycle, but it searches much faster than those O(n) searches.

    HLGF!

  • Yeah but insertion will become slower as the amount of data increases, and you will need to balance the three too. So it depends on what's critical: the searching time, the insertion time, or both.
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories