Advanced hierarchical query - how to?

In my recent work I’ve encountered the following problem:

The scheme:

There is a table TASKS having the columns as (each column has the type NUMBER):


The ID is a primary key on the table.
The OPERATION_ID groups the task into operations.
The CREATION_ORDER defines a partial-order-relation on the TASKS for each OPERATION_ID (if the values for two records are equals they can be executed concurrently).
The SUB_OPERATION_ID works as an “include” and refers to the OPERATION_ID of this same table (here is the recursion! :-)).
The TASK_TYPE_ID is a foreign key to the to-be-executed task description.

The last two fields are excluding each other! (In other words: there is EXACTLY one of them being not null!)

The problem:

For a given OPERATION_ID, I have to give back the graph of tasks in the form:


where the first two columns refer to the ID field of the TASKS table. (PARENT_ID is NULL for the root tasks).

Restrictions and other information:
1. The number of rows affected in the query is a few dozens (not more than 100).
2. The whole table contains a few thousands of rows.
3. The algorithm could work in parallel and distributed environment.
4. Time is not essential while it is not more than a few seconds.
5. I cannot use tables for storing temporary data.

The drawback:

Take this example:

1 1 0 1
2 1 1 2
3 1 1 3
4 1 2 2
5 2 0 3
6 2 1 4
7 3 0 4
8 4 0 5

The result should be:

1 1
1 5 3
1 7 4
5 8 5
7 4 2
8 4 2

The problems:

1. In the result, the parent not always the immediate ancestor in the original database, because all the sub-operation calls removed from the graph. For example, the 1-2-5-7-8-4 is reduced to 1-5-8-4.
2. It is not a tree. There could be more than one parent of any node, so I could not take the parent by a stored function.

I tried a dozen of ways, using START WITH ... CONNECT BY, but either received wrong result, exceeded the time threshold or bumped into SQL restrictions.

Thank you for any idea,
Balazs VISSY


  • First, why can you not create a cursor? To combine the task and sub_0 into one?

    Can you post an example of your closest code to help me understand what you are trying to do?

  • : First, why can you not create a cursor? To combine the task and sub_0 into one?

    I have to use this query as an input for an interface that accepts only a query. Also, I cannot use a stored function for retreiving the parent, because any node could have more parents, and they could even be not on the same "level" of the original graph.

    : Can you post an example of your closest code to help me understand what you are trying to do?

    The first, almost complete -- but extremely slow -- solution was to build up all the node-pairs, and then check all pairs by a function wheather they are in parent-child relation. SO I received a table with the sheme (OPERATION_TASK_ID, OPERATION_TASK_ID, TASK_TYPE_ID, CONNECTED) where the last collumn contains the flag, calculated by the function. The problem, that this kind of query cannot be used in the FROM clause so I wasn't able separate the real pairs (CONNECTED=1).

    The second, slow, but woriking solution is based on several assumptions:

    1. The whole graph isn't deeper than 5 level (there is at most 5 level of operation-sub_operation recursion).
    2. There is no "bare" operation (operation not containing any real labels (TASK_TYPE_ID is not null)).

    Based on the above restrictions, I build a rather long (more than 330 lines) and incredible awful ;-) query, that is slow, but works well. I would gladly post it to you by mail, but I would not like to publish it here, for its size and hack-style. Please let me know if you are interested!

    Thax for your interest and possible future help.


Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!