Community Server

The platform that enables you to build rich, interactive communities
Welcome to Community Server Sign in | Join | Help
in Search

Possible bug in REMOVE command

Last post 03-08-2010, 4:42 PM by virgule. 2 replies.
Sort Posts: Previous Next
  •  03-07-2010, 4:47 AM 105

    Possible bug in REMOVE command

    Hello John,

    It's been a few days I haven't harassed you (!), so here's a new one:
    REMOVE command is very unhappy when the position vector is empty. This is odd because the TAKE command works fine in the same conditions. Try the code below using:
    - List =0,10 (no problem)
    - List = 1,10 (crash on REMOVE)

    PS: it would be worth mentioning in the help that not only TAKE and REMOVE are complements of each other, but also if one is concerned with execution speed, AND looking at a very small subset of a much bigger array, one should choose wisely whether to use TAKE or REMOVE as the execution time can vary by orders of magnitude (TAKE / REMOVE a small vector, but don't TAKE / REMOVE a very large vector).

    Rgds
    Gus
    ----------------sample code to crash REMOVE---------------
    CLEAROUTPUT
    DATA 1,10 List
    TAGS List = 0 NulList
    SIZE NulList SNL
    PRINT SNL
    TAKE List NulList List_A
    PRINT List_A
    REMOVE List NulList List_B
    PRINT List_B
    -----------------------------------------------------------
  •  03-08-2010, 8:34 AM 107 in reply to 105

    Re: Possible bug in REMOVE command

    Gus,

    As it turns out, I was working on the REMOVE command because I ran into a different issue with it: I didn't like the fact that it has the side effect that it reorders the elements of the position vector. In most cases that doesn't matter, but in one algorithm I was writing, it did matter. So I was fixing that when you posted your bug report. I also found that if a position vector has any duplicates, then REMOVE fails. It is also probably a rare thing to have duplicates, but it shouldn't fail in that circumstance.

    So, I've fixed all those known problems, including yours, in the REMOVE command in my working version and the next version (after version 1.4.7) of Statistics101 will have the corrected command.

    Re the relative speeds of REMOVE and TAKE (after the above fixes are made):

    1. REMOVE: if the position vector is empty and the input vector is a different vector from the output vector, REMOVE copies the input to the output. If the position vector is not empty, it sorts a copy of the position vector, then walks through the input vector and the sorted position vector copying elements one by one from input to output unless they are listed in the position vector. As it's walking through the vectors, it looks for and ignores duplicate position indexes. So the cost of REMOVE is approximately a position vector memory allocation and sort plus one complete scan of each of the input vector and of the sorted position vector. The cost is worse if the input vector is also the same as the result vector:
    REMOVE vecA vecB vecA
    That requires that the input be copied to a temporary vector so it isn't destroyed as it is being copied to the output, which is itself. So that costs an extra memory allocation and a scan (copy) of the input vector.

    Bottom line: The shorter the input vector and the position vector are, the lower the time cost of REMOVE. Also, if possible, avoid using the same vector for both input and output or for position vector and output.

    2.TAKE: This command walks over only the position vector, copying elements from the input vector that are at positions specified in the position vector. So a short position vector leads to a small time cost. Again, that cost is increased by one complete copy and memory allocation each if either or both the input or position vectors are the same as the result vector.

    Regards,

    John
  •  03-08-2010, 4:42 PM 109 in reply to 107

    Re: Possible bug in REMOVE command

    Thank you, looking forward to it.
    Gus
View as RSS news feed in XML
Powered by Community Server, by Telligent Systems