Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In with Facebook Sign In with Google Sign In with OpenID

Categories

We have migrated to a new platform! Please note that you will need to reset your password to log in (your credentials are still in-tact though). Please contact lee@programmersheaven.com if you have questions.
Welcome to the new platform of Programmer's Heaven! We apologize for the inconvenience caused, if you visited us from a broken link of the previous version. The main reason to move to a new platform is to provide more effective and collaborative experience to you all. Please feel free to experience the new platform and use its exciting features. Contact us for any issue that you need to get clarified. We are more than happy to help you.

Reading a large file

OdDjObOdDjOb Posts: 13Member
I'm trying to open up a large text file (180mb) with +/-500 000 records inside. I have to process each line and insert it into my DB when it has passes the validations. The problem is that is takes 8rec/sec to process which means it takes a couple of hours to do the lot. Can anyone help me with a faster/efficient way of reading files?

Comments

  • zibadianzibadian Posts: 6,349Member
    : I'm trying to open up a large text file (180mb) with +/-500 000
    : records inside. I have to process each line and insert it into my DB
    : when it has passes the validations. The problem is that is takes
    : 8rec/sec to process which means it takes a couple of hours to do the
    : lot. Can anyone help me with a faster/efficient way of reading files?
    :
    :
    I'm more inclined to suspect the validation or the insert statements as the guilty party. Reading a text-file line by line is quite fast. For one of my projects I had to read and parse 40000 lines (~7 MB), which took about 30 seconds. Expanding that I would suspect that reading your file should take about 10 mins.
    You should test that using only the read part. Then adding the validation and finally adding only the insertions (with some general exception handling). This should give you a clear idea where the lack of speed comes from.
  • OdDjObOdDjOb Posts: 13Member
    : : I'm trying to open up a large text file (180mb) with +/-500 000
    : : records inside. I have to process each line and insert it into my DB
    : : when it has passes the validations. The problem is that is takes
    : : 8rec/sec to process which means it takes a couple of hours to do the
    : : lot. Can anyone help me with a faster/efficient way of reading files?
    : :
    : :
    : I'm more inclined to suspect the validation or the insert statements
    : as the guilty party. Reading a text-file line by line is quite fast.
    : For one of my projects I had to read and parse 40000 lines (~7 MB),
    : which took about 30 seconds. Expanding that I would suspect that
    : reading your file should take about 10 mins.
    : You should test that using only the read part. Then adding the
    : validation and finally adding only the insertions (with some general
    : exception handling). This should give you a clear idea where the
    : lack of speed comes from.

    Hi zibadian, thanks for the reply. I took your advice and uncommented all my validation & insert code. After 10min I was on record 30648 which means that my validations & inserts are slow, but still, the file reading is still way too slow. I'm using the assignfile & readln procedure. I need a method in which I can speed this up. Any suggestions on how I can achieve this?
  • zibadianzibadian Posts: 6,349Member
    : : : I'm trying to open up a large text file (180mb) with +/-500 000
    : : : records inside. I have to process each line and insert it into my DB
    : : : when it has passes the validations. The problem is that is takes
    : : : 8rec/sec to process which means it takes a couple of hours to do the
    : : : lot. Can anyone help me with a faster/efficient way of reading files?
    : : :
    : : :
    : : I'm more inclined to suspect the validation or the insert statements
    : : as the guilty party. Reading a text-file line by line is quite fast.
    : : For one of my projects I had to read and parse 40000 lines (~7 MB),
    : : which took about 30 seconds. Expanding that I would suspect that
    : : reading your file should take about 10 mins.
    : : You should test that using only the read part. Then adding the
    : : validation and finally adding only the insertions (with some general
    : : exception handling). This should give you a clear idea where the
    : : lack of speed comes from.
    :
    : Hi zibadian, thanks for the reply. I took your advice and
    : uncommented all my validation & insert code. After 10min I was on
    : record 30648 which means that my validations & inserts are slow, but
    : still, the file reading is still way too slow. I'm using the
    : assignfile & readln procedure. I need a method in which I can speed
    : this up. Any suggestions on how I can achieve this?
    :
    You could try to read the file into a TStrings object. Then use that object to parse the records.
    Another option is to use a TMemoryStream to store the entire file in, and then parse that to read the records. This is more difficult, because there's no implicit function/method to read a line from a TMemoryStream.
    If those two options don't work, then you'll need faster hardware.
  • OdDjObOdDjOb Posts: 13Member
    : : : : I'm trying to open up a large text file (180mb) with +/-500 000
    : : : : records inside. I have to process each line and insert it into my DB
    : : : : when it has passes the validations. The problem is that is takes
    : : : : 8rec/sec to process which means it takes a couple of hours to do the
    : : : : lot. Can anyone help me with a faster/efficient way of reading files?
    : : : :
    : : : :
    : : : I'm more inclined to suspect the validation or the insert statements
    : : : as the guilty party. Reading a text-file line by line is quite fast.
    : : : For one of my projects I had to read and parse 40000 lines (~7 MB),
    : : : which took about 30 seconds. Expanding that I would suspect that
    : : : reading your file should take about 10 mins.
    : : : You should test that using only the read part. Then adding the
    : : : validation and finally adding only the insertions (with some general
    : : : exception handling). This should give you a clear idea where the
    : : : lack of speed comes from.
    : :
    : : Hi zibadian, thanks for the reply. I took your advice and
    : : uncommented all my validation & insert code. After 10min I was on
    : : record 30648 which means that my validations & inserts are slow, but
    : : still, the file reading is still way too slow. I'm using the
    : : assignfile & readln procedure. I need a method in which I can speed
    : : this up. Any suggestions on how I can achieve this?
    : :
    : You could try to read the file into a TStrings object. Then use that
    : object to parse the records.
    : Another option is to use a TMemoryStream to store the entire file
    : in, and then parse that to read the records. This is more difficult,
    : because there's no implicit function/method to read a line from a
    : TMemoryStream.
    : If those two options don't work, then you'll need faster hardware.

    I already have tried reading it into a TString object, but the speed is roughly the same. I will try to use the TMemoryStream object and see if I have any success with it and will let you know.
  • rentorento Posts: 2Member
    Use TMemoryStream, I can read 700mb of smalls files (3kb each) in 3 seconds.
    Is more fast read all file and take in memory than you read 3kbytes from HD each time that you want the next line.
    I don't guess hard to find the lines in TMemoryStream, after load, you can create a function

    function readLineFromMemory(x: TMemoryStream): string;
    var
    s: string;
    c: char;
    i: integer;
    begin
    i := x.position;
    while (I < x.size) do
    begin
    x.read(c, size_of(c));
    inc(I);
    s := s + c;
    if c = #13 then break;
    end;
    result := s;
    end;

    I don't tested it.
Sign In or Register to comment.