Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories

Welcome to the new platform of Programmer's Heaven! We apologize for the inconvenience caused, if you visited us from a broken link of the previous version. The main reason to move to a new platform is to provide more effective and collaborative experience to you all. Please feel free to experience the new platform and use its exciting features. Contact us for any issue that you need to get clarified. We are more than happy to help you.

Efficient text file parsing

thegreenstarthegreenstar Posts: 173Member
I want to make a parser that will load info into a struct from a file. The format looks like this:

[code]
%Block Number One%
3
000.222.112.003.005.013+
321.012.012.350.015.015+
556.651.651.854.984.120+
[/code]

Where the first line (%Block Number One%) is loaded into an std::string, 3 is the number of block lines, and each block line is subdivided into cells. The cells are divided by .s, with + denoting a newline. I would actually rather just eliminate the + and use newline insead. Anyway, it is supposed to load each 'cell' into a struct containing three integers, and each number (like each 0 in 000, or the ones and the two in 112) is loaded into one of the three integers, to if we had

[code]
struct cell
{
int one, two, three;
};
[/code]


if we loaded 152 into it, one would equal 1, two would equal 5, and three would equal 2. Also, it needs to set one, two and three to -1 if the separator is a + sign (or a newline, which would be better). Sorry about the huge post, but I have no idea how to do this efficiently. Thanks.

Comments

  • blitzblitz Posts: 620Member
    : I want to make a parser that will load info into a struct from a file. The format looks like this:
    :
    : [code]
    : %Block Number One%
    : 3
    : 000.222.112.003.005.013+
    : 321.012.012.350.015.015+
    : 556.651.651.854.984.120+
    : [/code]
    :
    : Where the first line (%Block Number One%) is loaded into an std::string, 3 is the number of block lines, and each block line is subdivided into cells. The cells are divided by .s, with + denoting a newline. I would actually rather just eliminate the + and use newline insead. Anyway, it is supposed to load each 'cell' into a struct containing three integers, and each number (like each 0 in 000, or the ones and the two in 112) is loaded into one of the three integers, to if we had
    :
    : [code]
    : struct cell
    : {
    : int one, two, three;
    : };
    : [/code]
    :
    :
    : if we loaded 152 into it, one would equal 1, two would equal 5, and three would equal 2. Also, it needs to set one, two and three to -1 if the separator is a + sign (or a newline, which would be better). Sorry about the huge post, but I have no idea how to do this efficiently. Thanks.
    :

    This assumes that you have the file already open in text mode and the file pointer is positioned at the first line to be read:
    [code]
    void read_lines(FILE *f, size_t n_lines, vector &cells)
    {
    char buff[1024], *p;
    cell tmp_cell;

    while (n_lines--) {
    fgets(buff, sizeof buff, f);

    p = buff;
    do {
    tmp_cell.one = p[0] - '0';
    tmp_cell.two = p[1] - '0';
    tmp_cell.three = p[2] - '0';

    cells.push_back(tmp_cell);
    p += 4;
    } while (p[-1] != '+');

    tmp_cell.one = tmp_cell.two = tmp_cell.three = -1;
    cells.push_back(tmp_cell);
    }
    }
    [/code]
    It also assumes that the lines have a maximum determined length (1024 in this case) - for arbitrary long lines, you may have to read the input character by character...

    Also, here's a test function that uses the values stored in the cells vector to get back to the initial format (from the input file):
    [code]
    void print_cells(vector const &cells)
    {
    char buff[] = "000";
    bool first = true;

    for (vector::const_iterator it = cells.begin(); it != cells.end(); ++it)
    if (it->one >= 0) {
    buff[0] = '0' + it->one;
    buff[1] = '0' + it->two;
    buff[2] = '0' + it->three;
    if (!first) putchar('.'); else first = false;
    printf("%s", buff);
    } else {
    puts("+"); first = true;
    }
    }
    [/code]

    Regards,
    Blitz

Sign In or Register to comment.