Subject:
|
Re: how does a line ends?
|
Newsgroups:
|
lugnet.cad
|
Date:
|
Sun, 8 Apr 2007 12:04:22 GMT
|
Viewed:
|
1756 times
|
| |
| |
In lugnet.cad, Anders Isaksson wrote:
|
Travis Cobbs wrote:
|
It seems to work, but Im not 100% confident in its lack of bugs.
Timing with 8464.mpd results in about 550ms for loading using fgets
and 750ms using myFgets from above. On the one hand, thats a 50%
slowdown based purely on that one change. On the other hand, 750ms
isnt very long, and while 8464.mpd isnt exactly a huge file, its
big enough to prove your point that the performance is fine.
|
As long as you only have one character to unget you could probably speed
it up by introducing a static char which holds the ungetted char (or
null), instead of going through ungetc() -- fgetc(). OTOH, the if-statement
to check if there is something in that variable will also take time (see
below for better ways).
|
I thought about using a static instead of ungetc() in the first version of
readLine() that I posted. This means that you can only read from one open file
at a time, but that is usually an OK restriction as long as the programmer is
aware of it. Otherwise its a lurking time bomb, though.
|
You might also win a bit of speed by using a switch instead of the
if-statments. Switch statements are usually well optimised by the compiler.
Having one case for r first, and another case for n first will also
eliminate some more of the if-statements (mis-predicted branching is
expensive on todays CPU:s).
|
The ordering of the if statements was one area where I thought this code might
be optimized, and probably where the most speed could be reclaimed from this
implementation. (But see below...)
|
But you would probably get the best performance by opening the file in
binary mode, reading full disk blocks into a buffer, and implement MyFgets
on top of that. No library calls, ungetc() is only a Ptr--; and so on.
|
This is a great idea, but a much more complex routine to implement and
debug/verify. Frankly, I would expect a decent standard library implementation
to do this anyway.
But really, the first rule of optimization is that before you start optimizing
how youre doing it, optimize what youre doing to eliminate unneccessary
steps. If we relax the semantics of fgets() slightly so that we can ignore
empty lines, there is a much simpler way to do this that doesnt use ungetc() at
all. Recognize either CR or LF as a newline, and simply discard any newlines
that occur at the start of a line:
char *myFgets(char *buf, int bufSize, FILE *file)
{
int i = 0;
int c;
while (i < bufSize - 1)
{
c = fgetc(file);
if (c == EOF)
{
buf[i] = 0;
if (i > 0)
return buf;
else
return NULL;
}
else if (c == '\r' || c == '\n')
{
if (i > 0)
{
buf[i] = '\n';
buf[i + 1] = 0;
return buf;
}
// else discard extra newlines at start/end of line
}
else
buf[i++] = (char)c;
}
buf[bufSize - 1] = 0;
return buf;
}
This implementation can avoid any support for ungetc() whatsoever. If we were
then to optimize this further using the fread() buffering optimization, it
should be even faster.
|
|
Message is in Reply To:
| | Re: how does a line ends?
|
| (...) As long as you only have one character to 'unget' you could probably speed it up by introducing a static char which holds the 'ungetted' char (or null), instead of going through ungetc() -- fgetc(). OTOH, the if-statement to check if there is (...) (18 years ago, 8-Apr-07, to lugnet.cad)
|
24 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|