Subject:
|
Re: how does a line ends?
|
Newsgroups:
|
lugnet.cad
|
Date:
|
Sat, 7 Apr 2007 20:22:47 GMT
|
Viewed:
|
1691 times
|
| |
| |
In lugnet.cad, Travis Cobbs wrote:
|
In lugnet.cad, Chris Phillips wrote:
|
Ive been writing text parsing programs for over 20 years, and have found
that the approach Ive suggested works very well at detecting line ends in a
consistent manner.
|
Youll note that I didnt really say that there was anything wrong with your
parsing routine (other than some personaly negative feelings about fgetc and
ungetc). After fixing the bugs, it will do exactly what you say it will do.
Its just that since I havent ever run into CR or LF+CR line endings in any
LDraw file in close to seven years, I dont personally feel that it is
necessary.
|
Yes, sorry that my post sounded overly defensive. I have something of a chip on
my shoulder from years of working alongside programmers who want to take
shortcuts at the expense of their users. Very few seem to appreciate that it is
worth a lot of effort by one developer to save a small amount of effort by many
users. (What did Mr. Spock say about the needs of the many...?) It will
likely be rare that anyone will be affected by this, but it is also not a lot of
coding effort to sidestep the issue, either.
|
|
You can pick apart my code until the cows come home, but the underlying
heuristic works. Either CR or LF indicate end of line, and if the other
character immediately follows, clump them together as a single line break.
|
I apologize. I wasnt trying to pick apart your algorithm. Theres nothing
wrong with it; my main argument was that I felt that fgets was acceptable
instead for LDraw files.
|
Again, my choice of wording was clumsy, and I wasnt taking offense. But I have
learned that it is often a mistake to focus too heavily on optimizing code
everywhere. The book Inner Loops tells some interesting stories about
well-intentioned optimizations that actually hurt performance as CPU hardware
evolved and the rules of the game shifted. Modern compilers can perform
optimizations (ie: function inlining) that are erasing many of the old rules as
well.
|
|
How big is a typical CAD file? How often do you need to load one from disk?
Is the microscopic performance difference even noticeable? Maybe back in
the
|
Actually, quite a bit of file I/O goes into reading an LDraw file, due to the
way the parts are formatted. This can be observed by loading a medium size
file after a fresh reboot (timing the load), and then repeating the process
after the files end up in the cache. The second load will be a little
faster. On my computer, LDView takes 4-5 seconds to do the file reading for
the 8464.mpd that comes with LDView the first time you load it, and 1-2
seconds the second time. (LDView says Loading... in the status bar during the
file reading, then switches to Parsing... after that stage is complete.)
|
I suspect that a lot of the file I/O is actually due to the number of files that
must be opened and closed, rather than the total number of characters that must
be read from those files. (Windows actually seems to cache directory tables as
they are accessed, making even a simple directory search run faster the second
time through.) My gut tells me that you will not even be able to measure the
performance difference between these two approaches.
Regardless, I imagine that most LDView users spend a very low percentage of
their time loading a model as opposed to spinning, zooming, and viewing it.
|
|
(Ive surely wasted more time typing this sentence than I have
spent waiting for fgetc() calls to return over the past 20 years.)
|
I can agree that may be true, but only by spent waiting you mean the extra
time spent waiting vs. fgets (which I think is what you mean).
|
Right.
|
|
Splitting hairs over a few CPU cycles in some infrequently-used routines
does little or nothing for the overall performance of the program. OTOH, if
the program has a nervous breakdown because of an entirely predictable
situation, the user can waste a lot of time trying to work around the
problem.
|
Ill tell you what. Ill drop your algorithm into LDView and do some
empirical tests on the timing, and get back to you. Ill post the final
version of my fgets replacement along with the timing results.
|
I think a universal platform fgets() routine would be very useful to everybody.
If we slap it into a DLL, even non-C programmers could use it.
|
|
I guess the point Im trying to make is that truly great software goes the
extra mile to handle special cases so that the user doesnt ever have to
worry about them. If some users are having problems with line termination
(and I assume they are since this is the second discussion thread on this
topic in less than 2 weeks) then the software should be fixed. Changing the
spec doesnt help a user to load a poorly-formed file, it only gives the
developer an excuse not to care.
|
While youre correct here, my main point wasnt that the program shouldnt be
made to take care of line endings, but that CR and LF+CR dont seem to ever
show up in LDraw files.
|
This may be generally true, but I think my point is that nobody has tight
control over the universe of applications that may be used to generate CAD files
for LDRAW. I suspect that the reason LDRAW uses a human-readable text format
instead of a more compact binary file format was to facilitate the use of a wide
range of editors to create content.
In fact, this gives me another idea: If file load times are a significant
factor in the performance/responsiveness of CAD applications, maybe it would be
worth pre-compiling the parts library into a binary format. All parts could
be imported into one large file. Granted, storing floating-point numbers as
ASCII is often more compact than native IEEE format, but you could basically
blast the entire part library into RAM in one fell swoop without parsing
anything. Of course, the program would need to be able to rebuild the
compiled parts database whenever new parts were downloaded.
Ill bet load times would drop down into the fractional second range with this
one optimization.
|
And while its true that good programs should handle
unusual input conditions, you didnt mention the flip side, which is that
every extra line of code is an opportunity for new bugs.
|
True, but by that logic we should never write any code at all... ;)
|
|
Message is in Reply To:
| | Re: how does a line ends?
|
| (...) You'll note that I didn't really say that there was anything wrong with your parsing routine (other than some personaly negative feelings about fgetc and ungetc). After fixing the bugs, it will do exactly what you say it will do. It's just (...) (18 years ago, 7-Apr-07, to lugnet.cad, FTX)
|
24 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|