To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.cadOpen lugnet.cad in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 CAD / 14525
14524  |  14526
Subject: 
Re: how does a line ends?
Newsgroups: 
lugnet.cad
Date: 
Sat, 7 Apr 2007 20:22:47 GMT
Viewed: 
1691 times
  
In lugnet.cad, Travis Cobbs wrote:
   In lugnet.cad, Chris Phillips wrote:
   I’ve been writing text parsing programs for over 20 years, and have found that the approach I’ve suggested works very well at detecting line ends in a consistent manner.

You’ll note that I didn’t really say that there was anything wrong with your parsing routine (other than some personaly negative feelings about fgetc and ungetc). After fixing the bugs, it will do exactly what you say it will do. It’s just that since I haven’t ever run into CR or LF+CR line endings in any LDraw file in close to seven years, I don’t personally feel that it is necessary.

Yes, sorry that my post sounded overly defensive. I have something of a chip on my shoulder from years of working alongside programmers who want to take shortcuts at the expense of their users. Very few seem to appreciate that it is worth a lot of effort by one developer to save a small amount of effort by many users. (What did Mr. Spock say about “the needs of the many...?”) It will likely be rare that anyone will be affected by this, but it is also not a lot of coding effort to sidestep the issue, either.

  
   You can pick apart my code until the cows come home, but the underlying heuristic works. Either CR or LF indicate end of line, and if the other character immediately follows, clump them together as a single line break.

I apologize. I wasn’t trying to pick apart your algorithm. There’s nothing wrong with it; my main argument was that I felt that fgets was acceptable instead for LDraw files.

Again, my choice of wording was clumsy, and I wasn’t taking offense. But I have learned that it is often a mistake to focus too heavily on optimizing code everywhere. The book Inner Loops tells some interesting stories about well-intentioned optimizations that actually hurt performance as CPU hardware evolved and the rules of the game shifted. Modern compilers can perform optimizations (ie: function inlining) that are erasing many of the old rules as well.

  
   How big is a typical CAD file? How often do you need to load one from disk? Is the microscopic performance difference even noticeable? Maybe back in the

Actually, quite a bit of file I/O goes into reading an LDraw file, due to the way the parts are formatted. This can be observed by loading a medium size file after a fresh reboot (timing the load), and then repeating the process after the files end up in the cache. The second load will be a little faster. On my computer, LDView takes 4-5 seconds to do the file reading for the 8464.mpd that comes with LDView the first time you load it, and 1-2 seconds the second time. (LDView says Loading... in the status bar during the file reading, then switches to Parsing... after that stage is complete.)

I suspect that a lot of the file I/O is actually due to the number of files that must be opened and closed, rather than the total number of characters that must be read from those files. (Windows actually seems to cache directory tables as they are accessed, making even a simple directory search run faster the second time through.) My gut tells me that you will not even be able to measure the performance difference between these two approaches.

Regardless, I imagine that most LDView users spend a very low percentage of their time loading a model as opposed to spinning, zooming, and viewing it.

  
   (I’ve surely wasted more time typing this sentence than I have spent waiting for fgetc() calls to return over the past 20 years.)

I can agree that may be true, but only by “spent waiting” you mean the extra time spent waiting vs. fgets (which I think is what you mean).

Right.

  
   Splitting hairs over a few CPU cycles in some infrequently-used routines does little or nothing for the overall performance of the program. OTOH, if the program has a nervous breakdown because of an entirely predictable situation, the user can waste a lot of time trying to work around the problem.

I’ll tell you what. I’ll drop your algorithm into LDView and do some empirical tests on the timing, and get back to you. I’ll post the final version of my fgets replacement along with the timing results.

I think a universal platform fgets() routine would be very useful to everybody. If we slap it into a DLL, even non-C programmers could use it.

  
   I guess the point I’m trying to make is that truly great software goes the extra mile to handle special cases so that the user doesn’t ever have to worry about them. If some users are having problems with line termination (and I assume they are since this is the second discussion thread on this topic in less than 2 weeks) then the software should be fixed. Changing the spec doesn’t help a user to load a poorly-formed file, it only gives the developer an excuse not to care.

While you’re correct here, my main point wasn’t that the program shouldn’t be made to take care of line endings, but that CR and LF+CR don’t seem to ever show up in LDraw files.

This may be generally true, but I think my point is that nobody has tight control over the universe of applications that may be used to generate CAD files for LDRAW. I suspect that the reason LDRAW uses a human-readable text format instead of a more compact binary file format was to facilitate the use of a wide range of editors to create content.

In fact, this gives me another idea: If file load times are a significant factor in the performance/responsiveness of CAD applications, maybe it would be worth “pre-compiling” the parts library into a binary format. All parts could be imported into one large file. Granted, storing floating-point numbers as ASCII is often more compact than native IEEE format, but you could basically blast the entire part library into RAM in one fell swoop without parsing anything. Of course, the program would need to be able to rebuild the “compiled” parts database whenever new parts were downloaded.

I’ll bet load times would drop down into the fractional second range with this one optimization.

   And while it’s true that good programs should handle unusual input conditions, you didn’t mention the flip side, which is that every extra line of code is an opportunity for new bugs.

True, but by that logic we should never write any code at all... ;)



Message is in Reply To:
  Re: how does a line ends?
 
(...) You'll note that I didn't really say that there was anything wrong with your parsing routine (other than some personaly negative feelings about fgetc and ungetc). After fixing the bugs, it will do exactly what you say it will do. It's just (...) (18 years ago, 7-Apr-07, to lugnet.cad, FTX)

24 Messages in This Thread:











Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR