To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.publishOpen lugnet.publish in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Publishing / 627
626  |  628
Subject: 
Re: URL characters
Newsgroups: 
lugnet.publish
Date: 
Mon, 26 Jul 1999 13:25:25 GMT
Viewed: 
4707 times
  
Todd Lehman:

BTW, I'm consciously going against what W3 says about the ~ symbol.
According to the 'national' production here...

   http://www.w3.org/Addressing/URL/5_BNF.html

...the tilde character (ASCII code 126) isn't valid in URLs.  I've
seen people write %7E instead of ~, which is apparently the only
correct way to write it, but almost no one is aware of that (as
judged by the huge number of ~'s seen in URLs *everywhere* on the
net).  So practically speaking, it would be more broken (in users'
minds) to disallow ~ than to allow it.

The primary reason for disallowing ~ is the special
treatment it gets in several European languages.

If people type <slash> <tilde> <s> <p> (expecting "/~sp")
some systems will just return "sp".

Similarly if people type <slash> <tilde> <n> <o> (expecting
"/~no"...) most systems will return "/ño".

I have actually seen the latter of these two misprints in a
Danish newspaper.

So what's up with that, anyway?  How the heck did ~ gain such huge
popularity if it's not officially allowed in URLs?

Both the original CERN http daemon and Apache uses ~ for
user home pages.

Was it allowed once upon a time?

Dunno.

It has probably been allowed until people started thinking
about which characters gives trouble.

"ð" is also disallowed, just because of a few silly American
operating systems.

Play well,

Jacob

      ------------------------------------------------
      --  E-mail:        sparre@cats.nbi.dk         --
      --  Web...:  <URL:http://www.ldraw.org/FAQ/>  --
      ------------------------------------------------



Message has 2 Replies:
  Re: URL characters
 
(...) Wow. OK, that certainly makes sense. So, the hypothesis is that "~" may have been disallowed so that commonly available software (which used "~" for special formatting tricks) for certain languages didn't have to be altered to parse-recognize (...) (25 years ago, 26-Jul-99, to lugnet.publish)
  Re: URL characters
 
(...) I'm guessing these are supposed to be letters + tildes on top. Funny thing, tho' - on my computer, which has a Hebrew + English system, I see them as hebrew letters. I've rarely seen that before - I think the only other time was when someone (...) (25 years ago, 6-Mar-00, to lugnet.publish)

Message is in Reply To:
  URL characters
 
(...) Gulp, I made the same mistake in my URL detection code on the web interface here. Just tightened up the set of allowable characters a bit and regression tested...much better now. BTW, I'm consciously going against what W3 says about the ~ (...) (25 years ago, 18-Jul-99, to lugnet.faq, lugnet.admin.general)

86 Messages in This Thread:
































Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR