Subject:
|
Re: URL characters
|
Newsgroups:
|
lugnet.publish
|
Date:
|
Mon, 26 Jul 1999 13:25:25 GMT
|
Viewed:
|
4707 times
|
| |
| |
Todd Lehman:
> BTW, I'm consciously going against what W3 says about the ~ symbol.
> According to the 'national' production here...
>
> http://www.w3.org/Addressing/URL/5_BNF.html
>
> ...the tilde character (ASCII code 126) isn't valid in URLs. I've
> seen people write %7E instead of ~, which is apparently the only
> correct way to write it, but almost no one is aware of that (as
> judged by the huge number of ~'s seen in URLs *everywhere* on the
> net). So practically speaking, it would be more broken (in users'
> minds) to disallow ~ than to allow it.
The primary reason for disallowing ~ is the special
treatment it gets in several European languages.
If people type <slash> <tilde> <s> <p> (expecting "/~sp")
some systems will just return "sp".
Similarly if people type <slash> <tilde> <n> <o> (expecting
"/~no"...) most systems will return "/ño".
I have actually seen the latter of these two misprints in a
Danish newspaper.
> So what's up with that, anyway? How the heck did ~ gain such huge
> popularity if it's not officially allowed in URLs?
Both the original CERN http daemon and Apache uses ~ for
user home pages.
> Was it allowed once upon a time?
Dunno.
It has probably been allowed until people started thinking
about which characters gives trouble.
"ð" is also disallowed, just because of a few silly American
operating systems.
Play well,
Jacob
------------------------------------------------
-- E-mail: sparre@cats.nbi.dk --
-- Web...: <URL:http://www.ldraw.org/FAQ/> --
------------------------------------------------
|
|
Message has 2 Replies: | | Re: URL characters
|
| (...) Wow. OK, that certainly makes sense. So, the hypothesis is that "~" may have been disallowed so that commonly available software (which used "~" for special formatting tricks) for certain languages didn't have to be altered to parse-recognize (...) (25 years ago, 26-Jul-99, to lugnet.publish)
| | | Re: URL characters
|
| (...) I'm guessing these are supposed to be letters + tildes on top. Funny thing, tho' - on my computer, which has a Hebrew + English system, I see them as hebrew letters. I've rarely seen that before - I think the only other time was when someone (...) (25 years ago, 6-Mar-00, to lugnet.publish)
|
Message is in Reply To:
| | URL characters
|
| (...) Gulp, I made the same mistake in my URL detection code on the web interface here. Just tightened up the set of allowable characters a bit and regression tested...much better now. BTW, I'm consciously going against what W3 says about the ~ (...) (25 years ago, 18-Jul-99, to lugnet.faq, lugnet.admin.general)
|
86 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|