From time to time I will share some stories based on true events, maybe someone will learn something from them. Then again, maybe not. To protect the innocent, some names and events might be edited. Here comes the first one.
Someone raised a ticket that their application cannot access a certain url, let’s say “http://My.url.tld”. You dutifully log in to the system in question and try to access the url. Since the app is using the “libcurl” library, you naturally try to test with the respective utility. You confirm that it does not work:
[user@someserver ~]$ curl http://My.url.tld
Error message.
[user@someserver ~]$
In the same time a colleague also sees the ticket but for some reason he does the testing by the way of “wget”. It’s working for him:
[user@someserver ~]$ wget http://My.url.tld
Correct result.
[user@someserver ~]$
You go back and forth with “it’s working”, “no, it’s not” messages until both of you realize that you test differently. So, it’s working with “wget” but not with “curl”. Baffling. What could be wrong ?
After running both utils in debug mode you spot a minute difference:
[user@someserver ~]$ curl -v http://My.url.tld
* About to connect() to My.url.tld port 80 (#0)
* Trying 1.1.1.1... connected
* Connected to My.url.tld (1.1.1.1) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl
> Host: My.url.tld:80
> Accept: */*
>
<
< Error message
<
<
* Connection #0 to host My.url.tld left intact
* Closing connection #0
[user@someserver ~]$
[user@someserver ~]$ wget -d http://My.url.tld
DEBUG output created by Wget 1.12 on linux-gnu.
Resolving My.url.tld... 1.1.1.1
Caching My.url.tld => 1.1.1.1
Connecting to My.url.tld|1.1.1.1|:80... connected.
Created socket 3.
Releasing 0x000000000074fb60 (new refcount 1).
---request begin---
GET / HTTP/1.0
User-Agent: Wget (linux-gnu)
Accept: */*
Host: my.url.tld:80
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Correct answer
---response end---
200 OK
Registered socket 3 for persistent reuse.
Length: 242 [text/xml]
Saving to: “filename”
100%[=========================================================================================================================================================================>] 242 --.-K/s in 0s
“filename” saved [242/242]
[user@someserver ~]$
Have you seen it ?
.
.
.
.
.
(suspense drumroll)
.
.
.
.
.
.
.
Turns out that wget is doing the equivalent of a tolower(“url”) so in the actual http request it’s sending “Host: my.url.tld” and curl it’s just taking what I specified in the command line, namely “Host: My.url.tld”. Taking the test test further it turns out that calling curl with the “only lowercase” url is producing the expected results (i.e. working).
I know what you are thinking, it should not matter how you call an hostname. True. Except that in this story there is a load balancer in the way, who tries (and mostly succeeds) to do smart stuff. Well, it turns out that there was an host-based string match in that load balancer that did not quite matched the mixed-case cases.
But a question remains. What is the correct behavior ? The “curl” or the “wget” one ? I lean on the “curl” approach but maybe I am biased. What do you think ?