Blog posts tagged: http
News and other things I find interesting
Last modified: Sunday, July 10, 2011
We see the http a URL scheme, just about every day:
http part in the example above is the URL scheme.
But there are also dozens of other URL schemes, including: ftp, mailto, irc, smb, chrome, about, snmp, and data. This post talks about one called the data URL scheme.
The data URL scheme, amongst other things, allows you to embed images into your HTML pages. That means that no separate HTTP request/response is needed to obtain such an image. It looks like this:
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" />
The first part of the URL is the scheme
data, followed by the mime type
image/png, optionally followed by base64 (if not specified assumes ASCII characters with encoded non printable characters).
The last part of the URI after the comma is the content of the file in the appropriate encoding.
The data URL scheme was specified in 1998 in RFC 2397 and has been implemented by most major browsers as of HTML4. Most major browsers already have pretty good coverage for HTML5. IE1-IE7 lack support.
The benefit of using the data URL scheme is that if the image is small, the overhead is less than the HTTP request/response headers. It also frees up concurrent connections since each browser has a maximum amount of connections it can make total and to each domain.
You wouldn't want to use the data URL scheme for large images, or if you require support for IE7 and below. Your image won't be separately cached either, so this means that it will be downloaded with each request to the parent HTML page. You can get around this last limitation though by specifying your data URL inside an already cached CSS file with a CSS rule background:url('data:image/png;base64,...);
Overall it is a good thing to use and I'd use it for social icons in HTML.
There are also many other uses of the data URL scheme mentioned below.
Uses of the data URI scheme:
You may have noticed that sometimes emails with images don't have a separate attachment. You can use the data URI scheme inside HTML email messages without having a separate image attachment.
The new HTML5
<canvas> element allows you to export your canvas to a data URL.
You can do this with <canvas>.toDataURL
How it relates to me:
I recently improved the BMP and ICO decoder (refactoring plus adding support for PNG ICOs) for Firefox. I also have to implement BMP and ICO encoders so that we can have better shell integration with Windows 7.
A side effect of doing these ICO and BMP encoders is that Firefox will support BMP and ICO generation via the
This makes Firefox a pretty good image conversion program.
This also makes it possible for example, for a web page developer to implement a favicon creator without server side code. No other browsers currently implement BMP and ICO mime types for canvas exporting.
Last modified: Sunday, June 12, 2011
This article will cover the following topics:
- An overview of the basics of HTTP
- What is HTTP pipelining?
- What problems can appear with HTTP pipelining?
- Why you should care about HTTP pipelining?
- Which web servers support HTTP pipelining?
- Which browsers support HTTP pipelining? (And how to enable it)
- Which programming languages/libraries support HTTP pipelining?
An overview of the basics of HTTP
The HTTP protocol works by sending requests and getting responses back for those requests.
I will not get into the details of the HTTP protocol syntax. Details about headers, HTTP methods, paths, parameters, etc., as this post would be too long. Instead I'll just cover some basics and then dive right into explaining HTTP pipelining. But I will show a basic HTTP GET request and response.
A typical HTTP request looks something like this:
GET / HTTP/1.1 Host: www.brianbondy.com User-Agent: Mozilla/5.0 Connection: keep-alive
A typical HTTP response looks something like this:
HTTP/1.1 200 OK Content-Type: text/html; charset=utf-8 Content-Encoding: gzip Server: Google Frontend Content-Length: 12100 ...content...
On a single socket, a single request is sent out, and then a single response is retrieved.
A browser or other HTTP client could create multiple sockets to a server and make multiple requests. The picture on the right shows 2 HTTP requests and responses on 2 different sockets.
Pretty much all web browsers do multiple connections per server today.
In Firefox you can adjust this amount by going to
about:config and adjusting:
Mine was initially defaulted to 15.
Several requests to a single server are very typical. For example an HTML file can have several referenced images.
To avoid creating several connections, HTTP 1.1 introduced persistent connections.
The picture on the right shows 3 requests and responses on a single persistent connection.
Having several connections can give better speed, but if you need to create a new connection for each and every request, it will use much more resources, require more TCP handshakes, and will be susceptible to TCP slow-start.
If you look back at the example HTTP request above, the HTTP header: "Connection: keep-alive" indicates that you would like to use a persistent connection. The default is to use a persistent connection, but the server is not forced to do this, and it can send a "Connection: close" header.
What is HTTP pipelining?
HTTP pipelining is a feature of HTTP 1.1 persistent connections. It means that you can send multiple requests on the same socket without waiting for each response.
The picture on the right shows 6 requests and responses using at most 3 requests at a time.
HTTP is based on TCP, and one of TCP's guarantees is ordered delivery. This means that all of the requests sent out on the same socket, will be received in that order on the server. An HTTP server that supports HTTP pipelining will send its responses in the same order.
HTTPS pipelining is also possible with secure HTTP connections and it gives an even greater degree of speed because of the extra needed SSL/TLS handshakes.
What problems can appear with HTTP pipelining?
Although the HTTP 1.1 RFC indicates that HTTP implementations should support persistent connections, it is possible that they will not.
You can't be sure if an HTTP server supports HTTP pipelining before making a request.
The server may even send a "Connection: Close" header after your first request is sent indicating it does not want to use a persistent connection.
There could be proxies in between as well which cause problems, making an HTTP client black list approach to determining which servers support persistent connections not ideal.
Based on the HTTP 1.1 RFC, if a client finds that a pipelined connection is not supported, the client should re-attempt the failed requests.
To avoid problems with a server getting 2 of the same requests and the client not knowing it, the client should only use pipelining on HTTP methods which are idempotent. In general, idempotence means that you can apply the same operation 1 or many times, and it will have the same effect. Example: setting a variable x to the value of 3 is an idempotent operation. Setting a variable to one more than its last value is NOT an idempotent operation.
In terms of HTTP, PUT and DELETE are idempotent operations, GET, HEAD, OPTIONS and TRACE should be idempotent and HTTP POST is probably not. In practice, most browsers that do support pipelining only do so for GET and HEAD requests.
Sometimes it's hard for a client to determine if the server's response is valid or garbage.
Requests using pipelining to servers which don't support pipelining need to be retried and so it would be slower.
It would be nice, but servers do not currently tell a client that they support pipelining. If all servers did, then only the first request would need to be non-pipelined if the client didn't already know if the server had support.
Why you should care about HTTP pipelining?
TCP/IP packets can be reduced. The typical maximum segment size (MSS) is in the range of 536 to 1460 bytes, and so several HTTP requests could fit into a single packet. It would also reduce the total number of packets. Also there are wins with the congestion control strategy, connection handshake, connection teardown and SSL handshake.
What this means is that you can get much faster page loads by using HTTP pipelining.
I've been using it in Opera and Firefox and have not run into problems.
Which web servers support HTTP pipelining
Most modern web servers support HTTP pipelining. IIS 4.0 is said to not have support for it.
Which browsers support HTTP pipelining? (And how to enable it)
- Google Chrome: No
- Safari: No
- Internet Explorer: No
- Opera: Yes
- Firefox: Yes, but you need to enable it by following the below steps.
You can adjust HTTP pipelining settings in Firefox by changing the following settings in
For HTTP pipelining:
For HTTP proxy pipelining: (Use this if you want to try pipelining and you use a proxy server)
For HTTPS pipelining:
To adjust the number of requests to send at once:
network.http.pipelining.maxrequests to 8. The pipelining picture above would have a value of 3 here.
network.http.max-connections-per-serversetting is clamped between 1 and 255. (This setting has nothing to do with pipelining but you can adjust it)
network.http.pipelining.maxrequestssetting is clamped between 1 and NS_HTTP_MAX_PIPELINED_REQUESTS which is defined to be 8. Unless you compile your own builds, a value of 8 is the most you can try with Firefox.
Which programming languages/libraries support HTTP pipelining?
Many popular programming libraries across most programming languages support pipelining.
For example, here's a small subset list of libraries that support pipelining:
- Python: httplib2, Twisted
- .NET Framework: System.Net.HttpWebRequest
- C++: Qt's QNetworkRequest, libcurl
Last modified: Friday, April 22, 2011
Here's a quick tutorial on how email works, SMTP, POP3, IMAP, Webmail, ...
What is a Standard?
A standard is a set of rules that are followed by all developers around the world. Some standards include HTTP, SMTP, POP3, …
There is official documentation that describes each individual standard and most standards have been around for 0 to 30 years.
Each standard document is a very detailed explanation of what the standard is and how it works. Typically a standard has an RFC number associated with it, but there are many different types of standards.
SMTP and POP3 are ‘standards’. Each standard describes a different protocol. A protocol is any kind of communication between 2 or more computers.
What is SMTP?
SMTP is the ‘standards’ protocol that is used to send email. Your computer uses SMTP to send email. See RFC 821, August 1982
What is POP3?
POP3 is the ‘standards’ protocol that is used to receive email. Your computer uses POP3 to receive email. POP3 is also referred to as simply POP. See RFC 1939, May 1996.
POP3 typically will connect to the mail server and download messages to your computer. It can then optionally delete the message from the server (which it is usually setup to do).
How Email works
- User A wants to send an email to user B.
- User A writes up an email and presses send.
- User A’s computer, uses SMTP communication to send the email to User A’s (Yes A, not B) SMTP server.
- User A’s SMTP server, sends the email to user B’s SMTP server using SMTP communication.
- User B when he feels like it, contacts his SMTP server and uses POP3 to download the messages.
Some important notes:
The only way to send email is to use SMTP. (Actually you can also use MAPI and some other things but let's not get into that)
The only way to receive email is to use POP3. (Actually there is also IMAPv4, but we'll pretend that POP3 is the only way)
How Email Applications work:
SMTP communication is present on your computer, no matter what email client you use. Any time an email is sent out, your computer uses SMTP to send the email. It doesn’t matter if you're using Eudora, Outlook, Outlook Express, Mozilla Thunderbird, or a custom made program. All programs use SMTP to send emails.
By using standards you are guaranteed that, even know user A uses Outlook, and user B uses Eudora, and they both have different SMTP servers both of the users will be able to communicate.
What is HTTP?
Before I can get to what web mail is, you first need to know what HTTP is. HTTP is just another standard protocol. But HTTP is meant to download files and web pages, unlike SMTP which is meant to send emails. See HTTP 1.1 RFC 2616, June 1999.
What is web mail?
Web mail is an online web page that allows you to send and receive emails using HTTP.
But wait a minute, didn’t I just say that the ONLY way to send email was using SMTP?
Yes! What the web page does, is provide you with a form that you fill out. Your computer doesn’t know that it is any different from a form that you fill out to enter your credit card information, or a form that you fill out to enter your home address, or a form that you fill out to sign into another web site. All your computer knows is that you are filling out a form.
When you press the send button, your web browser sends the form to the server. The server knows that this form is for email though. So the server interprets the form and extracts the needed information. The HTTP server then uses SMTP to send the message. Because the only way that a message is going to get from User A to User B is using SMTP.
What the web browser has done is fooled you into thinking that you are sending an email. But what’s really happening, is that your web browser is filling out a form, and then the web server is using SMTP to send your email.
Can you give me a web mail walk through ?
- User A wants to send an email to User B, User A is going to use web mail.
- User A uses his browser to type in an internet address (for example: www.hotmail.com).
- User A’s computer uses HTTP to contact the server and ask for the web page that is used for web mail in this case.
- The server responds (using HTTP) to User A’s computer with a web page that gives him options to compose mail, check mail, …
- User A clicks on the compose a message link. Again User A’s computer uses HTTP to contact the server.
- The server responds (using HTTP) to User A’s computer with the web page (which contains a form) that allows User A to compose a message.
- User A fills in the web page and presses send. The page is sent back to the server using HTTP.
- In the background, unknown to User A, the web server uses SMTP to send the email to User B. Why? Because the only way to send an email is to use SMTP
- The server responds (using HTTP) to User A’s computer with a web page that says the email was sent.
How does the web server use SMTP?
Since SMTP is a standard protocol it uses SMTP in the same way any program would use SMTP. See the section ‘How email works’.
What is IMAPv4?
I mentioned IMAPv4 earlier. IMAPv4 is a second method used by email clients to retrieve your emails. IMAPv4 is also referred to as more simply IMAP. IMAPv4 is more complex than POP3, but gives you the ability to work on your email from multiple computers. If you use more than one computer, and you'd like to access your email from both computers, IMAP is the way to go.
IMAP stores all of its data on the mail server. In that way each mail client from each different computer can be in sync. When you read an email from one computer, your work computer will also see that the message is read. Since data is stored on the server, IMAP email accounts are typically more expensive.