Blog posts for year 2011 page 2

News and other things I find interesting


RSS Feed


Jul
10
2011

Web optimizations and canvas exporting with data URLs

Last modified: Sunday, July 10, 2011

We see the http a URL scheme, just about every day:

http://www.brianbondy.com

The http part in the example above is the URL scheme.

But there are also dozens of other URL schemes, including: ftp, mailto, irc, smb, chrome, about, snmp, and data. This post talks about one called the data URL scheme.

The data URL scheme, amongst other things, allows you to embed images into your HTML pages. That means that no separate HTTP request/response is needed to obtain such an image. It looks like this:

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" />

The first part of the URL is the scheme data, followed by the mime type image/png, optionally followed by base64 (if not specified assumes ASCII characters with encoded non printable characters). The last part of the URI after the comma is the content of the file in the appropriate encoding.

The data URL scheme was specified in 1998 in RFC 2397 and has been implemented by most major browsers as of HTML4. Most major browsers already have pretty good coverage for HTML5. IE1-IE7 lack support.

The benefit of using the data URL scheme is that if the image is small, the overhead is less than the HTTP request/response headers. It also frees up concurrent connections since each browser has a maximum amount of connections it can make total and to each domain.

You wouldn't want to use the data URL scheme for large images, or if you require support for IE7 and below. Your image won't be separately cached either, so this means that it will be downloaded with each request to the parent HTML page. You can get around this last limitation though by specifying your data URL inside an already cached CSS file with a CSS rule background:url('data:image/png;base64,...);

Overall it is a good thing to use and I'd use it for social icons in HTML.
There are also many other uses of the data URL scheme mentioned below.


Uses of the data URI scheme:

You may have noticed that sometimes emails with images don't have a separate attachment. You can use the data URI scheme inside HTML email messages without having a separate image attachment.

The new HTML5 <canvas> element allows you to export your canvas to a data URL. You can do this with <canvas>.toDataURL


How it relates to me:

I recently improved the BMP and ICO decoder (refactoring plus adding support for PNG ICOs) for Firefox. I also have to implement BMP and ICO encoders so that we can have better shell integration with Windows 7.

A side effect of doing these ICO and BMP encoders is that Firefox will support BMP and ICO generation via the <canvas>.toDataURL('image/ico') and <canvas>.toDataURL('image/bmp'). This makes Firefox a pretty good image conversion program. This also makes it possible for example, for a web page developer to implement a favicon creator without server side code. No other browsers currently implement BMP and ICO mime types for canvas exporting.

Tags:

Add a new comment | 1 comment(s)

Gravatar image anon on Wednesday, September 28, 2011 (10:09:04) says:

Mad props! Sounds neat.





Jun
12
2011

What you should know about HTTP pipelining

Last modified: Sunday, June 12, 2011

This article will cover the following topics:

  • An overview of the basics of HTTP
  • What is HTTP pipelining?
  • What problems can appear with HTTP pipelining?
  • Why you should care about HTTP pipelining?
  • Which web servers support HTTP pipelining?
  • Which browsers support HTTP pipelining? (And how to enable it)
  • Which programming languages/libraries support HTTP pipelining?

An overview of the basics of HTTP

The HTTP protocol works by sending requests and getting responses back for those requests.

I will not get into the details of the HTTP protocol syntax. Details about headers, HTTP methods, paths, parameters, etc., as this post would be too long. Instead I'll just cover some basics and then dive right into explaining HTTP pipelining. But I will show a basic HTTP GET request and response.

A typical HTTP request looks something like this:

GET / HTTP/1.1
Host: www.brianbondy.com
User-Agent: Mozilla/5.0 
Connection: keep-alive

A typical HTTP response looks something like this:

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Server: Google Frontend
Content-Length: 12100
...content...

On a single socket, a single request is sent out, and then a single response is retrieved.

A browser or other HTTP client could create multiple sockets to a server and make multiple requests. The picture on the right shows 2 HTTP requests and responses on 2 different sockets.

Pretty much all web browsers do multiple connections per server today.

In Firefox you can adjust this amount by going to about:config and adjusting:
network.http.max-connections-per-server

Mine was initially defaulted to 15.


Several requests to a single server are very typical. For example an HTML file can have several referenced images.

To avoid creating several connections, HTTP 1.1 introduced persistent connections.

The picture on the right shows 3 requests and responses on a single persistent connection.

Having several connections can give better speed, but if you need to create a new connection for each and every request, it will use much more resources, require more TCP handshakes, and will be susceptible to TCP slow-start.

If you look back at the example HTTP request above, the HTTP header: "Connection: keep-alive" indicates that you would like to use a persistent connection. The default is to use a persistent connection, but the server is not forced to do this, and it can send a "Connection: close" header.


What is HTTP pipelining?

HTTP pipelining is a feature of HTTP 1.1 persistent connections. It means that you can send multiple requests on the same socket without waiting for each response.

The picture on the right shows 6 requests and responses using at most 3 requests at a time.

HTTP is based on TCP, and one of TCP's guarantees is ordered delivery. This means that all of the requests sent out on the same socket, will be received in that order on the server. An HTTP server that supports HTTP pipelining will send its responses in the same order.

HTTPS pipelining is also possible with secure HTTP connections and it gives an even greater degree of speed because of the extra needed SSL/TLS handshakes.


What problems can appear with HTTP pipelining?

Although the HTTP 1.1 RFC indicates that HTTP implementations should support persistent connections, it is possible that they will not.

You can't be sure if an HTTP server supports HTTP pipelining before making a request.
The server may even send a "Connection: Close" header after your first request is sent indicating it does not want to use a persistent connection.

There could be proxies in between as well which cause problems, making an HTTP client black list approach to determining which servers support persistent connections not ideal.

Based on the HTTP 1.1 RFC, if a client finds that a pipelined connection is not supported, the client should re-attempt the failed requests.

To avoid problems with a server getting 2 of the same requests and the client not knowing it, the client should only use pipelining on HTTP methods which are idempotent. In general, idempotence means that you can apply the same operation 1 or many times, and it will have the same effect. Example: setting a variable x to the value of 3 is an idempotent operation. Setting a variable to one more than its last value is NOT an idempotent operation.

In terms of HTTP, PUT and DELETE are idempotent operations, GET, HEAD, OPTIONS and TRACE should be idempotent and HTTP POST is probably not. In practice, most browsers that do support pipelining only do so for GET and HEAD requests.

Sometimes it's hard for a client to determine if the server's response is valid or garbage. Requests using pipelining to servers which don't support pipelining need to be retried and so it would be slower.

It would be nice, but servers do not currently tell a client that they support pipelining. If all servers did, then only the first request would need to be non-pipelined if the client didn't already know if the server had support.


Why you should care about HTTP pipelining?

TCP/IP packets can be reduced. The typical maximum segment size (MSS) is in the range of 536 to 1460 bytes, and so several HTTP requests could fit into a single packet. It would also reduce the total number of packets. Also there are wins with the congestion control strategy, connection handshake, connection teardown and SSL handshake.

What this means is that you can get much faster page loads by using HTTP pipelining.

I've been using it in Opera and Firefox and have not run into problems.


Which web servers support HTTP pipelining

Most modern web servers support HTTP pipelining. IIS 4.0 is said to not have support for it.


Which browsers support HTTP pipelining? (And how to enable it)

  • Google Chrome: No
  • Safari: No
  • Internet Explorer: No
  • Opera: Yes
  • Firefox: Yes, but you need to enable it by following the below steps.

You can adjust HTTP pipelining settings in Firefox by changing the following settings in about:config

For HTTP pipelining: Set network.http.pipelining to true

For HTTP proxy pipelining: (Use this if you want to try pipelining and you use a proxy server) Set network.http.proxy.pipelining to true

For HTTPS pipelining: Set network.http.pipelining.ssl to true

To adjust the number of requests to send at once: Set network.http.pipelining.maxrequests to 8. The pipelining picture above would have a value of 3 here.

Note:

  • The network.http.max-connections-per-server setting is clamped between 1 and 255. (This setting has nothing to do with pipelining but you can adjust it)
  • The network.http.pipelining.maxrequests setting is clamped between 1 and NS_HTTP_MAX_PIPELINED_REQUESTS which is defined to be 8. Unless you compile your own builds, a value of 8 is the most you can try with Firefox.

Which programming languages/libraries support HTTP pipelining?

Many popular programming libraries across most programming languages support pipelining.

For example, here's a small subset list of libraries that support pipelining:

  • Python: httplib2, Twisted
  • .NET Framework: System.Net.HttpWebRequest
  • C++: Qt's QNetworkRequest, libcurl

Tags:

Add a new comment | 1 comment(s)

Gravatar image Lennie on Saturday, March 10, 2012 (07:03:06) says:

A lot of mobile browsers these days have pipelining enabled by default.





Jun
8
2011

Switching careers, starting soon with Mozilla

Last modified: Wednesday, June 08, 2011

On July 6th, I will be going through a major change in my life: I’ll be leaving the company I co-founded and worked at for nearly a decade, and will be starting at Mozilla as a contractor.

I'm confident that the company I co-founded will continue to prosper under its new parent company, and I will be excited to hear about the company's success over time.

I'll leave behind many great memories, exciting projects, and extremely intelligent co-workers.

A few months ago, I was approached by a Mozilla technical recruiter and after careful consideration actually turned down the offer for contract work. For several months, I was torn on whether to stay at my current comfy job, or to step out of my comfort zone, and do what I've always wanted to do.

Several months later, reading John Resig's (creator of jQuery) advice, I reconsidered and took the job.

It’s been an incredible experience working with everyone at Mozilla. The company is easily one of the most developer-friendly organizations I can imagine, with some of the smartest coders in the world. Mozilla is hiring across the board – I strongly encourage you to apply if you’re looking for one of the best jobs you’ve ever had.

I'll be working as a contractor at Mozilla with the title of Platform Engineer. I'll be working on Firefox and Core components which includes working with XPCOM, XUL, XBL, JavaScript, CSS, Python, and C++. I feel very privileged to be working on such an exciting project.



Why I chose Mozilla

A few reasons why I chose Mozilla:

  • Much of the web's technology and innovation was developed at least in part by Mozilla.
  • The Internet wouldn't be what it is today without Mozilla.
  • They are a relatively small and open source organization; your work will be seen.
  • They support openness and stand behind their beliefs to deliver a free and open web.
  • Some of the smartest people in the world work there, my old company also had this, but the other people had similar backgrounds to me.
  • Firefox has a significant market share and is used by millions of people around the world.
  • Mozilla has a thriving extremely intelligent community behind it.
  • They are not forcing me to move and they embrace a distributed team.
  • They do what is right, and not what makes the most money.
  • I get to be part of something larger than myself.
  • Mozilla is one of the coolest places and code bases I could ever have the privilege to work on and contribute to.

Mozilla is accelerating their release cycle as well which is exciting in itself. Firefox 4 was released in early 2011, and they plan to also release versions 5, 6 and 7 in 2011.

If you are a developer interested in contributing to Mozilla related technologies, a good place to start is the Developer Guide. If you'd like to find out more about pursuing a career at Mozilla, read here.

Tags:

Add a new comment | 2 comment(s)

Gravatar image Joshua Kehn on Monday, June 13, 2011 (01:06:19) says:

I wish you the best, it sounds like an awesome opportunity.

Gravatar image mikez on Thursday, July 07, 2011 (07:07:13) says:

I am ensure that you selected a right way. Hope you only the best on your new job. Mozilla is a really huge figure in our world.





Apr
26
2011

StackExchange average age of users for each tag

Last modified: Thursday, April 28, 2011

I thought it would be interesting to calculate the average age of users on each StackExchange site, and even more interesting to see each tag within those sites. I did a caculation using the April 2011 data dump and came up with the following data. I call the statistic the Expected age of a tag because it is calculated using the Expected Value.

Observations:

  • The expected age of the whole StackOverflow site is ~30 years old.
  • On StackOverlow the tag with the youngest expected age is 26 years old, the tag with the oldest is 36. I was surprised they were so close together.
  • The site with the youngest users of the StackExchange network is: Gaming, then surprisingly Game dev, and Ask Ubuntu.
  • The site with the oldest users of the StackExchange network is: Do It Yourself, followed by Photography, and then by Geographic Information Systems.
  • A funny one, on ServerFault one of the tags with the oldest expected age is old-hardware. Apparently older people know more about old-hardware than anything else.
  • I'm not sure if this is true, but perhaps the tags with younger ages are more cutting edge. For example vb6 and COBOL have ages of over 36 on Programmers SE. I don't think this assertion is true in general though.

And as for the other sites, the expected age is:

You can see the per user tag data by clicking on the site name in the above list.

You could probably say that the StackExchange network could use younger contributors. I've said this before, but I think it would be advantageous for the StackExchange team to do some events at Universities. When I previously helped with some Microsoft events at University of Waterloo (Top Computer Science University in Canada, and one of the top in the world) several students didn't know what StackOverflow was.

How I made the calculations per tag

The below calculations were calculated with the April 2011 StackOverflow data dump.

What I calculated was the average age per tag each answer comes from for each StackExchange site.

To do this calculation I calculated the Expected Age of each site.

Expected Age = Summation over each age X of: P(X) * X

Where P(X) is the probability that a user of age X will answer a given question. You can calculate this probability by summing the number of answers by each age, divided by the total number of answers within that tag.

I also only considered the top 3000 tags. The top tags may not match up exactly since I only consider tags if the answerer has an age specified in their profile.

Other attempts at these stats

I initially tried to do this statistic by weighing each age by the reputation of each user, but it turned out to not generate interesting data. The problem was that the data was weighted heavily to only include the top 1% or so of users.

Limitations of this study

  • Several users don't enter their age in their profile, so no answers from a user without an age specified counts.
  • Users that are very young and users that are very old may be more unlikely to enter their age.
  • Each user may be counted more than once, since I only count +1 for each age that answers a questions.
  • Some users may be entering fake age values, although I ignored age values out of an acceptable range.
  • We are talking about averages here, so this doesn't mean there aren't a lot of younger and older contributors.
    For example if an average is 20 years old, there could be an equal amount of 10 and 30 year olds answering, or there could be only 20 year olds answering.

Tags:

Add a new comment | 1 comment(s)

Gravatar image Andrew Steele on Tuesday, April 26, 2011 (04:04:23) says:

For some reason I find this data to be really interesting. This is not a study I would have performed, but I am glad you did. Kudos!





Apr
25
2011

Twitter, LinkedIn, and Facebook lists updated for StackExchange April dumps

Last modified: Thursday, April 28, 2011

I refreshed my lists of social networking accounts (Twitter, LinkedIn, and Facebook) for StackExchange users. The lists are sorted by reputation and updated for the April 2011 data dump.

The data dumps surface every 2 months, so I will update the lists on my site around the same frequency.

This month 7 new sites appeared since they came out of the StackExchange beta:

  • Android
  • Apple
  • Do It Yourself
  • Electronics
  • Geographic Information Systems
  • Unix
  • Wordpress

You can view all of the links for each list on this section of my site.

For the first time there are over 20 StackExchange sites, and so I ran into a problem of Twitter only allowing you to host 20 lists. For each site I use an automatically maintained list of the top 500 users.

I tried to contact Twitter support to raise my limit of 20 lists but they could not help. I ended up getting my 2 sons to host the lists, so I have all automatic lists up and room for another 36 StackExchange sites. Thanks @linkbondy and @ronniebondy.

Tags:

Add a new comment





Prev page Next page