Pages

Friday, February 26, 2010

Migration tool postponed until first part of next week

We've been working around the clock to get the migration tool launched this week as originally promised, and ran into two bugs which we are in the process of correcting. We are not happy about delaying this by a couple days, but launching it with the knowledge that a few bugs remained is definitely not the way to go into a weekend.

We'll have the video walkthrough and announcement post on this blog as soon as it's ready.

Monday, February 22, 2010

Migration tool almost ready, still on track

When we announced the migration tool last month, we pointed to the week of February 22 for its launch. That week is upon us, and we're just about ready to launch. We're doing some final QA on the tool, and will get it launched this week. Exact launch date is subject to a few variables - but we will update the blog as soon as it's available.

Once launched, you'll see an alert in your dashboard if you have any FTP blogs to migrate, and will be able to get started right away.

Friday, February 12, 2010

Migration deadline extended to May 1

When we originally announced that we were shutting down FTP publishing on Blogger, one of the common complaints we heard was that March 26 was too soon. Many of you asked for more time to manage the transition off of FTP publishing, either to migrate your blog(s) to Blogger hosting or to another platform.

March 26 was not an arbitrary date  — it was related to an upcoming infrastructure change being made at Google that will render our current FTP code unworkable. We relayed your feedback to the infrastructure team, and they have agreed to postpone their change, which allows us to extend the deadline to May 1.

We are finishing up the FTP migration tool, and are still working to have it out the week of February 22.

Finally, we've done a sweep through your recent comments, and will be updating the FAQ later today with more answers.

Thursday, February 11, 2010

The Technical Side of FTP

by Noah Fiedel, Blogger Engineering Tech Lead

Many of you have posted asking why Google doesn't continue supporting FTP publishing indefinitely or even charge money to those who would like this support. I'd like to give you some insight into our decision making and technology.

The Protocol

FTP is one of the earliest protocols on the Internet. It was drafted in 1971 before security was a concern, as the Internet at the time connected universities and research labs. Unlike nearly all other Internet protocols, it uses two insecure and unencrypted ports simultaneously. This makes securing FTP effectively impossible on both the server and network levels. FTP servers at ISPs are therefore vulnerable to attack, and your password can be 'sniffed' by anyone with access to the traffic to or within your ISP. sFTP, while more secure than FTP, still requires us to store your user credentials — which itself is undesirable from a security perspective.

Compare this to the HTTP protocol, drafted 20 years later in 1991: FTP doesn't have a mechanism to discover whether an FTP server is up, down, slow, or temporarily unavailable. HTTP supports all of these and more, and is now the basis for nearly all activity on the Internet.

Due to FTP's weaknesses, many ISPs restrict access to their FTP servers. They do this by limiting your FTP account to a list of approved Internet addresses (via an IP whitelist), which makes your account less likely to be hijacked. In the next section you will see why this also makes it difficult for Blogger to reliably provide FTP publishing.

Google Datacenters

Google runs many datacenters, and Blogger runs in several of them. Each datacenter has a different Internet address (a.k.a. "IP address") when it connects to your FTP server, so if your hosting service requires an IP whitelist, you would have to list all of the IP addresses associated with each of our active datacenters. We really don't like having a "primary" datacenter for anything, and instead prefer to let our traffic flow to the most efficient and lowest latency datacenter for our users. This makes the IP whitelist problem even worse, as some ISPs only allow a single (or very few) IPs to be whitelisted. Your ISP would need to whitelist all of Blogger's datacenters. Since they change regularly, your ISP's whitelist would have to be updated as well. This leads to a significant amount of user frustration, and regularly results in blogs failing to publish successfully. Diagnosing these issues has taken up a large part of our engineering and support team's time.

FTP Web Hosting Providers

Late last year during scheduled maintenance on a datacenter, we moved FTP publishing to another datacenter and updated our publicly posted IPs for your ISP's whitelists. Even after doing this, there were considerable complaints by users unable to publish via FTP to thousands of ISPs. Many of these ISPs maintain their own IP whitelists, often in an undocumented way. Troubleshooting this is extremely difficult and time consuming for us (and for you), as it's rarely clear where the underlying issue is. Our engineering, product and support teams often ended up directly contacting ISPs, waiting on hold, frequently without resolution. All to support a single user's report of "can't publish via FTP". In many cases the user or ISP simply entered the IP whitelist incorrectly. In other cases the hosting service's FTP server was unreachable. On more than one occasion, the ISP had set up "staging" and "production" environments without telling their users what was happening, so while Blogger was successfully publishing (to the staging server), the posts were not visible on the web and the user had no idea why their posts weren't showing up. In a great deal of cases, FTP publishing works but is extremely slow due to shared hosting plans having slow or limited network or disk per user. We have seen cases of full FTP republishes taking over a month, entirely due to the FTP server being slow.

What about sFTP?

If sFTP addresses some of the concerns with FTP, why are we shutting it down too? Fewer than 15% of users have adopted sFTP as their publishing mechanism, and many of the same challenges apply to both sFTP and FTP. Even with sFTP, a republish of a blog can take longer than a month to complete. Not all ISPs support sFTP and of those that do, many lock down FTP and sFTP with the same IP whitelist.

Engineering Effort

Blogger's current FTP support was 100% re-written for stability and maintainability in 2008. We added redundant queues on our side to make sure we never missed a file. We spent a significant amount of engineering time improving FTP support in the last two years, not including support and troubleshooting user issues. Even after this effort, approximately 10% of all FTP publishes fail.

As the original blog post mentioned, Google infrastructure is changing and would require us to again rewrite a major portion of our FTP support. Even after that rewrite, things would be no better than they are today in terms of stability, and we would be running just to stay in place.

Decision

I hope this post helps you understand the issues we face, and why it is not simply a question of money or a small bit of time. We want to deliver a best-in-class product experience for all users. Supporting a protocol with known security vulnerabilities and dependencies on downstream ISPs was preventing us from delivering the stable, reliable, and functional product we want and our users demand. It was also preventing us from doing more for the 99.5% of users who host their blogs with us, either on their own domain or on blogspot.com. While we deeply regret the impact this has on some of our users, many of whom have relied on Blogger for years, we remain confident that this was the right decision.

Wednesday, February 3, 2010

Who's affected?

There has been some confusion in the comments about who is impacted by the announcement that we're deprecating FTP on Blogger. The easiest way to determine if you're affected is to go to Blogger, select "settings" for your blog and click on "publishing". If you're on FTP, you'll see something like this:


A bit more information:
  • If your blog URL ends in "blogspot.com", you are not affected
  • If you don't see "You're publishing via FTP" (or "You're publishing via S/FTP") when you go to "Settings | Publishing", you are not affected

Missing Files Host: what it does, why it helps

For people contemplating a switch to custom domains, one good question that's shown up repeatedly in the comments concerns files other than blog posts. These could be images, PDFs, Word docs, etc. - and if you're moving to a custom domain, you will want to think about what happens to those files.

Before we get to the specifics of what's happening, it's useful to take a step back and understand the mechanics of DNS and how it relates to your blog. Let's assume for a moment your blog is published to http://www.yourdomain.com/. There are several parts to that URL that are important for this discussion:

  • yourdomain.com: The domain you registered, from your registrar.
  • www: the CNAME (often referred to as the subdomain), commonly configured by default by your registrar, to point to a nameserver.
  • Nameserver: the server that stores the right IP address to direct users to for the CNAME
  • IP Address: the numerical address (something like 123.45.6.7) that identifies the server which hosts the content

When someone types "www.yourdomain.com" into their browser, the CNAME is passed to the nameserver, which translates the CNAME into an IP address, which then receives the request (for the homepage, for instance) and sends it back.
Important note: For this illustration, we're using the example of someone who's moving from www.yourdomain.com (hosted by someone else) to www.yourdomain.com (hosted by Blogger). If you are considering setting your blog up on a subdomain (i.e., blog.yourdomain.com), your setup will be slightly different. The purpose of this post is to explain Blogger's Missing Files Host; later posts (and the migration tool) will provide more guidance about addressing specific situations.
When you move a CNAME (in this example, let's assume that you were hosting www.yourdomain.com with "Joe's webhost", and have opted to move www.yourdomain.com to Blogger), you're simply instructing the Internet to direct requests for "www" to Blogger's IP address, not Joe's Webhost.

The important thing to recognize at this point is that Joe's Webhost still has your old content, but there is no URL to request the old content. Blogger's Missing Files Host is designed to address this, by watching for requests to us that 404 (meaning we can't find them) and rewriting them to look for them in another location.

Using the above example, let's spell this out:

  • Old setup: www.yourdomain.com --> 123.45.6.7 (maintained by Joe's Webhost)
  • New setup: CNAME "www" to point to Blogger, which results in requests for www.yourdomain.com --> Blogger's IP address
  • Create a CNAME (we'll use "files") to point to Joe's Webhost (Joe's Webhost can help you with this, our help file discussing CNAMEs is here)
  • In Blogger's Custom Domain options, enable the Missing Files Host and input "http://files.yourdomain.com"
Result? You have a PDF stored on your domain today (managed by Joe) at http://www.yourdomain.com/uploads/resume.pdf. Once you point "www" to Blogger, requests for that URL will fail - we don't have that content on our servers. By enabling the Missing Files Host, requests will first go to us, and when we can't find the content in question, we'll automatically redirect the request to the backup URL, in this case: http://files.yourdomain.com/uploads/resume.pdf. This will be invisible to the end user, and will ensure that all of your old content will surface as intended.

Tuesday, February 2, 2010

FAQ: SFTP counts as FTP

Addressing a common question I’ve heard: SFTP support is also being discontinued along with FTP.

We’re playing a bit fast-and-loose with the terminology of these actually quite-different protocols, so whenever you read us saying something about “FTP” just add in “and SFTP” to it and you’re set.

For answers to this and other questions, take a look at our FTP (and SFTP) FAQ.

FAQ page published

Just added an FAQ page to address some of the most common questions.

For blogs that are no longer updated

If you no longer update your blog, but it is still configured to publish via FTP, then you received an e-mail from us today announcing the FTP shut-down. If you don't intend to update your blog, then you don't have to do anything. You won't have the ability to update the blog from within Blogger, but if you don't intend to publish new content to the blog, the HTML files will remain intact on your server and your existing posts will continue to function just fine.

If, on the other hand, you intend to update the blog down the road, you will be able to use our migration tool to convert your FTP blog to a Blogger-hosted blog (either at your own domain or at Blogspot), and you can then update the blog at a later date.

Some background on the process

Long-time Blogger engineer Pete Hopkins wrote a great blog post last week on his personal blog that captures the thought process behind our approach to shutting down support for FTP. From his post:
Though we’ve tried to put together a migration process that will work smoothly for everyone, I’m sure it won’t be perfect; there are too many moving parts in FTP publishing to guarantee that everyone will have a great experience. Nevertheless, I believe that our overall plan is sound, so I’d like to tell you about what we came up with, as well as some of the alternatives that we considered (and that might work better for you if you want to try them out).
That last part is key: he outlines a number of more "advanced" approaches that may be more appropriate for users, given certain situations. We'll be talking more about those in the next couple weeks here, and will answer questions as you raise them.

If you're interested in understanding more of the thinking behind what we're doing and why we're doing it, give Pete's post a read.