Weird URL data encoding

Sunday 30 August 2009This is over 15 years old. Be careful.

As a builder of web applications, I’m interested to see how others do it. This morning I received a promotional email from Snapfish about printing Facebook photos, and noticed the “view this email as a web page” link at the top. I figured the URL had to be unique enough to identify the campaign and the recipient, and it had to be obscure enough to prevent hacking so that I couldn’t peek in on others’ emails.

The typical way to do this is to include some keys in the URL, and then to also include a hash of those keys that includes a secret only the server has. The URL can be checked for authenticity, and the keys can be used to retrieve the data to display. The URL can’t be hacked, because if I fiddle with the keys, the hash won’t compute properly.

This is the actual URL (broken to fit):

http://email.snapfish.com/servlet/cc6?
kgHiMpkoQSUYSQSVgLKxgLKIHlJoLtKLjQJhuVaVSVupjjhjiHnLmjtVolli://
LuHptQkgHiMpkoQJhu/kLjNtLl/OLIkplL/yLjkhgHtpFLKzghju? tnapfithGze228X27X42XQWRaceboojXPSNX3zeGQTAMBILXBQ
TLQTLPQTAFFGnedzgnedbauchekdes9colGthoyGn

The odd thing here are the almost-words that appear in it: “tnapfith”, “Racebooj”, and “nedzgnedbauchekdes9col”. That’s almost “Snapfish”, “Facebook”, and my email address. And there’s the tell-tale “://” sequence with “olli” before it, which looks like ROT-13 “http” but is not.

I understand why the URL is so long: if you can store all of the data about the message in the URL itself, then you don’t need to store it on your server and then retrieve it by key when the link is clicked. But what’s with the grade-school encryption going on here?

» 2 reactions

Comments

[gravatar]
Assuming there's a hash, I don't see why they bother with the encryption at all. But someone probably told them they have to encrypt and they need something fast, so they are just doing a simple XOR. In this case, the byte string that they are XORing with happens to have a lot of zero bytes or near-zero bytes, so you see a lot of chars that haven't changed at all, or have had only one or two of the low-order bits changed.
[gravatar]
Well, that's mildly interesting.
The url seems to be composed of two parts, where the first part:

kgHiMpkoQSUYSQSVgLKxgLKIHlJoLtKLjQJhuVaVSVupjjhjiHnLmjtVolli://
LuHptQkgHiMpkoQJhu/kLjNtLl/OLIkplL/yLjkhgHtpFLKzghju?

is using a substitution code and decodes to

snapfish.****.**ned@nedbatchelder.com*H***mirrorpageurl*http://email.snapfish.com/servlet/website/PersonalizedForm

where * signifies a character I didn't resolve.
Though most of the plain characters are represented by a single cypher character, q,x and z are special cases.

As for the second part, I can't say how it's encoded, but it may as well be a simple xor as Richard suggested.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.