We really only expect developers to find this part interesting so we assume that if you’re still reading you are one.
We created the Forge project to provide a means for our customers to securely connect from our website, bitmunk.com, to their Bitmunk P2P application, which they run on their home computers. It is also related to two of our other projects: PaySwarm and Monarch.
In designing Forge, we used a similar approach to how we design many of our products here at Digital Bazaar. First, we start with a bottom-up approach. There are a lot of little individual parts that must all come together to make a complete TLS+HTTP stack. Once we’ve built and tested those little parts, we switch to a top-down approach. Here we discover exactly how the code best flows in a TLS implementation. We study how state-changes are driven in the system and which code paths can be or should be shared and reused. We design the API that a user of our TLS implementation will interact with and integrate it with our system design. This way we understand the bigger picture of what is going on, how the implementation will be used, and we can better ensure that we don’t make poor critical design decisions. Once we’ve got the code flow worked out, we can go in and fill out the middle details by gluing together the pieces we created during the bottom-up design phase.
Next we’ll provide an overview of the pieces you need to build a TLS implementation and explain how we decided the code ought to flow and interact with a user. We won’t really talk about the middle glue — for the most part is simply involves following the TLS spec — but we will briefly mention how we tested our implementation to sort out any problems we had.
The Cross-domain Problem
Since Flash 184.108.40.206 Adobe has made it possible to create raw sockets in Flash provided that the server they are to communicate with can serve up a cross-domain policy. That policy is in XML format and is, by default, served from port 843. You can, however, specify a different port from which to obtain the policy. There’s another option, which involves serving it directly in-line with the HTTP protocol, but not as part of the protocol itself. We thought that a bit hackish so we opted for using a custom port. This way our application can select a port (or you can configure one) that will serve up the policy. The port that gets selected can be uploaded to our website and stored in a database, along with the SSL certificate generated for the particular application, its current IP address, and access port. Dealing with firewalls is beyond the scope of this document other than to say that we use a UPnP implementation that handles the issue on many routers.
- Create or destroy a raw socket.
- Send or receive data on that raw socket.
- Deflate or inflate data using the DEFLATE algorithm (technically zlib, not just raw DEFLATE).
- Store data (read: cookies) on the local disk using Flash’s SharedObject.
We tried to minimize our Flash usage as much as possible and we aim to replace what we can in the future with features from HTML5.
A TLS implementation requires a lot of little pieces to come together. We made some choices to try and create or acquire those little pieces as quickly as possible without sacrificing too much performance. Also, we were originally going to start out by implementing the latest version of TLS (1.2) or maybe 1.1 if 1.2 was still too new. It turns out that 1.1 is still too new. Our application uses OpenSSL for its server-side TLS, which hasn’t yet implemented anything higher than 1.0.
The Technical Requirements of a TLS Implementation
- Raw byte storage
- At least 1 cipher suite
The TLS 1.0 spec has a mandatory cipher suite: TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA. All of that text means Diffie-Hellman parameters are signed with a DSA key and data is encrypted using triple DES (in Encrypt-Decrypt-Encrypt Cipher-Block-Chaining mode). The TLS 1.2 spec has a mandatory cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA. Since we started out implementing TLS 1.2 and AES is quicker, newer, in wide-adoption, and believed to be more secure, we preferred it. We will also already need RSA to handle the SSL certificates we generate. So we decided to do a little less work and sacrifice being officially compliant with TLS 1.0 but we are still practically compliant with just about every major TLS server out there.
- AES Encryption
- RSA Encryption
- SHA-1 and MD5
There were some implementations of these out on the web that we found but they were all more or less the same and simply followed the MD5 and SHA-1 pseudo code available on Wikipedia. Since they would be easy to write and we had some interest in how they worked we just quickly implemented them ourselves by following the pseudo code.
- HMAC (Hash-based Message Authentication Code)
Writing an HMAC implementation is really easy when you’ve already got the actual hash functions. There is pseudo code available on Wikipedia. Our HMAC implementation is just a wrapper around one of the supported message digests (SHA-1 or MD5) that integrates a secret key properly.
- ASN.1 (Abstract Syntax Notation Number 1)
- A cryptographically-secure PRNG (Pseudo Random Number Generator)
The PRNG used in the current implementation is based on the Fortuna algorithm, designed by Bruce Scheier and Niels Ferguso. The Fortuna algorithm removes the necessity of having to try to estimate the amount of true entropy in the data you add to your entropy pool. It does this by evenly spreading the entropy from all of your various entropy sources out over 32 pools and then only periodically taking from some subset of those pools based on the number of reseeds that have occurred. To generate its random numbers, the Fortuna algorithm uses a cryptographic function, typically whichever is already available in the system the PRNG will be used in. In our case, we used AES as the cryptographic backend.
Once all of these pieces were ready we could then begin implementing TLS. To do this we read over the spec to get a general idea of what was going on and then began doing a top-down design. We also decided to cut down on time and just implement the client-side of TLS. We also didn’t have a use case for the server side, however, should someone need to, extending our current implementation to support server-side TLS doesn’t seem too daunting. Like we mentioned earlier, how each of these pieces fits together is the “middle glue” that we don’t cover in this article but can be easily gleaned from the TLS spec. Next we’ll discuss just a little bit of our top-down design and APIs without going into too many boring details.
Handling TLS Records
TLS traffic is broken down into records. Each record has a maximum size of 16 KiB. The records contain application data, alerts (errors or warnings), or a message related to the TLS handshake protocol. The handshake protocol is used to establish a session which will contain and make use of the cipher suite that a client and server agree upon to secure their traffic. The records you receive control how state changes.
Our TLS connection object can either accept incoming records (typically from a web server but this is abstracted) or produce outgoing ones (again typically intended to be sent to a web server). When data comes in, we check to see if we are buffering a record already. If we are, then we add more data to the record. If not, we start a new record and take note of its size. If a record is part of the TLS handshake protocol then its full message size can be found inside of the record’s handshake message header. This can be used to determine whether or not a record has been fragmented so we know how long to keep reading until the full message has arrived.
Once a full message has arrived we ship it off to update our current connection state. There are state tables that keep track of what the next valid state is based on the next record or handshake message type that we’re expecting. We enter the record type and message type into our state tables which call the appropriate function to handle the record and its message. If the message is unexpected, our error handler takes over and generates a TLS alert indicating there was an error, which will terminate the TLS connection. Otherwise, we will process either a handshake message, if we are still negotiating our session with the other end, or application data. For details on how the TLS handshake protocol works see the RFC or our source code.
Handling SSL Certificates
The only detail we’ll discuss here concerning the handshake protocol is how certificates are handled. This is important because our design provides a useful callback during the certificate verification process.
When the server’s SSL certificate is checked, it is part of a chain of certificates. If the certificate is self-signed there will be only one certificate in the chain. Otherwise, each subsequent certificate in the chain must be the issuer of the previous one and must have digitally signed it. The certificate chain verification process therefore checks for this condition and ensures that some other details about the certificates in the chain are valid (ie: expiration dates). Every time a certificate in the chain has been examined, an optional user callback is called passing: the TLS connection, a verified flag which is true if the certificate passed all verification checks or is otherwise the TLS alert value corresponding to how it failed, the depth of the certificate in the chain, and the chain as an array with the server’s certificate at index 0. The user function can return true to indicate that the certificate should be considered verified (trusted) or a TLS alert indicating why it isn’t trusted. This allows customized certificate verification.
A similar callback can be found in OpenSSL, so developers who have worked with that project should recognize it.
Once the TLS Handshake is Complete
Luckily for us we have written some HTTP implementations here before so we just ported the simplest parts. One of those parts included chunked-encoding, something our servers often use and we didn’t want to be without.
You can provide optional callbacks for when a connection has been made, an HTTP header has arrived, an HTTP body has arrived, or an error has occurred. Using these callbacks you can handle most of your needs using HTTP.
An XmlHttpRequest API
To make this even easier, we added an XmlHttpRequest API implementation that wraps our HTTP client. Technically speaking, it wraps one HTTP client per domain, since it provides cross-domain support. Using jQuery you can specify a callback to create our XHR object and use it to communicate over HTTP with a cross-domain application using standard APIs.
With our top-level design and API in place, we started filling out the middle glue to connect our low-level pieces with our top-level design. Simply following the TLS spec will cover most of the issues here, with one obvious exception: testing everything with a TLS-compliant server.
Connecting to An OpenSSL Test Server
Since TLS is a conglomerate of many different pieces, we knew that implementations are difficult to get right and we would have made mistakes. The easiest way to correct those mistakes was to have a correctly implemented server to communicate with; one that was built with open source code to which we could add debugging information. Since we have worked with OpenSSL before, we downloaded their source and built their test server. From there, we added whatever appropriate debugging information we needed to the entire TLS handshake process until we worked out all of the kinks. This was the easiest way to find out exactly what was going wrong when something wasn’t quite right.
Well, that’s it. Once all of these pieces are put together, you can communicate using HTTP over TLS with a server running on a different domain, provided that you begin with a website that you trust.
Please check out the Forge project over at github and tell us what you think.
If you liked this article, please check out some of our other projects that are related to Forge: