Chair: I would very much like to welcome, again following the Google Wave theme, David Wang of Google. Please welcome David Wang. David Wang is one of the lead technical architects of Wave, so please take it away, David.
David:
Hi, thank you for coming to the Wave Federation talk. I've
heard a lot of excitement in the audience, just before this talk, and I
hope you've all gone to Lars and Stephanie's talk just then and saw a
quick demonstration of Google Wave. At the point, I hope you're all
excited about Google Wave.
In this talk, I will give a very quick overview of what is federation and where we are up to, today. For those of you who know Google Wave, I would like to reiterate that Google Wave is actually a product. Wave is actually a technology. You can imagine Wave is to Google Wave as email is to Gmail. That means that anyone is welcome to write another Wave server just like Google Wave.
What is Wave federation in that context? Wave federation is about basically enabling different Wave providers to interoperate with each other. A Wave provider is someone who is running their own Wave server. We already have a draft federation protocol spec available at waveprotocol.org and I welcome everyone to have a quick read at it. Also note that this specification is currently being iterated on. It's not final. I welcome everybody to give feedback on it, and we do need everybody here to give us insight.
Here, I have a diagram of three organizations. As you can see, there are two organizations here that are actually not Google, both of which are running their own Wave servers, and those people, we call them Wave providers. In the protocol spec, it simply says how you can run your own Wave server, and how you should talk to other Wave providers.
Why does Google want to do Wave federation? In short, I think Google wants to be successful at Google Wave, but we also believe that we want Wave, the technology, to be successful. We believe for Wave the technology to be successful, we need to have a wide adoption of Wave, just like how email is a widely adopted technology. We want to do that by being open and because the Internet is built on open APIs and open protocols, we believe we can do the same, just like email, by letting everyone participate in this. Even more importantly, we don't want to be an isolated communication tool that you have today, like IMs where you have five different IM clients just so you can talk to your friends.
Because this is open protocol, anyone can actually develop their own Wave servers, and users will have a choice. They can pick whichever Wave provider they want, based on price, features, or whatever. As an outcome of this federation, we also want to avoid different organizations building Wave-like systems that don't really interoperate. That means you can develop Wave servers yourself, but we would really like that users don't have to suffer the consequences of having different Wave clients opening up, just so they can communicate with their friends.
This brings the next point; if you're thinking about federation,
you're probably thinking about what it takes to become a Wave provider.
The bottom line is; anyone can be a Wave provider, as simple as that.
It's just like running your own SMTP server. Let me give you some
technical background about Wave federation. Hopefully it will make it
easier to understand how simple it is to run your own Wave server. 
Raise of hands for anyone who even knows what this diagram is about? Ah, great - I can breeze through this now. A Wave interim data model is very simple. It's basically a collection of Wavelets, these Wavelets. They are basically just boxes which contain a list of participants and a list of documents. You can think of a participant as just an address, and document as just and XML document. It's technically not an XML document, but it's much easier to describe it as an XML document. It contains annotation as well.
A Wavelet is what we call a "unit of concurrency control". That is, this is the thing that runs the live, concurrent editing code, the operational transformation. This is what I also call the "unit of Wave federation" this is the object that is shared between different Wave providers. In terms of federation, it's actually really straightforward. All you have to do is run your own Wave server, which runs the LT algorithm, and it's very important that everybody runs the same algorithm. Wave servers simply share updates to their Wavelets with each other, much like how clients do it today. You have a bunch of updates, you send them across the wire, and magically things will get synced together by the operational transformation algorithm.
It's very important that only one server owns a Wavelet. That is, if you are in initech [initechcorp.com domain] and somebody started a Wavelet on your server, you actually own the Wavelet forever and ever. It's currently unspecified if your server goes down, whether someone will take over or not, but we welcome comment about that.
When you look at a Wavelet, how do you know who owns the actual Wavelet? This is defined by its ID. A Wavelet contains both a domain and a arbitrary string. The domain tells you where you can go and fetch a copy of the Wavelet, well a canonical copy. When does federation actually happen? It's a very simple process where the minute you add someone that's from another Wave provider, you actually have to go and start sending operations to that provider. Straightforward?
A very simple example is Bob has a Wave open and he adds a new participant called milton@initech-corp.com, which is where his provider is, looks up [Initech Corp's] Wave server and says, "It's not me, it's someone else," so it basically starts pushing the "add participant" operation and any further operations to that external server. I won't go into more details about that.
From a very high architectural perspective, you can see that when you have your Wave server and you own the Wavelet, you have to promote all the deltas, which is essentially a list of operations, to external parties through this thing called federation host. Basically, you're like a web page. You're hosting this Wavelet and you're constantly broadcasting Wavelets to everybody else.
If you were a remote Wave server, someone who doesn't own the Wavelet, you'll be constantly receiving operations through this and you can optionally store it for caching purposes, so that when users of your Wave provider come online, you don't have to go all the way to the root. You can serve up a cached copy so it's much faster. Of course, from this, you can see there is a mirror image at the bottom, as well.
This hopefully makes it clear that it's easy to see because there is one Wave server owning a Wavelet, data stays in your network, provided you don't add anyone outside of the network. You can actually have an on-premise solution for your corporation, and talk between each other, and nothing will go out of your network until you add someone from outside the network, much like how you can send emails between each other today, and provided you don't forward the email to someone outside the corporation, nothing goes out of your network.
Where are we today with federation? Today we've actually published two main specifications, the Wave Federation Protocols Specification, and the Wave Conversation Model Specification. Both are available on the website. We've also open sourced about 40 thousand lines of code, and it's available at the address below. We've released it at very liberal license, which is Java Apache 2.0. I don't really know the exact details, but basically you can do whatever you want with it; that's my understanding.
One of the reasons we want to give you code is we believe it's critical that everybody runs the same algorithm for operational transformation. It's much easier if we just give you the code. You can see we'll give you the operational transformation and the model, which is how to interpret the operations into an XML-like structure.
But also, we're going to publish a very basic prototype which is sort of the simple Wave provider prototype. It's very dumb and it has a very simple crypto library attached to it. Crypto is actually very important in Wave context because with crypto, you can actually validate who actually sent the message. This is very important. This is a big problem we're addressing that we see in email today. You don't know where the email came from. With the crypto library, you know where the Wave came from. In fact, when we exchange messages over the wire between various Wave providers, we're exchanging them over TLS, in XMPP as well. In that sense, it's also encrypted on the wire.
The client that I've just mentioned, which is a very simple client-server pair, basically contains code that tells you how to use the wire protocol. Please don't use it and think it's a reference implementation. It is a very basic piece of code that demonstrates how things work, like first year computer science degree courses. You wouldn't take that and make a product out of it.
This is what it looks like it, for those computer geeks out there. I'm sure you prefer this over our live client. What are we doing right now? We're trying to open up the Outdoor Federation Port on wavesandbox.com. This port will be very experimental. This is because we're constantly iterating on it and we haven't ironed out all the bugs yet, and we do want everybody to participate and help us to achieve, in a sense, a reference imitation from that.
We also are updating the Fed-1 client that you saw previously, to do a much better job in using OT so that characters are more live and concurrent. This is work which is continuously going on right now. Also, if you want to contribute to the code base, we've just released the licensing agreement so you can actually go in and send code reviews to us. We'll liberally accept your code. I've just heard; right before the presentation that someone in Sydney hadn't slept for three days to cook this up for you guys here.
This is my regular Google Wave account, and this is the Google 6 or Solos Org or some other domain. It's a Wave service provider in another domain. I can, in this case, create a Wave and say, "Hey the other David, I'm in Amsterdam. Where are you?" I can go and add this person in acmeWave.com, so he's in a completely different domain to me. He's being added.
Back to my cool text client, I can see a Wave has just popped in. I open the thing. Don't tell me about UI. I'm no UI expert here [laughs]. I can reply to the person and say, "I'm here too. Want to get drinks at 6?" The Blip will appear on the other client as you see here. If I make this a little bit smaller, like so, I will show you a very cool feature that somebody hadn't slept many days, just to implement it for you guys.
As you can see in the Fed-1 client, characters are showing up,
character-by-character. It's a bit flaky right now because it's in
Sydney so the ping time is a bit bad. But you get the picture. An even more
important thing is this, "Let's invite Alex." I'm going to invite Alex,
who is a great software engineer on the team. I'm going to make a
private reply to Alex and say, "Knowing you, don't be late." As you can
see, I can see the private reply on my screen, but it does not appear,
in fact the bytes don't even arrive at the other organization. The
other organization has no idea what's happening in the background, and
this is critical if you want to protect your information and Federation
provides that.
So, where are we heading? As you can see, the demo is very primitive. We're still iterating through it and we want to get to a reasonable set of specifications, and of course using everyone's feedback here. Using your feedback, we want to gain more experience and we want to open the federation port to everybody. When I say to everybody, I mean open it on the actual production server which is wave.google.com. Currently, we're just doing it on the test platform which is wavesandbox.com.
We also of course, open source a lion's share of Google's client server code, but right now we're really busy with implementing lots of features everybody has been requesting. Certainly, some key components we're shooting to open source very soon include the editor, which is critical if you actually want to implement an online, live, concurrent edit platform.
Also, we want to iterate our existing implementation, the open source implementation to provide something that is of production quality, and that can only be done if we open source some of our server components too. We want to do this soon, but we can't do this without everybody's help here. Bottom line is I think we'd like to work with everybody here to achieve a true open standard. I have about 3 minutes left for Q&A.
Chair: Thank you David.
Let's open it up to the floor. One thing I noticed, before we go to Jay here, is that it seemed to me, and I'm glad you made the point so clear at the beginning; far too many people seem to look at Google Wave and instantly look at the UI and think this is Google Wave. They didn't seem to get that this is technologies. This is standards. This is a platform for essentially changing the way we communicate. That transformation is going to take a couple of years. Thanks for making that. I've been really surprised at the press for just thinking this is just a Google app and here it is, today.
Audience: I know you haven't decided how you're going to handle server failures in the federation protocol, but what are some of the ideas that you are tossing around for how you could handle that?
David: There are many ideas. One idea that has been suggested already was that there is some method of passing ownership of a Wavelet to an external server. That was one idea. There is some complications then in deciding which server to pass it to, etc. The other option which has also been suggested is simply not to pass the ownership at all; you just say the Wavelet is dead. Because other Wave service providers will have a cached copy of it, users can choose to make a copy of that Wavelet and keep going. We are still discussion what are the pros and cons of each option.
Audience: Looking at Google Wave as it stems, and the fact that you're going to integrate to Google Talk with it, it looks a lot as if you're putting a lot of effort into transferring things from HTTP world to an XMPP world. Do you have any plans to provide Google Search in real time format?
David: I'm not sure I'm the right person to answer that question.
Audience: It struck me as significant.
Chair: This is a tech lead. What's the tech question? Thank you.
Lars: I'm not 100% sure I heard the last part. The question was whether we would be able to search? I guess I just stole your microphone. I didn't hear the last part. Tell me and I'll repeat it in the microphone. The question is whether we will do a real time Google without also doing the search in real time. It's not something we've thought through a great deal.
You'll see that the search we provide over Waves is a real time search. You can see if you do it with public, you see things bubbling up really quickly, but as a gentleman in the back pointed out after our talk, the quality of the search is not very good. The challenge we have is to marry the real time in this, which is hard from a systems point of view with the ranking, which hard in an entirely different way. I think when Wave takes off, there will be a tremendous amount of goodness to come from that, and enough people will work on it that will figure it out, if I can make it any vaguer than that.
Chair: That sounded good to me. Last question. Anymore questions? Thank you David.
David: Thanks guys.