How to implement Polymorphic protocol buffers in Java
Posted by Jim Morris on Wed Nov 23 00:08:34 -0800 2011
I decided to use Googles Protocol Buffers for a new server. I have been writing custom binary protocols for over 15 years for my various servers, as well as using XML and JSON on occasions.
I prefer binary protocols as they tend to be lean and mean when it comes to bandwidth.
Protocol Buffers are a very nice way of defining the wire level protocols you tend to send back and forth between client and server, and is similar to the way I did it several years ago when I designed a binary protocol that needed to talk between a c++ client and a Java server, however mine was not nearly as flexible as the current protocol buffers that Google has kindly released to the Open Source community.
In this case I don't need the multi language capability, although it has come in handy. I am using Java.
Now when you send messages like these they need to be framed, as TCP likes to coalesce data buffers, this seems to trip up Newbees, based on my reading of StackOverflow and the news groups. I am using Netty for the socket level library, again it is very similar to the raw NIO stuff I have written in the past, but much nicer to use. It also has some support for protocol buffers by providing an encode and decode codec, not necessary but handy. Netty also provides a relatively easy way to frame TCP packets. The way I frame packets is prepend every packet with a 2-byte BigEndian count of the packets size. I use 2 bytes because it limits the packet size to 65KBytes, which avoids a potential DoS attack by someone sending huge packets and consuming all of your servers memory.
Netty allows this nicely by using a couple of provided codecs...
See the Netty docs but this will read a 2 byte length count on incoming messages and then read that number of bytes before passing it on to the protocol buffers decoder, and for outgoing messages it will prepend the message with the 2 byte count of the size of the packet.
Thus framing is neatly taken care of.
Anyway this post is about how to get Polymorphic protocol buffers to work in Java.
This blog explains the preferred way to handle it, however the example given is in Python, and Java is a little different and it took me a little time to figure out how to do the same thing in Java.
Why do you need Polymorphism? Well my server protocol is not an RPC (Remote Procedure Call), it is an Asynchronous protocol, so the client sends a command, and sometime later the server will respond and send one or more responses to that request, the client does not block on the request as an RPC would do (and the built-in RPC stuff in protocol buffers).
So I need to send one of several different commands/messages to the
server, these messages all have arguments which is what the .proto
file describes, but protocol buffers needs to know what the message type is
before it can decode it, so you need someway to send a command
indicator so the server can know what the message is and decode it
accordingly. Now if you don't have many commands/messages then the
simplest method would be to simply use a union of messages as
described
here,
and you can stop reading this blog ;)
As I have quite a large number of commands I decided to use the Extension technique described in the cited blog post above.
Extensions turn out to be quite messy to use in Java, hence this post.
Here is the example .proto
file for this example, similar in
concept to the Animal example given above. I won't explain how to
compile this with protoc as it is documented in the protocol buffers docs. I
only have two extension messages here, but in reality I'd have a bunch
more. The reasoning for the BaseCommand and the CommandType is
explained in the cited post above.
package example;
option java_outer_classname = "Commands";
message BaseCommand {
extensions 100 to max;
enum CommandType {
VERSION = 1;
LOGIN = 2;
}
required CommandType type = 1;
}
message Version {
extend BaseCommand {
required Version cmd = 100;
}
required fixed32 protocol = 1;
required fixed32 versions = 2;
}
message Login {
extend BaseCommand {
required Login cmd = 101;
}
required string username = 1;
required string password = 2;
}
The first thing to note is that if you want the Java based Netty protocol buffers decoder to decode the Extensions you need to register them and tell Netty about it, even if you are not using the Netty decoder you still need to register the extensions before you decode.
The incoming messages are decoded by the Netty codec, and passed onto your handlers, where you would use a switch on the CommandType enum to extract the relevant extension/message, something like this...
This is how you extract the extension in Java, a little harder than the Python example but not too much.
However encoding the extensions into the BaseCommand in Java is a lot messier than the Python examples...
First you need to build the message, here we build the Version message...
Commands.Version vers= Commands.Version.newBuilder().setProtocol(v1).setVersions(v2).build();
This is not so bad and chaining the setters is a nice touch.
Then you need to build the BaseCommand and set the Extension in it...
BaseCommand bcmd= BaseCommand.newBuilder().setType(BaseCommand.CommandType.VERSION).setExtension(Commands.Version.cmd, vers).build();
That is really ugly, but is all required, as it sets the BaseCommand CommandType enum, and then sets the extension.
It took me a while to refactor this into something that was not quite so ugly, but is still not optimal, but I don't see a way to further simplify this. The following code is a helper method to generate the above code...
and to call it you would do this...
wrap(BaseCommand.CommandType.VERSION, Commands.Version.cmd, vers)
Simpler than the hard way, but still you need to provide 3 bits of information for each command, and I don't see a way to simplify that.
Putting it all together I wrote a little sample in Groovy, but it can be easily translated to Java...
So that is how you do it in Java.
an update to this post is that I ditched this method for the union method. dealing with extensions is a pain, and there seems no advantage over unions, the files are still huge and the actual protocol binaries are exactly the same size
Thank you for this nice post. I'm learning netty+protobuf right know, maybe you have an idea where i can find a tutorial for a netty application using protobuf which is a bit more newbie-friendly than your post :)
Thank you for this nice post
Thank you for this nice post.as you said you had ditched this method,Can you introduce the new method?
In GPB 2.6, there is a feature called "oneof" which covers a fair amount of the above requirements.
Commands.proto:18:26: Message extensions cannot have required fields.
Commands.proto:27:26: Message extensions cannot have required fields.
This is the error I get when I try out your code.