Protocol Buffers - is it really faster than xml?
It seems google is claiming their protocol buffers are faster than xml... without any proof.
Consider AsmXml, which can process xml at over 200MB/s on old machines.
The protocol buffers from google also generate wrappers for different languages, and other nice things. But for loading structures into and out of memory, xml can be very fast.
Before claiming things like that, I think proof in the form of benchmarks are needed.
I don't doubt they thought that xml was slower, since many implementations are slower. Maybe xml is slower, but there is no proof yet. Also I'm sure the other nice features of protocol buffers make them perfectly suited for their task.
Url encoding could have been used nicely too.
Consider AsmXml, which can process xml at over 200MB/s on old machines.
The protocol buffers from google also generate wrappers for different languages, and other nice things. But for loading structures into and out of memory, xml can be very fast.
Before claiming things like that, I think proof in the form of benchmarks are needed.
I don't doubt they thought that xml was slower, since many implementations are slower. Maybe xml is slower, but there is no proof yet. Also I'm sure the other nice features of protocol buffers make them perfectly suited for their task.
Url encoding could have been used nicely too.
Comments
unfortunately I've not enough time for such science... I'll have to leave adding science to the protocol buffers claims to its authors.
It definitely would be interesting to compare the speed of AsmXml to protocol buffers.
Basically, data is encoded as a tag to identify the field, followed by the data type, followed by an optional data length for fields like strings, followed by the data.
At runtime the data is represented in your program by a generated object that can parse itself with very simple logic. It just scans through the binary fields until it recognizes a tag number and then it does a very simple copy/decode and stores the value directly in a member field.
Since this is an extremely simple and easy to serialize and parse data format (it's hard to imagine it getting much simpler!), it would be very difficult for structured XML to compete speed-wise given all the tokenizing and string -> data conversions it has to do. An XML representation of the same data would also be significantly larger, which would slow down communication between machines.
Not to mention offlining the idea of schema, generating native code for you.
Maybe there schema is not upto the mark .