XS::Manifesto - Shared XS modules manifesto
Out of the box Perl offers tools to create XS-modules, i.e. to write code in C/C++ language for it to be both fast and be available from Perl via tools like ExtUtils::MakeMaker.
This approach works well, however it has limited extensibility. First, it's hard to call C/XS-code from other C/XS-code; while it can be done, it's possible only via a Perl layer which has a huge performance penalty compared to a direct C/C++ function call. Type safety is also lost, which is quite important for early compile-time error detection with C/C++. Second, there is no way to share source code, e.g. header files and native-type-to-Perl-type mappings (aka typemaps). Without that information, C/C++ types and templates from one module cannot be reused in another C/C++ module.
There is an alternative approach on CPAN - Alien, which tries to make non-Perl libraries available for Perl. The approach reuses system libraries or downloads and builds them. It successfully solves sharing native code issue, i.e. C/C++ headers, but not sharing XS-code. One could argue that it's Perl own fault that it ships no mechanism to share typemaps. Maybe it's even not possible to implement this at all with pure C-layer.
The Alien-specific issue is that it doesn't aim for sharing binary (executable) code (see Alien CAVEATS ). While this approach has benefits, like the requirement of an Alien-library to be a build-only (compile-only) dependency, the ease of module upgrades (without rebuilding dependencies), etc., it has it's own limitations:
First, the non-sharing of library code means duplication of it in a processes' memory. Let's say, there is an Alien::libX and XS-libraries My::libA and My::libB, which both use C-interface of libX. Statically compiled Alien::libX will be duplicated in the both of XS-libraries. While memory is considered to be cheap nowadays, it can still be an issue.
Second, as both XS-libraries My::libA and My::libB use Alien::libX independently, they can upgrade Alien::libX independently. While it's a benefit in some circumstances, in can lead to them losing binary compatibility between themselves, i.e. data structures created via My::libA::libX might be not allowed to be transferred to My::libB::libX. In other words, it's only possible to have final XS-modules without binary inter-dependencies.
Third, as there is no support for XS-modules from Alien, it makes it impossible to have cross-dependent binary hierarchy of XS-modules, like:
alien::libX <-- xs::libA <-- xs::libB <-- xs::libC <-- xs::libD
as the modules are statically compiled, there is no runtime dependency between xs::libA (libA.a) and xs::libB (libB.a); the libB.a just embeds (copies) libA.a directly into own code. The object code copying propagates through all further dependecies, upto xs::libD. The opposite approach is to have shared libraries, without any duplication.
It should be possible to have fast applications in Perl. It should be possible to have low-level components (like parsers, event loops, protocol handlers, etc.) written in C/C++, while being able to access as much as possible of their functions from Perl. The middleware components (like application servers, session managers, etc.) can be written in Perl or in C/C++ for performance-critical parts. The higher level application logic is suitable mostly for Perl, with the exception of very limited performance-critical parts.
There is an exception from the last rule: if the application models/code have to be shared with non-Perl applications (e.g. in game-application with Unity or Unreal engines), it obviously should be written in C/C++. But they still should be accessible from Perl servers.
Summa summarum any part of an application can be written in Perl and/or in C/C++, the transition should be transparent. And as it's much faster to create code in Perl, it's most likely that during the early stages of the developement Perl code dominates, while later, as an application grows and performance becomes an issue, parts of it are replaced with XS modules written in C/C++. This make it possible to have a gradual evolution instead of radical solutions like "let's rewrite everything in Go!".
It's like Alien modules build with system modules or with shared libraries support.
Provided by XS::Install.
If XS::libraryX had been compiled with the XS::libraryY v1.0 dependency, but when loaded it finds out that the actual version of XS::libraryY is different (e.g. v1.1), it by default refuses to load and asks for recompilation with the actual dependency verison XS::libraryY v1.1.
Perl itself does not guarantees ABI-compatibility between major releases. Maintaining ABI-compatibility has it's own costs, as well as possible performance penalty, e.g. instead of a direct access of a public property of a C-structure it now must be accessed via a function, preventing inlines from a C/C++ compiler.
Usually C++ libraries do not maintain ABI-compatibility, so let it be. As the drawback/consequence, if a base libraryX is upgraded, all other modules which depend on it should be recompiled. We exchange here build time for the runtime performance.
There has been several attempts to share typemaps, i.e. make it possible to reuse in xs::libraryX conversion rules between C and Perl layers already defined in xs::libraryY. Unfortunately all attempts are unsatisfying, mostly because there are no tools nor language support in C for that.
However with modern C++ the situation is a bit different, as the powerful template mechanism is provided by C++. It can be used to share C++ typemaps. It is even possible to share C-typemaps this way as long as they are compile-time compatible with C++.
Provided by XS::Framework.
Every XS-module should have Perl interface to make it possible to use it as self-contained module from Perl. It should also have a C/C++ interface to make it possible to use it from other XS-module from theirs C/C++ code, i.e. embed as a type or use in inheritance.
C++ typemap should also be provided for easy using of the module. It is considered as a part of C/C++ interface of a XS-module.
There is a module Date, which provides methods for serialization and parsing dates in various formats. It is very fast, and when Date.pm is loaded it loads it's C++ XS backend as Date.so (or Date.dll on Windows). It depends on XS::Framework, so Framework.so is also loaded and it's C++ functions are used directly from Date.so.
There is a module URI::XS (say uri.so), which parses/serializes URIs. It also depends on XS::Framework.
Now there is a module Protocol::HTTP. When its http.so is loaded, it also loads Framework.so, uri.so and Date.so. So when date or URIs are parsed/serialized from Perl code using Protocol::HTTP, it directly invokes code from uri.so or Date.so, without routing via Perl (which isn't fast).
When URI/Date objects are returned to Perl layer from the Protocol::HTTP XS API, they're exactly the same C++ objects (i.e. no any additional memory allocations for C++ objects) just wrapped into Perl SV (scalars) and it's possible to simply use them in Perl APIs of the corresponding modules (URI::XS or Date), which exists completely outside of Protocol::HTTP space.
XS::Install (tooling support) and XS::Framework (XS/C++ support) makes it possible to do that sharing in easy way.
Pronin Oleg (SYBER) <firstname.lastname@example.org>, Crazy Panda LTD
Sergey Aleynikov (RANDIR) <email@example.com>, Crazy Panda LTD
Ivan Baidakou (DMOL) <firstname.lastname@example.org>, Crazy Panda LTD
You may distribute this code under the same terms as Perl itself.
To install XS::Manifesto, copy and paste the appropriate command in to your terminal.
perl -MCPAN -e shell
For more information on module installation, please visit the detailed CPAN module installation guide.