Code Refractor - Virtual Machines/Compiler performance musings: July 2013

Wednesday, July 31, 2013

Status Updates - Part 3

As I was on vacations, I did make smaller tasks in the free time, but there are some note-worthy updates, mostly in optimization front:
- there is a [PureMethod] attribute that you can mark functions. If this attribute is found, the function is considered pure, and as a consequence, if you call it with constants, at compile time, the code is evaluated. It will be great if in future the functions are computed for purity, but this is a longer plan (is possible to be done, but are many cases)
- there is an inlining code possible (at least for simple functions), but the optimization is disabled as it requires a lot of testing. Anyway, this opens a lot of possibilities on matter of performance: if you have a call of a function with a constant, and this method is inlined, more optimizations can successfully occur. In the medium plan is to bug-fix it and test the inliner to work with most small cases
- the compiler optimizer is split into parallel and serial optimizations. The good part of it, is that as more functions are initially defined, all cores are used to compile every function into cores. The inliner (or future purity computer) are serial optimizations. This reduces the compilation time of NBody (on my I5 first gen) from 200 ms to 150 ms of generating the C++ code, still the C++ code compilation takes longest
- the function bodies are defined like a sequence of simple operations. So, optimizations that do delete one item, are rewritten to be way faster by doing the deletes in batches
- unit tests are a bit weaker right now, but they do compile/run much faster. They test the capability of the compiler to give an output, not the execution output. They run now properly, so the unit testing is working again

So in short, you will get the same code (if you don't mark it with [PureMethod] everywhere) faster.

Added code to reflect APIs, it will be needed to generate stubs for OpenRuntime code. This code needs some love, and if there are any readers interested, would it be great if someone can look into generating some library empty wrappers.

Future plans:
- enums support
- string type improvements (it depends on enum in part)
- string merging - all occurrences of the same string should be defined just once in a string table
- (maybe) fixes into inliner - at least the inline of functions call overhead should not exist at least in some cases that can be detected: empty functions, simple setters/getters
- (maybe) a purity checker - computing purity gives speedups extensively if the developer uses constants. So if the functions can be computed for their purity (without [PureMethod] annotations), when called everywhere with constants, they will give zero overhead on execution time

Tuesday, July 9, 2013

Status Updates - Part 2

This entry will be brief (as I will be soon on vacations):
- switch CIL instruction is working, this means basically that you can write switch statements
- code was moved to GitHub, because of ubiquity of it
https://github.com/ciplogic/CodeRefractor
- I was looking into delegates code and I will postpone it as it looks as is a big feature, so it will be unlikely I will have an implementation early enough to be useful in the next two-three months
- I've removed the need of having a static library written in C++ that has to be linked alongside with the code written. All the code is taken now from the OpenRuntime assembly

Future (planned) developments:
- in one month from now I will try to fix the unit-tests (as many refactors were done, the tests were pending fixes). Automatic testing is critical at least in future, as some components (like optimizations) do interact and it is critical that they work correctly
- add annotations to functions which are optimizer friendly (for people knowing what are they about). I am thinking here about marking functions without side effects. This is critical for calling constant functions and to inline their value call. Let's say you call Math.Cos(0) and you will want that this call to not be executed, but to be replaced by value 1.0 . Similarly, string functions can be computed if their parameters are constant. Also, it is important to allow in your code to annotate your code with the flag "no side effects" and the runtime will trust you and it will evaluate parts of your code at compile time.
- look into string merging (like it was done for ldtoken buffers): if you call in more places the same string, this string should be loaded from a string table, not replicated over and over again

Longer term:
- I would love a "hello world" SDL (maybe OpenGL) application to work
- a PInvoke Pinta plugin to work (without delegates)
- extract resources
- enums
- structs code
- Make a small "IDE" integrating AvalonEdit (on Windows) or something similar for making possible a fast testing cycle. Or other way to be easy for developers to reduce the cycle of testing their small applications if they work against CodeRefractor

Monday, July 1, 2013

Status Updates (I)

More work has been done to advance the compiler and the runtime but for the first time inside the repository there are some user-visible changes (still there is no downloadable package but this will be improved soon).

Bug fixes:
- bug fix: there were cases when using the CROpenRuntime C# and the C++ methods in the same source code: (like Console.WriteLine(double) and Console.WriteLine(float) ) will make the final code to write the WriteLine(double) twice) , right now the WriteLine(double) is just once added.
- PInvoke calls for very simple calls are executed correctly. For loading the native code it is used a Windows only (LoadLibrary, GetProcAddress) implementation (which works 32 and 64 bit), but the equivalent Linux/OS X (dlopen, dlsym) was not yet done. Also, there is no marshaling yet
- ldtoken: this happens because when you initialize a long array using array initializers, .Net will use a memory copy from an assembly address. This requires a constant array table. Because we are executing ahead of time, the implementation does one thing which .Net implementation doesn't, namely it merges these data, and the executing code will point to an index in this array table.
- in the past the dispatch of instructions was done by their string upcode. But right now, at least some of them are using the direct value upcode. This translates into a bit faster lookup. I will move more instructions in the next iteration.
- the compiler starts with Mono and MonoDevelop from latest Linux Mint version (Olivia): small version changes in solution were done. The Linux support is not complete, but for people wanting to try, test and tweak it, it is a small step in a small direction
- there is an Inno Setup script. This will mean that there will be somewhat soon an installer.
- there is a compiler launcher UI:

Picker of input assembly and output exe

Pick the compiler options

- the command line logic is done, meaning that you can pick right now the input/output assembly, an assembly for reading the runtime (most likely the default will be fine)

As I will be in holidays, most likely July will be a fairly empty month in progress, but most likely I will go into bug fixing, and I will try to smooth the way for the first installable package (Windows only).
The longer time roadmap would be:
- support for unsafe code
- try as a target to compile a Pinta effect (which is not multi-threaded).
- support for switch keyword (opcode)
- (later) add support for delegate and MultiCastDelegate. Without it, most of goodness of Lambdas, Linq, or just calling callbacks are not possible.

Code Refractor - Virtual Machines/Compiler performance musings