Archive

Posts Tagged ‘ironpython’

IronPython: Reusing Import Symbols to Avoid Performance Hits

December 18th, 2008

I was struggling with IronPython today as I stumbled over a pretty annoying setback when it came to dynamic compilation of scripts that involved namespace imports.

Lets do an artificial sample: The snippet below just increments a variable by one. As expected, it executes blazingly fast – after compilation, it executes a few thousand times without getting over a single millisecond:

value = value + 1

 

Now look at the following example:

import clr
clr.AddReference('System.Xml')
from System.Xml import *
value = value + 1

 

Basically, this snippet performs the same logic (incrementing the ‘value’ variable), but it contains an import for the System.Xml namespace. It’s not necessary, but it still needs to be compiled. Executing this (compiled!) script 4000 times takes over 5 seconds!

 

However: If you are lucky (as me), you have the opportunity to separate namespace imports and business logic, so you basically end up with two scripts:

  • The import script, which is executed only once to get the imported symbols
  • A script that performs the actual work
 
//script 1
import clr
clr.AddReference('System.Xml')
from System.Xml import *

//script 2
value = value + 1

 

Of course, you still need the import from the first script to perform the second one – which is where symbol dictionaries come into play:

//a dictionary that receives the import simbols
var importSymbols = new SymbolDictionary();

//execute imports on our engine
ScriptSource importSource = engine.CreateScriptSourceFromString(imports,
    SourceCodeKind.Statements);
CompiledCode importCompiled = importSource.Compile();
//create a scop to populate the symbols with the imported types
ScriptScope importScope = engine.CreateScope(importSymbols);
importCompiled.Execute(importScope);

 

The statement above executes the import script with an empty SymbolDictionary. If you check this dictionary after execution, you can see that it now contains symbols for all types of the imported namespace:

importSymbols.Key = __builtins__
value = IronPython.Runtime.PythonDictionary

importSymbols.Key = clr
value = Microsoft.Scripting.Runtime.Scope

importSymbols.Key = IHasXmlNode
value = IronPython.Runtime.Types.PythonType

importSymbols.Key = IXmlLineInfo
value = IronPython.Runtime.Types.PythonType

importSymbols.Key = IXmlNamespaceResolver
value = IronPython.Runtime.Types.PythonType

... much more

 

What we can do now is reuse this retrieved symbols for our actual worker snippet  (but beware, there’s a catch):

//compile worker script
ScriptSource scriptSource = engine.CreateScriptSourceFromString(source,
  SourceCodeKind.Statements);
CompiledCode workerScript = scriptSource.Compile();

//creating our scope with the new symbols allows us to access
//the XML namespace
ScriptScope workerScope = engine.CreateScope(importSymbols);
workerScope.SetVariable("value", 10);
workerScript.Execute(workerScope);

 

However, this code is not thread-safe, because it causes different scopes to actually share not only the import symbols, but all variables through the symbol dictionary. Look at the code below, that creates to script scope instances which are initialized with individual values:

//create a new scope with the symbols of the import scope
ScriptScope run1 = engine.CreateScope(importSymbols);
//start with a value of 10
run1.SetVariable("value", 10);
workerScript.Execute(run1);

//create a second scope with the symbols of the import scope
ScriptScope run2 = engine.CreateScope(importSymbols);
//start with a value of 20
run2.SetVariable("value", 20);
workerScript.Execute(run2);

//both scopes actually share the same variable
Console.Out.WriteLine(run1.GetVariable<int>("value"));
Console.Out.WriteLine(run2.GetVariable<int>("value"));

 

The console outputs the same value both times – because both scopes operate on the same variable:

21
21

 

So basically, we have two requirements:

  • Store import symbols (or any other shared variables) in a reusable cache.
  • Store worker variables that belong to a given script scope in another dictionary.

 

We can do this by subclassing the CustomSymbolDictionary class of the Microsoft.Scripting.Runtime.BaseSymbolDictionary namespace. Took me ages to find a solution, but implementation was a breeze:

/// <summary>
/// A symbol dictionary that provides an fixed set of
/// symbols through a <see cref="SharedScope"/>. As new variables
/// are not added to the <see cref="SharedScope"/>, these cached
/// symbols can be reused across different scopes.
/// </summary>
public class SharedSymbolDictionary : CustomSymbolDictionary
{
  /// <summary>
  /// A script scope that provides a reusable set of symbols.
  /// Any variables that are being created by the <see cref="ScriptScope"/>
  /// that owns this cache are not stored within <see cref="SharedScope"/>,
  /// but the parent scope's own symbol dictionary.
  /// </summary>
  public ScriptScope SharedScope { get; private set; }


  /// <summary>
  /// Creates a new cache instance
  /// </summary>
  /// <param name="sharedScope">A reusable <see cref="ScriptScope"/> that provides
  /// a set of symbols that are supposed to be used across several scopes.</param>
  /// <exception cref="ArgumentNullException">If <paramref name="sharedScope"/>
  /// is a null reference.</exception>
  public SharedSymbolDictionary(ScriptScope sharedScope)
  {
    if (sharedScope == null) throw new ArgumentNullException("sharedScope");
    SharedScope = sharedScope;
  }



  /// <summary>
  /// Invoked if a given variable or symbol is being requested. This method
  /// tries to find the requested item in the underlying <see cref="SharedScope"/>.
  /// </summary>
  /// <param name="key"></param>
  /// <param name="value"></param>
  /// <returns>True if the <see cref="SharedScope"/> provides the requested
  /// symbol.</returns>
  protected override bool TryGetExtraValue(SymbolId key, out object value)
  {
    //return the key from the base scope, if possible.
    lock (SharedScope)
    {
      return SharedScope.TryGetVariable(SymbolTable.IdToString(key), out value);
    }
  }



  /// <summary>
  /// Gets a list of the extra keys that are cached by the the optimized
  /// implementation of the module.
  /// </summary>
  public override SymbolId[] GetExtraKeys()
  {
    lock (SharedScope)
    {
      return SharedScope.GetItems().Select(pair =>
        SymbolTable.StringToId(pair.Key)).ToArray();
    }
  }


  /// <summary>
  /// Tries to set the extra value and return true if the specified key
  /// was found in the list of extra values.<br/>
  /// Any attempts to store extra values are being denied, which causes
  /// them to be stored in the scope itself rather than the local
  /// <see cref="SharedScope"/>. This ensures that runtime variables are
  /// not shared between different instances of the cache.
  /// </summary>
  /// <param name="key">The key that is used to store the submitted
  /// value.</param>
  /// <param name="value">Value to be cached.</param>
  /// <returns>Always false because runtime symbols are not supposed
  /// to be stored within the cache. This causes the value to be stored
  /// within the internal dictionary of the base class.</returns>
  protected override bool TrySetExtraValue(SymbolId key, object value)
  {
    return false;
  }

}

 

 

With this implementation, we can change our snippet accordingly:

//create a new scope with the symbols of the import scope
SharedSymbolDictionary cache = new SharedSymbolDictionary(importScope);
ScriptScope run1 = engine.CreateScope(cache);
//start with a value of 10
run1.SetVariable("value", 10);
workerScript.Execute(run1);

//create a second scope with the symbols of the import scope
cache = new SharedSymbolDictionary(importScope);
ScriptScope run2 = engine.CreateScope(cache);
//start with a value of 20
run2.SetVariable("value", 20);
workerScript.Execute(run2);

//both scopes actually share the same variable
Console.Out.WriteLine(run1.GetVariable<int>("value"));
Console.Out.WriteLine(run2.GetVariable<int>("value"));
 
As the SharedSymbolDictionary class stores the value variable not in the shared importScope, the variables of the worker scope can be set independently, which produces the correct output:
 
11
21

 

And the performance gain is remarkable: Execution time is once again down to 17 milliseconds for 10’000 iterations 🙂

Download cache implementation and test script: symbolcache.zip

Author: Categories: C#, Open Source Tags: