21 July, 2009

Xtext Linking one step at a time

The Eclipse environment with Xtext provides an astoundingly powerful toolset. The EMF/TMF folks are doing a great job providing good capabilities with straightforward means of tailoring and extension. That said, there's a lot to the environment that can make some tasks appear more daunting than need be.

After reading Sven Efftinge's post on an experimental Xtext feature that automates linking among models in a project, I was definitely inclined to make use of it. On the other hand, it's a solution much bigger than the problem that I currently face: linking to references within the same model (program, actually) while assuming that references without a local definition are external.

My grammar allows references to imported constructs in a manner similar to Java's implied import of all class names that reside in the same package. The approach in Sven's article on integrating with the EMF index will ultimately be implemented if time allows in this project. Presently, I don't want the parser to actually load all the models partly because the main goal is to analyze individual models and the actual content of imported models is extraneous.

The simple way keep errors from being flagged where implicit imports can't be resolved is to extend the DefaultLinkingService. To start with, the ILinkingService binding has be be replaced by adding an override in the LanguageRuntimeModule class in the main parser project's directory:

 public Class<? extends ILinkingService> bindILinkingService() {
  return PdlLinkingService.class;
 }

The custom linking service is a minimal extension of DefaultLinkingService. Thanks to Sebastian for guidance on optimizing the integration to Xtext. Knut's pointers for dealing with the ECore resources was crucial to implementing a proper solution. Here's the complete class (I'm sure there's still room for improvement):

/**
 * Provide the linking semantics for PDL. The references to 'imported' table
 * definitions are stubbed out by this class until the Indexer is implemented.
 * 
 * @author John Bito
 */
public class PdlLinkingService extends DefaultLinkingService {

 private PdlFactory factoryInstance = null;
 /**
  * Keep stubs so that new ones aren't created for each linking pass.
  */
 private final Map<String, PhysicalFileName> stubbedRefs;
 private Resource stubsResource = null;

 public PdlLinkingService() {
  super();
  stubbedRefs = new Hashtable<String, PhysicalFileName>();
 }

 /**
  * Retrieve the factory for model elements
  * 
  * @return the factory for the model defined by the language
  */
 private PdlFactory getFactory() {
  if (null == factoryInstance)
   factoryInstance = PdlPackage.eINSTANCE.getPdlFactory();
  return factoryInstance;
 }

 /**
  * Use a temporary 'child' resource to hold created stubs. The real resource
  * URI is used to generate a 'temporary' resource to be the container for
  * stub EObjects.
  * 
  * @param source
  *            the real resource that is being parsed
  * @return the cached reference to a resource named by the real resource
  *         with the added extension 'xmi'
  */
 private Resource makeResource(Resource source) {
  if (null != stubsResource)
   return stubsResource;
  URI stubURI = source.getURI();
  stubURI = stubURI.appendFileExtension("xmi");
  stubsResource = source.getResourceSet().getResource(stubURI, false);
  if (null == stubsResource)
   // TODO find out if this should be cleaned up so as not to clutter
   // the project.
   source.getResourceSet().createResource(stubURI);
  return stubsResource;
 }

 /**
  * Override default in order to supply a stub object. If the default
  * implementation isn't able to resolve the link, assume it to be a local
  * resource.
  * 
  * @param context
  *            the model element containing the reference
  * @param ref
  *            the reference defining the type that must be resolved
  * @param node
  *            the parse tree node containing the text of the reference (ID)
  * @return the default implementation's return if non-empty or else an
  *         internally-generated PhysicalFileName
  * @throws IllegalNodeException
  *             if detected by the default implementation
  * @see org.eclipse.xtext.linking.impl.DefaultLinkingService#getLinkedObjects(org.eclipse.emf.ecore.EObject,
  *      org.eclipse.emf.ecore.EReference,
  *      org.eclipse.xtext.parsetree.AbstractNode)
  */
 @Override
 public List<EObject> getLinkedObjects(EObject context, EReference ref,
   AbstractNode node) throws IllegalNodeException {
  List<EObject> result = super.getLinkedObjects(context, ref, node);
  // If the default implementation resolved the link, return it
  if (null != result && !result.isEmpty())
   return result;
  // Is this a reference to be stubbed?
  if (PdlPackage.Literals.FILE_NAME
    .isSuperTypeOf(ref.getEReferenceType())) {
   // Get the stub's name from the text of the parse tree node.
   String name = getCrossRefNodeAsString(node);
   FileName stub = stubbedRefs.get(name);
   if (null == stub) {
    // Create the model element instance using the factory
    stub = getFactory().createPhysicalFileName();
    stub.setName(name);
    // Attach the stub to the resource that's being parsed
    makeResource(context.eResource()).getContents().add(stub);
   }
   result = Collections.singletonList((EObject) stub);
  }
  return result;
 }
}

Of course, this is a stop-gap that's not appropriate for complete language processing, but it's illustrative of one simple way semantics may be adjusted within the Xtext framework. There are two likely problems making this implementation sub-optimal:

  • The 'temporary' resource lifetime isn't managed, so the local cache of EObject references can be out of sync. Intuitively, one would expect the resource to exist longer than the LinkingService, so it will be surprising to find the LinkingService using a reference to an EObject that's not stored in the resource.
  • Since the code doesn't try to find EObject instances that are already in the resource, there can be a memory leak if the LinkingService is created multiple times during a session. It may make sense to use the delete method on the Resource, but it looks a bit dangerous and I don't have the wherewithal to test that at the moment.
I searched a bit for examples of code managing ECore resources; my search terms weren't very fruitful. Pointers would be greatly appreciated.

11 July, 2009

Embedding Java in a C language application

The Java Native Interface is a powerful API that allows Java to call native libraries and also allows native executables to invoke Java (or any JVM-based) classes.

An important addition to the Java Native Interface in Java 1.4 was support for java.nio.ByteBuffer parameters. This allows the native code (C in my case) to efficiently share data with classes running in the JVM. Often, when C code is invoking Java methods, it has to create objects, particularly Strings, to pass to the methods. While this provides some safety to the Java side, it has some performance implications.

With Java 1.4, we have the option to pass data across the JNI in a single buffer shared by the C and Java code referenced by a java.nio.ByteBuffer. The use of this technique comes with many caveats, particularly if the application runs multiple threads. The ideal application for this approach is when a fixed-size message with an unvarying or self-defining format is passed across the interface. The performance advantage comes because both the C and Java code can change the data, so the ByteBuffer object only has to be created once. The C code can change the data in the buffer and pass the same ByteBuffer object to multiple method invocations and then make use of changes the Java code makes to the buffer contents.

JNIEnv *jni = get_jniEnv();
jclass cls = getClassRef();
jmethodID getdata_mid = getMethodID(DATA_LOADER);
static jobject jbuff = NULL;
static char *databuff = NULL;

if (!databuff)
  if (databuff = malloc(cbuff))
    memset(databuff, '\0', cbuff);

if (jni && !jbuff && databuff)
  jbuff = (*jni)->NewDirectByteBuffer(jni, databuff, cbuff);
if (!(jni && cls && getdata_mid && jbuff))
  //quit unless all references are available
  return NULL;
(*jni)->CallStaticVoidMethod(jni, cls, getdata_mid, jbuff);
if (check_exception())
  return NULL;  
return databuff;
The code above assumes that all the work to initialize the JVM, load the class and lookup the method is handled elsewhere. The caller can examine and change the data retrieved by the Java code. It can then modify the contents of the buffer returned and call the function again; then Java code can use the changes made in buffer between the calls. This saves object creation and destruction as well as the time copying data from one buffer to another.

On the Java side, there are some limitations on the use of the ByteBuffer. It's not backed by an Array object, so some of the operations declared by the abstract class are unsupported when the object is created by the C code.

JRE Internal Error – "exception happened outside interpreter, nmethods and vtable stubs (1)"?!

When calling Java via JNI, there's no checking of the method parameters. As a result, if your code passes parameters that don't match the method signature in a Call...Method JNI call, the JVM will likely panic and terminate with a message like "exception happened outside interpreter, nmethods and vtable stubs (1)" if you're lucky (and your JVM is at least 1.6). If you're really lucky, the JVM will write an error log that includes a stack trace pointing to the C code that made the JNI call.

Unfortunately, the 'Internal Error' reported by the JVM is really a generic panic message, so it's also likely to result from more insidious problems like your code writing into the Java heap. In that case you could try valgrind or similar tool to track down stray memory usage, but I can't say whether it'll work when calling into the JNI.

07 July, 2009

Adding function to Xtext-generated plugins

The Xtext parser generator is generating much more than a parser—otherwise ANTLR by itself would be a simpler solution. As noted in my post on migrating the Xtext grammar, just by deploying the plugins generated by Xtext, a basic outline is provided while editing the defined language.

For my analyzer, I have to add checks that flag errors for programs that contain multiple definitions for certain types of structures. The legacy language allows a single name to be associated with multiple, different structure definitions as the program executes, but part of the upgrade will cause this to be illegal, so it's the job of the static analyzer (parser) to detect this case so the programmers don't have to search out and test each program manually.

The Xtext manual section on Validation includes Custom Validation describes the classes involved in implementing validations for the language. In order to get the Xtext customization framework for validation, add

<fragment class="org.eclipse.xtext.generator.validation.JavaValidatorFragment"/>
to the GenerateLanguageName.mwe file in the parser project. Unfortunately, the Java-based checks are not so easy to write as the oAW Check language. One of the nice things about oAW was the content assist—the editor for Check knows the AST model and makes it easy to write checks that work. There's another fragment to add to the MWE file in order to get the Check capability:
<fragment class="org.eclipse.xtext.generator.validation.CheckFragment"/>
Since the TMF Xtext plugins don't seem to include a Check editor with content assist, it's actually more convenient to write the validations in Java. Referring to the LanguageName.ecore file in the visual editor (should open when you double-click the ecore file) makes it pretty easy to see the types and their features when working on validations.

I just needed to add two validations (so far). The Java is pretty small, but it took me a little while to get comfortable with the CST classes.

/**
 * Issue warning if the View object is using the dynamic resolution syntax
 * @param view object to be validated
 */
@Check
public void checkViewNotDynamic(View view) {
  if (null != view.getDynamic())
    warning("View defined with dynamic target", PdlPackage.VIEW__DYNAMIC );
}
/**
 * Issue error if there there exist multiple Join objects identified by the same name
 * @param join one of the Join objects in the current ProcessDefinition 
 */
@Check
public void checkJoinUnique(final Join join) {
 final String name = join.getName(); // This name must not be used to identify any other Join object
  // Now loop on all the other elements that are in the ProcessDefintion
  for (EObject sibling : ((ProcessDefinition) join.eContainer()).getCommands()) {
    // Check for another object instance that's a Join with the same name as the one being validated
    if (join != sibling && sibling instanceof Join && name.equals(((Join) sibling).getName()))
      error("duplicate join '" + name + "'", PdlPackage.JOIN__NAME);
  }
}
These two validators are picked up by the Xtext framework and invoked by the editor and the generator. The warnings and errors show up in the editor as well as the Package Explorer, Problems and other views.

With the parser doing the right thing, I copied the Xpand template that I wrote under oAW into the generator project as templates/Template.xpt. The workflow/LanguageNameGenerator.mwe had a problem—it got a NoClassDefFoundError because it specified a class name in the register element with an initial lowercase letter. I don't know if that was because of something I did or a bug in the project wizard. Changing the MWE file was all that was needed to test the template. It's the same as oAW, which is to say it's a convenient way to traverse the CST and emit code.

«DEFINE main FOR ProcessDefinition»
  «FILE "test"»
    «EXPAND viewDef FOREACH commands.typeSelect(View)»
    «EXPAND joinDef FOREACH commands.typeSelect(Join)»
  «ENDFILE»
«ENDDEFINE»

//In the output file, there will be one line for each instance of Join
//Within the tempate, the properties of the instance are accessed like
//local variables.
«DEFINE joinDef FOR Join »
// Pay no attention to the man behind the curtain named «name»
«ENDDEFINE»

«DEFINE viewDef FOR View »
 «IF null!=name && null!=source -» «REM» Generate mappings for statically-defined views «ENDREM»
 views["«name.toUpperCase()»"] = joins.createView("«source.toUpperCase()»");
 «ENDIF -»
«ENDDEFINE»
You'll notice that the template language uses guillemets to enclose the executable instructions. To make sure that you read all of the documentation, the authors only tell you that the characters are bound to CTRL-< and CTRL-> at the end. The Xpand editor was pretty nice to use in oAW. I haven't tried it yet in TMF, partly because it wasn't installed. I'm not quite sure how I got the runtime plugin for Xpand installed—perhaps p2 resolved a dependency. I'm pretty sure that once I install the Xpand UI from the M2T Xpand update site, I'll get the content assist that I used when I was originally developing the template.

The Software Life: Antlr Frustrations

Andrew McKinlay's post The Software Life: Antlr Frustrations on "no start rule" warnings gives a succinct explanation of a problem that can be somewhat evasive. The ANTLR warning "no start rule (no rule can obviously be followed by EOF)" may come and go as other issues with the grammar are resolved. One answer is to always start the grammar with a rule like:

prog : expr ;
The prog rule doesn't change anything about the language that can be parsed. It can be used as an entry point by the code calling the parser, like an interface that can be stable even as the grammar changes underneath.

03 July, 2009

Migrating from oAW XText to Eclipse TMF XText

Part of my project is analyzing the structure of programs processed by the interpreter that I'm working to enhance. Xtext and the related tools from oAW allowed me to define a subset of the language for analysis and code generation. As there's a need for more function (and hopefully more speed), it's great that a new Xtext is available for Eclipse Galileo.

The reference for Xtext includes a section on changes from oAW to TMF that's quite informative, but doesn't include any advice on how to change the existing Eclipse projects. That's in the Eclipsepedia wiki section on Xtext Migration.

The guide is a bit ominous-sounding when it talks about specifying the language name in the Xtext project wizard. Since the previous name for my package wasn't very good, and there is very little extension work in the previous project—the main purpose is the generator—the approach for the moment is to copy the text into the shells created by the project wizard. The wizard doesn't give much advice about the naming, but the editor complains if the last element of the language name is lowercase.

In the old oAW system, the plugin added a context menu named 'generate Xtext artifacts'. This doesn't appear anymore. Since I'm building my eclipse configuration from the 'Classic SDK', it's necessary to install the Modeling Workflow Engine (MWE) separately. The MWE is required to run the workflow GenerateLanguageName.mwe in the parser project.

As the documentation says, unlike oAW, TMF Xtext doesn't enable parser backtracking by default. I was able to restructure the grammar rules to get rid of many of the warning(200)Decision can match input such as "'whatever'" using multiple alternatives messages. (These aren't really warnings, since the resulting parser is ignoring portions of the grammar that I'm specifying). In the end, I found that there is a construct in the legacy language accepted by its recursive interpreter that's indeterminate. So I modified the Xtext MWE file as recommended by Sebastian Zarnekow to replace AntlrDelegatingFragment with XtextAntlrGeneratorFragment and DelegatingGeneratorFragment with XtextAntlrUiGeneratorFragment adding the child <options backtrack="true"/> element to each (don't remove the JavaBasedContentAssistFragment). I wonder if Xtext provides (or will provide) a way to enable backtracking for a subset of the grammar rules.

Once the Grammar was compiling again, I was able to check it out quickly by following the steps suggested in the Getting Started with Xtext post by Peter Friese. Invoking 'Run As → Eclipse Application' from the context menu of the parser project brings up a new IDE in a separate, initially empty, workspace. From there, create a new project and within that a new file with the extension named in the Xtext Wizard. Copying an existing program populated the outline view—a vast improvement over oAW which ate all the heap so the outline had to be disabled. One caveat: with the generated plugin configured, the editor is sure to fail with an NPE if you open a file named with the extension associated with your editor outside of a project (using File → Open).

Now, on to working with linkages in the AST!

02 July, 2009

Jevopi's Developer Blog: User Report: Migrate from oAW Xtext to TMF Xtext

Jevopi's Developer Blog: User Report: Migrate from oAW Xtext to TMF Xtext

Just starting to work on the conversion of the Xtext grammar and code generator to the new technology developed by TMF. My case is a bit different as the language definition is for an existing general-purpose programming language. The current scope of the processing is limited to analyzing and extracting specific constructs in the language.

It's nice that Javopi provided such a thorough explanation of the process required to migrate that language processor from oAW XText to the Eclipse TMF XText. I'll be working on some additional details like analyzing cross references and code generation.

More posts to come.

 

Copyright 2009-2010 John Bito. Creative Commons License
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.