2 minute read

If you’re using Apache Storm on HDInsight and consuming events from Event Hubs, the current ‘get started’ approach is to use a hybrid topology with a Java spout provided by Microsoft, and code your own bolts in C#.

For integration testing your topology, you’ll want a stub that can emit events that look like they come from the Java spout. I’ve pushed a reusable stub as a Gist here: Simple stub for emitting Event Hub-style tuples to Storm context, for testing .NET Storm applications.

The real Java Event Hub Spout

You can get the latest version of the spout from Maven, but if you create a Visual Studio project from the Storm Event Hub Reader Sample project template:

Storm Event Hub Reader Sample project template

, you’ll get a bundled version of the spout and its dependencies in your project, under JavaDependency/eventhubs-storm-spout-0.9-jar-with-dependencies.jar.

When you deploy to HDInsight you need to include that dependency, but the project template does that too by adding the folder in the SubmitConfig.xml file:

<?xml version="1.0" encoding="utf-8"?>
<StormSubmissionConfig xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <CurrentClusterName>clusterName</CurrentClusterName>
  <IsHybrid>true</IsHybrid>
  <HybridPath>JavaDependency</HybridPath>
</StormSubmissionConfig>

That’s all straightforward, but it doesn’t lend itself to testing your topology. You don’t get Storm in the HDInsight Emulator, so if you want to test your cluster as-is you need a full stack in Azure - with an Event Hub, something to generate events, and a Storm cluster running.

For proper integration testing which you can run locally and in your CI build, you can use the stub spout - the Java spout isn’t your own code, so you can treat it as a dependency and swap it out.

The stub Event Hub spout

My stub is very easy to use, it has a generic argument for the type of object you want the spout to emit as tuples. You instantiate it with your LocalContext (see Unit testing .NET Storm applications), and a factory delegate for generating the events. Then call NextTuple() as many times as you want, to load up the context:

var spout = new StubEventHubSpout<TimingEvent>(context, () => events.Pop());
for (var i = 0; i < eventCount; i++)
{
    spout.NextTuple(dictionary);
}
context.WriteMsgQueueToFile(_queuePaths.First());

In this case, the factory simply pops event objects from a typed stack I load up earlier in the test:

var events = new Stack<TimingEvent>();
foreach (var racerId in _racerIds)
{
    events.Push(new TimingEvent
    {
        TimerId = startTimerId,
        RacerId = racerId,
        Timestamp = (long)_Random.Next(1441097480, 1441097590) * 1000
    });
}

The spout uses Json.NET to serialize the event to a string, so this will mimic your end-to-end if the objects you use in your factory for the stub, are the same as the objects you send to Event Hubs.

But the stub spout is configured to use a custom serializer when it emits to the context, the CustomizedInteropJSONSerializer , so it outputs tuples in the same format as the real Java spout.

When you use the stub spout, your test can verify that your .NET bolt is using the correct schema, serializer and stream IDs and give you confidence in the wiring of your Storm application

I’ll get on to integration testing whole topologies in a post coming soon.

Comments