17 April 2019

First look at the HoloLens 2 emulator

Intro

Today, without much fanfare, the HoloLens 2 emulator became available. I first saw Mike Taulty tweeting about it and later more people chiming in. I immediately downloaded it and started to get it to work, to see what it does and how it can be used. The documentation is a bit limited yet, so I just happily blunder along the emulator, trying some things, and showing you what and how

Getting it is half the fun

Getting it is easy - from this official download page you get get all the emulator versions, including all versions of the HoloLens 1 emulator - but of course we are only interested in HoloLens 2 now:

Just like the previous instances, the emulator requires Hyper-V. This requires you to have hardware virtualization enabled in your BIOS. Consult the manual of your PC or motherboard on how to do that. If you don't know what I am talking about, for heavens sake stop here and don't attempt this yourself. I myself found it pretty scary already. If you make mistakes in your BIOS settings, your whole PC may become unusable. You have been warned.

Starting the Emulator from Visual Studio

The easiest way to start is from Visual Studio. If you have installed the whole package, you will get this deployment target. You can choose whether you want debug or release - the latter is faster.

But mind using x86 as a deployment target. Otherwise the emulator is not available. It may be that the HoloLens 2 has an ARM processor, but your PC has not. For an an app I just cloned the Mixed Reality Toolkit 2 dev branch, opened up the project with Unity 2018.3.x and built the app. The I opened the resulting app with Visual Studio. See my previous post on how to do that using IL2CPP (that is, generating a C++ app)

If the emulator starts for the first time in your session, you might see this

Just click and the emulator starts up. Be aware this a heavy beast. It might take some time to start, it might also drag down the performance of your PC down somewhat. Accept the elevation prompt, and then most likely Visual Studio will thrown an error as it tries to deploy as soon as the emulator has started, but it's far from ready to accept deployment of apps - the HoloLens OS is still booting. After a while you will hear the (for HoloLens users familiar) "whooooooomp" sound indicating the OS shell is starting.

Starting the emulator directly

Assuming you have installed everything in the default folder, you should be able to start the emulator with the following command:

"%ProgramFiles(x86)%\Windows Kits\10\Microsoft XDE\10.0.18362.0\XDE.exe" /name "HoloLens 2 Emulator 10.0.18362.1005" /displayName "HoloLens 2 Emulator 10.0.18362.1005" /vhd "%ProgramFiles(x86)%\Windows Kits\10\Emulation\HoloLens\10.0.18362.1005\flash.vhdx" /video "1968x1280" /memsize 4096 /language 409 /creatediffdisk "%USERPROFILE%\AppData\Local\Microsoft\XDE\10.0.18362.1005\dd.1968x1280.4096.vhdx" /fastShutdown /sku HDE

This has been ascertained using the information in this blog post, that basically does the same trick for HoloLens 1 emulators

Either way, it will look like this:

If you have followed the Mixed Reality development in the 19H1 Insider's preview, you will clearly recognize that the Mixed Reality crew are aligning HoloLens 2 with the work that has been done for immersive WMR headsets.

Controlling the Emulator

The download page gives some basic information about how you can use keystrokes, mouse or an XBox Controller to move your viewpoint around and do stuff like air tap and bloom. This page gives some more information, but it indicates it this is still for HoloLens 1 emulator.

However, it looks like most of the keys are in there already. The most important one (initially) is the Escape key, which - just like in the HoloLens 1 emulator - will reset your viewpoint and your hand positions. And believe me, you are going to need them.

Basic control

This is more or less unchanged. You move around using the left stick, you turn around using the right stick. Rotating sideways van moving up/down is done using the D-pad. Selecting still happens using the triggers.

Basic hand control

If you use an Xbox Controller, you will need to do the following:

  • To move the right hand, press the right bumper, and slightly move the left stick. If you move it forward, you will see the right hand moving forward
  • To move the left hand, press the left bumper, and still use the left stick.

Hands are visualized as show on the right. The little circle visualizes the location of the index finger, the line it a projection form the hand forward, to a location you might activate from afar - like ye olde airtap, although I am not quite sure of the actual gesture in real life.

It's a bit hard to capture in a picture what's happening, so I made a little video of it:

With the right stick, you control the hand's rotation.

Additional hand control

If you click on red marked icon on the floating menu to the right of the emulator, you will get the perception control window. If you press the right bumper, the right hand panel expands, where you can select a gesture. Having a touch screen then comes in mightily handy, I can tell you


















Some final thoughts (for now)

You can also see the buttons "Eyes". If you click that, I presume you can simulate eye tracking. But if you do that, the only thing I can see is that you can't move your position anymore.So I am probably missing something here.

I have done more things, like actually deploying an app (the demo shown by Julia Schwarz, the technical lead for the new input model who so amazingly demoed the HoloLens 2 at MWC) but that's for another time. This really wets my appetite for the real device, but for the mean time, we have this, and need to be patient ;) No code this time, sorry, but there is nothing to code. Just download the emulator and share your thoughts.

11 March 2019

Debugging C# code with Unity IL2CPP projects running on HoloLens or immersive headsets

Intro

My relation with Unity is a complex one. I adore the development environment. It allows me to weave magic and create awesome HoloLens and Windows Mixed Reality apps with it with an ease that defies imagination for someone who never tried it. I also have cursed them to the seventh ring of hell for the way the move (too) fast and break things. Some time ago Unity announced they would do away with the .NET backend. This does not mean you can't develop in C# anymore - you still do, but debugging becomes quite a bit more complicated. You can find how you can do it in various articles, forum posts, etc. They all have part of the story but not everything. I hope this fills the gap and shows the whole road to IL2CPP debugging in an easy to find article.

Context

Typically, when you build an app for Mixed Reality, you have a solution with C# code that you use while working inside the Unity Editor. You use this for typing code and trying things out. I tend to call this "the Unity solution" or "the editor solution". It is not a runnable or deployable app, but you can attach the Visual Studio debugger to the Unity editor by pressing Start in Visual Studio, and the the play button in Unity. Breakpoints will be hit, you can set watches, all of it. Very neat.

When you are done or want to test with a device, you build the app. This generates another solution (I call that the deployment solution) that actually is an UWP app. You can deploy that to a HoloLens or to your PC with Mixed Reality headset attached. This is essentially the same code but in a different solution. The nice part of that is that if you compile it for debug, you can also put in breakpoints and analyze code on a running device. Bonus: if you change just some code you don't have to rebuild the deployment solution over and over again to do another test on the device.

Enter IL2CPP (and bit of a rant, feel free to skip)

Unity, in their wisdom, have decided the deployment solutions in C# are too slow, they have deprecated the .NET 'backend', and so in stead of generating a C# UWP solution, they generate a C++ UWP solution. If you build, your C# code will be rewritten in C++, you will need to compile that C++ and deploy the resulting app to your device. Compilation takes a whole lot longer, if you change as much as a comma you need to build the whole deployment solution again, and the actual running code (C++) no longer resembles any code you have written yourself. And when they released this, you could also forget about debugging your C# code in a running app. Unity did not only move the cheese, they actually blew up part of the whole cheese storehouse.

With Unity 2018.2.x - they've basically sent over some carpenters to cover up the hole with plywood at plaster. And now you can sort-of debug your C# code again. But it's a complicated and rather cumbersome process.

Brave new world - requirements

I installed all of Desktop and UWP C++ development bits, probably a bit over the top.

At one point I got complaints about the "VC++ 2015 (140) toolset" missing while compiling so I added that too. This is apparently something the Unity toolchain needs. Maybe this can be more efficient, needing less of this stuff, but this works on my machine. I really don't know anything about C++ development. I tried somewhere in the mid 90s and failed miserably.

Also crucial: install the Visual Studio tools for Unity, but chances are you already have, because we needed this too met .NET backends:

I did uncheck the Unity Editor option, as I used Unity 2018.3.6f1 in stead of the one Visual Studio tries to install. I tend to manage my Unity installs via the Unity Hub.

Build settings

In Unity, I use this settings for building the debuggable C++ app

I am not entirely sure if the "Copy References" is really necessary but I have added it anyway. The warning about missing components is another nice Unity touch - apparently something is missing, but they don't tell you what. My app is building, so I suppose it's not that important for my setup.

App capability settings

Now this one is crucial. To enable debugging, Unity builds a specialized player with some kind of little server in it that enables debuggers to attach to it. This means, it needs to have network access. The resulting app is still a UWP app, so it's network capabilities need to be set. You can do that in either the resulting C++ solution's manifest or in the Unity editor, using the "Player Settings" button. Under "Publishing Settings" you will find this box where you can set capabilities

I just added all network related stuff for good measure. The advantage of doing it here is that it will be added back even if you need to rebuild the deployment solution from scratch. The drawback is that you might forget to remove capabilities you don't need and you will end up with an app asking for a lot of capabilities it doesn't use. For you to decide what works best.

Selecting the IL2CPP backend

In case Unity or the MRTK2 does not do this for you automatically, you can find this setting by pressing the Player Settings button as well. In "Other settings" you can find the "Scripting Backend". Set this to IL2CPP.

Building and deploying the UWP C++ app.

A C++ UWP app generated by Unity looks like this:

Now maybe this is obvious for C++ developers but make sure the app that is labeled "Universal Windows" is the startup project. I was initially thrown off kilter by the "Windows Store 10.0" link and assumed that was the startup project.

Important is to build and deploy the app for Debug, that has not changed since the .NET backend days. Choose target and processor architecture as required by your device or PC

Make sure the app actually gets deployed to wherever you want to debug it. Use deploy, not build (from the Build menu in Visual Studio)

And now for the actual debugging

First, start the app on the machine - be it a PC or a HoloLens, that does not matter - where it needs to be debugged

Go back to your Unity C# ('editor' ) solution. Set breakpoints as desired. And now comes the part that really confused me for quite some time. I am used to debug targets showing up here.

But they never do. So don't go there. This is only useful when you are debugging inside the Unity Editor. In stead, what you need to do is go to the Debug menu of the main Visual Studio Window and select "Attach Unity Debugger"

I've started the app on both my HoloLens and as a Mixed Reality app on my PC, and I can choose of no less than three debug targets now: the Unity editor on my PC, the app running on the HoloLens and the app running on the PC

"Fudge" is the name of the gaming quality rig kindly built by a colleague a bit over a year ago, "HoloJoost" is my HoloLens. I selected the "Fudge" player. If you select a player, you will get an UAC prompt for the "AppContainer Network Isolation Diagnostics Tool". Accept that, and then this pops open:

Leave this alone. Don't close it, don't press CTRL-C.

Now just go over to your Mixed Reality app, be it on your HoloLens or your Immersive Headset, and trigger an action that will touch code with a breakpoint in it. In my case, that happens when I tap the asteroid

And then finally, Hopper be praised:

The debugger is back in da house.

Conclusions

This is not something I get overly happy about, but at least we are about three-quarters of where we were before. We can again debug C# code in a running app, but with a more convoluted build process, less development flexibility, and the need to install the whole C++ toolchain. But as usual, in IT, the only way is forward. The Mixed Reality Toolkit 2, that is used to built this asteroid project, requires Unity 2018.2.x. HoloLens 2 apps will be built with MRTK2 and thus we will have deal with it, move forward and say goodbye to the .NET backend. Unless we don't want to build for HoloLens 2 - which is no option at all for me ;)

No test project this time, as this is not about code but mere configuration. I will start blogging MRTK2 tidbits soon, though.

Credits

There is a host of people who gave me pieces of the puzzle that made it possible for to piece the whole thing together. In order of appearance:

29 January 2019

Labeling Toy Aircraft in 3D space using an ONNX model and Windows ML on a HoloLens

Intro

Back in November I wrote about a POC I wrote to recognize and label objects in 3D space, and used a Custom Vision Object Recognition project for that. Back then, as I wrote in my previous post, you could only use this kind of projects by uploading the images you needed to the model in the cloud. In the mean time, Custom Vision Object Recognition models can be downloaded in various formats - and one of them in ONNX, which can be used in Windows ML. And thus, it can be used to run on a HoloLens to do AI-powered object recognition.

Which is exactly what I am going to show you. In essence, the app still does the same as in November, but now it does not use the cloud anymore - the model is trained and created in the cloud, but can be executed on an edge device (in this case a HoloLens).

The main actors

These are basically still the same:

  • CameraCapture watches for an air tap, and takes a picture of where you look
  • ObjectRecognizer receives the picture and feeds it to the 'AI', which is now a local process
  • ObjectLabeler shoots for the spatial map and places labels.

As I said - the app is basically still the same as the previous version, only now it uses a local ONNX file.

Setting up the project

Basically you create a standard empty HoloLens project with the MRTK and configure it as you always do. Be sure to enable Camera capabilities, of course.

Then you simply download the ONNX file from you model. The procedure is described in my previous post. Then you need to place the model file (model.onnx) into a folder "StreamingResources" in the Unity project. This procedure is described in more detail in this post by Sebastian Bovo of the AppConsult team. He uses a different kind of model, but the workflow is exactly the same.

Be sure to adapt the ObjectDetection.cs file as I described in my in my previous post.

Functional changes to the original project

Like I said, the difference between this project and the online version are for the most part inconsequential. Functionally only one thing changed: in stead the app showing the picture that it took prior to starting the (online) model, it now sounds a click sound when you air tap to start the recognition process, and sounds either a pringg sound or a buzz sound, indicating the recognition process respectively succeeded (i.e. found at least toy aircraft) or failed (i.e. did not find an toy aircraft).

Technical changes to the original project

  • The ObjectDetection file, downloaded from CustomVision.ai and adapted for use in Unity, has been added to the project
  • CustomVisonResult, containing all the JSON serialization code to deal with the online model, is deleted. The ObjectDetection file contains all classes we need
  • In all classes I have adapted the namespace from "CustomVison" *cough* to "CustomVision" (sorry, typo ;) ).
  • The ObjectDetection uses root class PredictionModel in stead of Predition, so that has been adapted in all files that use it. The affected classes are:
    • ObjectRecognitionResultMessage
    • ObjectLabeler
    • ObjectRecognizer
    • PredictionExtensions
  • Both CameraCapture and ObjectLabeler have sound properties and play sound on appropriate events
  • ObjectRecognizer has been extensively changed to use the local model. This I will describe in detail

Object recognition - the Windows ML way

The first part of the ObjectRecognizer initializes the model

using UnityEngine;
#if UNITY_WSA && !UNITY_EDITOR
using System.Threading.Tasks;
using Windows.Graphics.Imaging;
using Windows.Media;
#endif

public class ObjectRecognizer : MonoBehaviour
{
#if UNITY_WSA && !UNITY_EDITOR
    private ObjectDetection _objectDetection;
#endif

    private bool _isInitialized;

    private void Start()
    {
        Messenger.Instance.AddListener<PhotoCaptureMessage>(
          p=> RecognizeObjects(p.Image, p.CameraResolution, p.CameraTransform));
#if UNITY_WSA && !UNITY_EDITOR _objectDetection = new ObjectDetection(new[]{"aircraft"}, 20, 0.5f,0.3f ); Debug.Log("Initializing..."); _objectDetection.Init("ms-appx:///Data/StreamingAssets/model.onnx").ContinueWith
(p => { Debug.Log("Intializing ready"); _isInitialized = true; }); #endif }

Notice, here, too the liberal use of preprocessor directives, just like in my previous post. In the start of it's method we create a model from the ONNX file that's in StreamingAssets, using the method I added to ObjectDetection. Since we can't make the start method awaitable, the ContinueWith needs to finish the initalization.

As you can see, the arrival of a PhotoCapture message from the CameraCapture behavior fires off RecognizeObjects, just like in the previous app.

public virtual void RecognizeObjects(IList<byte> image, 
                                     Resolution cameraResolution, 
                                     Transform cameraTransform)
{
    if (_isInitialized)
    {
#if UNITY_WSA && !UNITY_EDITOR
        RecognizeObjectsAsync(image, cameraResolution, cameraTransform);
#endif

    }
}

But unlike the previous app, it does not fire off a Unity coroutine, but a private async method

#if UNITY_WSA && !UNITY_EDITOR
private async Task RecognizeObjectsAsync(IList<byte> image, Resolution cameraResolution, Transform cameraTransform)
{
    using (var stream = new MemoryStream(image.ToArray()))
    {
        var decoder = await BitmapDecoder.CreateAsync(stream.AsRandomAccessStream());
        var sfbmp = await decoder.GetSoftwareBitmapAsync();
        sfbmp = SoftwareBitmap.Convert(sfbmp, BitmapPixelFormat.Bgra8, 
BitmapAlphaMode.Premultiplied); var picture = VideoFrame.CreateWithSoftwareBitmap(sfbmp);
var prediction = await _objectDetection.PredictImageAsync(picture); ProcessPredictions(prediction, cameraResolution, cameraTransform); } } #endif

This method basically is 70% converting the raw bits of the image to something the ObjectDetection class's PredictImageAsync can handle. I have very much to thank this post in the Unity forums and this post on the MSDN blog site by my friend Matteo Pagani to piece this together. This is because I am a stubborn idiot - I want to take a picture in stead of using a frame of the video recorder, but then you have to convert the photo to a video frame.

The 2nd to last code actually calls the PredictImageAsync - essentially a black box for the app, and then the predictions are processed more or less like before:

#if UNITY_WSA && !UNITY_EDITOR
private void ProcessPredictions(IList<PredictionModel>predictions, 
                                Resolution cameraResolution, Transform cameraTransform)
{
    var acceptablePredications = predictions.Where(p => p.Probability >= 0.7).ToList();
    Messenger.Instance.Broadcast(
       new ObjectRecognitionResultMessage(acceptablePredications, cameraResolution, 
                                          cameraTransform));
}
#endif

Everything with a probability lower than 70% is culled, and the rest is being send along to the messenger, where the ObjectLabeler picks it up again and starts shooting for the Spatial Map in the center of all rectangles in the predications to find out where the actual object may be in space.

Conclusion

I have had some fun experimenting with this, and the conclusions are clear:

  • For a simple model as this, even with a fast internet connection, using a local model in stead of a cloud based model is way faster
  • Yet - the hit rate is notably lower - the cloud model is definitely more 'intelligent'. I suppose improvements to Windows ML will fix that in the near future. Also, the AI coprocessor the next release of HoloLens will undoubtedly contribute to both speed and accuracy.
  • With 74 pictures of a few model airplanes, almost all on the same background, my model is not nearly enough equipped to recognize random planes in random environments. This highlights a bit the crux of machine learning - you will need data, data more data and even more than that.
  • This method of training models in the cloud and executing them locally provides exiting new - an very usable - features for Mixed Reality devices.

Using Windows ML in edge devices is not hard, and on a HoloLens is only marginally harder because you have to circumvent an few differences between full UWP and Unity, and be aware of differences between C# 4.0 and C# 7.0. This can easily be addressed, as I showed before.

The complete project can be found here (branch WinML) - since in now operates without a cloud model it is actually runnable by everyone. I wonder if you can actually get it to recognize model planes you may have around. I've got it to recognize model planes up to about 1.5 meters.

27 January 2019

Adapting Custom Vision Object Recognition Windows ML code for use in Mixed Reality applications

Intro

In November I wrote about a Custom Vision Object Detection experiment that I did, which allowed the HoloLens I was wearing to recognize not only what objects where in view, but also where they approximately were in space. You might remember this picture:

You might also remember this one:

Apart from being a very cool new project type, it also showed a great limitation. You could only use an online model. You could not download it in the form of, for instance, an ONNX model to use with Windows ML. It worked pretty well, don't get me wrong, but maybe you are out and about and your device can't always reach the online model. Well guess what recently changed:

Yay! Custom Vision Object Detection now support downloadable models that can be use in Windows ML.

Download model and code

After you have changed the type from 'General" to "General (compact)" and saved that change, hit the "Performance" tab, then you will see the "Export" option appear (no idea why this is at "Performance", but what the heck:

So if you click that, you get a bit of an unwieldy screen that looks like this:

We are going to select the ONNX standard because that is what we can in Windows Machine Learning - inside an UWP app running on the HoloLens. Please select version 1.2:










The result is a ZIP file containing the following folders and files:




We are only going to need the model.onnx file (in the next blog post). For now I want to concentrate on the file that is inside the CSharp folder - ObjectDetection.cs. That file is very fine for using in a regular UWP app. However, although they are running on top of UWP, HoloLens apps are all but regular UWP apps.

Challenges in incorporating the C# code in an Unity project

Some interesting challenges lay ahead:

  • Unity for HoloLens knows this unusual concept of having two Visual studio solutions: one for use in the Unity editor, and a second one that is generated from the first. But the first one, the Unity solution needs to be able to swallow all the code, even if it's UWP and will never run in the editor. To make that possible, we will have to put some stuff into preprocessor directives to be able to generate the deployment project at all
  • The code uses C# 7.0 concepts - tuples - that are not supported by the C# version (4.0) supported in all but the newest versions of Unity, that I am not using here for various reasons
  • I also found a pretty subtle bug in the code that only happens in a Unity runtime

I will address all three things.

Testing in a bare bones project - here come the errors

So, I created an empty HoloLens project basically doing nothing: just imported the Mixed Reality Toolkit and hit all three configuration options in the Mixed Reality Toolkit/config menu. Then I added the ObjectDetection.cs to the project and immediately Unity started to balk:

Round 1 - preprocessor directives

The first round of fixing is pretty simple - just put everything he editor balks about between preprocessor directives:

#if !UNITY_EDITOR 
#endif

You can do this the rough way - by basically putting the whole file in these directives - or only put the minimum stuff in directives. I usually opt for the second way. So we need to put the following parts between these preprocessor directives.

First, this part in the using section of the start of the file:

    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
#if !UNITY_EDITOR
    using System.Threading.Tasks;
    using Windows.AI.MachineLearning;
    using Windows.Media;
    using Windows.Storage;
#endif

Then this part, at the start of the ObjectDetection class:

    public class ObjectDetection
    {
        private static readonly float[] Anchors = ....

        private readonly IList<string> labels;
        private readonly int maxDetections;
        private readonly float probabilityThreshold;
        private readonly float iouThreshold;
#if !UNITY_EDITOR
        private LearningModel model;
        private LearningModelSession session;
#endif

Then the following methods need to be put into entirely between these preprocessor directives:

  • Init
  • ExtractBoxes
  • Postprocess

And then both Unity and Visual Studio stop complaining about errors. So let's build the UWP solution...

Oops. Well I already warned you about this.

Round 2 - Tuples are a no-no

Although the very newest versions of Unity support C# 7.0 the majority of the versions that are used today for various reasons (mainly hologram stability) do not. But the code generated by CustomVision has some tuples in it. The culprit is ExtractBoxes:

private (IList<BoundingBox>, IList<float[]>) ExtractBoxes(TensorFloat predictionOutput,
float[] anchors)

So we need to refactor this to C# 4 style code. Fortunately, this is not quite rocket science.

First of all, we define a class with the same properties as the tuple:

internal class ExtractedBoxes
{
    public IList<BoundingBox> Boxes { get; private set; }
    public IList<float[]> Probabilities { get; private set; }

    public ExtractedBoxes(IList<BoundingBox> boxes, IList<float[]> probs)
    {
        Boxes = boxes;
        Probabilities = probs;
    }
}

I have added this to the ObjectDetection.cs file, just behind the end of the ObjectDetection class definition. Then, we only need to change the return value of the method ExtractBoxes from

private (IList<BoundingBox>, IList<float[]>)

to

 return new ExtractedBoxes(boxes, probs);

We also have to change the mode PostProcess, the place where ExtractBoxes is used:

private IList<PredictionModel> Postprocess(TensorFloat predictionOutputs)
{
    var (boxes, probs) = this.ExtractBoxes(predictionOutputs, ObjectDetection.Anchors);
    return this.SuppressNonMaximum(boxes, probs);
}

needs to become

private IList<PredictionModel> Postprocess(TensorFloat predictionOutputs)
{
    var extractedBoxes = this.ExtractBoxes(predictionOutputs, ObjectDetection.Anchors);
    return this.SuppressNonMaximum(extractedBoxes.Boxes, extractedBoxes.Probabilities);
}

and then, dear reader, Unity will finally build the deployment UWP solution. But there is still more to do.

Round 3 - fix a weird crashing bug

When I tried this in my app - and you will have to take my word for it - my app randomly crashed. The culprit, after long debugging, turned out to be this line:

private IList<PredictionModel> SuppressNonMaximum(IList<BoundingBox> boxes, 
IList<float[]> probs) { var predictions = new List<PredictionModel>(); var maxProbs = probs.Select(x => x.Max()).ToArray(); while (predictions.Count < this.maxDetections) { var max = maxProbs.Max();

I know, it doesn't make sense. I have not checked this in plain UWP, but apparently the implementation of Max() in the Unity player on top of UWP doesn't like to calculate the Max of and empty list. My app worked fine as long as there were recognizable object in view. If there were none, it crashed. So, I changed that piece to check for probs not being empty first:

private IList<PredictionModel> SuppressNonMaximum(IList<BoundingBox> boxes, IList<float[]> probs)
{
    var predictions = new List<PredictionModel>();
    // Added JvS
    if (probs.Any())
    {
        var maxProbs = probs.Select(x => x.Max()).ToArray();

        while (predictions.Count < this.maxDetections)
        {
            var max = maxProbs.Max();

And then your app will still be running when there's no predictions.

Round 4 - some minor fit & finish

Because I am lazy and it makes life easier when using this from a Unity app, I added this little overload of the Init method:

public async Task Init(string fileName)
{
    var file = await StorageFile.GetFileFromApplicationUriAsync(new Uri(fileName));
    await Init(file);
}

This will need to be in and #if !UNITY_EDITOR preprocessor directive as well. This method allows me to call the method like this without first getting a StorageFile:

_objectDetection.Init("ms-appx:///Data/StreamingAssets/model.onnx");

Conclusion

With these adaptions you have a C# file that will allow you to use Windows ML from both Unity and regular UWP apps. In a following blog post I will actually show a refactored version of the Toy Aircraft Finder to show how things work IRL.

There is no real demo project this time (yet) but if you want to download the finished file already, you can do so here.

17 January 2019

Making lines selectable in your HoloLens or Windows Mixed Reality application

Intro

Using the Mixed Reality Toolkit, it's so easy to make an object selectable. You just add a behavior to your object that implements IInputClickHandler, fill in some code in the OnInputClicked method, and you are done. Consider for instance this rather naïve implementation of a behavior that toggles the color from the original to red and back when clicked:

using HoloToolkit.Unity.InputModule;
using UnityEngine;

public class ColorToggler : MonoBehaviour, IInputClickHandler
{
    [SerializeField]
    private Color _toggleColor = Color.red;

    private Color _originalColor;

    private Material _material;
	void Start ()
	{
	    _material = GetComponent<Renderer>().material;
        _originalColor = _material.color;
	}
	
    public void OnInputClicked(InputClickedEventData eventData)
    {
        _material.color = _material.color == _originalColor ? _toggleColor : _originalColor;
    }
}

If you add this behavior on for instance a simple Cube, the color will flick from whatever the original color was (in my case blue) to red and back when you tap it. But add this behavior to a line and and attempt to tap it - and nothing will happen.

So what's a line, then?

In Unity, a Line is basically an empty game object containing a LineRender component. You can access the LineRender using the standard GetComponent, then using it's SetPosition method to actually set the points. You can see how it's done in the demo project, in which I created a class LineController to make drawing the line a bit easier:

public class LineController : MonoBehaviour
{
    public void SetPoints(Vector3[] points)
    {
        var lineRenderer = GetComponent<LineRenderer>();
        lineRenderer.positionCount = points.Length;
        for (var i = 0; i < points.Length; i++)
        {
            lineRenderer.SetPosition(i, points[i]);
        }
        //Stuff omitted
    }
}

This is embedded in a prefab "Line". Here you might see the root cause of the problem. The difference between a line and, for instance a cube is simple: there is no mesh, but more importantly - there is no collider. Compare this with the cube next to it:



















So... how do we add a collider, then?

That is not very hard. Find the prefab "Line", and add a "Line Collider Drawer" component. This is sitting in "HoloToolkitExtensions/Utilities/Scripts.

Once you have done that, try to click the line again

And hey presto - the line is not only selectable, but even the MRKT Standard Shader Hover Light option, that I selected in creating the line material, actually works.

And in code, it works like this

First of all, in the LineController, I wrote "//Stuff omitted". That stuff actually calls the LineColliderDrawer (or at least, it tries to):

public class LineController : MonoBehaviour
{
    public void SetPoints(Vector3[] points)
    {
        var lineRenderer = GetComponent<LineRenderer>();
        lineRenderer.positionCount = points.Length;
        for (var i = 0; i < points.Length; i++)
        {
            lineRenderer.SetPosition(i, points[i]);
        }
        
        var colliderDrawer = GetComponent<LineColliderDrawer>();
        if (colliderDrawer != null)
        {
            colliderDrawer.AddColliderToLine(lineRenderer);
        }
    }
}

The main part of LineColliderDrawer is this method:

private  void AddColliderToLine( LineRenderer lineRenderer, 
    Vector3 startPoint, Vector3 endPoint)
{
    var lineCollider = new GameObject(LineColliderName).AddComponent<CapsuleCollider>();
    lineCollider.transform.parent = lineRenderer.transform;
    lineCollider.radius = lineRenderer.endWidth;
    var midPoint = (startPoint + endPoint) / 2f;
    lineCollider.transform.position = midPoint;

    lineCollider.transform.LookAt(endPoint);
    var rotationEulerAngles = lineCollider.transform.rotation.eulerAngles;
    lineCollider.transform.rotation =
        Quaternion.Euler(rotationEulerAngles.x + 90f, 
        rotationEulerAngles.y, rotationEulerAngles.z);

    lineCollider.height = Vector3.Distance(startPoint, endPoint);
}

This is partially inspired by this post in the Unity forums, and partially by this one. Although I think they are both not entirely correct, it certainly put me on the right track.

Basically it creates an empty game object, and add a capsule collider to that. The collider is set to the end with of the line, which is assumed to be of constant width. It's midpoint is set exactly halfway the line (segment) and then rotated to look at the end point. Oddly enough, it's then at 90 degrees with the actual line segment, so the collider is rotated 90 degrees over it's X axis. Finally, it is stretched to cover the whole line segment.

The rest of the class is basically a support act:

public class LineColliderDrawer : MonoBehaviour
{
    private const string LineColliderName = "LineCollider";

    public void AddColliderToLine(LineRenderer lineRenderer)
    {
        RemoveExistingColliders(lineRenderer);

        for (var p = 0; p < lineRenderer.positionCount; p++)
        {
            if (p < lineRenderer.positionCount - 1)
            {
                AddColliderToLine(lineRenderer, 
                    lineRenderer.GetPosition(p), 
                    lineRenderer.GetPosition(p + 1));
            }
        }
    }

    private void RemoveExistingColliders(LineRenderer lineRenderer)
    {
        for (var i = lineRenderer.gameObject.transform.childCount - 1; i >= 0; i--)
        {
            var child = lineRenderer.gameObject.transform.GetChild(i);
            if (child.name == LineColliderName)
            {
                Destroy(child.gameObject);
            }
        }
    }
  }

This basically first removes any existing colliders, then adds colliders to the line for every segment - basically a line of n points gets n-1 colliders.

Concluding words

And that's basically it. Now lines can be selected as well. Thanks to both Unity forum posters who gave me two half-way parts that allowed me to combine this into one working solution.

19 December 2018

Improving Azure Custom Vision Object Recognition by using and correcting the prediction pictures

Intro

Last month I wrote about integrating Azure Custom Vision Object Recognition with HoloLens to recognize to label objects in 3D space. I wrote the prediction went pretty well, although I used only 35 pictures. I also wrote the process of taking, uploading and labeling pictures is quite tedious.

Improving on the go

It turns out Custom Vision retained all the pictures I uploaded in the cause of testing. So every time I used my HoloLens and asked Custom Vision to locate toy aircraft, it stored that in the cloud with it's prediction. And the fun thing is, you can use those pictures to actually improve your model again.

After some playing around with my model my (for the previous blog post about this subject) , I clicked the prediction tab, and I found about 30 pictures - each for every time I used the model from my HoloLens. I could use those to improve my model. After that, I did some more testing using the HoloLens to show you how it's done. So, I clicked the Predictions tab and there were a couple of more pictures:

image

If we select the first picture, we see this:

image

The model has already annotated the areas where it thinks is a an airplane in red. Interestingly now the model is a lot better than it originally was (when it only featured my pre-loaded images) as it now recognizes the DC-3 Dakota on top - that it has never seen before - as an airplane! And even the X-15 (the black thing on the left ) is recognized. Although the X-15 had a few entries in the training images it barely looks like an airplane (for all intents and purpose it was more a spaceship with wings to facilitate a landing).

I digress. You need to click every area you want to confirm:

image

And when you are done, and all relevant areas are white:

image

Simply click the X top right. The image will now disappear from the "Predictions" list and end up in the "Training images" list

Some interesting things to note

The model really improved from adding the new images. Not only did it recognize the DC3 'Dakota' that had not been in the training images, but also this Tiger Moth model (the bright yellow one) that it had never  seen before:

image

Also, it stopped recognizing or doubting things like the HoloLens pouch that's lying there, and my headphones and hand were also recognized as 'definitely not an airplane'

image

Yet, I also learned it's dangerous to take the same background over and over again. Apparently the model starts to rely on that. If I put the Tiger Moth on a dark blue desk chair in stead of a light blue bed cover

image

Yes...  the model is quite confident airplane in the picture but it's not very good at pinpointing it.

image

And as far as the Curtiss P40 'Kittyhawk' goes - even though it has been featured extensively in both the original training pictures and the ones I added from the Predictions, this no is success either. The model is better at pinpointing the aircraft, but considerably less sure it is an aircraft. And the outer box, that includes the chair, gives a 30.5%. So in looks that to make this model even more reliable I still need more pictures but then on other background, more different lighting, etc.

Conclusion

You don't have to take very much pictures up front to incrementally improve a Custom Vision Object Recognition model - you can just iterate on it's predictions and improve them. It feels a bit like teaching a toddler how to build something from Legos - you first show the principle, then let them muck around, and every time things goes wrong, you show how it should have been done. Gradually they get the message. Or at least, that's what you hope. ;)

No (new) code this time, as the code from last time is unchanged.

Disclaimer - I have no idea how much prediction pictures are stored and for how long - I can imagine not indefinitely, and not an unlimited amount. But I can't attach numbers to that.

08 December 2018

Mixed Reality Toolkit vNext–dependency injection with extension services

Intro

The Mixed Reality Toolkit vNext comes with an awesome mechanism for dependency injection. This also takes away a major pain point – all kinds of behaviors that are singletons that are called from everywhere, leading to all kind of interesting timing issues - and tightly coupled classes. This all ends wit extension services, which piggyback on the plugin structure of the MRKT-vNext. In this post I will describe how you make, configure and use such an extension service

Creating an extension service

A service that can be used by the extension service framework (and be found by the inspector dropdown that I will show later) needs to implement IMixedRealityExtensionService at the very least. But of course we want to have the service make do something useful so I made a child interface:

using Microsoft.MixedReality.Toolkit.Core.Interfaces;

namespace Assets.App.Scripts
{
    public interface ITestDataService : IMixedRealityExtensionService
    {
        string GetTestData();
    }
}

the method GetTestData is the method we want to use.

Any class implementing IMixedRealityExtensionService needs to implement six methods and two properties. And to be usable by the framework, it needs to have this constructor:

<ClassName>(string name, uint priority)

To make this a little more simple, the MRKT-vNext contains a base class BaseExtensionService that provides default implementation for all the required stuff. And thus we can make a TestDataService very simple, as it a) implements all properties and b) forces us to provide the necessary constructor:

using Microsoft.MixedReality.Toolkit.Core.Services;
using UnityEngine;

namespace Assets.App.Scripts
{
    public class TestDataService : BaseExtensionService, ITestDataService
    {
        public TestDataService(string name, uint priority) : base(name, priority)
        {
        }

        public string GetTestData()
        {
            Debug.Log("GetTestData called");
            return "Hello";
        }
    }
}

Registering the service in the framework

Check if a custom profile has been selected. Assuming you have followed the procedure I described in my previous post, you can do you this by selecting the MixedRealityToolkit game object in your scene and then double-clicking the “Active Profile” field

image

If the UI is read-only, there’s no active custom profile. Check if there’s a profile in MixedRealityToolkit-Generated/CustomProfiles and drag that on top of the ActiveProfile field of the MixedRealityTool object. If there’s no custom profile at all, Please refer to my previous blog post.

Scroll all the way down to Additional Service Providers.

image

Click the </> button. This creates a MixedRealityRegisteredServiceProvidersProfile in
MixedRealityToolkit-Generated/CustomProfiles and shows this editor.

image

Click “+ Register a new Service Provider”. This results in a “New Configuration 8” that if you expand it, looks like this:

image

If you click the “Component Type” drop down you should be able to select “Assets.Apps.Scripts” and then “TestDataService”.

image

I also tend to give this component a bit more understandable name so the final result looks like this:

image

Calling the service from code

A very simple piece of code shows how you can then retrieve the and use the service from the MixedRealityToolkit:

using Microsoft.MixedReality.Toolkit.Core.Services;
using UnityEngine;

namespace Assets.App.Scripts
{
    public class TestCaller : MonoBehaviour
    {
        private void Start()
        {
            var service  = MixedRealityToolkit.Instance.GetService<ITestDataService>();
            Debug.Log("Service returned " + service.GetTestData());
        }
    }
}

Notice I can retrieve the implementation using my own interface type. This very is similar to what we are used to do in ‘normal’ IoC containers like Unity (the other one), AutoFac, SimpleIoC. If you attach this behaviour to any game object in the hierarchy (I created an empty object “Managers” to this extent), and run this project, you will simply see this in the console:

image

It’s not spectacular, but it proves the point that this is working as expected

Conclusion

MRTK-vNext provides a very neat visual select mechanism for wiring up dependency injection that is very easy to use. I can also easily retrieve implementations of the service using an interface, just like any other IoC platform. The usage of profiles makes it very flexible and reusable. This alone makes it a great framework, and then I have not even looked into the cross-platform stuff. That I will do soon. Stay tuned.

In the mean time, the demo project can be found here.