DCU Final Year Project: March 2008

Sunday 30 March 2008

30/03/08 - Farewell!

I just took a brief glance at Grahams last post and I realised that this is the last time the blog is checked for marking purposes!? That I know of, at least. So I figured, I'd best say a few last words!

Firstly, these won't really be a few last words, I fully intend (even thoguh I hate blogging) to keep posting updates here until the project is fully completed (ie, until we have to hand in everything). So, should anyone actually be interested, please feel free to continue reading!
I completely forgot to mention that we decided my application editor should be called ASEDIT (short for Audio Spatial EDITor). It makes sense to us nayway ;-)
THANKS VERY MUCH to anyone who was reading my blog purely for the interest in the project! I definitely appreciate it. If you would like to drop me some comments, feedback, criticism or anything else, please please leave a comment.
As Graham said in his post - words, videos and images are not really enough to describe this project. It needs to be experienced. So, if any readers want a chance to be one of the guinea pigs (that is, if you want a go), please let either myself or Graham know. We need a good few people to test this and to tell us how well it works! Leave a comment or drop me an email (d kereten AT g mail DOT com).
Both myself and Graham see this project as a great success. It was supposed to be a research project, to develop a framework to ease the development of audio and location based augmented reality applications. I believe we have managed to do just that. In fact, I hope by the end of April, that we will have surpassed that. I hope the people marking our project can see that too, if not, please get in touch, so that we may take action to fix any potential problems (read: aspects which would cause a loss of marks). Thank you very much!
I feel I learn't a lot through working on this project. If nothing else, I think that the learning experience made it all worthwhile.
The system we built is crude compared to what it could have been. The hardware we used was the cheapest we could find that would do the job. The software was (or, should I say, still is) written to give suficient results before the deadline. Unfortunately this means we were a little lax in certain areas: no unit testing, for example. Basically, there are a huge number of improvements which could be done. I feel myself and Graham could, with sufficient time and money, design and develop a relaistically usable augmented reality platform. There is so much more we wanted to add, but had to scrap due to time/money constraints.

All in all, I enjoyed working on this project and feel it was an astounding success! I hope to be able to post more success stories over the last few weeks of the project. Check back in a few days, more is bound to have happened!

30/03/08 - Framework and tools.

Admittedly, I posted the previous post five minutes ago, though it should have been posted on Friday. I got home late on Friday, so never got a chance and yesterday... yeah. Anyway, on to todays real post:

I didn't really do much since Friday, mainly because of time constraints - unfortunately I have a lot lot more needing doing than just the project.. I did, however, get to plan out the featureset for the application editor some more and have also decided upon an architecture for the integration of the application editor and the message router/state machine.
I have decided to merge both programs. If I had more time for the project, I would probably keep them seperate, as that allows for greater flexability in the long run, but since the deadline is fast approaching, I decided it was more important to simply get it running. By merging them, it also makes it a lot easier to expose functionality between the two - because they are now one and the same! The application editor also benefits from already having the Twisted networking code implemented in the message router and it now has direct access to the state machine. In the morning, I hope to merge the two and refactor the resulting code to ensure there is no decrease in code quality. I expect this program to be an integral part of the framework, not only being the central point for users to access the frameworks built-in functionality through a grpahical editor, routing messages between the various daemons or haandling states and state transitions, but also providing a platform for python code to access the frameworks features for custom coded applications, which are not possible through the editor alone - and these applications should be able to add tools to the editor, for convenient access!
Hopefully I can manage to complete the code merger in a single day, in which case there should be another post here tomorrow. This will leave me to implement features on tuesday and wednesday. The goal now is that any core functionality is complete for Thursday, so that we can then build a few simple demonstration applications. I hope the demo apps will prove the framework and application editor to be intuitive and easy to work with. These demo apps can then be also used for our guinea pig testing :-)

If the rest of the project was a success so far, then the demo apps should be trivial to implement. After all, that is the whole point of implementing a framework.

A quick look at the calendar and schedule shows that we are still on track, though care must now be taken to ensure we don't fall behind. We need to have the demonstration applications complete within about a week and a half, meaning the framework needs to be in a useful state before then. That leaves us with a week or two for testing and evaluation and a week for documentation. The schedule should really be revised, since I don't think we are able to do antyhing after April. As it stands though, I believe we are about where we should be, perhaps a week behind, but nothing we cannot recover from.
At the start, most of the work was done by Graham, since there was little that could be done without the hardware and code that interfaced with said hardware. Now that it's mostly complete, it's my turn to have more work to do. So, now Graham is mostly testing and debugging, while I'm writing a bucketload of code. Almost IBM standards - millions of KLOCS! Ok, maybe not, but I wrote more code last week than I have any other week since the start of the project!

28/03/08 - Meeting with supervisor

Today we had a meeting with Alan to see if we are on track for the second milestone. For the meeting, we prepared a small demo, which showed him all the completed hardware working together with the application editor so that he could see the test environment we intend to use over coming weeks to test and evaluate our project. I'll take a moment to talk about the application editor first.

The application editor, in it's current state, is a more powerful form of the compass test program written back in February - you can position sound sources in virtual space and the listener (now represented by a little graphic of a person, instead of simply a dot) gets positioned in such a way that it represents the position and orientation of the user, wearing the headset. While the compass test only took orientation into account, the application editor also reacts to Ubisense data. Another difference is that multiple sound sources can be active (ie, playing sound) at once and each can be playing a different sound. This is accomplished by giving each sound it's own set of properties, which can be adjusted seperately. Finally, sound sources can now be repositioned/moved after having been placed, which could not be done in the compass test.

While feature-wise, this is not a significant improvement, the infastructure is now coded to allow for more advanced (and useful) features to be implemented over coming days - per-sound source property support being the most important, but also the graphical widget for representing the environment. In the compass test code, I was simply drawing to a gtk Drawable widget, while this time I created my own custom widget, derived from both Drawable and the GtkGlExt OpenGL class. This allows me to handle events in a more localised fashion as well as drawing more complex images. Hopefully I'll have a video online showing the program in action, the still screenshots don't really do it justice. Speaking of screenshots, heres one now:

Over the next few days, I plan to scrap my current socket based networking code for Twisted code. This should make the networking aspect of the program much more robust and flexable. Currently, certain scenarios are somewhat error prone, due to the use of blocking sockets in a multithreaded GUI application (needs to be multithreaded so that a blocking socket does not stall the entire GUI, but this causes problems when a thread needs to terminate...). Twisted will solve all of these issues.
Besides the networiking code, there are a lot of features planned that still need to be implemented. Currently the program is little more than a testing tool, when complete, I hope it will actually be an application editor capable of allowing one to design and build augmented reality applications with out framework. After all, what is a framework without intuitive tools?

So, thats basically what I've been working on since my last post. Other than that, I fixed some issues witht he audio daemon - I added a command to reset the audio environment by removing all sound sources at once. This is needed so that each application can easily ensure it has a clean environment to work in. I also spent a couple of hours with Graham, testing the system, that is, wearing the headset, walking around trying to find sounds and ensuring that my application editor actually did what it was intended to. Eventually, everything was working and we could test everything together, though the system still has a few bugs, making the whole thing a little brittle. In fact, during our demonstration to alan, the application editor crashed... not good! but, it was easily restarted and everything was fine. I guess I now need to make sure that doesn't happen again!

So, yes, there are still some bugs which need to be fixed, but overall, everything seems to be working. I guess that puts us on track.

To recap, what I want to (or need to) do next is:

Debugging! I certainly don't want anything to crash during demo day.
Adding more features to the application editor. I posted a "TODO" list a few days ago, and, even though I had hoped to have it completed by the end of the week, this didn't happen.
Testing testing testing! Myself, Graham and Alan agreed that it would be beneficial to find some guinea pigs to test our project, so that we can evaluate and report on our findings, with respect to sound localisation, the usefulness of our hardware and the effectiveness of our techniques.
Documentation. It's a large project with many aspects. Everything needs to be thoroughly documented before the project deadline, which is now approaching faster than we would like.

Wednesday 26 March 2008

26/03/08 - Application Editor

Currently, I am working on an editor which will allow you to position multiple potentially moving sound sources around the listener and set a number of properties for each. This will be similar to the demo program I wrote, except that it will provide a lot more functionality.
Each sound source will have a number of associated properties which can be edited. So far the planned properties are:

Position
Sound file to play
Duration
Which state the user must be in for the sound to play
Radius
State to change to if within radius
Path along which the sound will move
Speed of movement along path

Other properties will be added as I think of them.

This should help us to not only test the hardware better, but also to develop some simple, yet rich, demo applications, such as the virtual zoo or band.

This program will be used to either drive the router/state machine program written a few months back (after it is updated, later this week), or it will be integrated into it. I have yet to decide which approach I will take. For the time being, it will be a standalone application (and will not support the state properties until I have it working together with the state machine program, using whichever method I decide upon).

Here is a screenshot of the program so far:

Nothing terribly interesting there yet, besides the basic GUI. Over the next couple of hours I will be coding a custom OpenGL-based GTK widget for editing the sound sources.

The lefthand bar will be the toolbar, containing buttons for each type of action that can be performed. The existing buttons represent (from top to bottom) editing the dimensions of the environment (ie, setting a rectangular area which representts the ubisense enabled area the application will be run in), placing and editing sound sources and the cogwheels at the bottom will run the application.

The black area in the middle is the OpenGL-enabled canvas. This is where all the magic will happen ;-)
The area to the right will contain a list of properties for the currently selected sound source.

The plan is to have the basic version of this working tonight and then to test it in DCU tomorrow morning. If all goes well, I will be adding the state properties and adding whatever else I think of tot he program over Friday and the weekend. The hope is that it will be completed on or before Monday - before, if at all possible.

Ok, best get back to work :-)

Tuesday 25 March 2008

25/03/08 - Plans for this week

I figured I may as well post my plans for this week. It will help me remember twhat I want to do and provide me with a checklist too, as well as showing whoever may be reading my blog what I intend to accomplish over the coming days.

The audio feedback component needs some more work. Before the weekend, I'd very much like for it to be more or less complete, which means I need to implement the ability to add sounds that decay over a given time, by fading in volume until it is inaudible, at which point it will be removed. The ability to reset a sounds timeout should also exist. This would allow for some interesting applications - one use for this which myself and Graham have discussed is to represent objects which were detected by the ultrasonic sensor to decay over time, as there no no easy way to determine if the object is static or moving, storing it's position indefinately would cause inaccurate results, but having it decay would provide a useful representation to the user.
I want to refactor the router application to accomodate for advances made in the project so far. It is now time that the router/state machine application can be used (and indeed, it should be quite useful now), but it's current state does not really reflect the rest of the project. I hope to have this completed and working by friday, so that it may be used to implement the demo applications we have planned.
Tools. I have a number of tools, which would be used to configure the framework and develop applications, planned. The tools are what I was hinting at in previous posts - one (or more, depending on how successful I am) of these tools will have a nice pretty drag'n'drop interface which I will write in Python and PyGTK using OpenGL. If all goes to plan, this should make using the framework much much easier than the current vodoo needed to make everything work.

I guess we will find soon enough if I'm successful or not with this little todo list... today looks to be an exceptionally busy day though, so I don't think I'll get a chance to work on the project until tomorrow :-(

Thursday 20 March 2008

20/03/08 - Neglecting to update blog

It is waaaay too easy to neglect the blog.. I really am not a blog person. GRR BLOGS...

Now that I have that out of my system, onto the real post.
Graham has posted about the hardware problems we've had, so I won't say any more about it now. Instead I'll write about the coding I've done.

Since my last post, I was still sick, so the Friday demo plan was completely ruined. After that, I ported my FMOD code to windows, since FMOD cannot use hardware acceleration under linux.. boo! This mainly just involved minor editing to the socket code to make it work under windows and removing the linux specific headers. The FMOD code itself did not need any changes, besides changing the initialization flags to tell it to use hardware acceleration. Simple. Easy. Great.
... Except there were problems. Theres always problems. Basically, on windows, FMOD only supports the C++ API when compiling with Microsoft Visual C++. I wanted to compile using MinGW, because thats what I've always used on windows and since I use GCC on linux, I figured why not use it on windows too. To cut a long story short, I downloaded Visual Studio 2008 Express and tried to compile with that. Endless hassle. If I had used Visual Studio before, I'm sure I could have got it working, but it was taking me longer than I had available time, with no success. At first it kept trying to compile into managed .NET code, but eventually I found the option to disable that so that it would produce unmanaged native code, though even then I couldn't get it compiling without problems. So, instead of spending yet more time trying to fix it, I decided instead to scrap the FMOD C++ API entirely and use the C API instead (which is supported by all compilers, since the C ABI is standard and the $*%& C++ ABI is not).
Luckily, porting FMOD from the C++ API to the C API was extremely easy and painless and now everything works. Yay.

Graham and I also ported the C# Ubisense code which Lorcan Coyle sent us to the new Ubisense 2.0 API. This wasn't difficult, but it took a while to match the old functions to the new ones, as the API seems to have been reorganized quite a bit. The documentation wasn't terribly useful in teaching you how to use the API, but it served as a good reference and the Ubisense code works now too. Success!

I have also been working on updating my demo app to allow for multiple sound sources to be playing at once and also to allow the listener to move around - controlled by Ubisense. This will allow us to test the complete system, once the hardware issues have been overcome.

Finally, I have been playing with GtkGLExt in PyGTK, so that I can create a GUI using Python and PyGTK (well, I already can and have done - the demo app for example), but also allowing me to draw onto GTK widgets using OpenGL. This will be useful for some tools I am slowly working on, sicne they have a visual component which would be a lot easier to implement (and prettier) using OpenGL rather than GTK's native drawing functions. I should be ready to post about the tools about a week from now.

So, now our project contains some C, C++, PIC assembly, PIC BASIC, C# and Python code. Interesting how a nice TCP/IP-based modular design allows a nice mix of languages ;-)

Wednesday 5 March 2008

05/03/08 - Quick Update

Ok. It has been a while since my last post. Last week I was away, attending the Irish Web Technology Conference (thanks to the folks over at Python Ireland for the free ticket) and this week I've been sick.. Hopefully tomorrow I'll be better so as to get back to work, since our hardware has arrived. That leads me on to the first part of my update:

Hardware has arrived
As mentioned in a previous post, Alan has been kind enough to purchase a soundcard and wireless headphones for us. They arrived sometime yesterday evening, so we are keen to test them out. Graham has already modified our headset to incorporate the wireless headphones.

Ubisense is ready
On a related note, the ubisense is finally ready and we have received a ubisense tag from Kirk Zhang, one of the techies here in the CDVP. Graham emailed Lorcan yesterday and he has kindly provided us with their C# code, which simply transmits the location data from the ubisense over TCP/IP. This fits in well with the rest of our system, as all our components communicate over TCP/IP.

Tomorrow morning, I plan to test the hardware accelerated HRTF. Hopefully we will now have much improved 3D sound localization. Besides that, I will modify my test application so that the listeners position can be controlled by the ubisence. Once we have the ubisence working, then the hardware side of our project is more or less ready and we can work on demonstration applications and flashy development tools to ease the use of our framework.
Besides this, over the next week, I plan on revising the message router code which I wrote before christmas and then design an easy to use, event-driven API to allow applications to be written in python. The foundation for this already exists, it just needs to be cleaned and refined (and the actual features of the other components need to be exposed to the applications). This will mean that applications can be developed in a single unified place, rather than having to manually connect everything together via TCP/IP (though that option will still exist, just in case it is needed).
Finally, I want to develop a graphjical configuration tool which would be used as a convenient (and user friendly) means of configuring and setting up the framework and applications. Graham knows what I mean, since we have discussed it in great detail, but I won't post more about it until I've started work on it - since it's easier for me to write about something as I'm actually working on it.

DCU Final Year Project