September 13, 2009

Nuke 5.2 GPU Support

I found it somewhat unintuitive and unpublished, so I thought I'd post my findings here.

Nuke's brand new GPU support is only currently set up to run GPU operations on the viewport.  It will not return them to the next node in the chain.

This is accross the board in all 3 hidden GPU nodes that come with Nuke 5.2 and appears to be built into the IOp calling mechaism (the closed source part of Nuke).  So for all those with visions of GPU accelerated compositing... not yet.

I cannot really intuit the design intentions to the level that I can say if this will change in future versions.  The one really odd part is the hidden node: BlockGPU.  It makes no sense to have a node that blocks the usage of the GPU engine in a tree, if the only way to use the GPU engine is to view a node that uses the GPU engine directly.  And secondarily... it doesn't block the node if you view it directly.

Anyhow, to me it looks like they got just enough code in there to accelerate their new GPU LUT functionality, and left it at that.  That's probably why the GPU nodes are hidden rather than exposed.  Maybe in Nuke6.0?

May 03, 2008

Iron Man Released

Iron Man finally hit threaters and so I finally get to talk about it in the open, and what I did on it.

If you watch the credits, under Pixel Liberation Front, you'll find me crediated as Brad Friedman, Technical Artist.  This is actually a mistake.  My credit is supposed to be Technical Lead, as my supervisor submitted it.  I'll assume it was a transcription error and not a willful demotion.  If I'm really lucky it might be fixed for the DVD release, but that's pretty unlikely.

So, what exactly did I do on the film?  What does previz do on films like that? How does it relate to FIE?  How will it help my clients?

Well, I'll answer the last questions first.  On Iron Man, I developed a brand new previsualization pipeline for Pixel Liberation Front, in Autodesk Maya.  PLF has, in the past, been primarily an XSI based company.  Though they have worked in Maya before, its usually for short jobs (commercials and the like) or working within someone else's custom Maya pipline (such as working with Sony Image Works on Spider Man 2).  This was the first time PLF was going to really take on a modern feature film as the lead company, completely under it's own sails in Maya.  And it was my job to make it possible.

The tools and techniques I developed were the backbone of the previsualization department on Iron Man for the two years PLF was actively on the project, even after I left the film.  The pipeline also moved over to "The Incredible Hulk" as we had a second team embedded in that production.  I've actually been talking to PLF about further expansion and will probably be expanding their pipeline in the coming months for upcoming projects.  All those designs and ways of working are currently being updated and reimplemented into my licensable toolkit.  They will be available to my clients.

I'll get into the specifics of the tools and designs as I describe the challenges of the production further.

When we were first approached by Marvel and the production, one of the requirements for being awarded the job, was that we work in Maya.  We quickly evaluated if we could do it.  And the answer was "yes."  Very quickly we were awarded the job and we sent our first team to the production offices, embedded in Marvel's offices in Beverly Hills.  The team consisted of Kent Seki, the previsualization supervisor, Mahito Mizobuchi, our asset artist and junior animator, and myself, the technical director and artist.

My frist major task was to help the designers in the art department with the Iron Man suit design.  At the time, there was no visual effects vendor.  And the hope was that I could do Visual Effects development and vetting with the art department, avoiding the need of hiring the VFX vendor early in the production.  John Nelson, the VFX supervisor was particularly concerned with the translation of the designs on paper, to a moving 3d dimensional armor.  He didn't want obvious interpenetrations and obvious mechanical imposibilities to be inherent in the design.  He wanted the range of motion evaluated and enhanced.   He needed us to be able to build the art department designs in 3d and put them through movement tests.  He needed us to suggest solutions to tricky problems.  All that weight fell on my shoulders with the assistance of Mahito's modeling skills.

For a number of months, we rebuilt the key trouble areas again and again as the design was iterated back and forth between us and the art department.  The abdomen and the shoulder/torso/arm areas were where we focused most of our time. I rigged a fully articulated version (metal plate for metal plate) of the trouble areas about once a week, trying to solve new and old problems different ways each time.

Finally, we arrived at what has stuck as the final articulation and engineering of the suit after a few months of work.  The suit has gone through many superficial redesigns and reproportions since then.  However its mechanical design was sound and survived all the way to the screen.  The design eventually went to Stan Windston Digital for further work and to be built as a practical suit.  It eventually went to ILM for final VFX work, once they were awarded the show.

I think what this aspect of the project shows is a very smart VFX Supervisor and VFX Producer
figuring out how best to utilize a top notch previs team to save money and get experts in the right places, without having to pay a premium for a full VFX company early in the production process.  This allowed them to go through a full bidding process for the VFX vendors, without being rushed into an early decision, as their VFX needs were met by us.  To give you an idea of how early in the process this was:  Rober Downey Jr. was not cast at this point.  The production designer started work a week after we did.  The script was not complete.  The Director of Photography was not chosen.  The main villain in the film was NOT Obidiah Stain at the time.  This is really early and ultimately the final decisions on VFX vendors didn't come until about a year later.

Once our initial suit design chores were handled, I moved onto the main previs effort and we started working on sequences of the film.

Previs has some very special requirements.  It has to be done fast.  It has to look good.  It has to be cheap.  I think its the most demanding animation project type there is these days.  On the triangle of neccesity, it hits all three points.  Fast, Good and Cheap.  Usually you can pick only two of those to accomplish at the expense of the third.

Historically, previs got away with skimping on the "Good."  Characters would slide along the ground, making no effort to pump their legs.  It was pure blocking.  It was fine though, because the previs was a technical planning tool.  The aesthetics were irrelevent as long as the previs was technically accurate enough to plan with.  However, as previs has become more of an aesthetic tool and less of a purely technical tool, the quality of animation has had to increase exponentially.  It has gotten to the point that on a lot of shows, we're doing first pass animation.  And in the extreme, even first pass animation is not enough to convey real human acting.  We are expected to be able to generate real human motion without giving up on speed or price.  Some might find this antithetical to the concept of previs (infact a lot of people do) but our reality as a previs company is that "the customer is always right" and if the customer decides they want it, we need to be able to provide it.

So to sum it up, I was faced with being the only experienced Maya user on the team.  I needed to hit all three points in the triangle of neccesity fast cheap and (exceptionally) good.

By the end of my time on Iron Man and Hulk, I created the following

  • Multi Rig Marionette System
    • One character can have any number of control rigs in a given scene.  The character can blend from rig to rig as the animator pleases.  Addition of more rigs and characters to the scene is handled via an integrated GUI.  Control rigs can be of any type and therefore can use any animation for their source, from hand animation to mocap.  Think hard about what doors it opens up.  A rig devoted to a walk cycle.  A rig devoted to a run cycle and a rig to animate on by hand.  All connected to the same character with the freedom to add as many other rigs as you need to make the scene work.
    • Accelerated skinning based on a goemetry influence system integrated into the Marionette.
  • Motion capture Integration
    • Accelerated pipeline for moving raw motion capture from motionbuilder to any number of previs characters in a batch oriented fashion.  
    • Motion Capture editing rig that takes mocap as a source and gives the animator an array of FK and IK offset controls with which to modify the mocap.
  • Auto Playblast/Capture/Pass System
    • A fully functional settings system that allows assets to be populatd with "terminal strips" and passes to be populated with "settings" for the terminal strips that control aspects of the asset via connections, expressions, etc.  Activating a pass applies the settings to any matching terminal strips in the scene.
    • Passes can be members of any number of shots in the scene.
    • Passes can optionally activate associated render layers.  In this way, I've incorporated the existing render layers system into my own (if the new render layers ever become stable enough to use in production, I might actually condone their use)
    • A shot references a camera, a number of passes and keeps information on Resolution, Aspect ratio, shot, sequence, version, framerange, etc.
    • Smart playblast commands capture the selected shots selected passes in shots etc.  One command and it sets up the playblasts itself and fires them off one by one.  Go get coffee.
  • Guide and AutoRig solutions
    • A custom biped guide allows easy joint placement and alignment.
    • A very comprehensive rig based on the ZooCST scripts is automatically generated from the guide.
    • Motion capture rigs suitable for movement into MotionBuilder and generated off of the guide.
    • Motion capture editing rigs suitable for use in maya are generated off of the guide.
  • An exceptional Goat rig
    • I made a really good goat rig.  Which didn't really end up getting used.  But it was a spectacular rig.  Really.

Beyond the tools I built directly for the previs jobs, Iron Man also opened the door for us when it came to motion capture.  The production had scheduled a series of motion capture shoots with which to test out the suit designs and see how they moved.  We made it a point to capture some action for the previs while we were there. Our previs characters were already loaded into the systems since that was what the production was using to test our the suit design.  We used some of the mocap for the previs on Iron Man in a more traditional motionbuilder pipeline that was functional, but slow.  I immediately realized that mocap could be the solution to the, better-faster-cheaper problem with the bottlenecks sorted out with custom tech.  And I moved in that direciton.  Within a year, PLF had bought a Vicon mocap system.  I set to the task of making it work for previs and within the year, we were in full swing of doing previs mocap for "The Incredible Hulk" due out later this summer.

The remaining question is: do these technologies have relevence outside of Previs?   And the answer is, Yes.  All of these technologies were built to dual target previs animation and production animation. They're generic in nature.  We have used all of these systems on finsihed render jobs, game trailrs and the like.  What I've built, is a better way to animate, that is rig agnostic.  All the old animation techniques and rigs still work.  I've just incorporated them all into a higher level system.  I then built a high volume motion capture pipeline as one potential animation source in that higher level system.

But it doesn't stop there.  In reality, I only really got to develop about half of the systems and tools I felt should be a part of the pipeline.  In their new incarnations, the tools will be even more functional.

I saw the movie the other night and it was a real thrill to see all our shots and sequences finished and put together.  A number of my own shots made it through the gauntlet and into the final film, which was a real thrill.  More so than that, it was great to see people enjoying it.  I felt good about knowing that I had in a very real way, built a large section of the foundation for it.  My tools allowed Jon Favreau to work on the film in a more fluid and intuitive manner through the previs team, which really created an unprecedented ammount of previs at an astounding quality.  The more previs we were able to do, the more revisions and iterations Jon could make before principal photography.  And the film is better for it.

March 31, 2008

Introducing GlobalStorage

GlobalStorage solves a lot of problems I've been pondering for a long time.  Here are some of those problems.

  1. You're not really supposed to work directly on the file server in most production environments.
    1. It clogs the network for everyone.
    2. Network fabric is slower than internal data buses like SATA.  You get better performance on a local drive.
    3. Redundancy can save you.  If you break your local copy, you can pull the original from the server to fix it.
  2. But there are advantages to working on the server
    1. Organizationally, its easier to work on a common filesystem.  Changes others make to the filesystem are immediately seen by you.  You and your fellow artists wont miss each other's changes as they live in the same place.
    2. You don't have to manage which files you have "changed" and which you have not when publishing your changes back to the server since changes are changed immediately and... thats it.
    3. Absolute filepaths that are part of the files you are working on (references from one file to another) are not an issue if everyone maps the shared directory to the same place.  Even the renderfarm can work directly with the filesystem this way.  Otherwise, you have to manage absolute paths and artists mess this up all the time.
    4. You wont as often "forget" to publish new files to the server that you had local on your harddrive as your first instinct will be to save directly to the server.
  3. There are things that neither solution fixes
    1. Without actually locking the files you are working on when they are loaded into memory you risk two artists changing the same file and overwriting each other's changes when they publish (or write) them back.
    2. When files are eventually published to the server they are overwritten, the old version is lost.  So if you mess something up, you're out of luck.

An experienced artist will look at my list of issues and immediately start listing application features, workflows and tools to deal with the problems one by one.  And lets not be unreasonable here.  Most medium sized facilites have at the very least, solutions, standards and practices that mitigate a lot of these problems to varying degrees.  Here are a few:

  1. Alienbrain
  2. Perforce
  3. Versioning files manually
    1. Never create myfile.txt
    2. Always make myfile_v01_01.txt
  4. Use "incremental save" features in your 3d app
  5. Keep your whole project in a Subversion or CVS repository
  6. Make artists responsible for individual assets, reducing the number of people who may be working on a file.
  7. Use the verbal Chekout Checkin system (i.e. "I'm Checking Out SC_02!")

The better solutions listed here fall into the category of Source Control Management (SCM) systems.  SCMs solve a lot of problems.  They were created to manage the first real digital assets, computer sourcecode, many years ago.   Modern SCMs manage locking and versioning of complex directory trees.  They can manage collisions down to the file level and if your files are text based, they can often manage them down to the line level.

Perforce and Alienbrain have been optimized to work with digital media assets (which are usually characterized as being big binary files rather than text files).  They are however, proprietary and expensive.  If you choose one of these solutions (and many digital media production facilities do) you will be stuck licensing each artist seat, or buying a rather hefty site license.  They are proprietary and therefore, closed source.  And as much as they can provide plugin APIs, anything thats closed source is more difficult to customize than an open source solution.

Subversion on the other hand, is open source and has a large volume of support.  Subversion has been my favorite SCM for years and I've used it in production a number of times.

However, Subversion is not the end solution to the problem.  All SCMs I've worked with including Subversion, have a few problems.

  1. You don't necessarily want to version every file in your tree.  Some files are meant to be replaced.  Especially large files that are generated from small files.  Its probably good enough to generate them and push them to the server.  There's no need to track their every incarnation over time.  Its wasteful of space and processing power.
  2. If you accidentally commit large amounts of data to the server, its often quite hard to get rid of it for good.  Its part of the history and SCM systems are kind of built NOT to lose historical data.
  3. Archiving granularity is an issue.  You can create a repository per project but then the projects are separate.  Or you can keep everything under one tree, but it becomes hard to delete a project after archiving it.  Also, when archiving a project, you may want to keep versioing information for some parts but only the latest version for others.  This is even more complex, if not impossible.

Anyhow, what I've built and have running in Alpha right now, is what I'm terming GlobalStorage.  Its a suite of tools that use Subversion to implement a more robust SCM thats tailored better to production.  Basically, it a system built on top of Subversion and more common filesystem tools, to act as the single storage solution for a digital production studio.

Here are some features in no particular order:

  1. Generic storage solution.  Even the production accountant can use it as his/her data store, regardless of his/her completely different tools and workload.  Producers can use it for their storage needs.  Its not 3d or video specific in nature.
  2. Written mostly in python and therefore able to integrate directly into leading digital production packages directly and easily.
  3. Assets can be SVN backed or FLAT. So they are either under full historical version control, or just a flat copy on the server, depending on the appropriate storage for the asset.
  4. Assets can show up multiple times in the directory tree.  A single asset (say, HDRI Skies) can be in the textures directory for an XSI project, Maya project, and a central asset library, all without making redundant copies of the asset.
  5. Dependecies.  Assets can be set to be dependant on other assets.  Dependencies can optionally be updated and commited in lockstep with one another from a single call on the top level asset.
  6. Disconnected Mode.  When the system is disconnected from its server, it can create and work with new assets locally, as if they were on a server.  When you reconnect to the server, these assets are then able to be transferred to the server.
  7. Assets can have their history deleted when its time to save space.
  8. Assets can be filtered at the path level, allowing the permanant deletion of parts of an asset's history witout affecting the history of the rest of the asset.
  9. Assets are easily copied and moved from server to server for archival purposes.
  10. Assets are stored via hashcode and will never collide at the storage level.  The entire history of your production at the company will be able to live on a single storage system if its big enough.  historical projects can be brought back into an existing server without worry of data loss.

The 900 pound gorilla in the room has the word "Scalability" shaved into his chest hair.  This of course being a serious issue and the cause of many growing pains.

There are a few ways to deal with this.

Firstly, I'm going to add a "round robin" load balancing system into GlobalStorage, where a newly created asset is put on a randomly selected server from a pool.  Assets will also be able to be created on specific servers at the user's request.  And assets will be able to be moved to specific servers at a user's request.  GlobalStorage will magically merge the assets into a directory structure on the user's machine when they are checked out.  Their location on the network is irrelevant as long as it has access to the repositories.

The round robin solution is pretty powerful and will probably meet the needs of a large number of facilities.  With the application of minimal brain power by the artists, assets will move to unencumbered servers every once and a while.

However, what you'd really want is what's known as a clustered file system, where it appears a single server does all the work and it runs really fast. In reality, a cluster of servers is moving data around and load balancing in a logical data driven manner. You also would want redundancy at every step of the way to avoid having a single point of failure to keep your uptime in the 99.9% range even when you have a bad week and 4 drives and a network switch fail on you.

Clustered file systems are a pain to set up and usually quite expensive to license.  However, one of my goals over the next few months is to put together a set of virtual machines and infrastructure to make the deployment of a clustered filesystem based on commodity hardware and open source software a simple matter.

GlobalStorage is designed to work just as well in a Clustered environment as in a round robin environment.   But there's no doubt that at some size, you'll really want to put a storage cluster into your facility rather than maintain many individual servers. 

February 24, 2008

Professional Software Forums

This is a rant.  If you do not like rants, move on.

So, I've frequented a number of professional level media creation software forums in my time.  I even worked with the creator of XSIBase for a couple of years (professionally in animation, not on XSIBase itself). I've also spent quite a bit of time in forums for MMOGs (Massively Multiplayer Online Games) both as a member and as a moderator.

As you might immagine, the MMOG forums are sesspools of childish behavior, ego jockeying and all around nastyness. They're also a lot of fun.  And no one really expects them to be anything other than a waste of time.

Professional forums however, are a different matter.  What I'm talking about here are forums for software like Maya, XSI, Avid, Final Cut etc.  The expectations for these forums are different.  They're a lot of things to a lot of people and I'd like to rant about some of the typical forum diseases I see a lot.

Top Dog Syndrome

This syndrome is common.  Users who play into this syndrome have their egos attached to an image of being a top dog within the forum.  They feel that they are seen as having a superior technical or professional ability and they have to keep that image up.  When this is practiced to a minor degree, its beneficial, as it provides a push toward answering questions.  It gets out of hand more often than not and has seriously detrimental effects.

Now, I KNOW I have a tendency toward falling to this syndrome myself.  And I try to keep a check on it.  That being said, its difficult.  Specifically because I actually am much more learned and experienced on most of these matters than the general professional community.  My usual position within an animation team is that of "Technical Director" which is defined as being the guy/gal with all the answers on technical and technique matters.  So I'm paid to be top dog.

So, what I use to define a healthy top dog versus an unhealthy top dog, is weather the pressure to answer questions results in answers that should never have been given.  When a person on a forum starts skimming questions and responding with erroneous information, its a problem.  And more importantly, some forums are pandemic with it, to the point that its the norm.

For example, I logged into the central forum for a very high end video editing application recently, because I was having a problem with a feature.  I was unsure if what I was seeing was user error, or a bug, or a design limitation after searching the forum for answers and reading the documentation.  I did have enough evidence that I was fairly certain it was user error or a bug, as I had been able to force the software to work correctly under some very specific settings that were unfortunately, not good enough to let me work in the general case.  Anyhow, I posted a good detailed explanation of what I was seeing, what I thought I should be seeing, forum threads that had talked about similar problems and what resulted when I tried to implement the recommendations in those threads.  I asked if anyone had any experience or ideas that were related to what I was seeing.  So I wrote what I believe to be any professional forum's dream post.

What did I get?  I got a guy with over 20,000 posts responding almost immediately with a suggestion that I was doing everything wrong and should change my entire workflow.  He also recommended that I read the manual.

Now, I've been using non-linear editing software for over 12 years at this point.  I was using it professionally for broadcast before a single system cost less than $200,000.  I'm currently acting as a combo post production supervisor and visual effects supervisor on a feature film and I know more about video compression and post production workflow than most people with the title of "post production supervisor" on this planet. I work with the software.  I can write the software.  I developed those skills in a professional environment as the technology developed over the past decade.

So as you might imagine, when a 20,000 post top dog fails to actually read my post and comprehend it, and gives me a canned line for amateurs, I don't say "thank you" and throw away my post production pipeline because he said so.  Instead, I posted that while the advice was appreciated, its not valid due to reasons a, b c, d and e that were explained in the initial post.  I also added that I'd prefer it if the thread remained focused on the features I was having an issue with, rather than commenting on the general post production workflow.  This again, is a professional way of dealing with the issue.  Keep the thread on target.  And if the problem is not solved, don't let the thread die, if for no other reason than that others will find the thread when they run into the problem, and they're as entitled to a reasonable conclusion as you are.

So, a 2,000 post user then came to his defense and reaffirmed that I was doing everything wrong and made even more suggestions that were immediately invalidated by the information in the original post (he didn't read it).

So what's really going on here?  Its top dog syndrome.  They are not actually interested in the problem.  They're interested in being seen answering my question, especially since my tone and technical explanation indicates that I'm a threat to their top dog status. By composing an initial post thats very high level, I've put myself in their line of fire. I'm a threat and they have to respond.

There was a little back and forth while I refuted their claims with tests and information to the contrary.  They continued to tell me everything I was doing was wrong.  I made an extreme effort to not make personal attacks and stop at the level of suggesting the topic was steering off course.

Eventually I gave up, frustrated and angry.  I posted a quick rant at the end of the thread where I declared the forum usesless due to a focus on ego polishing and rampant misinformation.

I then proceeded to investigate the problem further myself until I was convinced I understood the behavior enough to classify it a bug or design flaw.  Either way, at that point it should actually be submitted to the developer in the form of a bug report.  Its also clear at that point, that you wont get any relief from it.  Possibly ever.  Just because you can isolate a bug and give full repro steps and get it into a developer's system, doesn't mean that its ever going to be fixed.  In fact, it often wont be if the developer is large enough.  Internal politics and bureaucracy almost always gets in the way.  So at that point, if the functionality is important, you have to find another solution (workaround.) And thats what the forums are for really at this level.  They allow exchange of information on bugs and software misbehaviors.  More importantly, they provide workarounds and ideas.  But this particular forum was not serving that purpose and probably never will.  All because of the rampant top dog syndrome.  The results of my attempts to combat it in just my one little post because I really needed someone to take the problem seriously?  I was belittled and  attacked.  Some forums are beyond help.

In the past, I've been able to overcome the top dog issue with the approach I tried here.  Generally, repeated appeals to fact and reason result in forcing the top dogs to actually deal with the problem in order to be seen ultimately solving the problem, or at least be part of the confirmation of the problem.  However, that only seems to work if the larger forum community is technical enough to see those facts and reasons for what they are, even if they can't provide an answer.  If the top dog feels the general community is smart enough to see them messing up, they'll try to save their skin.  Film and Video editors working in a generally Macintosh community do not meet that threshold and therefore, the top dogs on that community had no fear of being seen playing ego games when its clear to a technically inclined individual that there's something wrong going on.  So it didn't work.  And I declared the forum a lost cause.

Professionals vs Amatuers vs Prosumers

These forums tend to be populated by users at varying levels of usage.  Amateurs tend to be looking for training and answers to questions that require a certain level of expertise in order to research oneself.  These users drive professionals mad.  Because they often are asking to be able to do incredibly complex or difficult things without actually studying and training enough to even understand what they're asking.  Add to that, they often belittle the fact that it does take a lot of training and dedication.  They often have a certain level of entitlement to these more difficult techniques but don't feel the answer that it requires time and experience is fair, and it breeds anger.

Prosumers are people who see themselves as professional but are actually unaware what the professional level actually means.  For example, animators who work on projects of 30 to 300 seconds with teams less than 10 people.  They don't comprehend the issues involved with projects of 20 or more minutes with teams of 50 - 500 people.  They think its just a matter of hiring more people and being organized.  So their responses and approaches to issues are often not scale-abe and would bring a full scale production to a halt.  But they and their peers don't understand that and therefore are unable to evaluate or comprehend it.  These users make up the majority of the user base.  These types of users are frustrating to Professionals but often not infuriating.  They're frustrating for a number of reasons.  Firstly, because they often spurn the advice of professionals because they don't fully understand it and see them as being overly complex.  Second, because the software is usually written for prosumers and not professionals.  The developers often confuse the prosumers for the professionals and cater to them, often creating features that are useless in a professional environment at the expense of professional level features or functionality.  Thirdly, it is the prosumer userbase that professionals recruit from, and its frustrating to see the prosumer base become accustomed to working in a non-scalable manner, because you know you're going to have to retrain them when you eventually recruit them.  Both they and you would be better off if they'd just listen and try to understand... but well, that wont happen.  So you just let it go and move on.  But the chorus of prosumer voices completely overpowers the professional voice.

What's the solution?  The forum moderators need to try to categorize their forums.  Create subforums.  Create beginners forums.  Create topic forums.  This keeps everyone from getting in everyone else's way.  This is the way XSIBase is organized actually.  And its a good approach. You'll find most of the professional level users who are concerned with scale-able solutions in the "scripting" and "programming" forums.

WingIDE for Python

Thought I'd just put in a quick shout out for my favorite Python coding tool, ever.  WingIDE from WingWare. WingIDE is by far the best Python coding environment I've ever used.


I know the first question that comes to mind when looking at the pricetag:  "With all the free python IDEs and script editors, why bother buying one?  They're all about the same."  Well, thats mostly true.  Most python script editors I've used are about the same.  They provide some mediocre code completion and code folding.  Not bad... just not as good as it could be.

For me, its all about code completion.  Smart code completion.  The kind that reads APIs on the fly, knows what kind of object you're working with, and tells you what is possible with that object.  Its sort of a combination of code completion and an object browser.  Visual Studio is renowned for its ability to do this on the fly.  Most good python script editors attempt this level of completion but they're confounded by pythons dynamic typing.  I'll give an example.

[code] 

import xml.dom.minidom

def myFunction (doc, element):
    pass

[/code]

So, here's the question.  Since python has dynamic loose typing (opposite of static typing), when I try to code with the objects doc and element, how is the editor to know what types of objects they are so it can tell me what I can do with them?  It might be able to look at the code that is calling the function, but thats backwards.  A function can be called multiple times from anywhere.  Perhaps with completely different object types.  And its possible both of those calls could be valid.  The same problem shows up when trying to figure out what type of object a function returned.  There's no rule that a function always has to return the same kind of object.  So how could the system know?

Its at this point that most script editors give up.  Code completion stops working the moment you get outside the scope of objects you create yourself within a single function.

With WingIDE, you can hint the system and you get your code completion back.  All you have to do is put in an particular type of assert statement.  For example:

[code]

import FIE.Constraints

def myFunction(obj, const):
    assert isinstance(const, FIE.Constraints.ParentConstraint)

[/code]

from the assert statement on down, code completion now works again.  There's also an added benefit, in that the script will throw an exception should the assertion fail.  In python, my script could go for another 20 lines working on the wrong type of object and giving me a vague error without the assert check that cuts straight to the heart of the matter.

WingIDE will also parse source files for documentation and display it for you as you code, eliminating the need to constantly look up the API docs yourself.

Now, I know there's a hardcore base of programmers out there who say all they need is a text editor and be damned with all these fancy IDEs and their crutches.  Well, I simply disagree.  I'm sure if you are a coder who has maybe 2 APIs to work with on a regular basis, perhaps that is all you need.  But in my job, I am required to learn a new API within a few hours and repeat that as much a necessary.  That can sometimes be 2-3 APIs a day. Do I know the full API?  No.  I know enough to get the job done.  And thats what I'm paid to do.  For that kind of coding (and scripting, I think lends itself to that kind of coding more that development does) there is no better tool than WingIDE.  Call me a weak coder if you wish.  I'll just keep coding, getting the job done faster and better, and keep getting paid to do it. I have a job to do.

 

 

 

February 01, 2008

How Moap Works: Trajectorization, and Labeling

So far in the series, we've started in the middle at reconstruction.  Then we took a step back and talked about reflectivity and markers.  Now, we're going to move forward again, into the steps after reconstruction.

This article will be a little different than the previous ones, in that its more theoretical than practical.  That is to say, its the theory of how these kinds of things are done, not neccesarily how its done in Arena or in Vicon's IQ.  Both systems are really closed boxes when it comes to a lot of this.  I can say, that the theory explained here is the basis for a series of operators in Kinearx, my "in development" mocap software.  And most of the theory is used in some form or another in Arena and IQ as well.  It just may not quite work exactly as I'm describing it.  Also, its entirely possible I'm overlooking some other techniques.  It would be good if this post spurred some discussion of alternate techniques. 

So, to review, the mocap system has triangulated the markers in 3d space for each frame.  However, it has no idea which marker is which.  They are not strung together in time.  Each frame simply contains a bunch of 3d points that are separate from the 3d points in the previous and next frames. I'll term this "raw point cloud data."

Simple Distance Based Trajectorization

Theory:  Each point in a given frame can be compared to each point in the previous frame.  If the current point is closer than a given distance to a point in the previous frame, there's a good chance its the same marker, just moved a little.

Caveats:   The initial desire here, will be to turn up the threshold, so that when the marker is moving, it registers as being close enough.  The problem, is that the distance one would expect markers to be from one another on a medium to small object, is close to the distance they would be expected to travel if the object were moved at a medium speed.  Its the same order of magnitude.  Therefore, there's a good chance that it will make mistakes.

Recommendation:  This can be a useful heuristic.  However, the threshold must be kept low.  What will result, will be trajectorization of  markers that are moving slowly, or are mostly still.  However, movement will very quickly pass over the threshold and keep moving markers from being trajectorized.  This technique could be useful for creating a baseline or starting point.  However, it should probably be ignored if another more reliable heuristic disagrees with it.

Trajectorization Based on Velocity

Theory:  When looking at an already trajectorized frame, one can use the velocity of a trajectory to predict the location of a point in the next frame.  Comparing every point in the new frame against the predicted location, with a small distance threshold should yield a good match.  Since we are tracking real world objects that actually have real world momentum, this should be a valid assumption.  This technique can also be run in reverse.  This technique can be augmented further by measuring acceleration and using it to modify the prediction.

Caveats:  Since there is often a lot of noise involved in raw mocap data, a simple two frame velocity calculation could be WAY off.  A more robust velocity calculation taking multiple samples into consideration can help, but increase the likelihood that the data samples are from too far back in time to be relevant to the current velocity and acceleration of the marker (by now, maybe the muscle has engaged and is pushing the maker a different direction entirely).  An elastic collision will totally throw this algorithm off. Since the orientation of the surfaces that are colliding is unknown to the system, its not realistic for it to be able to predict direction.  And since most collisions are only partially elastic, the distance can not be predicted.  Therefore, an elastic collision will almost always result in a break of the trajectory.

Recommendation:  This heuristic is way more trustworthy than the simple distance calculation.  The threshold can be left much lower and should be an order of magnitude smaller than the velocity of a moving marker.  It can also be run multiple times with different velocity calculations and thresholds.  The results should be biased appropriately, but in general, confidence in this technique should be high.

Manual Trajectorization

 Theory: You, the human, can do the work yourself.  You are trustworthy.  And its your own fault if you're not.

Caveats:  Who has time to click click click every point in every frame to do this?

Recommendation:  Manual trajectorization should be reserved for extremely difficult small sections of mocap, and for sparse seeding of the data with factual information.  Confidence in a manual trajectory should be extremely high however.

Labeling enforces Trajectorization

Theory:  If the labeling of two points says they're the same label, then they should be part of the same trajectory.

Caveats:  Better hope that labeling is right.

Recommendation:  We're about to get into labeling in a bit.  So you might think of this as a bit of a circular argument.  The points are not labeled yet.  And they're trajectorized before we get to labeling.  So its too late right?  Or too early?  Not necessarily.  I can only really speak for Kinearx here, not Arena or IQ.  However, Kinearx will approach the labeling and trajectorization problems in parallel.  So in a robust pipeline, there will be labeling data and trajectorization data available.  The deeper into the pipeline, the more data will be available.  So, assuming you limit a trajectorization decision to labeling data that is highly trusted, this technique can also be highly trusted.

Trajectorization enforces Labeling

Theory: If a string of points in time are trajectorized, and one of those points are labeled, all the points in the trajectory can be labeled the same.

Caveats: Better hope that trajectorization is right.

Recommendation:  Similar to the previous technique, this one is based on execution order.  IQ uses this very clearly.  You can see it operate when you start manually labeling trajectories. The degree to which Arena uses it is unknown, but I suspect its in there.  Kinearx will make this part of its parallel solving system.  It will also likely split trajectories based on labeling, if conflicting labels exist on a single trajectory.  I prefer to rely on this quite a bit.  I prefer to spot label the data with highly trusted labeling techniques, erring on the side of not labeling if you're not sure, and have this technique fill in the blanks.

Manaual Labeling

Theory: You, the human, can do the work yourself.  You are trustworthy.  And its your own fault if you're not.

Caveats:  Who has time to click click click every point in every frame to do this?

Recommendation:  Manual labeling should be reserved for extremely difficult sections of mocap, and for sparse seeding of the data with factual information.  Confidence in a manual label should be extremely high however.  When I use IQ, I take an iterative approach to the process and have the system do an automatic labeling pass, to see where its having trouble on its own.  I then step back to before the automatic labeling pass and seed the trouble areas with some manual labeling.  Then I save and set off the automatic labeling again.  Iterating this process, adding more manual labeling data, eventually results in a mostly correct solve.  Kinearx will make sure to allow a similar workflow, as I've found it to be the most reliable to date.

Simple Rigid Body Distance Based Labeling

Theory:  If you know a certain number of markers to move together because they are attached to the same object, you can inform the system of that fact.  It can measure their distances from one another (calibrate the rigid body) and then use that information to identify them on subsequent frames.

Caveats:  Isosceles triangles and equilateral triangles cause issues here.  There is a lot of inaccuracy and noise involved in optical mocap and therefore, the distances between markers will vary to a point.  When it comes to the human body, there is a lot of give and stretch.  Even though you might want to treat the forearm as a single rigid body, the fact is, it twists along its length and markers spread out over the forearm will move relative to one another.

Recommendation:  This is still the single best hope for automatic marker recognition.  When putting markers on objects, its important to play to the strengths and weaknesses of this technique.  So, make sure you vary the distances between markers.  Avoid making equilateral and isosceles  triangles with your markers.  Always look for a scalene triangle setup.  When markering similar or identical objects, make sure to vary the marker locations so they can be individually identified by the system (this includes left and right sides of the human body).  If this is difficult, consider adding an additional superfluous marker on the objects in a different location on each, simply for identification purposes.  On deforming objects (such as the human body), try to keep the markers in an area with less deformation (closer to bone and farther from flesh).  Make good use of slack factors to forgive deformation and inaccuracy.  Know the resolution of your volume.  Don't place markers so close that your volume resolution will get in the way of an accurate identification.

Articulated Rigid Body Distance and Range of Motion Based Labeling

Theory:  This is an expansion of the previous technique, to include the concept of connected, jointed or articulated rigid body systems.  If two rigids are connected by a joint (humerus to radius in a human arm for example) the joint location can be considered an extra temporary marker for distance based identification on either rigid.  Therefore, if one rigid is labeled enough to find the location of the joint, the joint can be used to help label the other rigid.  Furthermore, information regarding the range of motion of the joint can help cull mis identifications.

Caveats:  Its possible that the limits on a joint's rotation could be too restricting compared with the reality of the subject, and cull valid labels.

Recommendation:  This is perhaps the most powerful technique of all.  Its nonlinear and therefore somewhat recursive in nature.  However, most importantly, it has a concept of structure and pose and therefore can be a lot more intelligent about what its doing that other more generic methods.  It wont help you track a bunch of marbles or a swarm of ants, but anything that can be abstracted to an articulated jointed system (most things you'd want to mocap) are greatly assisted by this technique.  You can also go so far as to check the pose of the system from previous frames against the current solution to throw out labeling that would create too much discontinuity from frame to frame.

Conclusion

These techniques get you what you need to trajectorize and label your data.  However, there are plenty of places to go from here.  These steps serve multiple purposes.  They'll be executed for realtime feedback.  They'll be the first steps in a cleanup process.  They may be used and their results exported to a 3rd party app such as motion builder.  Later steps may include:

  • more cleanup
  • export
  • tracking of skeletons and rigids
  • retargeting
  • motion editing

IQ, Arena, Blade and Kinearx may or may not support all of those paths.  For example, currently, Arena will allow more cleanup.  It will track skeletons and rigids.  It will stream data into motion builder.  It will export data to motion builder.  It will not regarget.  It will not get into motion editing.  Motiobuilder can retarget and motion edit, and it also has some cleanup functionality.  IQ will allow more cleanup, export and tracking.  It does not perform retargeting or motion editing.  Blade supports all of this.  Kinearx will likely support some retargeting but will stay clear of too much motion editing in favor of a separate product that will be integrated into an animator's favorite 3d package (Maya or XSI for example).

The next topic will likely be tracking of skeletons and rigids.  You might notice that we've kind of gotten into this a bit with the labeling of articulated rigid systems.  And you'd be correct in making that identification. A lot of code would be shared between the labeler and the tracker.  However, whats best for labeling may not be best for tracking.  So the implementation is usually different at a higher level because the goals are different. 

January 03, 2008

How Moap Works: Markers and Retroreflectivity

The NaturalPoint cameras as well as your typical Vicon and Motion Analysis systems are what are known as Optical Motion Capture Systems.  More specifically, in their more common configuration, they're Retroreflective Optical Motion Capture Systems.  Though, they can also be configured as active marker systems as well.  Its just less common.

Diffuse Bounce, Reflectivity and Retroreflectivity

Wikipedia has a page on these different types of reflected light (doesn't it always?).  However, its a bit dense.  I'll summarize and provide context.

There are plenty of potential light sources in your mocap space.  It can come through a window.  It can come from light bulbs.  It can come from the LED ring around the lense of your cameras.  When light hits the surface of an object, you tend to think about it as a whole bunch of individual rays generally coming from the same direction if it comes from a single light source, and generally having the same angle (orientation).  Anyhow, when the light strikes the surface, lots of different things happen to it.  For example, some of the light can be absorbed.  The resulting energy needs to go somewhere and can become heat, light, electricity etc.  This is how most pigments work.  Most of the light is usually not absorbed however.  Its either reflected or refracted.  A simplified explanation of refracted light, is that it passes through the object, like say, glass.  Reflected light however, is what we're more concerned with.

Simple reflection or specular reflection, is what you find in a mirror.  The light ray bounces off a surface as per the law of reflectance.  More important than any one ray following the law of reflectance, in a material that has high specularity, most if not all the rays follow the law and end up having a similar angle after being reflected.  Hence an image as seen in a highly specular material maintains its general appearance.  It doesn't blur or distort beyond recognition.  This is true of a mirror as an extreme example.  Its also true of say, car paint.  You can see things reflected in car paint and as such, it can be said that a significant number of light rays hitting car paint exhibit a tight specular reflection.  Or you could say car paint has high specularity (not as high as a mirror).


Diffuse bounce light is another form of reflection.  Diffuse bounce light is the light that you see when looking at a matte object, such as say, concrete or paper.  In the case of diffuse light, the incoming rays still respect the law of reflectance.  However, the material is rough enough, that its highly faceted at a microscopic level.  That is to say, at any given point on the surface, its orientation or surface normal is somewhat random.  So while individual rays reflect, as a whole, they scatter all over the place because the material doesn't exhibit a single smooth uniform surface for all the rays to bounce the same direction off of.  The appearance and general characteristics of such a surface can generally be predicted through Lambert's Cosine law.  Hence, why in 3d animation, we're often applying "Lambert" shaders to objects for their diffuse component.  Diffuse bounce light makes up the majority of light you see when looking at objects in our world.  Anything that's sorta matte finish, is putting out a lot more diffuse bounced light than other types of light.

Retroreflected light is light that manages to reflect directly back at the light source.  Retroreflection doesn't usually happen naturally all that much.  However, it is incredibly useful for optical motion capture and safety.  "Reflective" paint on the road at night, and roadsigns are examples of man made retroreflective materials used for safety. Also, those strips of "reflective" material you put on haloween costumes are good examples.  Notice these materials are marketed as "reflective" when in reality its not their simple reflective characteristics that make them desirable.  Its their retroreflective characteristics, a subset of reflectivity, that make them work.  Marketing often isn't concerned with being succinct.  Technically, a roll of masking tape is reflective tape.  Its just mostly diffuse reflection is all.  And it probably wont alert anyone driving a car as to its presence.

What does this have to do with Mocap?

So, how do we use this knowledge to get our mocap cameras to see markers and nothing else?  Hence making the task of tracking those markers easier?  Well, its generally a matter of contrast.  If you can make your markers brighter than anything else in the frame, you can adjust your exposure and threshold the image to knock everything else out of contention, leaving you with a mostly black image, with little gray and white dots that are your markers.

Its probably worth noting that this is not the only way to accomplish the task of tracking markers.  Another approach would be pattern recognition.  A system based on pattern recognition would probably count as an optical mocap system but doesn't fall into the historical category of an optical system as used in the entertainment industry.

Anyhow, back to contrast.  The task of making your markers brighter than everything else.  Simple specular reflectivity makes some pretty bright highlights. You could theoretically conceive of a scenario where you know where your light source is and if you catch a reflection in a marker in a camera, you could solve for the marker.  In reality though, this isn't useful.  Its rare that you'll catch a reflection of a light source in a camera.  You'd need way too many light sources to make it common enough to use.  Its possible you could take this to an extreme and set up a colored dome and then use the color of the dome reflected in a marker to track the ray back to its source location, but again, this is speculative and the kind of setup you'd need to do is is expensive and quite disruptive on the shooting environment.  Remember, one of the goals of viable mocap systems is to be able to be used in parallel with principal photography on a movie set.

Diffuse light is potentially useful.  However, fact of the matter is, most things are fairly diffuse.  Things that are white, or light gray are highly diffuse.  A diffuse object can only put out as much light as it takes in.  Its not possible to be SO much more efficient than a white piece of paper.  So instead, approaches to using diffuse light to generate contrast go the other direction.  You try to make everything in your environment matte black (full absorption, no diffuse bounce).  That way, your markers show up bright by contrast.  Again, this solution isn't ideal.  The room, the cameras, the people, everything but the markres must be matte black to get contrast this way.

As you might imagine, the solution here is retroreflection.  Again, retroreflection is light that reflects back at the light source.  So its super bright like specular reflection, but unlike specular reflection, its easy to pick up.  You know exactly where its going, right back to the source.  All you need to do is make sure your light source is also your camera lense (or close enough).  This is of course, why NaturalPoint cameras and optical mocap cameras in general, tend to have LED rings around the lense.  NP camera LEDs show up a dull pink when they're active but don't let this fool you.  They are actually putting out a ton of light.  Its just infrared... about 850 nanometers in wavelength.  According to Jim Richardson, the CMOS sensors in the cameras are actually more responsive to visible light than IR.  However, IR light is usually used in mocap because a) we can't see it, so it doesn't distract us.  b) motion picture film and video cameras already filter it out because they are mimicking our own visual response.  This way, the mocap system's lighting doesn't interfere with human vision based imaging.

Markers

If you've got your light source and camera all set up to pick up retroreflective light, then all thats left to do is make sure your marker actually is retroreflective.  There are typically two ways this is done by contemporary humans.

Firstly, we can use "corner reflectors."  An example of a corner reflector is a bicycle reflector.  Corner reflectors are made by butting three mirrors together at right angles.  A bicycle reflector often has hundreds of little mirrors set up in triplets in this manner.  Believe it or not, this does actually work.  I have to cover up my bicycle all the time when I use cameras in my apartment.  I have looked into getting a bunch of small 1" bicycle reflectors to use as markers and in some situations, they may actually be useful.  Though, there are better solutions.

The second retroreflective material is whats known as 3m scotchlite.  Pretty much any retroreflective material you can think of besides corner reflectors comes back to 3m and scotchlite.  Even those reflective paints on the road are made with materials bought from 3m.  I have a can of "reflective" spray paint from Rustoleum.  They bought their materials from 3m.  Scotchlite is based on glass beads and can be bought in many forms, from raw beads (sand like) to textiles to tapes to paints.  Scotchlite comes in different grades and colors.  Generally though, the best retroreflectivity comes from scotchlite products in which the beads have been bonded to a material by 3m, rather than bonding done by other parties.  So, buying 3m tape or textile is your best bet for mocap.  The material that NaturalPoint sells in their own store is actually the highest quality material I've come across.  Markers built from that material perform better than some of the "hard" markers in their store, that clearly had the material sprayed on by a 3rd party.

Emissive Markers

You may have noticed that to this point, we've been talking about generating contrast on materials that are bouncing light from a separate light source.  However, its possible that a marker could emit its own light.  Generally, these types of markers are known as active markers.  I have actually constructed active markers in the past and will probably do so again within the year.  NaturalPoint actually sells wide throw 850nm LEDs in their store for this kind of application.  Mocap systems by PhaseSpace also work off of active LED markers.  Active markers have benefit and detriment.  They often put out a lot more light than a retroreflective maker will and therefore are really easy to track.  They are however, expensive, and they do require mounting electronics on your mocap talent.  This can be problematic in some cases. In some cases, they heat up quite a bit, though this problem can be designed away.

Hopefully some of this has helped give an understanding of what is going on in your mocap volume.  You can use this information to help get better quality captures.  Throwing your cameras into grayscale mode and looking at the enivironment as the camera sees it,  will let you see these concepts in action.  It should also give you a better idea of how to go about optimizing your mocap environment and exposure settings for capture.