Media Player Framework Guidelines

The following recommendations are based on my experience of creating two media player implementations. The one was based on the MediaElement in Silverlight and provided out-of-the-box integration with SilverHD DRM as well as some smart transport channels and performance tweaks. Another one was based on GStreamer in an embedded software and featured tight coupling with HW decoder and support of 19 different pipelines. Surprisingly, there are design commonalities between these media players.

1. If your player is going to support playlist, do not define the playlist format. Typically, playlists come from external sources. You cannot change their format. It is better to write an ad-hoc playlist parser for each playlist type needed, and provide a parsed playlist in form of an object tree to the media player. In other words, you want to exclude playlist parsing from the media player framework. Because playlist formats are often too weird and non-standard so that there is no hope the playlist parsing code will ever be reusable.

2. Playlists consist of playlist items (if your player does not support playlists, you can still think of it just playing a single playlist item). Do not expect a playlist item to be just a url / file path. In general, it is not possible to reliably detect the format of the media, the transport protocol as well as many other parameters just by parsing a url. Create a full-featured object describing the playlist item. For example, with the following properties:

  • Transport (file, progressive download, rtsp, smooth streaming, hls, http streaming, etc)
  • Container format (MP4, ASF, fMP4, AVI, etc)
  • Video codec
  • Audio codec(s) per audio stream
  • DRM settings
  • Additional parameters needed for your player to work, eg. MPEG2 TS program number or live streaming flag.

3. It seems to be a more clear design to separate the player state from the playlist item, but surprisingly I was obtaining better source code by combining them. Therefore, I have added the following fields to the previously mentioned:

  • Play state
  • Play position
  • Media duration as detected by the player
  • Media bytesize as detected by the player

4. Speaking of the play state, two questions are most important. What is the difference between STOPPED and PAUSED? The difference is that at STOPPED you must destroy the playing pipeline, release all the memory, reset DRM state – basically, revert to the state before the playback. Do we need separate FINISHED state? Yes we do, many apps rely on the player’s ability to detect that the media item has been watched fully (and not stopped by user interaction).

5. Play position — I inevitable end up with expressing it both in time units and in percent. There are situations when you only can seek in time units, but not in percent. There are other situations when you can seek only in percent but not in time units. Most of the time, when displaying the play position as time, you also need to know it in percent.

6. Another thing I inevitable end up with is creating a repeated timer that would fetch current play position, say, 10 times per seconds, and fire all corresponding events to allow the UI to update itself. You might be wondering why this separate thread is needed at all bearing in mind there is a separate decoding thread anyways, and it knows exactly the PTS of the video frame it is going to dispay, so that it could check if the second portion of the new PTS is not the same as the one of the previous PTS and fire all necessary events (preferrably by posting them onto a main loop of UI thread). But no, the reality is different.

7. Media duration could also be detected by the player synchronously. After all, most of the formats have it pretty soon in the beginning of the file, and the demux has to parse this header anyways, so that nothing would prevent it to post a corresponding event to the UI thread. Again, the reality is different so that I always end up with implementing it in the same repeated timer routine I’ve mentioned in the previous point — i.e. checking a flag if the duration has been determined, and if not, ask the pipeline.

8. Another thing that might be controversial from the clean design point of view, but worked for me, is handling of pipeline asyncronity. You can’t change play state synchronously. Therefore, you can’t issue a command to the pipeline to change its state and then write the new state to your playlist item immediatly — for a couple of milliseconds (or even couple of seconds when we are seeking), this state would be wrong, which will lead to unpleasant racing conditions. Following a theoretical clean design, for each call of the pause(), play() or seek() method of your player you would create a job object describing the change and add it to the queue, and have another thread that would execute the queue, waiting for every job to be finished before starting the next one. This is complex. What worked for me is cloning the playlist item object. Basically, the player always keeps two copies of the playlist item object: actual state and desired state. When the player instantiated and a playlist item is passed to it to be played, it would set it to the desired state, set its copy to the actual state (resetting the play state to STOPPED), and then initiate the playback. All subsequent events coming from the pipeline (current play position and play state, duration etc) will be applied to the actual state. When the user calls pause(), the play state of the desired playlist item object will be set to PAUSED immediately, then pausing will be initiated. When the pipeline will really pause, the play state of the actual playlist item will be updated.

9. Generally, you should expect to be forced to write a lot of workarounds. I don’t quite understand why it is always needed. But it is just a fact of life. In Silverlight, you would get some wrong play states reported, and you wouldn’t get some events you’d expected to get, and you can easily overload the pipeline, and you cannot do a seek and wait on http response in the same time (a deadlock). With GStreamer, you wouldn’t get NEW SEGMENT events after a seek, and it would report an error when seeking in FLV but still perform the seek, and it wouldn’t post the state change from READY to NULL onto the bus.

10. Never ever think you can correlate byte offset in the file with its time position. You cannot divide byte offset by bitrate to get the time. There is no single codec (except of turning compression off) capable of holding the exact bitrate. There are VBR encodings. There are encodings with some unused streams or ID tags embedded into it (ever saw a MP3 file with 1 Mb of cover image embedded in it?). This is especially true if your player is used to play long content, and not just clips of a couple of seconds.

11. You will need a lot of test content in various formats, quality levels, bitrates and DRM protection levels. And you will need this test content to be stored locally, and on a web server, and on DLNA server or any other server that streams in the protocol you’re going to support. Preparing this test suite and configuring all needed software is a huge work, so that you probably want to find somebody (an intern, or a tester, or a product manager, an admin, etc) who would be willing to manage it for you.

12. I was always dreaming of creating the YouTube-like “immediate drag and drop seek” user experience, but it didn’t come through. Maybe the media platforms I was using were really so limited. Or maybe I just haven’t tried hard enough. In any case, do not expect your pipeline will handle it for you automatically.

This Week in Twitter

Powered by Twitter Tools

Anti-Pirate

Hello, freedom fighter and member of the pirate party. Welcome to my blog. Let’s team together and make a movie. Better yet: let’s make a great movie and make a lot of profit out of it. This will be a world hit. Let’s say, we call it “Harry Potter and the Order of the Phoenix” and base on the corresponding Volume of Harry Potter saga. Yeah, this should surely work. I mean, there are 400 Million fans out there, and if they have attended to the previous four movies, they will also attend to the fifth.

We will be rich and famous!

So we quickly check with Ms. Rowling, check with the director of previous movies (who was that guy? I dunno), check with each of the actors, and make a 100% correct estimation of movie costs, which happens to be $315 mln. This sounds way too much for us. I mean, hello, that’s $40000 per movie second! But we take it easy at first. Let be good and allow those cool guys to profit from our movie too.

But we don’t have this money.

So we go to bank and ask for a credit. I mean, this surely will be a world hit, can’t it be? But the bank laughs and tells us they need something as security. Way too many movies flop out there.

So we have to decide on financing. We try to follow the typical Silicon Valley way and pitch the idea to investors. Unfortunately, we don’t have any talents in our founder team as all the real talents (including the writer and the actors) are somehow not really willing to be co-founders, but rather want to be paid, and paid much. We also can’t show any mockup or dummy version of our movie. You see, when you create a new app by touching your iPod sitting in Starbucks, with very little investment of your time you can create the important parts, like UX, and mock unimportant parts, like any backend logic. With movies it is not the same, as they don’t have any backend logic, and there is no difference between mocking their experience and create a 80%-ready movie.

Well, we have to be creative now. What if we put this title on Amazon and allow fans to pre-order it? Let’s say, each will pay $20 in advance. After 10-15 mln pre-orders we can already start filming… But after some consideration, we reject this idea. First, it is unrealistic to attract this number of pre-orders without any huge publicity, and it needs money that we don’t have. Second, relying only on word-of-mouth propaganda and Harry Potter fan forums, we might get only 10000 or maybe 100000 pre-orders. Besides, even if we were phenomenally successful in our viral campagne, all those people will have to wait several years until we have 10 mln pre-orders, and then yet another year or two for filming and post-production. That’s very unlikely. Third, we immediately get a lot of mail from our potential investors telling that they are ready to pay $10 in advance, but under the promise that we will release the movie in public domain, i.e. without the DRM and under a CC license. Thusly we realize that we cannot rely on any after-release sales whatsoever and have to gather all the money (including our profit. Remember, we wanted to get rich?) up-front, before starting to film. This increases the required number of pre-orders to 20 mln or so. Fourth, what if we ultimately won’t be able to make a movie? We would need to return all the money collected so far, and that’s huge cost (just sitting in our office and wiring the money back using PayPal will take 40 man-years per mln of pre-orders).

Well, what if we cut our costs? We don’t need the plot from Rowling, because there is a lot of open-source fanfiction in the Internet. We can use students of our local actor academy as actors, and replace special effects with some painted paper and glue. But then we make a quick market research and understand that by factoring out the key success factors of the movie, we will also lose most of our paying customers. Basically, around 5000 fans will watch our movie, and fourty of them will even pay for it. $10 each, with a CC and no-DRM clause.

Okay. So, what else possibilities do we have. We can register our company in some country giving tax reductions for movie production and then sell those reductions for hard cash. But this also would require some guarantees that our movie will eventually be made. We can go to a ministry of culture and ask for a sponsorship. Perhaps, they will even pay us twenty bucks or so. We can try product placement, but that’s unlikely that we can gather so much money out of it that we’ll cover $315 mln. Besides, all those sponsors will also surely require some guarantees or securities.

And at last we realize what is our Denkfehler, what is our failure to reason properly. All these previous five paragraphs, we were trying to sell idea. But ideas cost nothing! Well, except for patent trolls, but that’s another sad story. It is not possible to earn money having an idea of movie (or idea of a software, for that matter). It isn’t ever possible to gather enough money to start making the movie, having just an idea of it. If at all, our revenues have to be extracted from a made and released movie. Only by selling movie we can make enough money to refinance its production and also to earn some profit.

To borrow money for the movie, we need a guarantee that we can sell it, and sell it well. So we go to a distributor, like, Warner Brothers, Inc. They have contracts in place with several thousands of movie theaters all around the world and can handle a window for our movie so that all theaters will show it simultaneously. “Coincidentally” (of course not), they also have output deals with Pay-TV, which will ensure that the notorious non-movie-goers can still pay and watch our movie. Besides, when all the money will be extracted from movie theaters and Pay-TV, the distributor will also continue earning us money on the DVD market and by licensing the movie to free TV.

Let’s see how it went with the real “Harry Potter and the Order of the Phoenix” according to the leaked distribution report.

First, the expenses.
Movie production costs (called Negative costs): $315 mln
Interest (supposedly on the borrowed money):  $57   mln
Guild, Union, Trade accociations:                         $14   mln
Preprint, Dubbing, Subtitles, Editing:                  $5,6  mln
Prints:                                                                          $29   mln
Advertising and Publicity:                                       $131  mln
.

The latter two items are most interesting. It seems that over there in US they still use the analog technology so that they need to actually print the movie on a film roll and to ship it to the theater physically. Second, they spend $131 mln on advertisement. But, in both cases, I believe the distributor knows what they are doing, as they pay for those expenses out of their pocket. The grand total of all expenses was $564 mln.

Now, the revenue.
From all movie theaters together:                         $460 mln
Pay TV:                                                                        $42   mln
Video:                                                                          $87   mln
Free TV:                                                                      $2     mln

Including other minor revenue sources, and deducting $211 mln of distribution fees, we’ll finally get (only) $398 mln, so that the movie was in red in September 2009. On the other hand, the so-called back-end (the time window where revenue is mostly collected from non-theatrical sources) has just started back there (in total, $131 mln was earned from TV and video). According to Edward Jay Epstein, the upcoming revenues from TV and video markets are to grow significantly. To bring the movie out of red, and considering the distribution fees, the combined non-theatrical revenues should be at least $160 mln. This include Pay and free TV, DVD market, and, well, Video on Demand streaming in the Internet.

Let’s talk now about the freedom of information and about DRM, shall we?

A free unprotected video file downloadable in Internet doesn’t earn any money. Yeah there are some people who are ready to donate their $100 for the movie just watched, even if they aren’t forced to do so. But even collecting money from all four of them costs more than the resulting $400. Moreover, if we want our movie business to be at least profitable (and ideally get rich, you remember?), we have to extract more money, much more money than can be done with free will donations.

Besides, the more there are such things as media centers, XBOXes, Smart TVs, DivX players and stuff, the easier it is for the lay people to play a pirated content instead of buying or renting a DVD. They will also have less motivation to watch the movie on free TV, where its user experience is often almost destroyed with annoying advertisements. And Pay-TV subscriptions might also be threatened, albeit in a lesser extent, because the Pay-TV channels have some unique value propositions (compromise-less full HD quality as well as live sport event coverage) and can bundle all the stuff into one subscription plan.

Yes, I know the reasoning along the lines “those who watch a pirated movie would never pay for it”. This might be true. But, as you can see from the distribution report above, there are millions of consumers who have as a matter of fact generated $131 mln (and counting) of revenue for just one movie by buying or rending DVDs, paying for their TV or enduring it on the free TV. And these consumers also have access to all those Smart TVs devices and illegal content, and they can stop paying for the content. Likewise, free TV can stop licensing movies if their market research would show that their audience would rather watch some self-produced talk shows instead of movies, which they have already watched as pirated content.

So, relying only on movie theaters would either prevent such movies as “Harry Potter and the Order of the Phoenix” from happening, or seriously shift the movie experience – for example, the movies will be made even more shallow so that they will attract even wider audience to the movie theaters.

I think we’re in agreement now that we have to sell our movie, we can not afford to just give it away for free. Now, do we absolutely need to protect it with technical means (DRM)? Well, what will happen if we would sell a movie without DRM? The first person buying it will be a good fellow, so they will just watch it themselves, and, maybe, lend it to their close family and friends. Nothing tragic. So will be also the second person. And the third one.

But, let’s agree that every one person from a 1000 has enough criminal energy and the urge of cheap fame required to upload this movie to Dropbox or some other free file exchange service. Up from this point, our sales will rapidly decline. It is hard to predict how rapidly. Let’s just use the Pareto principle and say we will earn 80% of total sales before the point of publishing the movie for free. This means, on average we will sell 1250 copies of movie. If we want to earn just $30 mln on the video on demand market, we have to sell movies for $24000 per copy. First, it is hardly possible to find anybody willing to pay this price tag for “Harry Potter and the Order of the Phoenix”. Second, this would mean that a tiny group of 1250 people would pay for movie, even though hundreds of millions would watch it. Which is unfair, as simple as that.

OK, so let’s apply DRM, but utilize it in a sell-through fashion, that is, once paying for the movie, you can watch and use it unlimited number of times and unlimited time period, on any device compatible with this DRM technology. This is not unheard of. For example, in Germany you can buy and download sell-through movies from the MediaMarkt web site. The only three things you won’t be able to do is to make screen shots of the movie, to re-cut or transcode it yourself, and to play it on “just any” cheap piece of junk – you devices have to be compatible with the DRM.

Now, this mode is completely appropriate for the Harry Potter fans, who indeed will be watching enjoying the movie several times. But what about the mere mortals? Do I always need to pay $25 for a movie, even if I apriory know most probably I won’t watch it second time? Is it fair to have me pay the same amount as the Harry Potter fans? Well, why, no, DRM can also support a more fair price. I can rent a movie from the mentioned web site or from maxdome for just a fraction of this price, and have 24 hours to watch it, which is more than enough in most circumstances. And if against expectations I will want to watch it again, I still can rent it again from maxdome (for that bargain price, I don’t really care). Or download it from MediaMarkt. Or buy it as premium DVD box together with Potter’s magic wand.

Résumé.
We pay for movies to enable creation of high-budget high-impact films. The more we pay for movies, the deeper they will be, because studios will not be required to make them as shallow as possible to get as much profit as possible from the male teenage movie-goers. If we support DRM, we will support selling movies in Internet, and thus we support Internet services, which are way cheaper and more comfortable than Pay-TV and DVDs. If we support DRM, we will allow for fairer prices, involving everybody to pay at least something to watch the movie, and taxing heavy movie watcher with a higher price tag.

This Week in Twitter

  • Huh, have I missed something? You can rent Hollywood movies on YouTube or Google Play? RT @verge http://t.co/biByj200 #
  • NuGet 1.7 released. NuGet, VS and .NET are order of magnitude better for OSS app development than GNU Autotools. #
  • I liked a @YouTube video http://t.co/ULEf4zLa ????? 1 #
  • Have I already told how I love the web design of @verge ? #
  • More and more web sites don't support IE9. WTF!? You've always wanted open web, now BE open to ALL standard-conforming browsers!! #

Powered by Twitter Tools

Vergebung

Könnt ihr das bitte bildlich vorstellen? Jemand nimmt eure Hand, und nagelt sie fest an einen Stück Holz. Einfach mit Hammer, ohne Betäubung. Mit einem Nagel direkt in die Mitte. Und ihre Hand bewegt sich etwas hin und her auf dem Nagel, solange er noch nicht festgehammert ist. Und danach fühlt ihr mit dem Handrücken das vom Nagel auseinandergeschobene Holz.  Und ihr wisst, dass ihr sehr bald mit dieser Hand auf dem Kreuz hängen werdet.

Könnt ihr dann diese Tat und diesem Mann auf die Stelle vergeben? Und zwar, richtig und aus dem vollen Herzen vergeben, und nicht einfach “passt scho, passt scho” sagen und künstlich lächeln? Und sich danach sogar dafür Sorgen machen, dass eigene Vergebung nicht genug ist, und auch den Vater darum bieten?

Und ich denke nicht, dass die Tatsache, dass er Sohn des Gottes war, Jesus es leichter gemacht hat, zu vergeben, letztendlich war er gleichzeitig auch ein Mensch!

Für mich ist es Grund zum Überlegen, was alles, in welchem Unfang und wie schnell ich vergeben kann. Dann den Potenzial erkennen. Und dann versuchen, mich zu verbessern.

This Week in Twitter

  • Vala programming language: combine C# syntax with GObject implementation. Interesting idea! http://t.co/Q9tVUqKp #
  • RT @verge CBS CEO turned down Steve Jobs over Apple TV subscriptions to protect 'existing revenue streams' http://t.co/DUxgxQ4t #
  • RT @verge New online TV network from 'world's richest man' will bring Larry King out of retirement http://t.co/2iOnYc8H #
  • Was steereo.de just ahead of time? Rdio, Simfy, now Spotify… RT @verge Spotify coming to Germany tomorrow http://t.co/YSF8ZdAD #
  • "If clients don’t trust you they will eventually stop doing business with you. It doesn’t matter how smart you are." #
  • Gespräch dreier Jungen, alle unter 10 Jahren: "Selbstjustiz! – Neein, Sachbeschädigung war es nicht, und Körperverletzung auch nicht" #omg #

Powered by Twitter Tools

The secret of git

The secret of git popularity is simple. It is very logical, quick and has an awful usability. A perfect combination for its primary audience – the geeks, who love tinkering with a complicated beautiful thing to make it work for them. Another lesson learned – great UX doesn’t have to be ergonomic (in tayloristic sence).

In particular, I couldn’t think I would enjoy working with SCM in the command line. In TFS, I frowned upon every time somebody told me you have to use tf or even some 3rd party tool to accomplish some tasks, because it was obvious for me that if I see all my projects and sources in the Visual Studio, there is also the proper place to perform all SCM-related tasks. I didn’t want to work with my files using two completely different UIs, let alone work with my files and change history from the command line.

But with git, working in command line is the only option for me. First, git is a too dangerous tool to allow any intransparent GUI wrappers doing some uncontrollable magic around it. Second, using it for embedded C development feels surprisingly harmonious: compared with high-level languages and projects you have dramatically less files when working in C, so that the scope is still manageable.

Nevertheless, after tinkering with it for a couple of months, trying to conceive a workable development process, I want to give up on some its areas. Perhaps some good git wizard will stumple upon this post and share his wizdom with me.

Basically, our current process looks like this. We have a central git server with several bare repositories. They are cloned into each dev PC, as well as into a release PC. In most repos, there is a branch called “develop”, this is the integration branch. Working on source code would typically look like this:

git checkout develop
git pull
git checkout -b maxim_bug1234
# Switch to Eclipse, code, build, test, debug, ready
git status
git add  OR
git commit -a -m "Changed abc"
# repeat until bug fixed or feature done
git checkout develop
git pull
git checkout maxim_bug1234
git rebase develop
git checkout develop
git rebase maxim_bug1234
git push

We normally don’t push our developer branches to the central server, because the commits themselves are still pushed (and thus backed up) as part of the develop branch, and we currently are afraid of cluttering the server with thousands of dev branches.

Besides of the develop, there is a release branch. Our software versioning format is <major>.<minor>.<release>. To prepare for release, we do the following on the development PC:

git checkout develop
git pull
git checkout . # or "git checkout -b" if the branch does not exist yet
git pull
git merge develop
git push origin .
# If we want to make an internal release for debugging purposes, that's it
# Otherwise, we increment the release number and tag it
git tag -f  ..
git push origin --tags

Now, in our process we can either release a branch (that is, its top) for debugging purposes only, or a tag. Given the <tag-or-branch> we want to release, the procedure is as follows (on release PC):

git fetch
git fetch --tags
git checkout 
git merge origin/

# automatically generate version string of this release
# this will give either the tag, or (in case of a branch top), --
echo "#define VERSION \"`git describe --tags  --match=[[:digit:]]\.[[:digit:]]\.[[:digit:]]`\"" > version.h

make all

# Now automatically tag the released state to be able to return to it later
stamp="release_``"
git tag $stamp
git push origin $stamp

The issues we have with this process are:

1) It seems to be in git nature to be very selective in what you commit at a time. For example, before git I would regularly do several tasks at once, and then commit all files with one changeset, possibly describing all changes in its message. Now I would first stage changes related to one task, commit them, then stage changes of other task, create a second commit, and so forth. It seems to be a good idea to keep commit scope atomic; this should help when merging or cherry-picking. The problem is that there are SO MANY COMMITS! And the git log command, at least when used without additional parameters, is very unhelpful in keeping an overview of work done in the last several days. It prints a lot of useless information in a verbose format, so that we have a feeling of get lost in this sea of commits. GUI tools like gitk and SmartGit are not better at this. There are still way too much commits to keep track of. Compared to old-style SCMs, where you’ve typically did one or two commits a day, it is a quantum leap. We need to find an alternative to commit messages to understand, what has been done lately.

2) Perhaps we’re doing something wrong with how we merge or rebase, but I’ve expected to see the develop branch as a straight line in visualization tools like gitk or SmartGit. It is not; the branches look like a complex graph, so that you cannot see visually what changes have been merged in to the develop by whom, because you can’t recognize the clear line of develop branch.

3) In a traditional SCM, the server tracks all files I’m currently working on (i.e. changed but not yet commited), no matter where I work on them (in which working folder and on what PC). We work in virtual machines on several PCs, for example I have at least six different VMs. Each of the VM has a local clone of git repository. When working on a release (often under time pressure) I tend to perform changes in several VMs at the same time. Now, if I forget to push the change to central git server and turn off the VM, I have to means to remember about this change. Typically, I would forget about it, then proceed with the development, then at some point wonder why the change is not there (“haven’t I already fixed this thing?!”), then perhaps re-implement the change and push it to origin. Afterwards, there are good chances I will resume my other VM, make some changes there and try to push everything, including my previous implementation. At this point, I will be presented with a nasty merge conflict and I have absolutely no history spanning across my VMs that would help me to understand what has just happened.

4) Fundamentally I don’t like how merging is implemented. git can only merge into the branch currently checked out. In more than 50% of cases, this is the wrong approach, because I typically want to ”uplift” changes from my current branch first into the develop branch, and then from the develop branch “uplift” them into a release branch. Constant switching between branches just to merge something is waste. We would need a way to merge “uplift” changes, at least for the case where it is fast track.

Any ideas?