vendredi 28 mars 2008

more ffmpeg for 3G

My 3gp video files generated by the rendering worked, but took up to 5s to be displayed... This was kind of irritating for an interactive demo.

Turns out the 3gp file was not really to the liking of the 3G gateway/phone, because it had only 1 intra (full) frame, and no intermediate frames. I naively thought this would be optimum for a static image....

Bizarrely, changing from a 0.5 frame/sec rate for a 2second video (ie 1 frame only) to 5fps (ie 10 frames) makes the image appear in around 1s instead of 4-5s! Why sending more data makes the display go faster I have no idea.....

So, my ffmpeg parameters are now:

ffmpeg -loop_input -i INPUT.PNG -i_qfactor 3 -qscale 8 -b_qfactor 6 -ps
128 -bufsize 43000 -b 10000 -vcodec h263 -y -an -r 5 -t 2 -f 3gp OUTPUT.3gp

This is using my SVN 11003 version of ffmeg. Your mileage may vary...

I also changed to a black background rather than a white one before changing the frame rate, but I didn't see any impact on the latency....

mardi 25 mars 2008

lame wav files to mp3

And a useful lame commandline to convert basic 16 bit linear 8khz audio wav files (as generated by a certain VoiceXML platform) into nice and small mp3 files:

lame -m m --bitwidth 16 -S -silent -r -s 8000 input.wav output.mp3

ffmpeg : png to 3gp

ffmpeg. What a fantastic tool... but so many command line options.....

For the renderer project, ffmpeg is currently used to create the 3gp files with the generated video that is the output bitmap from the markup rendering. Longer term, this can/should be done using the ffmpeg api directly from the bitmap. Short term, I just generate a PNG file, and transform it into a 2s video clip in 3gp container format with h.263 codec.

But: when creating 3gp files, its important to note that if you want to play them on a 3G phone over the mobile waves, you have to watch a whole lot of parameters to ensure that the bit rate and buffering don't make it break.....

For humanity, here's a note of the good command line to convert a PNG into a 3GP that will actually work.... Anyway, currently I'm using:

ffmpeg -i input.png -s 176x144 -i_qfactor 10 -qscale 8 -b_qfactor 6 -ps 128 -bufsize 80000 -intra -b 40000 -vcodec h263 -y -an -r 0.5 -t 2 -loop_input -f 3gp output.3gp

However, I'm not sure this is too good for 2 reasons:
1/ it converts a <1K PNG file (at 176x144 QCIF) into a 3-4K 3gp file (which should contain only 1 intra frame which is the same image!!)
2/ the 3gp file seems to take about 4s to get from the media server to the phone....

VoiceXML, and 3G video

I've been working on VoiceXML (www.w3c.org/voice, http://www.w3.org/TR/2004/REC-voicexml20-20040316/) for quite a while now in the work context. Development of VoiceXML 2.0 interpreter, operation with HP OCMP media platforms, operation with a SIP audio media platform, creation of VXML applications for products and customers.....

Latest requirement : 3G video. How to create or update a audio IVR service to add video capabilities?

Well, using a video capable platform (such as HP's OCMP 4.0 or the NMS VoiceXML gateway), its pretty easy to do the basic play a 3gp file to get a video image up. Generally, the audio tag just does it if the file referenced is the right format.... But I find 2 big issues in creating a video service:
1/ creating those pesky video files for prompting
2/ dealing with 'dynamic' data

For audio services, the situation is pretty simple:
Point 1 : use a studio or a 'batch' interface to a speech synthesis motor (TTS) to get .wav files
Point 2 : use TTS directly in the VXML - say whatever you want (a name, a date, a value etc)

For video, not so easy. In fact, what we're missing is sort of TTS for video : data+markup -> video stream or video file. Thus was born the "renderer server"!

In its current form, it takes markup (xhtml at the moment), and an output format (.3gp files currently) and does the neccessary to create a static 2s long video file, of the correct format and encoding to work with a 3G call. This is already pretty much what we need for point 1!

You can try the thing here:
https://www.eloquant.net/rend/file/render/genVideo.html
(ok, time for license and disclaimer : this is provided purely for demo or personel interest, with no support or guarantees, if it breaks and takes your arm off then you've only yourself to blame)
Note that it only likes 'pure' xhtml and has no 'plugins' such as flash or javascript...

I used this in a batch mode to create my static video prompts for my first demo video service!

Next : dynamic data
Often, in a vxml service, we get some text or details that needs to be communicated to the caller. When we're in audio mode, TTS is used. Easy enough, but limited (there's only so much info you can absorb via your ears!). In video mode, we can potentially communicate a LOT more (a picture is worth a thousand words and all that). But we need to be able to generate that picture, on the fly and in 3G compatible files.

Ordering up the generation currently is pretty simple : a subdialog call to the renderer, with the xhtml content, and it does the job! (even better would be to call up the renderer from the vxml interpreter whenever we get a text prompt during a video call.... but that requires modifications to the vxml interpreter which I can't yet do....)

Then we have to get back the video: the best method would be to generate it as a RTP H264 stream, directly to the 3G gateway. But I haven't got an RTP encoder from the static images working yet.... Otherwise, an RTSP controlled RTP stream back to the media platform. Nope, still blocked at the RTP encoder level......

So instead, the renderer generates the same 3gp file, and the vxml just 'plays' it. As these platforms generally just repeat the last image from a video when they run out of file, then that's good enough for the moment.

In the VXML, this just boils down to
a/ create the xhtml to markup the image you want (with an inline CSS generally to deal with the small mobile screens)
b/ get the renderer server to create or update the 3gp file via a subdialog call
c/ play the resulting file directly from the renderer server
[I was going to embed some vxml here but the tags break the html...]
So get it here:
http://eloquant.nerim.net/blog_bits/render_demo.vxml

vendredi 21 mars 2008

Clutter

What clutters up my bag?

Nokia N800 Internet Tablet
+ forget the name, its a wee portable linux box
+ a fantastic touch screen
+ a proper OS (based on Debian).
+ does media player, games machine, web surfing, email, instant messaging etc etc
+ 2 SDHC slots (got 2x4Gb in there at the mo)
+ if you hack up a USB OTG cable you can plug in a USB disk, keyboard etc etc....
Think 'open' iTouch....
- Java support is not there yet..

Nokia E90
+ not a bad smartphone, probably the best for my needs
+ full keyboard, big screen, wifi+3G
- S60 just isn't as good as S80 on my old 9500
- there's NO PROPER JVM!
(so can't run my old java based SSH tunnel VPN: http://sourceforge.net/projects/sshforwarder/)
- no open source VPN code (no SSH tunneling, no pptp, no ipsec, nothing!)

bunch of cables: 2.5mm-3.5mm jack converter, usb extender, usb otg cable, car adapter to nokia charger, headphones etc etc.
SD card USB reader + mini/micro SD adapter (Nokia)

And a Swiss Army knife of course...

Software...

An random list of useful/recommended software I've used recently.

Converting video files (avi, dvd, divx etc) to a Nokia N800 friendly format:
https://garage.maemo.org/projects/mediaconverter/

Drawing architect type plans:
Sweet Home 3D

SIP softphone (audio)
SJPhone : http://www.sjlabs.com/

start...

Is it a blog? or just a place to store stuff? who cares....

I wanted to call to get a url like devrandom or devslashrandom, coz here is where I plan to post various bits-n-pieces about development stuff, but there are (of course) too many geeks on blogger to get any url around that theme....

Anyway, the idea is to post up hints-n-tips about software dev (and anything else that seems neat... to me...) for the benefit of mankind. I've got a lot of bits of code and help off other folks web pages/blogs over the years, this is a way to put them together somewhere (so at least I'll be able to find them) If anyone else finds this useful, great....