Video communication has been around for decades now. But certain anthropological issues have kept it at bay. Men prefer not to see the other bloke, and women do not want to be seen. Something pre-arranged has a chance here. Even when you do video chat on a client such as Skype you start off with a voice chat and then ask the other person to turn on the webcam. The other factor is that it will not work in mainstream business communication because there is no need and time for visual communication in the business world. So we are left with the residential offering with an emphasis on pre-arranged feature.
There is one additional cultural factor though: we associate video more with TV and not with a communication device such as a phone or a PC. We are used to seeing multiple people participating visually on TV rather than the visual communication being one-to-one. It is a one-to-many, many-to-one, and many-to-many communication. It is not a one-to-one communication.
So my guess is that we need to (1) migrate the video communication over to TV because that is what we associate it with and that is where it culturally belongs; (2) instead of one-to-one video communication, introduce an application that facilitates one-to-many, many-to-many, and many-to-one communication; (3) target residential users that have multiple family members, and most important of all (4) target markets/countries where there is greater inter-relative communication. If we do that, then we could make the video communication work.
The form it could take is a moderated video session generated within an IP TV application that brings two or more households together into the session (with moderating done by one who initiates the session). You could have multiple families attend such sessions. Things like these could be a great content creation for local TV and virtual TV channels on IP TV, which are right now confined to just YouTube type personal videos.
