October 16, 2013

What encoding parameters for video representations in adaptive streaming?

Dynamic Adaptive Streaming (DASH) is a technology that has been implemented and deployed although the scientific literature was inexistent. Simply put, the server offers several representations of the same video ; clients can choose the representation that best fit their capacities. Since 2008, many researchers have deciphered the global behavior of client-based adaptive mechanisms. However, one key piece of the theoretical cake is still missing: what is the optimal set of video representations the server should offer?

As far as we know, there are no commonly accepted rules on how to choose the encoding parameters of each representation (resolution and rate). Providers typically use somewhat arbitrary rules of thumb or follow manufacturers’ recommendations (e.g. Apple and Microsoft), which do not take into account neither the nature of video streams, nor the user base characteristics. These parameters can however have a large influence on both user QoE and delivery cost.

With fellow researchers from EPFL (Laura and Pascal), we have recently investigated this topic from an optimization standpoint. The objective is to maximize the average user satisfaction. We formulated an optimization problem with the following inputs, which any content provider hopefully knows:
  • for each video in the catalog, the expected QoE of users for any rate-resolution. This can be easily obtained from a rate-distorsion curve computed on a sample of the video on every resolution.
  • for each video in the catalog, the characteristics of the population of viewers. I mean here the client device (tablet, TV, smartphone, ...) and the available bandwidth of the network connection (xDSL, fiber, 3G, ...). This requires an "a priori" knowledge of the viewer population, but we guess it can be obtained from previous statistics. 
  • the minimum ratio of viewers that must be served, i.e. the users who actually get a video, even at a relatively bad quality.
  • for the delivery part, the overall bandwidth budget that can be provisioned. Typically, we consider that the cost of the CDN should be bounded, and so the overall used bandwidth is bounded too.
  • finally, the total number of representations that we want to encode. The idea here is to limit the storage and encoding costs, and to avoid huge, hard-to-administer Manifest files.
We solved the problem on a set of synthetic configurations (the above inputs). Our goal was twofold: (i) measure the "performances" of recommended set of representations, and (ii) provide guidelines for content providers.

About the former goal, our observation is that recommended sets are not that bad in terms of average QoE but, for a given expected quality, the number of representations in these recommended sets is almost twice the number of representations in the optimal solutions. In other words, the average QoE is obtained at the price of more video representations, which mean more encoders, more storage, more delivery bandwidth in the CDN infrastructure, and more complexity in the management. We also showed that these recommended sets perform poorly for more specific configurations. For instance, a content provider specialized in live e-sport videos or a content provider targeting mobile phones must absolutely not follow recommendations.

We also derive from our analysis a series of guidelines. Some of them may be obvious, but it is never bad to recall obvious things, especially when nobody seems to follow them.
  1. How many representations per video? The repartition of representations among videos needs to be content-aware. Put emphasis on the videos that are the more complex to encode (e.g. sports)
  2. For a given video, how many representations per resolution? It mainly follows the distribution of devices in user population. Put a slight emphasis on highest resolutions.
  3. How to decide bit-rates for representations in a given resolution? The higher is the resolution, the wider should be the range of rates. Put emphasis on lower rates.
  4. How to save CDN bandwidth? Reduce the range of rates for representations in a resolution. Reduce the number of representations at high resolution.
These first results are just preliminary tests. We have plenty of new topics to explore. Stay tuned!

July 21, 2013

We forced students to enroll in a MOOC... and they liked it!

We made a MOOC, and it was all but easy in backstage. This MOOC was integrated in the regular curriculum of Telecom Bretagne students, so we kind of forced students to follow a MOOC. These students were neither volunteers nor MOOC-enthusiasts.

We just got feedbacks (i) from the traditional survey, which is performed by our administration every semester, and (ii) from a specific survey we conducted. Here is a short analysis.

Students enjoyed videos!
Students were unanimously positive for the MOOC although they were unanimously negative for other distant learning experiments, for example watching videos captured during a regular lecture (even with several cameras), or lectures through visioconference. As far as I know, it is the first time that Telecom Bretagne students are positive about a distant learning experiment.

To be honest, we did not expect feedbacks at this level of enthusiasm, especially with regard to the troubles we experienced during the course preparation. Typically, we received suggestions of replacing all regular lectures by MOOC videos. Some students enrolled in another (traditional) course about cellular networks did not attend that course because they preferred attending the MOOC instead. Less passionate but more useful, students were satisfied with the pace and the clarity of videos. They admitted they have worked more than expected overall but they did not especially complain about it. And students who are not French natives said that their level in French was sufficient to watch the video.

Of course, these results have to be validated by another experiment, but they confirm the high level of acceptance for KhanAcademy-like short videos.

Quizz matters, peer-reviewing does not
An MOOC is expected to be something more complex than just a bunch of videos. What we did in this MOOC was nothing spectacular: some quizz after video, a forum, some assignments, and a peer-reviewed system, which allowed students to review the assignments from other students. At the end of the day, how useful are these beyond-videos learning tools?

From our survey, quizz are what matters the most. The main purpose of these quizz is to offer students a way to check whether they were attentive during a video. In short, if you cannot answer the quizz, then you should probably watch the video again. Intuitively, quizz are not magical learning tools. But, think twice about it and recall when you were student. If you were sure the teacher would ask you a question in say five minutes, you would certainly be very focused on the teacher during these five minutes. Now think about a teacher asking you a question every five minutes! Today's quizz are very simple, but this positive feedback may encourage us to enhance quizz.

On the contrary, peer-reviewing has not been appreciated. Students did not find useful to review the assignments from other students, and they found even less useful to receive the reviews about their work from other anonymous students. I am disappointed because I had a lot of hopes in this learning tool, which is the most "connectivist" tool we implemented. Well, we have to work further on it!

It is not easy to take notes while watching videos
When we interviewed (very informally) students, a recurring object of worries was the notes. How to take notes although the videos is played? A video is not a lecture. It is focused and it does not include any time out. Almost any sentence matters and requires a note. Moreover, you cannot only listen, you must watch, at least a bit.

Students suffered from being unable to follow the videos and to take notes simultaneously. Some of them paused the video regularly. Some other played the video twice, one first time to get the global picture and a second one to take selected notes. From our survey, students playing videos more than twice are rare (less than 10%).

This feedback emphasizes that students have to acquire new methods in order to follow such video-based courses. Somehow students who have followed our MOOC got some specific skills, which will be useful if they have to follow other MOOC in their life (which is highly probable). Should we include some courses about how to follow a MOOC in the curriculum?

Multiple experiences are possible
There is not one unique way to follow a MOOC. We got confirmation if you had doubts about it.

We booked some classrooms with computers and headspeakers in the regular schedule. Some students told us they appreciated. They used to attend these "free" hours because it was for them a guarantee to maintain a regular learning pace. Some others of course worked by bursts, watching several hours of videos in one night when they got assignments. Overall, I like this freedom, which calls for unconstrained MOOC schedules.

Finally, a group of students told us they used to watch the video together. I can only imagine beers, chips, a TV screen... and MOOC videos! (debates should be lively for the quizz) This way of experiencing MOOC videos is great since it allows students to discuss the learning material. When we give a lecture in amphitheater, we usually ban in-class chats because we assume most of it is not related to the lecture. But chats among students can be useful. This experience is also opening questions about next-generation campuses: dedicated tiny classrooms equipped with a TV screen are options to consider.

May 27, 2013

I made a MOOC and I survived!

Xavier Lagrange, Alexander Pelov and I made a MOOC introducing Cellular Networks!

It is supposed to be a 20-hours course for students having a minimum background on networks. It attracted around 350 students, including 35 students from my institution for which this course is part of the curriculum.

I do not discuss here our motivations to create a MOOC and the way students have experienced it. I focus on the teacher's standpoint when making this MOOC.

We decided to make our own MOOC from scratch without using external products (except YouTube to host video). In other words, we did not use third-party companies like Coursera and Udacity, which host content, advertise it, ensure hotline for technical troubles, and so on.

Time Spent
On a very rough estimation, we spent 240 hours on this MOOC, including:
  • 20 hours preparing the pedagogical material. This part is actually enjoyable for teachers. To transform classic 3 hours-long lectures into 7-minutes-long to-the-point videos (+ quizz) is actually a nice challenge. Though, there is room to do more: we did not change much our exercises. Moreover the only collaborative tool we experimented is peer reviewing for homework. In other words, the transition to c-MOOC would require more time.
  • x hours interacting with students. Since our MOOC was not that crowded, x was close to epsilon but I guess there should be some formulas linking teacher interaction time and number of students.
  • 30 hours installing and testing the MOOC platform. We chose OpenMOOC because it was the only available, viable, open-source platform at that time. It is an overall good basis but it is relatively hard to install for people who are not familiar with server administration. Moreover, statistics modules are very incomplete. But, still, OpenMOOC is OK, it provides basic functionalities and the teacher interface is friendly.
  • 180 hours generating the video of courses, including:
    • Warm-up: it is all but easy to write on a tablet while watching a camera, to master all recording elements, to feel comfortable, to find the right tone and the right pace. For each teacher, the first tries recording videos were disastrous.
    • Recording: we made a lot of errors, for example to speak during twenty minutes with microphones off. Even when everything runs perfectly, speaking for such video is totally different from lectures. Overall, 2 minutes of recorded videos ended up into 1 minute of video that can be actually exploited.
    • Producing: we discovered how to use a studio software. With nowadays tools, it took us in average 10 minutes to deal with each recorded minute, so 20 minutes per each finally online minutes of video. And we had around 90 videos with average 6 minutes. 
  • 5 hours advertising. We were not affiliated with a well-known platform, so we needed to attract people (despite the aridness of the topic). Clearly, we did not do enough.

The teacher should first decide how to cut a full course into units, each unit being four to ten chunks, each chunk being a 7-minutes long video. We opted for the format that has been popularized by Khan Academy. This format is now widely used all over the platforms: the background is (almost) empty, the teacher speaks in the background, we sometimes see his/her face, and the most important point is that he/she writes on the slide when he/she speaks.

Our process was to first create the target slides, i.e. what we want to have at the end of the video. Then, we only kept what is actually hard to draw in real time, for example a hexagonal ceiling. This is the background slide. The goal of the video is to start with the background slide, to write on it, and to finish with something that is close to the target slides. During the recording, the target slides were displayed behind the camera so that the teacher does not forget anything (and look at the camera).

We decided to not use a prompter because we wanted to keep it as natural as possible. We did not write  the discourse in advance, but Xavier and Alex master the topic so well that they did not need it. Note that it is also possible to pause during the recording so that teacher can take time to think to the next sentences. It is also possible to repeat something in a better way when the previous sentences were not totally satisfactorily. These pauses and repeated sentences can be cut afterwards.

To generate the first videos, we used software, cams and microphones that we found in our shelves. We were able to generate some videos but the overall quality was borderline. Then, we got some extra fundings and we were able to get professional materials and to build our own studio. The quality is far better. Our studio includes a tablet, a powerful Mac with enough hard-disk (we needed around 1 Terabytes for this MOOC only), some wireless microphones, and a semi-professionnal camera.

Stuff I Would Have Made In a Different Way
I put here miscellaneous thoughts:
  • We would have chosen a sexier title. In our case, it would have been appropriate to include a buzzword like LTE, LTE-advanced or femtocells in the title. We identified three ways to attract a large population of students: 
    • The MOOC is affiliated with a top-ranked university, which knows how to advertise, or with a highly-visible platform like Coursera. These websites attract millions of visitors, so any course can enroll thousands of students. It is possible that these enrolled people are less committed to complete the course though;
    • The MOOC is about a very trendy topic, say quantum computing, software-defined networks or any other buzzword. It has to be reminded that the majority of "MOOC students" are professionals who want to keep in touch with new topics they heard about;  
    • You make the buzz about your MOOC. We spent only 5 hours advertising and we are not professional. Press and web buzz campaigns is a way. It is also possible to convince fellows from other universities to make their students enroll your MOOC.
  • We would have found a better video format. Let's be honest: without a dedicated team, it is a indecently long and fastidious to create Khan Academy-like videos. We went too much when it was about videos. Compare this video and this other video. It took us 20 minutes to produce each minute of the former video while it took us only 4 minutes for the latter. Is it worthwhile? Based on our experience, it is probably possible to divide by at least 3 the overall time spent on video.
Overall, it was a great experience. We learned a lot about the potentials of such online courses and we had a lot of fun playing with videos. We developed a lot of nice ideas for the next MOOC, and we significantly improved the process of video recording and editing.

But it was also a huge investment. Xavier told me that making this MOOC was as demanding as writing a book. I often compare books with MOOCs when I have to explain our motivations to do MOOC. Both are knowledge, both are supposed to be done by experts, both target a wide population… it seems that both require very committed authors.

January 29, 2013

MOOC and Grandes Ecoles: surfing the tsunami

In North-America, the development of Massive Online Open Courses (MOOC) is seen as a panacea, a way to fix some of the multiple flaws of the higher education system. From a buzz standpoint, this belief culminated when Stanford president claimed "there is a tsunami coming." The debate is less lively in France. Yet, the tsunami would have good chances to affect the French higher education as well.
An originality of the French higher education system is the prominent position of Grandes Ecoles. I am working in one of them, Telecom Bretagne, which is part of the Institut Mines-Telecom.


My first claim is that Grandes Ecoles are in danger. Reasons include:
  • Grandes Ecoles have to face new competitors. The emergence of start-ups like Udacity and Coursera has transformed Grandes Ecoles into an oligopoly of dinosaurs. Grandes Ecoles are used to the competition among themselves. In short Grandes Ecoles share the "market" of producing highly-qualified professional students, which is a market that universities do not address accurately. The reality is that all Grandes Ecoles have approximately the same offer: roughly same size, same structure, same diploma, same normalized courses. The aforementioned new competitors are start-ups with limitless ambition and nothing to lose. They have almost no administrative cost, they do not waste money in research, they can fail and revise their strategies on a month basis, they can address students worldwide. These start-ups actually shuffle the cards.
  • Grandes Ecoles' main asset is diploma in a certification world. Companies like Cisco and Microsoft have developed professional certification systems for years. Students become super-expert in a given specialty and receive a certificate, which demonstrate their employability. Though, these efforts have never disrupted the high-education system. By offering individual courses, MOOCs challenge again the notion of degree, which is commonly seen as a set of certifications (including many useless ones). The companies that are not convinced by the degree system will find in MOOC a great opportunity to revisit their Human Resources processes and to bypass Grandes Ecoles.
  • Grandes Ecoles are not attractive for the targeted students. Grandes Ecoles are very attractive to bright French students, but MOOC's target population is all over the world. Unfortunately, Grandes Ecoles are visible neither in international rankings, nor on the web. Grandes Ecoles have also not demonstrated strong relationships with companies that really matter to students (especially Apple and Google). Finally, the perspective to live in France during three years for courses that are not all considered as worthwhile is a key weakness.


My second claim is that Grandes Ecoles are in an excellent position. Here is a selection of advantages.
  • Grandes Ecoles are adaptive. These are small institutions, which are directed by managers having a long experience in industry. Grandes Ecoles are far more flexible and reactive than any other institutions. They can re-organize, they can develop strategical plans, they can reinvent themselves and they can embrace new ways to fulfill their missions without delay. MOOC is an opportunity for Grandes Ecoles to develop new businesses and to improve their offers.
  • Grandes Ecoles excel on what complement MOOCs. It is a common understanding that MOOC is about knowledge. A set of MOOCs is not enough to turn students into smart workers. Many other competencies should be developed, including team-working, communication skills, and social networking among classmates. Grandes Ecoles focus on these aspects through project-based pedagogy, personalized and tutored curriculum, campuses designed as learning centers, and good placement in attractive companies. In Europe, Grandes Ecoles excel in all these aspects and find here a way to differentiate in a positive way from other institutions. Grandes Ecoles can leverage MOOC rather than suffering from MOOC.
  • Grandes Ecoles have already a strong relationship with companies. Curricula are typically discussed with companies on a regular basis such that learning matches the requirements of targeted employers. Grandes Ecoles also have developed programs for "continuing education" in relationships with Human Resources. Thus Grandes Ecoles are used to the act of selling learning programs elaborated by their faculties in a business perspective. The diploma is a virtual good that has made sense since the XIXth century, often challenged but never surpassed because companies like employees who are more than just a super-expert in a couple of areas.
The next couple of years will be key for Grandes Ecoles. It will be very interesting to observe what the executives of Grandes Ecoles will do. Undoubtedly, executives will have to be brave if they want to transform their institutions. They have to make Grandes Ecoles able to compete at the planet scale, to leverage their assets, to catch up emerging trends, and to focus on what is really making Grandes Ecoles unique learning places. Strong decisions will have to be taken. For example: giving up with academic research to re-focus on education, cutting faculty jobs in departments that have no activities in core scientific domains, developing business related to buying/selling MOOCs...

This increasingly frustrating peer review process

Academic people barely share their bad personal experiences related to peer reviewing. But everybody has papers rejected in conferences… and these decisions sometimes generate legitimate frustration since they seem to be due to some "random bad outcome from this plain old flawed reviewing process". On my side, I have the feeling that reviewing process is getting worse and worse. I am not alone. Following this example, I describe below some recent reject notifications that illustrate some of these flaws. And I propose some ways to fix them.

The un-rebutted rebuttal
In 2012, both ICME and Sigcomm conferences introduced a rebuttal in the reviewing process. I know a lot of scientists who call for such rebuttal process. Unfortunately, my experience of rebuttal was absolutely disastrous on both cases. It is interesting to note that these conferences are definitely not in the same league.

For ICME, I suspect one of the reviewer to be a weak graduate student: he gave us a strong reject based on his claim that one of the four proofs of the paper was wrong on a specific equation. Unfortunately his mathematical statement was false. This bad review was the perfect case where a rebuttal can help to fix a clear misunderstanding and a wrong analysis. We spent a significant part of our rebuttal trying to politely fix the mathematical error of this reviewer. Hélas, we received our negative notification. The reviewers did not change any word of their review. And the meta-reviewer gave us this unforgivable remark: "The authors thinks that the reviewer 2 misunderstand the work in this paper. From the comment, the reviewer should be an expert in this field". This meta-reviewer does not understand rebuttal, does he?

For Sigcomm, one of the reviewers claimed that our 14-pages long proposal can be done by tweaking another existing system. More precisely, the reviewer "believes that with simple changes to your problem, one can use the [other] system to tackle it, probably by just changing the utility function." We knew well this said other system… and we double-checked again. No, there is no way, both papers share some words, but they are like apples and oranges. However, this was the main strong drawback raised by this reviewer, so we were full of hope that we could make our case by carefully explaining the differences with this previous work. Hélas, triple hélas, one month later, the reviews arrived, unchanged.

In both cases, rebuttals came back without any changes, even when we highlighted some major wrong analysis.

Proposal: I don't believe much in rebuttal, but at least this proposal deserves a better implementation. In particular, reviewers must address the remarks that authors made about their reviews.

The anonymous reviewer
We submitted a reasonable paper to a special issue of IEEE Transactions on Multimedia. One reviewer was vaguely positive, one reviewer was vaguely negative, and then came the third reviewer… This guy did not find any positive comment to do. It looks like none of these 14 pages was worth anything. Moreover, all his negative comments were excessively aggressive and mostly based on wrong self-proclaimed facts. The review was just a piece of harsh and assertive remarks. This paper was not a Nobel Prize, for sure, but it was a honest, valid paper, with a motivation based on a series of observations from well-established measurement systems, some theoretical developments, and a non-trivial simulation. Maybe not worth a publication in this journal, but why so much hate?

One well-known issue of peer reviewing in computer science is the excessive harshness of reviewers, often young scientists, comfortably protected by the anonymity. In the excellent "Guide for Peer Reviewing", it is said that, as far as possible, the first paragraph of a review should summarize the goals, approaches and conclusions of the paper (including positive assessments) while the second paragraph should provide a conceptual overview of the contribution.

Proposal: Some reviewers would be less assertive, and less aggressive if there were any probability that their identity would be revealed. Why not having a "out of the k reviews you do for a conference, one of them will be randomly chosen to be de-anonymized." Or a "one out of ten reviews are de-anonymized".

The no-room-for-cold-topics program chair
We sent a P2P paper to Globecom, although it is well-known that P2P is now a very cold topic. We received two clearly positive reviews, and one review slightly more negative in the grades, but with comments like "The addressed problem is relevant, the paper is well-written and technically solid". Globecom has a 37% acceptance ratio, but despite these grades, our paper has been rejected. My first reject at Globecom.

I asked some additional explanations to the TPC chair, and he kindly answered that "in the confidential comments, there was a voiced concern about novelty". In other words, it seems that anonymity is not enough for reviewers, they still require an even more anonymous place to assess the judgements they are the less proud of. According to the guide of peer reviewing, the "confidential comments" are just a bad habit, which affects the overall transparency of the reviewing process. On my side, I never use it, and I don't find any convincing point for using it.

Proposal: ban the confidential comments.