How can we make algorithms that generate a new image, video or music? For many researches, a natural extension of artificial intelligence is artificial creativity and there is a titanic ongoing effort on the machine learning community to make algorithms that can generate new information in meaningful and creative ways.
In this post we review the current state-of-the art in Machine Learning for creative content generation, going through the hotest developments from the computer science community on automatic generation of text, images, video and speech.
Generative algorithms allow us to create synthetic data "out of thin air". They are "trained" on a set of examples provided the by the user and may generalize over the provided examples to produce new examples that aren't part of the data. These algorithms can be applied to any form of data such as images, speech, text and video.
However it is incredibly hard to scale this algorithms even to medium sized images and videos. In the last 5 years, due to a rapid increase in computational power and several innovative new algorithms based on Deep Learning, the field has seen an enormous progress.
These generative algorithms are finally getting to a point where they can become useful to generate content. These are very exciting times.
Below we show examples of images, video, speech and text generated by various algorithms developed in the last 5 years.
Image and Video Generation
The human vision system is an incredibly sophisticated machinery that, through a combination of evolution and learning, can capture a lot of the rich statistics of the world around us. This sophistication of our own visual systems means that generating believable and interesting images artificially is a very hard problem. This problem has eluded researchers in machine learning and computer vision for many years. In the last few years we have seen a real explosion in the diversity, quality and complexity of image generation algorithms. Some of them can even pass a simple form of visual Turing test which you can verify for yourself here.
Checkout our References List or continue below to see a few remarkable examples of unique images generated by machine learning algorithms.
Google Inception Network
Facebook DCGAN Generative Model
Google DeepMind DRAW Generative Model
Audio and Speech Generation
As with images and video, generating believable speech and music is a very hard task.
Speech has many applications across various industries world-wide, so it is not surprising that companies like Google  and Microsoft [14,15] invest a substantial amount of resources in modelling speech.
Music has a long history with machine learning , but only recently learnable algorithms can generate meaningful music fragments. Checkout our References List or continue below to see a few interesting examples.
Music Generation (Example from )
Speech Synthesis (Examples from )
Generating plausible and meaningful text is still an open problem, however machine learning has gone a long way from simple hand-crafted grammar rules and chatbots systems to algorithms that learn grammar and a semantics from scratch by going through thousands of pages of text data (for example, from Wikipedia or Shakespeare).
These systems are being already used for text-translation , free-style text-generation [6,7], Obama speeches , Donald Trump quotes , clickbait titles  and even TED Talks [8,11].
Checkout our References List or continue below to see a few interesting examples.
Example of Obama speech from 
Example of Shakespeare text from 
Examples of baby names from 
Example of a whole website generated by Recurrent Neural Networks from 
There is an enormous effort from the machine learning and computer science community to develop better algorithms and methods to generate images, videos, text and speech and they have gone a long way in these directions.
It seems that the different pieces of the "content automation" puzzle are starting to crystallize as powerful independent toolboxes. Now it is time to integrate these different technologies and algorithms into something even more powerful.
Generating high-quality and meaningful automated content is still very hard. Such challenges are now being embraced by a few startups such as the The Grid.
One can easily extrapolate that in the near future such algorithms will generate the vast majority of the textual, visual and auditory content we will experience. Even converting an entire book into a movie or producing short synthetic video-clips from text is a plausible outcome.
Want to try for yourself?
As we hope to have demonstrated so far, generating high-quality and useful content with these amazing machine learning techniques is still an open research question.
To our benefit, most of these research efforts produce open-source code that anyone can try. Checkout our reference list below for the links of all github repositories mentioned in this post.
Note: Running these algorithms requires large scale computing systems, typically involving multiple GPUs and CPUs, please beware of the requirements before getting too excited.
 DCGAN github