I've never been one for social media. Facebook, Twitter, Snapchat. I've never felt the need to produce or consume content on any of these platforms. However, most people do, and because of that each platform is a major target for any large corporation's marketing team, particularly Instagram which has a very high 'engagement' amongst its users. Companies are more than happy to throw money at users with large followings in order market their products in 'organic' streams, so much so that it turns out being a successful Instagramer can be a serious money grab, just look at these headlines:
That sounds pretty appealing, maybe even appealing enough to convince me to give it a shot. Of course, in order to learn to become instafamous, we must do some market research to find what qualities successful Instagramers have. Taking a look at the most followed people on Instagram, we predictably find that they are all already established celebrities.
Looking past the obvious, we can also find that talented artists do pretty well on Instagram as well.
And then there's me. I'm neither a celebrity nor talented, so outlook is poor for my success on Instagram :(
I'm not going to become Instafamous by being myself, so my only shot is if I pretend to be someone else (this is the internet after all. It's only expected that you pretend to be someone different). I'd have to be someone talented. A celebrity. A talented celebrity. Well, there's one clear choice that fits this criteria:
Bob Ross. Talented, famous, and also no longer around to turn his success into Instafame. An obvious choice. And thus led led to...
The Birth of Robob Ross and his Quest for Instafame
The idea is something like this: Content images will be chosen from the day's popular #landscape posts and the poor ones filtered out. The remaining photos will have style transfer applied to them to make them in the style of Bob Ross paintings. The product will be posted to Instagram resulting in Robobross growing his following while I sit back and wait for BigCo Inc. to come to me for a paid spot of a Bob Ross style painting of a TastyBurger (tm)!
More formally, here's what that looks like:
Looks like it could work. Let's take a closer look at the different components.
Developing a content candidate filter¶
Garbage in = Garbage out¶
A lot of photos retrieved from a specific hashtag, eg: #landscape, aren't very representative of a landscape, or may simply not be worth trying to stylize. To filter out the obvious bad content candidates, we could train a new image classifier on some labeled data, however, training a deep network like this is unfeasable on my laptop or in any reasonable time. Another, common approach, would be to utilize a pre-trained network, such as one from the ILSVRC, and train an additional layer on the network on my specific data set. This transfer learning approach will significantly speed up the learining rate, however, it will still involve training a rather large network on my laptop.
The final choice that I settled with is a more confined version of the transfer learning approach. I will utilize a pretrained image classifier to produce classification tags for an image, then train a much simpler model on those tags to produce predictions.
Hand labeling a data set yourself may sound rather off-putting to many, but not to fear, it's actually quite easy in this case. I started with only about 500 images from the #landscape hashtag. To label them, I merely used finder (or any file browser) to select all the images I knew would not work well, then move them into a rejects directory. The rest went into an accepted directory. The process only took about 15 minutes and the results were quite good!
257 Bad landscapes 308 Good landscapes
Producing Training Data (meta-features)¶
Seeing as I'm trying to avoid traning any sort of neural network, I need to extract features out of the images. I produce these meta-features using the pretrained VGG19 model trained on ImageNet for the ILSVRC-2014. Keras makes this very easy by building it into a utility library.
I then run this model over all the training images and generate tags (meta-features) for each one and store them all in a .csv.
Learning to Paint¶
Neural style transfer was introduced a couple years ago as method of applying one image's artistic style to another image. Style transfer works by training a CNN on a
content image and another CNN on a
style image. It then swaps out one convolution layer in the
style then reconstructs an image from the network. This results in an image in the same artistic style and color of the
style image, but with the subject of the
content image and tends to work quite well (see paper). However, after experimenting a lot with neural-style-tf and it's handful of paramaters, I found that the results were too inconsistent and often filled with artifacts. Here's an example:
Not so satisfactory. Results all end with only some slight color shift with an over-pronounced, brush-like texture applied to them. Fortunately, there's been more developments in style since the original paper came out.
Instead of trying to adapt neural transfer, I instead opted to try CycleGAN. CycleGAN is able to translate any image from one space (photo) to another (Bob Ross painting) using only images from each category (they don't even need to be paired). It does this by contstructing two GANs for each category: one to translate images from category A to category B, the other for images from category B to category A. Usinig CycleGAN, I can generate a Bob Ross painting not by simply copying a single image's style to another, but applying a learned knowledge of many Bob Ross paintings to an image!
This approach, however, required training a model (two GANs) on a set of Bob Ross paintings and another set of landscape photos. It turned out the CycleGAN source included a repository of over 6000 landscape photos from flikr and I used The Bob Ross Painting Index to grab about 400 of Bob Ross' paintings (screen captures taken from his show). The training was very slow. At first, I only could utilize a cpu which was only giving me about one epoch per day (!!). I only was able to bear it up through about 5 epochs before I had a GPU I could put to the task. I was able to get some nice results after about 50 epochs, or two days, of training.
Here are some examples of how it progressed over 50 epochs.
And here are some results from the final model, including the original content image on the right.
There's only one problem with CycleGAN: I'm unable to generate pictures much larger than 256x256. My GPU only has 4GB of ram, which limits the size of the generator network I'm able to load. It also seems Torch models trained on GPU also won't work on CPU, although I need to investigate further. Instagram scales everything to 1080x1080, which would make my images look rather rediculous, so rather than trying to scale up a single image, I'll combine a couple to make photo sheets. This is less than optimal because it won't allow me to record feedback image by image, but there's not much better options at the moment.
Hashtags are a central part of Intsagram and play a large part in determining the success of a post and the growth of a userbase. There's a whole problem of optimizing the hashtags to reach the largest user base possible, but for now, remember the tags we had VGG generate for filtering out bad photos? We can use those as hashtags!
Here's what a post will look like