deep learning and neural art

Discussion in 'Other Genres' started by piggsy, Mar 21, 2016.

  1. piggsy

    piggsy Mu-43 All-Pro

    Do you remember the horrifying/fascinating neural art from google's deep learning AI a while ago?

    Google’s Artificial Brain Is Pumping Out Trippy—And Pricey—Art

    anyway. There are several paid and free service versions of this running off of AWS and other hosts, but I got bored waiting and installed neural styles at home. Which basically was an 8 hour trip into reminding me why I dislike linux, but finally got something working.

    Basically you take a style image, say,


    and cross it with another image, say,


    the deep learning AI learns what makes up the first image and smashes it into the second one through xxxx iterations, and eventually, you get -


    Some others - some of which work better than others but I'm kind of at a point where I'm happy to see anything coming out of this at all. See if anyone can guess some of the style images :D








    if anyone has something they'd like to throw through it (styles with a strong style to 'em work best) post an image for a style and a content image and I'll get around to it. About 550px is about what I can handle with the setup as it is now (long story but holy crap, wow, does linux ever have a case of DLL hell these days).
    Last edited: Mar 21, 2016
    • Like Like x 3
    • Winner Winner x 1
  2. tkbslc

    tkbslc Mu-43 Legend

    Pretty decent art, really!

    ...................And we continue to engineer our own obsolescence.
  3. TassieFig

    TassieFig Mu-43 Top Veteran

    Oct 28, 2013
    Tasmania, Australia
    Never heard of this before but it's not bad.
    Surly you have a van Gogh in there. Starry Night butterfly maybe?
  4. piggsy

    piggsy Mu-43 All-Pro

    Yup, starry night is pretty identifiable :D Some of the earlier iterations it pushed were "starrier" but I wasn't paying attention to what it was doing and it overwrote them when I pushed through the next set.
  5. piggsy

    piggsy Mu-43 All-Pro

    Jim Thirlwell vs Basil Brush

    Rhianna vs potato

    Hillary Clinton vs green tree python

    Donald Trump vs bacon
    • Funny Funny x 1
  6. piggsy

    piggsy Mu-43 All-Pro

    Here's another couple of cool toys, in case anyone still hasn't played with this yet.

    1) Prisma - app for ios/android that takes any photo and does the above magic on it for you in a number of preselected art styles. Much less hassle than installing a lunix to do it.


    by default works only with your cameraphone and only as an app, but, you can feed it any image from your regular camera over USB/SDCard and have it uploaded and processed by them in a few seconds.

    Here's an example -



    It doesn't do a fantastic job all the time but you retain a lot more of the torch/cuda/lua version's functionality than most of the uploaders you find that will do this, like being able to control the strength of the effect on a slider, and while it doesn't come with a "bacon" or "snek" style it has a lot of different art ones.

    One downside - it spits out fixed res and quality jpg images. But! Deep learning has another toy for us to help with that.

    2) Waifu2x


    this one again has a linux/cuda version but it also has a dead simple windows version that uses openCV and is about as easy to use as this kind of thing can get - both can be found on the github link. If you are happy to put up with limited output size - use the web portal one at the link. Waifu2x is a deep learning resizer and denoiser that works a lot better than typical up-scaling methods on simple image styles - it's an AI "trained" on anime with thick bold lines and colours and no noise, and it tries to make any image it sees look like that as hard as it can.

    So say we take out little jpg that Prisma spat out - which has a lot of coloured areas and bold lines - and we want to make it bigger. It's only 1080x1080 which is a bit shithouse if we wanted to do much more than stare at it on a phone (though I should mention, the Neural Styles program I was using above is extremely VRAM hungry and even a GTX970 with ~3.6GB available will have trouble even doing that high a res, and switching to CPU acceleration is very slow in comparison). Here's what Waifu2x can do to a 1080 80 quality JPG, set to 4x scale and low level denoise.

    Original -100%

    Enlargement 4x -100%

    I could have gone more aggressive on de-jpg/denoise, but, I only had so much time. Personally for anything that isn't 100% cartoonish I'd probably stick to Neat Image to do another round of the final denoising, this is more just, it seems like it would probably help to control it at the enlargement stage before the frequency of it gets too low for other programs to see it.

    But yeah! Not bad eh. Here's the original full size file for comparison, (edit - and the Waifu'd Prisma'd 4x conversion here. )

    The real fun starts when you realise that pretty much anything with a cartoonish art style can be transformed from a ~720p or even lower noisy jpg into something that will look great at several tens of megapixels. So here's a bit of fun with that (should note - it spits out pngs, ignore any jpg artifacts resultant from re-saving it here), which is simply unbelievable given the size and quality of the starting image.

    A fun google search is say "Mondo movie poster" specifying "large" or "2mp" in search tools, and watching Waifu2x eat those files and spit out flawlessly unpixelated and noiseless versions at 8000x10000 px or more. Other neat part - Waifu's a lot less complicated and a lot less intensive than Neural Styles, so it can generally be relied upon to spit out something of a (very) large size in about 20-40 minutes on my 4.3ghz Devil's Canyon I5. NS you'd be hard pressed on a 12gb Titan card to have something nice at that res before your next meal.
    Last edited: Jul 31, 2016
    • Informative Informative x 2
  7. piggsy

    piggsy Mu-43 All-Pro

    Well now I feel dumb, it turns out there's a windows native, gpu accelerated (both OpenCL for AMD, and CUDA for Nvidia) version over there also -

    GitHub - tanakamura/waifu2x-converter-cpp: waifu2x(original : re-implementation in C++ using OpenCV

    it is dynamite fast and after having 4 PCs cranking out images all day (obtw: for this at least - Haswell > Ivy Bridge > Sandy Bridge by a LOT more than you might think) watching my 970 tear through making what was taking hours (it's not like the CPU one is slow, just, if you want it done in several iterations at several NR levels for hundreds of images...) in about 2 minutes is pretty humbling. This one seems extraordinarily memory efficient too, even this one with a final output size of 13644x10500, source file of a lossless TIFF at 53mb, used ~700mb VRAM doing it.

    (ed - whoops seems I spoke too soon on this, that was just background usage, the GPU version doesn't support non-power-of-2 scaling factors and going for 8x on a 16mpx image crashes it from out of VRAM - running it back on the other CPU-only version indicates this is now going to try and allocate in excess of 16GB of memory!)

    PS, here's how it handles a regular image also -

    Original - 100%


    Waifu2x - 4x scaling - 100%


    Bear in mind this is the simplest possible thing you can do, scaling only, no noise reduction, no changed sharpening to adjust to the new size, etc. If you went and re-did it with this as part of the whole toolchain you could go pretty nuts.

    If you manage to find lossless source images of posters (what I've been using it for) it looks like someone went in and hand-redrew the poster as vector art perfectly - thankfully someone did one of the Mondo The Thing posters by hand in Illustrator and did some ultra-high res outs of it here:

    The Thing Tyler Stout Mondo Poster Vector Drawing DOWLOAD - Blu-ray Forum

    so we know what that would look like. And it looks .. like that. Google has an advanced image searcher here -

    that has a couple of handy options you normally don't get, like being able to specifically search for PNGs. Really Waifu2x's only weakness of it is its native handling of JPG as a component of noise, or really rather, the not-so-great ~60-75 at most quality settings people use on the net. Which nothing can really, truly fix without some hand work involved, it's something which is fine for display at less than 100% on a monitor but otherwise ... yeah. Anyway. Go nuts.
    Last edited: Aug 1, 2016
    • Like Like x 1
  8. barry13

    barry13 Super Moderator; Photon Wrangler Subscribing Member

    Mar 7, 2014
    Southern California
    @piggsy@piggsy, could images be broken into tiles before processing, to get around RAM limits, or would that create artifacts at the joints?

  9. piggsy

    piggsy Mu-43 All-Pro

    Yep, I seem to remember though that for some of these deep learning things you could end up with shitty artifacts doing that - you can see it on that first prisma sample image, where the edges of the image have a thin border as it runs out of things to neatly feed into it. You'd have to play with it and see how it works. Other alternative is just what they do on the web site version where it offers 1.5x factor and just shrink the image down by a % first.

    Oh also, just sitting in a subfolder of that 2nd version of Waifu2x I linked, is an incredibly handy thing that can add Waifu2x to the Windows "send to.." right click menu, and it supports everythign the command line version does, as well as things like custom processing options for different image types, and wildcard and file handling options based renaming of output files.
    • Like Like x 1
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.