behind the scenes of image making

by time news

True or False? Relying on photos found online is increasingly risky, as images generated by artificial intelligence (AI) are multiplying. While some visuals are deliberately artistic and dreamlike, drawing inspiration from the world of science fiction, Japanese manga or even impressionist painting, others are disconcertingly realistic.

Like several world celebrities, the image of Pope Francis has been the subject of countless hijackings in recent weeks: Twitter has exhibited him in a white puffer jacket, but also his arms covered in tattoos or even as a nightclub DJ. These falsified photos are made with Midjourney, a software that appeared in July 2022 and which is already in its fifth version.

A booming phenomenon for five years

Deepfakes (or “hyperfaking”) are nothing new, however: their advent dates back to 2018. That year we remember a phlegmatic Barack Obama insulting Donald Trump in front of the camera, the American flag in the background. Three years later, a larger-than-life Emmanuel Macron announced to the French the imminence of a nuclear attack, before a fake Brigitte joined him to hug him.

These manipulations of photos and videos first emerged in the pornography industry, before gaining ground in entertainment, politics and even art. In 2018, a painting “painted” by algorithms was sold at auction for $350,000 in New York. The technology was still far from mature at the time and the result much less impressive than today, as can be judged by the blurring of the face depicted.

Tools now accessible to the general public

In five years, the situation has changed. First concerning the technology used. Until 2022, the generation of images was based on a class of algorithms called GAN (“generative adversarial networks”), discovered in 2014. This type of unsupervised learning algorithm trains itself to from a very large number of captioned illustrations to produce unpublished images of a given type and always more realistic.

The year 2022 has seen the emergence of new tools, which are both more efficient, more precise and more general public, based on so-called “diffusion” algorithms. The image is “hallucinated” by artificial intelligence by means of progressive “denoising”, which allows the result to be arranged, pixel by pixel, to respond to the given instructions.

Once the first representation has been obtained, the user can request modifications and the more detailed his description, the more the image produced will match his expectations. The comic strip author Joann Sfar was thus able to reconstruct a photo – very similar – of his deceased mother, simply by describing her face with the Midjourney software.

Even if we don’t all have the ability to describe a profile or a landscape so finely, we no longer need to be computer experts to create images of great precision from scratch. The imagination also has no limits and it is possible to come across sometimes surprising variables: a giraffe jumping rope on an ice floe, for example.

Before Midjourney, the first tool of this type to have been developed was called Dall-E (contraction of “Salvador Dali” and the robot from the animated film “Wall-E”). It was unveiled in January 2021 by the American startup OpenAI, which in November 2022 launched the now famous ChatGPT virtual chatbot.

In the wake of Dall-E, several free and open source solutions have emerged. The main ones are MidJourney, therefore, which is the reference today, but also Craiyon, which each time offers a mosaic of nine images, or Stable Diffusion, a British software with 10 million daily users.

They all have free versions, but that of Midjourney was suspended on March 30 due to a rush of Internet users, considered excessive by its CEO David Holz. The monthly subscription now costs between 10 and 60 dollars.

How to recognize an AI-generated image?

These technologies are improving day by day, but some anomalies can still detect doctored images. The hands, in particular, are the most difficult part of the body to represent by artificial intelligence because of their very structured appearance. Automatically generated images will sometimes contain twisted knuckles or disproportionate fingers, as in this fake shot which revisits the episode of the slap given to Chris Rock by Will Smith during the Oscars ceremony in 2022.

The details present in the background may also lack precision or consistency: misaligned windows, crooked wall, etc. Finally, the skin of human beings rarely contains the “grain” of an unretouched photo, and still often resembles the silicone skin of video game heroes.

You may also like

Leave a Comment