Visualization of AI generation in novel description

In the article promoting Salt Town, Liu Shen Lei Lei mentioned a scenery description in the novel, and one of them described "the sun soaked in salt".

What should the "salt-soaked sun" look like? My imagination still seems to be insufficient. Now many foreign artificial intelligence software can draw pictures. As long as you give it a few key words, you can generate a new picture. If I make an analogy, what will be the images of "the sun stained with honey" and "the sun fried with peanuts"? Some novels describe landscapes unimaginable to human beings, and robots may be able to draw them now. The visualization of language description seems to be really visually impactful.

With the further development of artificial intelligence, it should be possible to achieve "reverse visualization", that is, to deliver paintings such as Picasso, Van Gogh and Monet to robots, and the software can reversely generate a set of descriptive words related to the content and style of paintings.

Now there are a variety of AI image generators that use artificial intelligence algorithms to convert text into images. Enter a text prompt or description, and these AI tools can quickly turn your ideas or concepts into visual representations, that is, pictures, in a few seconds. This tool is based on deep learning algorithm, which has been trained on large image data sets and their corresponding descriptions.

With the rapid expansion of training scale and continuous enrichment of experience, this kind of AI image generator will become smarter, more flexible and creative. But at present, this kind of AI generator has only one-way operation, that is, "from text to image". I believe that someone will design the reverse operation in the future, that is, "from image to text". At that time, we can use software to generate a paragraph on a picture, reflecting the content and style of the work. This is actually the beginning of AI’s analysis of art works, even though it may be a bit rough at the beginning.

In addition to the two-way generation of "text images", when the video analysis is deep enough and the video data set is large enough, we can expect the two-way generation of "text video" through training, that is, by writing a prompt text, AI can generate a related video according to the content and style of the prompt; On the contrary, a paragraph of text can be generated according to a video to explain its content and style. The former will create a precedent for automatically generating videos only by using spoken language or words, while the latter is the starting point for analyzing and commenting on film and television AI. If one day, people can automatically generate a film and television work by inputting the script of film literature in spoken or written language, will it still be incredible?

The generation of two-dimensional "text to image" has been realized, can the generation of three-dimensional "text to video" be far behind? If I had the "text-video" generation software, maybe I would be interested in "producing" small movies, such as Guan Gong vs. Qin Qiong, Marx entering the Confucius Temple, or even more imaginative stories, which are more likely to be very realistic works. And you?

Let’s assume that "words to images" are generated in the forward direction. What would it be like if a group of words ("images to words") generated in the reverse direction were used as hints and then a visual painting was generated in the forward direction? It’s like translating a Chinese poem into English and then translating this English poem back into Chinese. In a few cycles, with the continuous wear and gain of language elements and cultural elements unconsciously, the result of "passing the password" game will be unrecognizable.

Imagine again, using artificial intelligence software, forward generation and reverse generation are repeated, and even shocks are formed. Compared with the initial input, the final output is definitely unrecognizable, but the process of "transcription" and "translation" is still well documented and traceable.

The result of repeated superposition of this forward and reverse generation has undergone a qualitative change and has become a new existence, perhaps a kind of "emergence". Although it is trivial, it is complicated with wear and gain, and you can’t understand its ins and outs. In contrast, the language of modernist writers and painters only experienced low-frequency shocks. Perhaps a sense of innovation was unexpectedly born in it. From the modernist creation such as misty poetry, we can make this inference.

With the appearance of cameras, "image" is no longer unusual, and the value of realistic paintings has plummeted. Modernist and postmodern heroes of all walks of life have found their way out of "unlike", claiming that it is more like in essence. AI (Artificial Intelligence) is rampant. What will our future works of words, images and videos look like?

Thoughts from the novel Salt Town recommended by Liu Shen Lei Lei

Twenty million two hundred and thirty thousand three hundred and six

Tags