Convert any Image into Short Story

Falah Gatea
5 min readSep 8, 2022

In this article, we will learn how to convert any artwork into a short story by using an easy python package, I created using BLIP and GPT3- 3 models

A painting designed with artificial intelligence technology by the author

Note

for implementation for this package will be on Google Colab with some example because it was previously prepared for all package and libraries and away from the problems of installation troublesome libraries, so does not need any high specifications computer ….so let us begining

What is BLIP Model …?

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

which is the newest Feb-2022 model and current state-of-the-art in different languages and visions tasks. can generate a more accurate caption than previous state-of-the-art models. if you need information about this model follows this search paper.the model uses Vision Transformer ViT (Dosovitskiy et al., 2021) which divides the input image into patches and encodes them as a sequence of embedding with addition to [CLS] token to represent the globe image feature. As the authors mentioned ViT uses less computation cost and is a straightforward method, and is being adopted by recent methods.BLIP effectively utilizes the noisy web data by bootstrapping the captions, where a captioner generates synthetic captions and a filter removes the noisy ones. We achieve state-of-the-art results on a wide range of vision-language tasks, such as image-text retrieval (+2.7% in average recall@1), image captioning (+2.8% in CIDEr), and VQA (+1.6% in VQA score).ypu can read about BLIP Mode architecture from here.

What is GPT-3 Model …?

Generative Pre-trained Transformer 3 (GPT-3) is an auto-regressive language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series (and the successor to GPT-2) created by OpenAI, a San Francisco-based artificial intelligence.

GPT 3 can write poetry, write stories and, poems, translate text, chat convincingly, and answer abstract questions. It’s being used to code, design, and much more.The model has 175 billion parameters to put that figure into perspective it's previous model GPT 2 which was considered state of the art and shockingly massive when it was released last year had 1.5 billion parameters which were soon eclipsed by Nvidia’s Megatron with 8 billion parameters followed by Microsoft’s turning energy that had 17 billion parameters now open ai turns the table by releasing a model that is 10 times larger than during energy gpt3 is largely being recognized for its language capabilities when properly primed by a human it can write creative fiction. some information about GPT-3 Model.

Now Let’s Begin…..

Exploitation and use of the two models to convert the image into a short story

Step-1

installs the package from PyPI default software repository for Python developers to store created Python programming language, software developers.

!pip install Image2Story==0.0.6

Step-2

create an instance for of object for the package to download blip git clone by this code

from Image2Story import image2caption as dblip
dblip.download_blip()

Step-3

download model blip for using for image caption by this code

dblip.model_download()

Step-4

upload an image or any artwork to get a caption or description

#upload image
image_path='/content/name_of_image.png'

display image

#display imagefrom IPython.display import ImageImage(image_path, width = 600, height = 300)

Step-5

calls the image caption function by package object instance by this code

#image caption call functioncaption=dblip.image_caption(image_path)caption

the function will return the image caption

“a painting of a building in the middle of a forest”

the fifth step is to convert the image caption to a short story by using the gpt- 3 model function calls by this code, but you must have gpt- 3 model authorization to access by getting an API key from openai site

API_KEY='sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'prompt=captionshort_story=dblip.english_story(prompt,API_KEY)short_story=short_story.replace('.','\n')print('short story \n',short_story)Image(image_path, width = 600, height = 300)

we will see the results of the short story

“In the middle of a forest, there stands a tall, proud building It is a beautiful sight, surrounded by tall trees and green foliage The building is a school, and it is the only one in the forest It is a place of learning and growth, and it is the only place where the children of the Forest can go to get an education The school is very important to the people of the Forest, and it is the only place where they can learn about the world outside of their own”

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -

the same way calls Arabic functions to convert image captions to Arabic short stories, especially for any Arabic developers

short_story=dblip.arabic_story(prompt,API_KEY)short_story=short_story.replace('.','\n')print('short story \n',short_story)Image(image_path, width = 600, height = 300)

we will see the results of the short story

“ تظهر اللوحة مبنى في وسط غابة كانت لوحة جميلة ، وجعلت المشاهد يشعر كما لو كانوا في وسط الغابة بأنفسهم كان المبنى في اللوحة محاطًا بالأشجار ، وبدا كما لو كان في وسط غابة هادئة وهادئة جعلت اللوحة المشاهد يشعر كما لو كانوا في عالم مختلف ، وكانت لوحة مهدئة ومريحة للغاية”

all the code and original file is located at Google Colab

and the package module in PyPi

thanks for reading If you love this tutorial, give some claps, and sharing for all your friends on social media

Connect with me on FB, Github, Linkedin, my blog, PyPi, Google Store Play, and my youtube channel

Email:falahgs07@gmail.com

References

[1]- https://arxiv.org/abs/2201.12086

[2] https://github.com/salesforce/BLIP

[3] https://openai.com/

[4] https://beta.openai.com

--

--

Falah Gatea

Developer Programmer, in Python and deep learning. IOT Microcontroller Developer iraqprogrammer.wordpress.com