Skip to content

Implementation of "StorIA: In defense of children happiness in hospitals" application.

Notifications You must be signed in to change notification settings

joanlafuente/StorIA

Repository files navigation

StorIA

Sketh2Image

Project overview

StorIA is an app for tablets that will use children's drawing and brief textual prompts and offer a story fitting for it. The child will draw a sketch (and optionally input a prompt), from which the app will offer an improved version of the drawing, painted and colored, and a description narrating it. The amount of text can be selected accordingly knowing the age of the target audience. Since our device will use multiple drawings, it can be made together by different kids to create a story made by all of them, in a way that each kid contributes by adding a new page to the story. One of the main objectives is to have the children do stories in groups so they can play with their friends or arrive to meet new ones.

Lastly, StorIA will have the option to share your story on an internal social page. This network will have a collection of stories created by kids from all around the world. This way, even if a kid doesn't want to draw, they could read all of the posted stories, and connect with children abroad through those. The internal network will also implement a ranking system where kids can rank the stories they have read and StorIA will award the best stories of the month with some sort of prize, yet to be decided.

More details on StorIA project can be found here.

Example page generation:

In the left image you can see an example of a posible sketch, and in the right the image generated by our application, conditioned with the text "A horse in a field"

Sketch Generated image

The text generator make the following story:

  • Generated story: Once upon a time, 100 horses were in a field. They were all happy and healthy. One day, a farmer came and took 99 of the horses away. The last horse was left alone in the field. The horse was sad and lonely.

Repository Structure

The repository has the following structure:

  • /APP: This directory contains the code of StorIA application.
  • /Code_VM: This directory contains the neccessary files, which must be copied into a virtual machine, to use StorIA AI capabilities.
  • /Design APP: This directory contains an overview of the application initial desing.
  • /Examples: Contains a folder with the original sketch and the image generated from it.
  • /Sketches: Some example sketches that can be used with the ImageGenerator script.
  • DrawSketch.py: File with the source code for the Tkinter interface, which enables a user to draw and save a sketch using a simple toolbox.
  • ImageGenerator.py: Main file that contains the logic of the application. It defines the code necessary for creating the Gradio application and contains the functions that call the auxiliary models, namely: Sketch2Image, Image2History.
  • TextGenerator.ipynb: Testing of BLIP-2 to describe the image.
  • environment.yml: The environment required to execute the code of the different files.
  • Mistral-7B.py: File that takes an image, runs it through BLIP-2 to describe it, and then uses Mistral-7B to create the history.

Models used

The different pretrained models that we use for the generative tasks of StorIA are the following:

  • Sketch2Image: StableDiffusionXL with ControlNet and a Variational Auto-Encoder
    • Combination of models which are used to transform the skecth into an image taking into account the text provided to condition the generation.
  • Image2Text: Blip2
    • Model used to provide a description of the generated image.
  • Text2Text: Mistral
    • Model used to generate the text of a story page.

There is a more specific explanation of the pipeline on the report.

Installation and Usage

Before starting clone the repository:

git clone https://github.com/joanlafuente/StorIA.git

And then create a conda environment with the following command:

conda env create -f environment.yml

To be able to use StorIA is required to add a .env file in the /APP folder. This file provides the configuration to connect through ssh to a linux virtual machine, which will must contain files and folders of /Code_VM and it will execute the generative AI models. The path on the virtual machine containing /Code_VM files has to be updated using HOME_CLUSTER variable at /APP/main.py file. The virtual machine also requires the instalation of the conda environment.

If you do not have any machine available create the .env file as in the following example, in this way the application will work but it will not have the generative AI capabilities.

This .env file must have the following structure:

HOSTNAME = '<Host IP>'
PORT = '<Host port>'
USERNAME_CLUSTER = '<Username>'
PASWORD = '<Password>'

Also, Text2Text model (Mistral) requires the use of an Access Token provided by Huggingface. This token must be set on the variable HG_TOKEN_MISTRAL at Code_VM/img2text.py, so the model can make inference.

Executing StorIA

Execute /APP/main.py and the application will be automatically launched on the main page. There you will have the options to generate a new story or acces previouslly created ones. If you have any proble do not hesitate to contact us.

The application is develoved in a way that allows you to export it and be used in IOS or Andorid devices, this process has to be done using Kivy.

Executing the gradio interface

Execute /Imagegenerator.py which will launch the gradio interface, in which you are able to generate one image from a sketch, as well as the start of a story from that drawing. The interface will be hosted in your local machine. The IP will be printed in the command line.

Contributors

Joan Lafuente Baeza, joan.lafuente@autonoma.cat

Maria Pilligua Costa, maria.pilligua@autonoma.cat

Nil Biescas Rue, nilbiescas3@gmail.com

Jordi Longaron Carbonell, jordilongaroncarbonell@gmail.com

Xavi Soto Picón, xaviminisoto@gmail.com

About

Implementation of "StorIA: In defense of children happiness in hospitals" application.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published