aicomicfactory

AI Comic Factory: Generate Comic Books with AI

1 min


AI Comic Factory

Generate Comic Books with AI

Github logo

The AI Comic Factory is an innovative project that showcases the power of AI models. It uses Llama-2 70b, a language model that can create coherent and diverse texts, to write the captions for each comic panel. It also uses SDXL 1.0, a stable diffusion model that can generate realistic and high-quality images, to draw comic scenes. The project is open-source and anyone can run it locally with some modifications in the code.

Run the project locally

To begin with, We want to emphasize that all components of the project are available as open-source. (see herehereherehere).

The project cannot be replicated and run instantly since it’s not a single entity. It needs a range of components to operate, such as the frontend, backend, LLM, and SDXL.

If you want to duplicate the project, open the .env you will see it requires some variables.

Provider config:

  • LLM_ENGINE: can be one of: “INFERENCE_API”, “INFERENCE_ENDPOINT”, “OPENAI”
  • RENDERING_ENGINE: can be one of: “INFERENCE_API”, “INFERENCE_ENDPOINT”, “REPLICATE”, “VIDEOCHAIN” for now, unless you code your custom solution

Auth config:

  • AUTH_HF_API_TOKEN: If you decide to use an inference API model or a custom inference endpoint, you will need to use OpenAI for the LLM engine.
  • AUTH_OPENAI_TOKEN: Only when you choose to use OpenAI for the LLM engine.
  • AITH_VIDEOCHAIN_API_TOKEN: secret token to access the VideoChain API server
  • AUTH_REPLICATE_API_TOKEN: in case you want to use Replicate.com

Rendering config:

  • RENDERING_HF_INFERENCE_ENDPOINT_URL: necessary if you decide to use a custom inference endpoint
  • RENDERING_REPLICATE_API_MODEL_VERSION: url to the VideoChain API server
  • RENDERING_HF_INFERENCE_ENDPOINT_URL: optional, default to nothing
  • RENDERING_HF_INFERENCE_API_BASE_MODEL: optional, defaults to “stabilityai/stable-diffusion-xl-base-1.0”
  • RENDERING_HF_INFERENCE_API_REFINER_MODEL: optional, defaults to “stabilityai/stable-diffusion-xl-refiner-1.0”
  • RENDERING_REPLICATE_API_MODEL: optional, defaults to “stabilityai/sdxl”
  • RENDERING_REPLICATE_API_MODEL_VERSION: optional, in case you want to change the version

Language model config:

You can disregard some variables that are for community-sharing purposes. They are not necessary to use the AI Comic Factory on your own site or device (they are for connecting with the Hugging Face community, and only apply to official Hugging Face apps):

  • NEXT_PUBLIC_ENABLE_COMMUNITY_SHARING: you don’t need this
  • COMMUNITY_API_URL: you don’t need this
  • COMMUNITY_API_TOKEN: you don’t need this
  • COMMUNITY_API_ID: you don’t need this

For further details, consult the default .env configuration file. You can override any variable locally by creating a .env.local file (this file should not be committed as it may contain sensitive data).

To use local, cloud-based, or private models, you will have to write some code.

The LLM API

Currently, the AI Comic Factory uses Llama-2 70b through an Inference Endpoint.

You have four options:

1: Use an Inference API model

They have recently introduced a new feature that allows you to choose from the Hugging Face Hub models. It is recommended to use CodeLlama 34b, as it will give you better outcomes than the 7b model.

To activate it, create a .env.local configuration file:

LLM_ENGINE="INFERENCE_API"

HF_API_TOKEN="Your Hugging Face token"

# codellama/CodeLlama-7b-hf" is used by default, but you can change this
# note: You should use a model able to generate JSON responses,
# so it is storngly suggested to use at least the 34b model
HF_INFERENCE_API_MODEL="codellama/CodeLlama-7b-hf"

2: Use an Inference Endpoint URL

To use the AI Comic Factory with your own LLM hosted by the Hugging Face Inference Endpoint service, you need to set up a .env.local configuration file:

LLM_ENGINE="INFERENCE_ENDPOINT"

HF_API_TOKEN="Your Hugging Face token"

HF_INFERENCE_ENDPOINT_URL="path to your inference endpoint url"

If you want to use this type of LLM on your own machine, you can install TGI (Please refer to this post for the licensing details).

3: Use an OpenAI API Key

OpenAI API Key is a feature that is introduced recently. It allows you to access OpenAI API and its capabilities.

To activate it, create a .env.local configuration file:

LLM_ENGINE="OPENAI"

# default openai api base url is: https://api.openai.com/v1
LLM_OPENAI_API_BASE_URL="Your OpenAI API Base URL"

LLM_OPENAI_API_MODEL="gpt-3.5-turbo"

AUTH_OPENAI_API_KEY="Your OpenAI API Key"

4: Fork and modify the code to use a different LLM system

A possible alternative is to turn off the LLM entirely and use a different LLM protocol and/or provider (eg. Claude, Replicate), or a manually crafted story instead (by sending fake or fixed data).

The Rendering API

This API allows you to create panel images from text. I developed this API for some of my projects at Hugging Face.

  • The hysts/SD-XL Space by @hysts
  • And other APIs for making videos, adding audio etc.. but you won’t need them for the AI Comic Factory

1: Deploy VideoChain yourself

You will have to clone the source-code

2: Use Replicate

To use Replicate, you have to create a .env.local configuration file:

RENDERING_ENGINE="REPLICATE"

RENDERING_REPLICATE_API_MODEL="stabilityai/sdxl"

RENDERING_REPLICATE_API_MODEL_VERSION="da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf"

AUTH_REPLICATE_API_TOKEN="Your Replicate token"

3: Use another SDXL API

You can customize the code by forking the project and applying the Stable Diffusion technology that suits your needs (local, open-source, proprietary, your own HF Space etc).

You can also try something different, like Dall-E.

Example of use cases

Website built with AI Comic Factory Project

aicomicfactory

Join Guidady AI Mail List

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for subscribing.

Something went wrong.


Like it? Share with your friends!

0
79 shares

0 Comments

Your email address will not be published. Required fields are marked *

Belmechri

I am an IT engineer, content creator, and proud father with a passion for innovation and excellence. In both my personal and professional life, I strive for excellence and am committed to finding innovative solutions to complex problems.
Choose A Format
Personality quiz
Series of questions that intends to reveal something about the personality
Trivia quiz
Series of questions with right and wrong answers that intends to check knowledge
Poll
Voting to make decisions or determine opinions
Story
Formatted Text with Embeds and Visuals
List
The Classic Internet Listicles
Countdown
The Classic Internet Countdowns
Open List
Submit your own item and vote up for the best submission
Ranked List
Upvote or downvote to decide the best list item
Meme
Upload your own images to make custom memes
Video
Youtube and Vimeo Embeds
Audio
Soundcloud or Mixcloud Embeds
Image
Photo or GIF
Gif
GIF format