GPT-Migrate: Convert codebase across frameworks or languages.

GPT-Migrate

Migrate your codebase across frameworks or languages.

GPT-Migrate is a project that helps you convert your codebase from one framework or language to another with ease. It uses advanced natural language processing techniques to analyze your code and generate equivalent code in the target framework or language. Whether you want to switch from React to Vue, from Python to Ruby, or from C# to Java, GPT-Migrate can handle it for you.

Usage

To use GPT-Migrate, you need to have Docker installed and running. You also need a powerful GPT-4 model, such as GPT-4-32k.
Get your OpenAI API key and install the Python requirements:

export OPENAI_API_KEY=<your key>

pip install -r requirements.txt

Run the main script with the target language you want to migrate to:

python main.py --targetlang nodejs

(Optional) If you want GPT-Migrate to check the unit tests it generates against your original app before testing the migrated app, make sure your original app is accessible and use the --sourceport flag. To execute this against the benchmark, open another terminal, go to the benchmarks/language-pair/source directory, and run python app.py. This will expose your app on port 5000. Use this with the --sourceport flag.

This script runs the flask-nodejs benchmark unless you change the settings. To customize the language, source directory, and more, see the options guide below.

Options

To modify how GPT-Migrate operates, you can use these options when you run the main.py script:

--model: The Large Language Model to be used. Default is "gpt-4-32k".
--temperature: Temperature parameter for the AI model. Default value is 0.
--sourcedir: Source path containing the code to be migrated. Default is "../benchmarks/flask-nodejs/source".
--sourcelang: The code’s source language or framework, to be migrated. No default value.
--sourceentry: Entry point filename relative to the source path. For example, this may be an app.py or main.py file for Python. Default is "app.py".
--targetdir: The directory in which the migrated code will be located. Default is "../benchmarks/flask-nodejs/target".
--targetlang: Target language or framework for migration. Default value is "nodejs".
--operating_system: OS for the Dockerfile. The usual options are 'linux' or 'windows'. Default option is 'linux'.
--testfiles: Comma-separated files list that have functions to be tested. For example, this might be an app.py or main.py file for a Python app where your REST endpoints are. Include the full relative directory. Default is "app.py".
--sourceport: (Optional) Use a port to compare the unit tests file to the original application. no default value. GPT-Migrate won’t try to test the unit tests against your original app if they are not there.
--targetport: Port for testing the unit tests file against the migrated app. Default port is 8080.
--guidelines: Small functional or stylistic rules that you’d prefer to be adhered to during the process of migration. Use tabs instead of spaces, for instance. By default, a string is empty.
--step: Step to run. Options are 'setup', 'migrate', 'test', 'all'. Default is value 'all'.

For instance, to migrate a Python codebase to Node.js, you can run:

python main.py --sourcedir /path/to/my-python-app --sourceentry app.py --targetdir /path/to/my-nodejs-app --targetlang nodejs

This will take the Python code in ./my-python-app, migrate it to Node.js, and write the resulting code to ./my-nodejs-app.

Project External Links

Github

Twitter

Website

How It Works

To migrate a repository from --sourcelang to --targetlang…

GPT-Migrate sets up a Docker environment for --targetlang, which it either receives as an argument or infers from the source code.
It scans your code recursively and finds 3rd-party --sourcelang dependencies. Then, it chooses suitable --targetlang dependencies for them.
It converts your code to a new --targetlang by applying a recursive process that starts from the --sourceentry file you specify. You can initiate this step with the --step migrate option.
It spins up the Docker environment with the new codebase, exposing it on --targetport and iteratively debugging as needed.
It develops unit tests using Python’s unit test framework and optionally tests these against your existing app if it’s running and exposed on --sourceport, iteratively debugging as needed. This step can be started from with the --step test option.
It tests the new code on --targetport against these unit tests.
It iteratively debugs the code for for you with context from logs, error messages, relevant files, and directory structure. It does so by choosing one or more actions (move, create, or edit files) then executing them. If it wants to execute any sort of shell script (moving files around), it will first ask for clearance. Finally, if at any point it gets stuck or the user ends the debugging loop, it will output directions for the user to follow to move to the next step of the migration.
The new codebase is completed and exists in --targetdir.

📝 Prompt Design

Subprompts are organized in the following fashion:

HIERARCHY: this defines the notion of preferences. There are 4 levels of preference, and each level is prioritized more highly than the previous one.
p1: Preference Level 1. These are the most general prompts and consist of broad guidelines.
p2: Preference Level 2. These are more specific prompts and consist of guidelines for certain types of actions (e.g., best practices and philosophies for writing code).
p3: Preference Level 3. These are even more specific prompts and consist of directions for specific actions (e.g., creating a certain file, debugging, and writing tests).
p4: Preference Level 4. These are the most specific prompts and consist of formatting for output.

Prompts are a combination of subprompts. This concept of tagging and composability can be extended to other properties as well to make prompts even more robust. This is an area we’re highly interested in actively exploring.

In this repo, the prompt_constructor() function takes in one or more sub-prompts and yields a string that may be formatted with variables, for example with GUIDELINES being a p1, WRITE_CODE being a p2 etc:

prompt = prompt_constructor(HIERARCHY, GUIDELINES, WRITE_CODE, DEBUG_TESTFILE, SINGLEFILE).format(targetlang=targetlang,buggyfile=buggyfile)

Performance

GPT-Migrate is an alpha-stage project that aims to translate code across different programming languages. However, it still has many limitations and challenges. For example, it can only handle simple languages like Python or javascript with moderate success, and it fails to cope with more advanced languages like C++ or Rust without human intervention.