User:Gso321/LLVM codegen GCC/libgccjit only

From Gentoo Wiki
Jump to:navigation Jump to:search

This will explained how to compile LLVM IR to GCC backend using LLVM C API and libgccjit respectively, all in C. A example use case could be to optimize C code by using Clang's -O2 then GCC -O2 optimizations. For full source code, see this project.

Requirements

Make sure to have GCC with JIT support, LLVM, and Clang:

root #echo "sys-devel/gcc jit" >> /etc/portage/package.use/gcc
root #emerge "llvm-core/llvm"
root #emerge "llvm-core/clang"

Accept GCC GPLv3+ license with the following copyright:

Copyright (C) 2013-2025 Free Software Foundation, Inc.

Create a C file with the following headers:

FILE main.cC Code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <llvm-c/Core.h>
#include <llvm-c/BitReader.h>
#include <libgccjit.h>

Create BASH file to link properly (Modify if necessary):

FILE build.shBash Code
#!/bin/sh

cc="gcc"
llvmflags=$(llvm-config --cflags --ldflags --libs core executionengine interpreter analysis native bitwriter --system-libs)
clang $(llvm-config --cflags) -I/usr/lib/gcc/x86_64-pc-linux-gnu/14/include -emit-llvm -c -O2 input.c -o input.bc
$cc main.c -O2 -I/usr/lib/gcc/x86_64-pc-linux-gnu/14/include -lgccjit $llvmflags -Wall -Wextra

Convert

Warning
THIS IS NOT FINISH, DO NOT COMPILE THIS YET!!!

First, LLVM bitcode file (Compressed LLVM IR) is created using Clang:

user $clang -emit-llvm -c -O2 input.c -o input.bc

Then, bitcode is converted to LLVMModuleRef:

FILE main.cC Code
LLVMModuleRef codegen_gcc_filename_to_LLVMModuleRef(const char * restrict path){
	LLVMMemoryBufferRef *bufferref;
	char *outmessage;
	bufferref = malloc(sizeof(LLVMMemoryBufferRef));
	LLVMModuleRef ret_module;
	if(!bufferref){
		fprintf(stderr, "%s() malloc failed\n", __func__);
		abort();
	}
	LLVMBool chk_r0 = LLVMCreateMemoryBufferWithContentsOfFile(path, bufferref, &outmessage);
	if(chk_r0){
		fprintf(stderr, "%s() main failed with:\n%s\n", __func__, outmessage);
		abort();
	}
	LLVMParseBitcode2(bufferref[0], &ret_module);
	free(bufferref);
	return ret_module;
}

LLVM module basically contains every relavant code like all global variables and all user functions.

Function declaration

Important
libgccjit does not support all function attributes in enum gcc_jit_fn_attribute such as aligned. Maybe use LLVM C backend instead?

All LLVM module user functions are represented with type LLVMValueRef, via functions:

FILE llvm-c/Core.hLLVM C Core.h
LLVMValueRef LLVMGetFirstFunction(LLVMModuleRef M);
LLVMValueRef LLVMGetLastFunction(LLVMModuleRef M);
LLVMValueRef LLVMGetNextFunction(LLVMValueRef Fn);
LLVMValueRef LLVMGetPreviousFunction(LLVMValueRef Fn);

In order to convert module user function to gcc_jit_function, it needs it's function type, function return type, function name, number of function parameters, function parameters type, and if it is variadic:

FILE main.cReturn gcc_jit_function *
// NOT FINISHED
gcc_jit_function *codegen_gcc_declare_gcc_jit_function(gcc_jit_context *gctxt, enum gcc_jit_function_kind kind, LLVMValueRef Fn){
	gcc_jit_function *gfunc;
	LLVMValueRef Fn;
	LLVMTypeRef FnTypeRef = LLVMTypeOf(Fn);
	//gcc_jit_type * FnRetType = codegen_gcc_LLVMTypeRef_to_gcc_jit_type(gctxt, LLVMGetReturnType(FnTypeRef));
	char *FnName;
	int num_params = (int)LLVMCountParamTypes(FnTypeRef);
	gcc_jit_param **params;
	LLVMBool is_variadic = LLVMIsFunctionVarArg(FnTypeRef);
	return NULL;
	gfunc = gcc_jit_context_new_function(gctxt, NULL, kind, FnRetType, FnName, num_params, params, is_variadic);
	return gfunc;
}

Function definition

Note
This does not apply to functions that are expected to be define elsewhere with GCC_JIT_FUNCTION_IMPORTED, such as C printf().

After declaring function into libgccjit, the function must get a definition of how the function should execute. Both GCC (gcc_jit_block) and LLVM (LLVMBasicBlockRef) splits the functions into basic blocks. Such functions only have one entry block and may have many blocks that return. All GCC blocks must either return or go to another block. All basic blocks have a series of optional non terminator instructions followed by one terminator instruction.

Compiling

Add the functions to finally compile LLVM IR to GCC backend:

FILE main.cC Code
gcc_jit_context_add_command_line_option(ctxt, "-O2");
gcc_jit_context_compile_to_file(ctxt, GCC_JIT_OUTPUT_KIND_OBJECT_FILE, "gccoutput0.o");