Some useful tips for debugging LLVM source code with VSCode on Linux (all commands below assume a bash shell). It is not a substitute for the excellent LLVM documentation
cmake libedit-dev python-dev swig # optional clangCompilation commands:
git clone https://github.com/llvm/llvm-project.git mkdir build # make sure you have cmake and build dependencies (go to llvm.org) # LLVM will NOT build with pre-c++14 compiler # Optionally, you can build with RelWithDebInfo for a slightly smaller build (still with debug symbols) cmake -DLLVM_ENABLE_PROJECTS="clang;lldb;clang-tools-extra" -DCMAKE_BUILD_TYPE=Debug ../llvm-project/llvm # build may take a couple of minutes to 1-2 hours depending on your hardware
Note if you want to also build LLDB, you should instead use -DLLVM_ENABLE_PROJECTS="clang;lldb"
.
Note, this will default to the default CC and CXX compilers on your system. I highly advise using clang/clang++ so you can debug with LLDB
export CC=clang export CXX=clang++
I highly suggest you use LLDB instead of GDB, and clangd for code navigation. You can get the LLDB/clangd extensions for free in VSCode, and configure clangd to use the executable you just built from source.
clang itself is composed of 2 parts, the driver and the actual invocation into the FE. Thus, to get the actual call into the compiler for a simple program, you need to issue the -v flag:
clang++ -v foo.cpp
This will run the driver in verbose mode and will output a call to clang with the cc1 flag and a bunch of other flags. This is the actual command you want to use as a debug target within VSCode. To quickly convert the space separated string into a JSON friendly encoding for VSCode, you can run the following:
awk -v RS='' -v OFS='","' 'NF { $1 = $1; print "\"" $0 "\"" }' command.txtThen create a simple VSCode debug target with contents similar to:
{ "version": "0.2.0", "configurations": [ { "type": "lldb", "request": "launch", "name": "Debug", "program": "/media/luis/TI10657400D/llvm/build/bin/clang++", "args": [ "-cc1","-triple","x86_64-unknown-linux-gnu","-emit-obj","-mrelax-all","-disable-free","-main-file-name","test.cpp","-mrelocation-model","static","-mthread-model","posix","-mframe-pointer=all","-fmath-errno","-fno-rounding-math","-masm-verbose","-mconstructor-aliases","-munwind-tables","-fuse-init-array","-target-cpu","x86-64","-dwarf-column-info","-debugger-tuning=gdb","-v","-resource-dir","/media/luis/TI10657400D/llvm/build/lib/clang/10.0.0","-I","/usr/lib/gcc/x86_64-linux-gnu/7/include/","-internal-isystem","/usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0","-internal-isystem","/usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/x86_64-linux-gnu/c++/7.4.0","-internal-isystem","/usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/x86_64-linux-gnu/c++/7.4.0","-internal-isystem","/usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/backward","-internal-isystem","/usr/local/include","-internal-isystem","/media/luis/TI10657400D/llvm/build/lib/clang/10.0.0/include","-internal-externc-isystem","/usr/include/x86_64-linux-gnu","-internal-externc-isystem","/include","-internal-externc-isystem","/usr/include","-O0","-fdeprecated-macro","-fdebug-compilation-dir","/media/luis/TI10657400D/llvm/build/bin","-ferror-limit","19","-fmessage-length","0","-fopenmp","-fgnuc-version=4.2.1","-fobjc-runtime=gcc","-fcxx-exceptions","-fexceptions","-fdiagnostics-show-option","-fcolor-diagnostics","-faddrsig","-o","/tmp/test-c457c8.o","-x","c++","/home/luis/test.cpp","clang","-cc1","version","10.0.0","based","upon","LLVM","10.0.0git","default","target","x86_64-unknown-linux-gnu", "/home/luis/test.cpp" ], "cwd": "${workspaceFolder}" } ] }
You should now be able to set breakpoints anywhere in the compiler. A good starting point to set a breakpoint is in CodeGenModule::CodeGenModule.
LLVM_DEBUG
macros to run, pass the -debug
CLI flagmake check-all
ConstructJob
in clang/lib/Driver/ToolChains/Clang.cpp
-cc1
will not correctly handle system includes. Use clang -Xclang
instead
See https://releases.llvm.org/6.0.1/docs/CodeGenerator.html
MachineInstr
is a target agnostic representation of low level insts (opcode + operands)
If you have no idea where to break in source, you can get an idea of which transforms are run as part of a clang invocation with the -Rpass
and -Rpass-analysis
flags. These are opt-in flags in which not every transform participates, but give you a good idea of what's happening "under the hood". To view all participating transforms, issue: -Rpass=.* -Rpass-analysis=.*
Other good places to break on are in PassManager::run
and LegacyPassManager::run
Once you're stopped at a breakpoint in VSCode, there are a couple of useful commands you can call from LLDB to inspect the state of the IR. Some of which include:
dump()
- dump the contents of the current object to stderr (can be llvm::Module, llvm::CallGraph, llvm::BasicBlock, llvm::Instruction, llvm::Function)OBJECT.viewCFG()
- dump the contents of the current llvm::Function
object (with basic block statements) to a dot graph. Assumes dot
is on your pathOBJECT.viewCFGOnly()
- similar to viewCFG()
but will not print statements in a basic block. Useful for inspecting the CFG only and for more complex graphsclang++ -cc1 -ast-view <filename>
dump the AST (as a DOT graph) for all functions in <filename>
clang++ -emit-llvm -S <filename>
emit LLVM IRclang -c -Xclang -dump-tokens >filename>
emit the tokens parsed by the lexerclang-check -ast-dump filename
View color coded AST produced by Clangopt -debugify
Output additional debug infollvm-dis
Convert *.bc into *.llThe list below is a non exhaustive list of all resources I found helpful while hacking on LLVM. I take no credit for them. They are listed in no particular order.
gem5.opt --debug-help | less
gem5.opt --debug-flags=XXX
sudo apt-get install libstdc++-10-dev-mipsel-cross apt-get install binutils-mipsel-linux-gnu sudo apt-get install gcc-mips-linux-gnuGenerate code (GCC):
mipsel-linux-gnu-gcc -O0 -g test.cNote you need to use the mipsel toolchain as GEM5 only supports little endian programs. Generate code (MIPS):
clang -static --target=mipsel-linux-gnu test.cRun code on GEM5:
./gem5.opt ../../configs/example/se.py -c ~/sandbox/a.outRun dynamically linked code on GEM5:
./gem5.opt configs/example/se.py --cmd=/home/luis/research/gem5/a.out --redirects /lib=/usr/mipsel-linux-gnu/lib --interp-dir /usr/mipsel-linux-gnuThe additional flags are needed to setup the right library search paths for resolving dynamically linked symbols