Python Embedded C Compiler
This project is in the planning stage. But because the
richeness of the Python language and the available tools (modules,
libraries) available in Python it appears that a set of tools that can
be used for embedded programming can be developed.
The goals of this project is to create a complete C compiler toolchain
in Python. Like most moden compilers this will have a
frontend and backend. This compiler toolchain will be easily
retargetable for different embedded processors and microcontrollers.
The most important features of this project will be the
Intermediate format (after the C-code is parsed, TBD XML?) and the ISA
description (XML?). These two descriptions will make
a new embedded processor simple. There already exists
tools in Python that make this project possible. In many ways
this project is more about the overall design and pulling the different
Python modules together and not reimplementing the different parts.
- Number one goal is KISS, want to keep this design and
implementation as clean as possible sacrificing performance and
optimized output if needed.
- Easily Retargetable. In less than a day retarget
for a new ISA. Basic support for most common C code quickly.
- Educational tool and platform for experimentation.
Encourage the experimentation of new ISA easily create
toolchains for these ISA.
- Ease of Use. The toolchain can be used to quickly
evaluate different processors. Then move to a commericial
highly optimized compiler if needed. Gives a free option to
start with. But also a platfrom for commercial developers to
develop optimization plug-ins, different target plug-ins etc.
- 100% Python. All modules and code used to be
Python (other than plug-ins that may be create to increase performance
all plug-ins API clearly defined and usable from Python).
Provide an simple platfrom (1 language) for
contribution and Python will take care of the portability. As
far as perfomance goes I think there are enough folks looking to
improve Python performance (ucpy) that the performance will come with
time. Python wrapped C++ libs ok (wxPython, etc) but want to
Current items to be completed
- Compiler designer to review the following and make any
suggestions on the 30000ft design.
- Decide on Intermediate format (AST, C--, ???).
- ISA XML description design. Complete description
that fits into C programming paradigm. If the ISA XML
describes the ISA and how it can be mapped to C programming everything
can be automated.
Proposed Python Coding Standard for the Project.
Would like to use generic Python as much as possible.
And regretably (for some) any non-Python
projects that make sense to incorporate would like to rewrite them in
Python versus wrapping the pre-existing modules. The reason
the rewrite would be to keep the number one goal of a set of tools that
encourages participation and experimentation. The Python
Python Modules Used.
Front End Compiler
Python C Preprocessor
At one time there existed a Python preprocessor, PYM
"A Macro Preprocessor". It used the C
preprocessor language but made it generic for any language (HTML, etc).
If an existing preprocessor implemented in Python is not
available will want to create a standalone C preprocessor that can be
used on C files and/or assembly files.
Python C Parser
The goal is to use (E)BNF description of C (C99).
Python Module will be used to parse the code.
will read-in the BNF description parse the input
code and create the pyparsing
code to parse the code. The following is an example of C
language BNF and the produced pyparsing code from the BNF description.
- C BNF Description
- Resulting pyparsing code
Intermediate Format (AST Described in XML)
The intermediate format is a verbose optimized
representation of the original C code. Everything up to here
is the frontend of the compiler. Parsing the C-code and
generating the intermediate representation. This
rerpesenation can then be run through the optimizer which will generate
another AST representation. Also will try and leverage any
current intermediate descriptions. Want to maintian the goal
complete compiler chain being in Python!
Like many other portions of this project will try and
leverage previous work. The only difference is that
this portion of the
design will be borrowed but will be rewritten in Python.
Again the reason for writting the complete compiler in Python
is make the tool easily portable and a common programming
language for clean implementation.
The intermediate format has to be flexible enough to represent all the
C properties and map easily to assembly and hardware. The
hardware mapping is a secondary goal but is one worth pursuing.
Verilog and/or MyHDL can be produced from the intermediate
format. Also the intermediate format should also represent
objects, mainly OOP provided by ObjC.
This is a Python C compiler written for a compiler course.
A different frontend Python parsing Module was used but
it created an abstract syntax tree (AST) intermediate format.
- Abstract Syntax Tree, compiler design using AST
1. Comvert the program into an AST
2. Perfrom type-checking and semantic analysis on the tree
3. Rearrange the tree to perform optimizations
4. Convert the tree into the target code.
- Abstract Semantic Graph
- Tree Compiler Compiler - A discussion here how aspect orient
programming approach using treecc can be used to build a compiler.
C-- is another format, has a virtual machine etc. This maybe
good source for design ideas but not sure if it is 100% applicable.
They want compiler frontends to produce c-- code??
This area is definetly not my expertise. I will gladly take
any suggestions on any of the frontend or backend work.
Python Optimizer, Pre and Post Intermediate format
A compiler wouldn't be a compiler with out optimizing the code.
But as already stated this is a secondary goal.
Want to provide the hooks and environment for code
Back End Compiler
This is where most of my work will exist. Hopefully some
interest will be developed for the frontend and a complete compiler
tool chain can be developed.
XML Instruction Set Architecture Description
An extensible portable definition / descriptioni of a processor design.
This will lead to the automatic creation of instruction set
simulators, assemblers, etc.
Python XIF to ASM, Optimizer
Generate annotated assembly code.
Python Regular Expression Assembler. The assembler will
automatically be generated based on the ISA XML Description. (xisad)
Python Blend and Pyastra
Along with writting C and Assembly code want other tools that assist in
writting highly optimized code. These two tools will assist
in writting assembly code. Also I imagine that these will be
the C frontend and then can be used to generate code.
This is a tool based on the Circuit Cellar Java Blend tool.
It provides very minimal C flow control statements blended
with assembly statements. This help develop all the start up
and control code easily.
This is a branch of the PyAstra project. It is indended to
take a small subset of Python code and create assembly code.
Python Linker and Python Binary Formatter
Pulls everything together.
Creates hex, srecord, elf, etc files.
Why Python is a Good Language to Develop a Compiler
Why Python is a good language for a toolchain.
Why Python is not a good language for a toolchain
Is it not?
Resources and Related Projects
Currently this project will focus on C compiler toolchain
(preprocessor, c-compiler, assembler, assembly ext and linker) but if
there is enough interest and support will extend to a larger tool
chain, simulator, debugger, and profilers.
Commercial products that automatically (similar/same goals) generate
ISS (instruction set simulators), assemblers, linkers and debuggers.
These commercial tools are extremely expensive. Hopefully
this project can generate some cheap competition.