llvm-journeyLLVM Journey | |
| git clone git://0xff.ir/g/llvm-journey.git | |
| Log | Files | Refs | README | LICENSE |
My journey through "My First Language Frontend with LLVM Tutorial".
kaleidoscope_lexer.hpptests/kaleidoscope_lexer.test.cppThe lexer has been implemented as a STL-like algorithm. It accepts a pair of
ForwardIterators and a function object; for each token, the function object
will be called as follow: funcobj(kal::TkToken, FwdIt tkBegin, FwdIt tkEnd).
kal::lex function templateTokenize the input sequence.
template<typename FwdIt, typename TokenOp>
FwdIt
lex(FwdIt f, FwdIt l, TokenOp op);
The function template tokenize the range [f, l), and will call op with
the following arguments:
kal::TkToken)FwdIt)FwdIt)f Beginning of the source code sequencel End of the source code sequenceop Function object that will be called for each recognized tokenReturns the end of input range (l).
File: examples/ex_kaleidoscope_lexer.cpp
#include <iostream>
#include <string>
#include "kaleidoscope_lexer.hpp"
int
main()
{
std::string source{
R"(
extern sin(arg); # C standard library function
sin(3.14159265/2)
)"
};
auto sBegin = source.cbegin();
auto sEnd = source.cend();
kal::lex(
sBegin, sEnd, [&](kal::TkType type, const auto& tkBegin, const auto& tkEnd) {
auto b = std::distance(sBegin, tkBegin);
auto e = std::distance(sBegin, tkEnd);
// clang-format off
std::cout << "Token{"
<< "\n /*type*/ kal::TkType::"
<< kal::TKTYPE_STR[static_cast<int>(type)]
<< ",\n /*begin*/ " << b
<< ",\n /*end*/ " << e
<< ",\n /*str*/ \"" << source.substr(b, e - b) << '"';
// clang-format on
if (type == kal::TkType::Num)
std::cout << ",\n /*num*/ " << source.substr(b, e - b);
std::cout << ",\n}\n";
});
return 0;
}
Output of the program:
Token{
/*type*/ kal::TkType::Extern,
/*begin*/ 1,
/*end*/ 7,
/*str*/ "extern",
}
Token{
/*type*/ kal::TkType::Id,
/*begin*/ 8,
/*end*/ 11,
/*str*/ "sin",
}
Token{
/*type*/ kal::TkType::Etc,
/*begin*/ 11,
/*end*/ 12,
/*str*/ "(",
}
Token{
/*type*/ kal::TkType::Id,
/*begin*/ 12,
/*end*/ 15,
/*str*/ "arg",
}
Token{
/*type*/ kal::TkType::Etc,
/*begin*/ 15,
/*end*/ 16,
/*str*/ ")",
}
Token{
/*type*/ kal::TkType::Etc,
/*begin*/ 16,
/*end*/ 17,
/*str*/ ";",
}
Token{
/*type*/ kal::TkType::Comment,
/*begin*/ 18,
/*end*/ 47,
/*str*/ "# C standard library function",
}
Token{
/*type*/ kal::TkType::Id,
/*begin*/ 49,
/*end*/ 52,
/*str*/ "sin",
}
Token{
/*type*/ kal::TkType::Etc,
/*begin*/ 52,
/*end*/ 53,
/*str*/ "(",
}
Token{
/*type*/ kal::TkType::Num,
/*begin*/ 53,
/*end*/ 63,
/*str*/ "3.14159265",
/*num*/ 3.14159265,
}
Token{
/*type*/ kal::TkType::Etc,
/*begin*/ 63,
/*end*/ 64,
/*str*/ "/",
}
Token{
/*type*/ kal::TkType::Num,
/*begin*/ 64,
/*end*/ 65,
/*str*/ "2",
/*num*/ 2,
}
Token{
/*type*/ kal::TkType::Etc,
/*begin*/ 65,
/*end*/ 66,
/*str*/ ")",
}
Token{
/*type*/ kal::TkType::END,
/*begin*/ 67,
/*end*/ 67,
/*str*/ "",
}
kaleidoscope_ast.hpp, kaleidoscope_ast.cpptests/kaleidoscope_ast.test.cppASTNode classEach node of AST (Abstract Syntax Tree) can be saved in ASTNode class. It's a
concepts-based polymorphic type that accepts an instance of type T that
satisfies the following requirements:
T is a Regular typestd::string to_string(const T&);kal::NodeType node_type(cons T&);kal::NodeType is an enum class:
enum class NodeType
{
None = -1,
Number,
Variable,
BinaryOp,
Call,
Prototype,
Function,
};
For more information about implementation details of concepts-based polymorphic types and their benefits, you can watch this great talk by Sean Parent. If you're impatient you can go directly to time 49:13.
kal::cast function templatetemplate<typename T>
bool
cast(const ASTNode& n, T& val);
This function will copy the value type inside the polymorphic type n to val.
On successful copy, this will return true.
n Polymorphic type ASTNode that (possibly) contains the value of type Tval Destination value of type T (this is an output)Returns true on success (when dynamic_cast can casts to T).
Current implementation provides the following regular types for AST nodes:
kal::Number
double valuekal::Variable
std::string namekal::BinaryOp
char opASTNode lhsASTNode rhskal::Call
std::string calleestd::vector<ASTNode> argskal::Prototype
std::string namestd::vector<std::string> paramskal::Function
Prototype protoASTNode bodyAll of these types are implemented as struct (all members are public).
Because there's no invariant to establish.
See C++ Core Guidelines C.2.
#include "kaleidoscope_ast.hpp"
using namespace kal;
int
main()
{
Number n{ 3.14 };
Variable v{ "var1" };
BinaryOp b{ '+', n, v }; // 3.14 + var1
Variable x{ "x" };
Variable y{ "y" };
Call c{ "tan2", { x, y } }; // tan2(x, y)
Prototype p{ "add3", { "x", "y", "z" } }; // extern add3(x, y, z)
Variable z{ "z" };
Function f{ p, BinaryOp{ '+', BinaryOp{ '+', x, y }, z } };
return 0;
}
kaleidoscope_parser.hpp, kaleidoscope_parser.cpptests/kaleidoscope_parser.test.cppAlgorithms to make sure validity of input program.
kaleidoscope_codegen.hpp, kaleidoscope_codegen.cpptests/kaleidoscope_codegen.test.cppCode generator assumes input program is valid; feeding code generator with invalid program leads to undefined behavior (UB).
kal::Number, kal::Variable, and kal::BinaryOpkal::codegen function overload for const kal::Number&Generates code for kal::Number.
llvm::Value* kal::codegen(const kal::Number&);
kal::codegen function overload for const kal::Variable&Generates code for kal::Variable.
llvm::Value* kal::codegen(const kal::Variable&);
kal::codegen function overload for const kal::BinaryOp&Generates code for binary operation kal::BinaryOp.
llvm::Value* kal::codegen(const kal::BinaryOp&);
kal::codegen function overload for const kal::ASTNode&Generates code for the concrete type inside the node.
llvm::Value* kal::codegen(const kal::ASTNode& node);
The pre-condition is that node must not be empty (default constructed), or
contain Prototype or Function nodes.
kaleidoscope_opt.hpp, kaleidoscope_opt.cpptests/kaleidoscope_opt.test.cpp