Lex And Yacc – Programming Languages

Lex And Yacc – Programming Languages


Welcome to office hours, week 3. The first question we have from the forums is are the tools Lex and YACC that we use throughout lecture and in the homeworks used outside of the classroom? That’s a good question, Peter. Actually, the answer is a resounding yes. A number of times in the real world as well as in my research career I have used tools like the lexer and parser generators we’re learning about in this class, Programming Languages. And it turns out that there’s effectively a standard, this notion of making lexical analyzers or making parsers. It’s so common, it’s so popular that there is support for it for almost every language. The original tool for this was called Lex, a lexical analyzer generator, and the name in there, generator, is important. The idea was you would just write out some regular expressions, and this tool would automatically make the finite state machine, the lexical analyzer, for you, saving you a lot of time and grungy implementation work of converting regular expressions down to finite state machines. It was called a lexical analyzer generator, but it was proprietary software at the time and it only worked for C. Similarly, there were many attempts to make compilers or interpreters, so it was really important to be able to write down a context-free grammar and spit out a parser, something that would recognize the language and make an abstract syntax tree or a parse tree. So there were a number of so-called compiler compilers, tools that would allow you to write your own compiler, and one of the most famous was known as Yet Another Compiler Compiler, or YACC. Both Lex and YACC were proprietary software, so the GNU Project made free versions of them called Flex, a fast lexical analyzer generator, and Bison, Bison being a pun on YACC, which were then very widely used. Initially, they only supported the languages C and C++. But as time has gone by, you can find them for many other languages, so for example, Ruby, a scripting language a bit like Python, as Ruby-Lex and Ruby-YACC. Python has things like PLY. Java has similar tools. CUP is an example of 1 of them. OCaml, a language near and dear to my heart that you may know as Microsoft’s F#, has ocamllex and ocamlyacc. So exactly the same sorts of ideas. You write out a regular expression and then some code to do. You write out a grammar and then for each rule you write out how to build up the abstract syntax tree. That’s exactly the same format that are used in the lexical analyzer and parser generator tools for all of the other languages. So the techniques that you’re learning in this class really carry over directly. The next time you want to make a little scripting language you can use exactly the same sorts of things we’ve learned here. So yes, I and the real world use, more or less, exactly these tools depending on exactly the language we’re targeting.

1 thought on “Lex And Yacc – Programming Languages”

Leave a Reply

Your email address will not be published. Required fields are marked *