Install Flex Bison In Windows

  пятница 06 марта
      25
Install Flex Bison In Windows 3,6/5 6991 reviews
w w w . a q u a m e n t u s . c o m
What are Flex and Bison?
lex vs. flex, yacc vs. bison

GNU bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification of a context-free language, warns about any. Offers a selection of pet doors, furniture, dog houses, ramps, stairs and electronics. All about Lex, Yacc, Flex, and Bison: Overview, Online Documentation, Papers, Tools, Pointers.


what we'll do!
Flex
reading from a file
Bison
a makefile
forcing carriage returns
line numbers
Tips
directly returning terminal characters
actions before the end of the grammar
whitespace in flex/bison files
avoiding -lfl and link problems with yywrap
renaming identifiers
moving output
stalling in Windows
Advanced
start states
reading gzipped input
Regular expression overview
Acknowledgements

Flex and Bison are aging unix utilities that help you write very fastparsers for almost arbitrary file formats. Formally, theyimplement Look-Ahead-Left-Right (as opposed to 'recursive descent') parsingof non-ambiguous context-free (as opposed to 'natural language') grammars.

Why should you learn Flex/Bison pattern syntax when you could just writeyour own parser? Well, several reasons. First, Flex and Bison willgenerate a parser that is virually guaranteed to be faster than anything youcould write manually in a reasonable amount of time. Second, updating andfixing Flex and Bison source files is a lot easier than updating andfixing custom parser code. Third, Flex and Bison have mechanismsfor error handling and recovery, which is something you definitely don'twant to try to bolt onto a custom parser. Finally, Flex and Bison havebeen around for a long time, so they far freer from bugs than newer code.

This webpage is supposed to be a tutorial for complete novices needing to useFlex and Bison for some real project. I wrote it as I was learning them, sohopefully it explains things right as you need to know them. (The Flex andBison tutorials I found Googling around tended to be really heavy on theory and arcane details.) However, to really learn these tools you shouldtake a look at a few books:

  • Flex & Bison by John Levine. This is the definitive text for learning these tools.
  • Effective Flex & Bison, which covers advanced topics and good practice.

lex vs. flex, yacc vs. bison

In addition to hearing about 'flex and bison', you will also hear about 'lexand yacc'. 'lex and yacc' are the original tools; 'flex and bison' are theiralmost completely compatible newer versions. Only very old code uses lex andyacc; most of the world has moved on to Flex and Bison.

All four of the above are C-based tools; they're written in C, but moreimportant their output is C code. However, my project was in C++ -- so thisis also a tutorial on how to use C++ with Flex and Bison.

what we'll do!

Let's create a Flex/Bison parser for a silly format that I'll call 'snazzle'.Snazzle files have four general sections:

  • a header that just says 'sNaZZle' followed by a version number,
  • a section that declares type names, like 'type foo' and 'type bas',
  • a section with actual data consisting of exactly 4 numbers and one of the aforementioned type names,
  • and a footer that just says 'end'.
(I don't have a particular file format or program in mind, I'mjust coming up with this.) Here's an example of our silly format:

For making parsers, you want to start at the high level because that's howBison thinks about them. Happily, we already started at the high level bysaying that a 'snazzlefile' is comprisedof four general parts; we'll call them 'header', 'typedefs', 'body', and'footer'. Then we break down those four general parts into the pieces that comprise them, and so onuntil we get all the way down what are called terminal symbols.Terminal symbols are the smallest units of your grammar -- for our examplehere, integers like '15' are one of our terminal symbols. (Butnot the individual characters '1' and '5' that make it up.) Terminalsymbols are the boundary between Flex and Bison: Flex sees the individual '1'and '5' characters but combines them into '15' before giving it to Bison.

For our silly snazzle files, we'll make 3 terminal symbol types: a NUMBERthat's basically an integer, a FLOAT that's a floating-point number (whichwe only need for the version), and a STRING which is pretty much everythingelse. Furthermore, since this is a tutorial, let's show off Flex's power bymaking a program that just lexes our input without using Bison to parseit. Here's the first whack:

snazzle.l

Flex and Bison files have three sections:

  1. the first is sort of 'control' information,
  2. the second is the actual token (Flex) or grammar (Bison) definitions,
  3. the last is C code to be copied verbatim to the output.
These sections are divided by %%, which you see on lines 7 and 12. Let'sgo through this line by line.
  • Lines 1 (and 5) are delimeters that tell Flex that lines 2 through 4are C code to be copied directly to the generated lexer.
  • Lines 2 and 3 are to get access to cout.
  • Line 4 is to declare the yylex function that we're going tocall later. The yylex function is actually what Flex generates, so it willmagically appear when we run our snazzle.l through Flex.Note that mixing C and C++ has several delicate points, one of which is'name-mangling' - C++ compilation of a function named yylex will result ina symbol named something like _Z5yylexv. That's fine as long as everyonecalling yylex is also compiled in C++ mode, so that their linkage issimilarly mangled. However, if you want to compile any part of it in C mode,you will probably have to turn off name-mangling for it like so:
    • declare with extern 'C' int yylex();
    • define it with #define YY_DECL extern 'C' int yylex()
  • Line 6 is a handy Flex option that keeps it from looking for (or tryingto use) the yywrap function. We're not going to use yywrap at all here,and I don't want to have to add '-lfl' to the compilation just to pull inthe default empty definition of it.
  • Line 7 is %%, which means we're done with the control section andmoving on to the token section. Notice that we don't have much in ourcontrol section -- the Bison control section gets a lot more usage than theFlex one.
  • Lines 8-11 are all the same (simple) format: a regular expression (a topic in its own right,which I touch on not at all exhaustively at the end of this page)and an action. When Flex is reading through an input file and can match one of theregular expressions, it executes the action. The regular expression is not Perl's ideaof a regular expression, so you can't use 'd', but the normal stuff is all available. The action isjust C code that is copied into the eventual flex output; accordingly, you can have a single statementor you can have curly braces with a whole bunch of statements. Some specifics on the fileformat I discovered: the action has to be left-justified (if there's whitespace beginninga line where a pattern is expected, the line is considered a comment!); the separation between the patternand the action is just whitespace (even just a single space will do); the action is not limited to a singleline if you use curly braces.
  • Line 12 is another %% delimiter, meaning we're done with the second section and we can goonto the third.
  • Lines 13-16 are the third section, which is exclusively for copied C code. (Noticethat, unlike the control section at the top, there is no '%{' or '%}'.) Wedon't normally need to put anything in this section for the Flex file -- butfor this example, if we put the main() function in here, then all of ourcode is in one file and then we don't have to link a separate main.o.

This example can be compiled by running this:This will produce the file 'lex.yy.c', which we can then compile with g++:

If all went as planned, you should be able to run it and enter stuff on STDIN to be lexed:

Pretty cool! Notice that the exclamation mark at the very end was just echoed: whenFlex finds something that doesn't match any of the regexs it echos it to STDOUT. Usually thisis a good indication that your token definitions aren't complete enough, but to get rid of itfor now you could just add . ; to the token section -- that will match anyone character (the '.') and do nothing with it (the empty C statement ';').

Now for a little more detail on the syntax for the middle section. In general, it is reallyas simple as 'a regex match' followed by 'what to do with it'. The 'what to do with it' canvary a lot though - for example, most parsers completely ignore whitespace, which you can do by makingthe 'action' just a semicolon:Most parsers also want to keep track of the line number, which you would do by catchingall the carriage returns and incrementing a line counter. However, you want the carriage returnitself to be ignored as if it were whitespace, so even though you have an actionperformed here you want to not put a return statement in it:Pretty much everything else will need to return a token to Bison, but more on that later.

reading from a file

OK, first big upgrade: reading from a file. Right now we're reading from STDIN, which is kindof annoying, because most of the time you'd really like to pick a fileto read from. Flex reads its input from a global pointer to a C FILE variablecalled yyin, which is set to STDIN by default. All you have to do is setthat pointer to your own file handle, and it'll read from there instead. That changes our exampleas follows (differences are in bold):

Here's how to run our new code, though there's nothing earth-shattering:

The first thing we have to do in our silly parser is to start defining the terminaltokens; for us, those are ints, floats, and strings. Even though it is Flexthat figures out and returns tokens by type, defining those types is donein the Bison file. Also, let's move main() into the third section('copied-C-code') of the Bison file:

Now that the Bison file has declared the terminal types (%tokens), we can implementthem in the Flex file:

Since the Flex file now has to #include a file generated byBison (snazzle.tab.h), we have to run Bison first. Also note that bisonmust be run with -d to create a .h file for us.

You'll notice that the input is echoed out backward! This is because for each of therules you define, Bison does not perform the action until it matchesthe complete rule. In the above example, all of the recursive rules wereright-recursive (i.e. they looked like 'foo: bar foo', instead of 'foo: foo bar'). Theright-recursive search will print the output backwards since it has to match EVERYTHINGbefore it can start figuring out what's what, AND it has anothermajor drawback: if your input is big enough, right-recursive rules will overflowBison's internal stack! So a better implementation of the Bison grammar would beleft-recursion:

(Notice that we also had to change the $1 in the cout to a $2 since the thing we wanted to print outis now the second element in the rule.) This gives us the output we would hope for:

Now that the groundwork is completed, we can finally define the real fileformat in the Bison file. First we make the easy tokens in Flex: the ones that are usedto represent constant strings like SNAZZLE, TYPE, and END, respectively representingthe strings 'sNaZZle', 'type', and 'end'. Then we flesh out the actual rulesin the Bison grammar, and end up with this beautiful object d'art:

This is compiled and run just like the others:

So, that concludes this little tutorial, or at least the baseline part of it. I'm now going topick a few upgrades at random and show you how they might be done.

a makefile

When you finally get sick of manually rerunning Flex and Bison, and alsoforgetting in which order to do it, I heartily recommend setting up amakefile. Here's a sample one I made:

forcing carriage returns

Next tweak: you'll notice that this completely ignores whitespace, so that you could put the entiresnazzle file on a single line and it'll still be okay. You may or may not want this behaviour, butlet's assume that's bad, and require there to be carriage returns after the lines just like thereis in the sample file 'in.snazzle'. To do this, we need to do two things: recognize the 'n' token(flex), and add it to the grammar (bison):

But when we go to run it, we get a parse error:

Why?? Well, it turns out that in.snazzle has an extracarriage return at the end:

And if we remove it, snazzle would be happy:

But that really isn't the best solution, as requiring all input filesto have exactly one carriage return after data is a little unreasonable. We need to make itmore general. Specifically, we should allow any number of carriage returns between lines, whichwe'll do by defining an ENDLS rule that matches one or more ENDL tokens, and swappingour grammar over to using it instead of using ENDL directly:

Note that this didn't require any changes to the flex file -- the underlying tokens didn't change, justhow we used them. And happily, this works perfectly:

line numbers

Next little tweak: wouldn't it have been nice if that parse error had given us the line to look at, so thatwe didn't have to guess-and-check the grammar? Unfortunately Flex has no guaranteed way to get the linenumber (well, there's yylineno, but it's almost completely manual, and in somesituations makes your parser very slow, so you mightas well just do it yourself). The best way to keep track of the line numberis to have a global variable that you update whenever you see a carriage return:

Which is pretty cool when you make a 'type oops' definition in the middle of the body insteadof where it's supposed to be:

Of course, 'syntax error' isn't very helpful, but at least you've got the line number. :)

directly returning terminal characters

If you have individual characters that are also terminal symbols, you can have Flexreturn them directly instead of having to create a %token for them:You could also list a bunch of them in a single line with this shortcut:

On the Bison side, you can also use them directly:

ISTA is just the next generation of diagnostic software also known as Rheingold. Feature changing, VO editing), WinKFP (module flashing, updating), INPA (diagnostics, activations, error reading-clearing), and Tool 32 (activations, short circuit resetting of modules). I have one and its not even close to what you can do with the dealer level software I’m talking about.BMW Tools 2.12 consists of NCS Expert (coding. Bmw ista scan tool.

actions before the end of the grammar

The coolest thing I've discovered about Bison so far is that you can putaction statements in the middle of the grammar instead of just atthe end. This means you can get information as soon as it's read, which Ifind is very useful. For example, a C++ function call might look like this:You can always add actions at the end of the line, like this:But then you get the function name after 'body' is evaluated, which means the whole thinghas been parsed before you find out what it was! However, you can embed the code blockin the middle of the grammar and it will be run before 'body' is evaluated:

whitespace in flex/bison files

I've discovered that, unlike most unix tools, Flex and Bison are surprisingly lenient aboutwhitespacing. This is really nice, because it's hard enough to understand their syntaxwithout the ability to reindent things. The following are all equivalent:

avoiding -lfl and link problems with yywrap

If you know that you only want to parse one file at a time (usually a goodassumption), you don't need to provide your own yywrap function that would justreturn 1. Inside your flex file, you can say %option noyywrap and thegenerated output will define yywrap as a local macro returning 1. This hasthe added bonus of removing the requirement to link with the flex libraries, sincethey're only needed to get the default yywrap defined as a function returning 1.

renaming identifiers

Flex and Bison generate code that uses the global namespace, which means thatif you ever try to have more than one in a single program you're going tohave identifier collisions. To get around that, you can tell both Flex andBison to prefix their identifiers with a custom string you specify.

In Flex, you can use either the command line option -P foo or theoption syntax %option prefix='foo' in the first section of the file.

In Bison, you can use either the command line option -p foo or theoption syntax %name-prefix 'foo'.

moving output

Similar to why you rename identifiers, you usually don't want to use thedefault output file names because multiple parsers will all step on eachother.

In Flex, you can specify either -o lex.foo.cc on the command line (it has to be before your input file!) or%option outfile='lex.foo.cc' in the first section of the .l file.

In Bison, you can specify either -o 'foo.tab.cc' on the commandline or %output 'foo.tab.cc' in the first section of the .y file.However, Bison usually names its output after its input anyway; an inputfile named 'foo.y' will already generate 'foo.tab.c'.

stalling in Windows

I have heard from a loyal reader (oh hai there Alena) that running theseexamples from Windows presents a logistical problem -- Visual C++ runsit in its own terminal, and when the program is done it immediately closesthe terminal, so you can't see any output.

The suggestion is to add a call to the Windows-specific pause program,which stalls until you press enter. Thus, add this to your main functionafter yylex/yyparse and before return:

start states

In a Flex/Bison grammer, it is almost impossible to allow multiline commenting suchas C's /* . */. What you really want is for the parser go into asort of'ignore everything' state when it sees '/*', and go back to normal when itsees '*/'. Clearly you wouldn't want Bison to do this, or else you'd have to putoptional 'comment' targets all over every contruct in your syntax; youcan see how it fell to Flex to implement some way to do this.

Flex allows you to specify start states, which are just regex pattern rules likeany other but they're only matched if you're in a particular state. The syntaxfor a pattern that should only be matched if we're in state 'FOO' looks like this:

Note that you come up with the state names -- they're not predefined oranything. Though when you create a state, you do have to declare it in your Flexfile's control section (that's the first section, before the first '%%'):

So how do you get into the state? There's a special Flex construct (I think it'sultimately a C macro) that goes into any regular code block to get there:And how about getting back out of that state and going back to where you wereinitially? Instead of somthing obvious (say, 'END()'?), they decided to make adefault state called 'INITIAL.' Any Flex match pattern that doesn't have astate in front of it is assumed to be in the INITIAL state. To get back toit, you just jump to it:Note that the BEGIN thing is for all intents and purposes normal C code, despite thefact that it's not. :) What that means is that you can treat it like code and put itanywhere you want -- you can even make it conditional:

Back to the original problem -- multiline commenting. Here's a way to do C-styleblock commenting using start states:

Note that you can also have a Flex match pertain to multiple states by listing them all:

This occasionally comes in handy, such as keeping you from having to duplicatethe line-counting pattern/code above.

reading gzipped input

This is, surprisingly, rather easy. The basic idea is to filter Flex's inputthrough libz. libz is very nice because it will pass through, unchanged, anyinput which isn't zipped -- so we don't need to check if the input is zippedand handle it differently!

For reference, the API to libz is availablehere (last I looked).I have been consistently surprised with how easy it is to uselibz directly to avoid silly hacks like 'popen'ing 'gunzip -c' and redirectingits output. Kudos to the libz team!

In the first section of your Flex file, you'll want to declare that you have a better inputfunction, and then tell Flex it has to use yours instead:

Somewhere or another (I put it in my top-level files because it doesn't have to goin with the Flex/Bison source) you need to define your input function. Mine looks like this:

Then, instead of using fopen, you use libz:

You've almost used regexs before without knowing it, for example when you usewildcards on file names:

The shell's idea of regular expressions isn't quite accurate with the 'real'definition of regular expressions, but at least the idea is the same. Here,we're telling ls to list all the files that have 'asdf' somewhere in the name.We could ask for just the files starting with asdf by saying 'asdf*',or all the files ending with asdf with '*asdf'. The asterisk basically means'anything can go here'.

Regular expressions are really just the scientific deconstruction of suchpattern matching. There's actually an entire O'Reilly book dedicated to them, ifyou really want to see what they're all about. (I'd love to make a wisecrackabout being great for insomnia, but right now it's ranked #5,076 on Amazon'ssales rank. Guess it can't be too unbearable!)

With that said, my little overview here is clearly notgoing to be exhaustive, but should give you the general idea. Flex's regularexpressions consist of 'characters' and 'meta-characters'. 'Characters' are interpretedliterally as real text, whereas 'meta-characters' change how the search works. For example,listing 'foo' as a regular expression will match exactly one string, which is 'foo'. Butif you add the metacharacter '+' (which means 'one or more of the previouscharacter') after the 'f' to get 'f+oo', it will match 'foo', 'ffoo',and 'ffffffffffffffffoo'. A table of meta-characters follows:

metacharacterdescription
+previous expression can match one or more times
*previous expression can match zero or more times
?previous expression can match zero or one time
.can match any character except the carriage return 'n'

I say 'expression' above instead of just 'character' because you can also makegroups of things by enclosing whatever you want in parentheses.

Brackets are very cool -- by saying '[abcde]', it will match any one of the charactersin the brackets, so you'll match 'a' and 'b' but not 'ac'. If you add a plus afterthe closing bracket, though, then you will match 'ac' as well.Bracketsalso allow negation: '[^abc]+' will match anything that's not in the brackets, so 'd' and 'foo' wouldpass but not 'b'.

Most useful of all, brackets allow for ranges: '[a-e]' is the same as '[abcde]'. Soyou will frequently see stuff like '[0-9]+' to match integer values, or'[a-zA-Z]+' to match words, etc.

So how do you match an actual bracket, or an actual period, if the regular expressionsearcher is immediately going to think you're defining a meta-character? By addinga backslash () in front of it! So 'a.b' will match the string 'a.b' but not 'acb', whereas'a.b' as a regular expression will match both. And to be complete, backslash itselfcan be backslashed, so that 'a.' will match the string 'ab'. (What fun!)

This stuff just takes some playing with to get used to. It's not really that bad. :)

Many people over the years have contacted me with valuable suggestions for thistutorial. In no particular order:

  • Pieter Nobel
  • Robert Selberg
  • Brian Zarnce
  • Philip Mulcahy
  • Federico Tomassetti
  • Brandon Ortiz
  • Bill Evans
  • Alena Kirillova
  • Brandon Finley
  • Ofer Biran
Chris verBurg
2018-12-16

1. Download

Upon purchase, you will receive an email*. In it will be a download link, a unique License Key for activation and a link for generating an invoice.

Download the RBZ file:

For FlexPack Pro it will look something like: ftx.x.x_flex_pack_pro_x.x.x.rbz

For ComponentFinder it will look something like: ftx.x.x_component_finder_pack_x.x.x.rbz


*Can't find the email? Please search for a message from: 'FlexTools.cc <yoni@flextools.cc>'. Check in your spam folder and also in your promotions tab (Gmail). Still can't find it? Contact us!

2. Install

In Sketchup, go to:

Window > Extension Manager > Install Extension > Choose the downloaded RBZ

A new toolbar called FlexTools will appear. You will see:

The FlexTools toolbar with FlexPack Pro
The FlexTools toolbar with ComponentFinder

3. Activate

You can activate FlexPack on up to 2 computers with the License Key you received by email. Activating the stand-alone version of ComponentFinder is a similar process.

1. Click on the gear icon to open the FlexTools Manager

Activating FlexTools