![]() |
![]() |
|
Home Articles Authors Links Useful Tips Polls HOWTOs |
![]() |
|
Abstract: This column shows some simple but powerful techniques to make TCL data persistent in a text file and to parse it again from the file at a later point. We make use of TCL's flexible syntax to make the data files easily readable and even editable. This column is also available as a paper including runnable examples. Many thanks to everybody on comp.lang.tcl who scrutinized the paper and suggested improvements, in particular to Andreas Kupries and Morten Jensen. IntroductionA typical TCL script stores its internal data in lists and arrays (the two primary data structures in TCL). Suppose you want to write a TCL application that can save its data on disk and read it back again. For example, this allows your users to save a project and load it back later. You need a way to write the data from the place where it is stored internally (lists and arrays) to a file. You also need a way to read the data back into a running script. You can choose to store the data in a binary form or in a text file. This paper is limited to textual data file formats. We will look at a number of possible formats and how to parse them in TCL. In particular, we will show some simple techniques that make text file parsing a lot easier. This paper assumes that you are familiar with TCL, and that you have written at least a few simple scripts in TCL. A simple exampleSuppose you have a simple drawing tool that places text and rectangle items on a canvas. To save the resulting pictures, you want a textual file format that must be easy to read. The first and simplest file format that comes to mind looks something like this: example1/datafile.dat rectangle 10 10 150 50 2 blue rectangle 7 7 153 53 2 blue text 80 30 "Simple Drawing Tool" c red The first two lines of this file represent the data for two blue, horizontally stretched rectangles with a line thickness of 2. The final line places a piece of red text, anchored at the center (hence the "c"), in the middle of the two rectangles. Saving your data in a text file makes it easier to debug the application, because you can inspect the output to see if everything is correct. It also allows users to manually tinker with the saved data (which may be good or bad depending on your purposes). When reading a data file in this format, you somehow need to parse the file and create data structures from it. To parse the file, you may be tempted to step through the file line by line, and use something like example1/parser.tcl canvas .c pack .c set fid [open "datafile.dat" r] while { ![eof $fid] } { # Read a line from the file and analyse it. gets $fid line if { [regexp \ {^rectangle +([0-9]+) +([0-9]+) +([0-9]+) +([0-9]+) +([0-9]+) +(.*)$} \ $line dummy x1 y1 x2 y2 thickness color] } { .c create rectangle $x1 $y1 $x2 $y2 -width $thickness -outline $color } elseif { [regexp \ {^text +([0-9]+) +([0-9]+) +("[^"]*") +([^ ]+) +(.*)$} \ $line dummy x y txt anchor color] } { .c create text $x $y -text $txt -anchor $anchor -fill $color } elseif { [regexp {^ *$} $line] } { # Ignore blank lines } else { puts "error: unknown keyword." } } close $fidWe read one line at a time, and use regular expressions to find out what kind of data the line represents. By looking at the first word, we can distinguish between data for rectangles and data for text. The first word, therefore, serves as a keyword: it tells us exactly what kind of data we are dealing with. We also parse the coordinates, color and other attributes of each item. Grouping parts of the regular expression between parentheses allows us to retrieve the parsed results in the variables 'x1', 'x2', etc. This looks like a simple enough implementation, assuming that you understand how regular expressions work. But I find it pretty hard to maintain. The regular expressions also make it hard to understand. There is a more elegant solution, known as an 'active file'. It is captured in a design pattern, originally written by Nat Pryce. It is based on a very simple suggestion: Instead of writing your own parser in TCL (using regexp or other means), why not let the TCL parser do all the work for you? The Active File design patternTo explain this design pattern, we continue the example of the simple drawing tool from the previous section. First, we write two procedures in TCL, one that draws a rectangle, the other writes text. example2/parser.tcl canvas .c pack .c proc d_rect {x1 y1 x2 y2 thickness color} { .c create rectangle $x1 $y1 $x2 $y2 -width $thickness -outline $color } proc d_text {x y text anchor color} { .c create text $x $y -text $text -anchor $anchor -fill $color }To make a picture on the canvas, we can now call these two procs several times, once for each item we want to draw. To make the same picture as above, we need the following three calls: example2/datafile.dat d_rect 10 10 150 50 2 blue d_rect 7 7 153 53 2 blue d_text 80 30 "Simple Drawing Tool" c red Does this look familiar? The code for calling our two procs looks almost exactly like the data file we parsed earlier. The only difference is that the keywords have changed from 'rectangle' and 'text' to 'd_rect' and 'd_text'. Now we come to the insight that makes this design pattern tick: to parse the data file, we treat it like a TCL script. We just put the calls to our two procedures in a file, and we use that as the data file. The fact that the data file actually contains calls to TCL procedures, is the heart of this design pattern. Parsing the data file is now extremely easy: source "datafile.dat"The built-in TCL command source reads the file, parses it, and executes the commands in the file. Since we have implemented the procedures d_rect and d_text, the source command will automatically invoke the two procedures with the correct parameters. We will call d_rect and d_text the parsing procedures.
We do not need to do any further parsing. No regular expressions, no line-by-line loop, no opening and closing of files. Just one call to The data file has become a TCL script that can be executed. This is called an Active File because it contains executable commands, not just passive data. The Active File design pattern works in most scripting languages, and is excellently described by >> Nat Pryce on his website. Advantages of using the Active File pattern:
Disadvantages of using the Active File pattern:
Limitations of the Active File pattern:
Next timeNext time, in part two, we will look at some easy tricks to make the persistent data easier to read by humans, without making it any harder to parse. And we will see some more advanced examples of how you can flex TCL into handling more complex data files for you. |
![]() |
0.4.0 | Copyright to all articles belong to their respective authors. Everything else © 2024 LinuxMonth.com Linux is a trademark of Linus Torvalds. Powered by Apache, mod_perl and Embperl. |