|
Ploticus can read tabular ASCII data from files, commands, from the standard input, or data may be embedded in ploticus scripts. If you're using prefabs, the data parameter specifies the source (either file or standard input) of data. If you're writing scripts, proc getdata is used to read or specify plotting data; proc trailer may be used to place larger amounts of embedded plot data at the end of the script file, to get it out of the way. Since ploticus can read data on standard input, there are many possibilities for getting data for plotting. To get data out of an SQL database, use your database's command line tool to extract tabular ASCII data. Or, to get data across the internet using a URL, use a utility like Jeff Poskanzer's http_get. These examples illustrate:
Script writers can do the same thing by setting
proc getdata's
file attribute to stdin, or by using
proc getdata's
command attribute.
Plotting from data fieldsPlotting and data display operations are done using fields. Suppose we have a data set like this: F1 2.43 0.47 "Jane Doe" PF7955 F2 2.79 0.28 "John Smith" PT2705 F3 2.62 0.37 "Ken Brown" PB2702 F4 "" "" "Bud Flippner" PX7205We might draw a bar graph using the values in field 2, and draw error bars using the values in field 3. The bars could be labeled with the values in field 4, or perhaps field 1. Data fields may always be referenced by number, where the first is 1. For example, to produce a line plot using fields 1 and 2 of a data set you might use the prefab command: pl -prefab lines data=mydata x=1 y=2. Or the script equivlent: #proc lineplot xfield: 1 yfield: 2 Naming data fields: You may be able to reference data fields by name. Sometimes data sets carry field names in the first row. This is called a field name header. If your data set has a field name header, you can reference fields using those names (if you're using prefabs specify header=yes; if you're writing scripts set proc getdata fieldnameheader to yes). The field name header is expected to use the same delimitation as the rest of the data. Here's an example: date time alevel blevel 020402 13:11 102 392 020402 13:28 128 402 ...You can also assign names explicitly in your ploticus script, by using one of the proc getdata fieldname attributes. Field names (whether from header or specified explicitly) are like variable names; they cannot contain embedded white space, comma, or quote characters. Massaging data: If you are developing ploticus scripts, and your data exists in a state such that additional processing is required in order to work with it, you may be able to accomplish the desired manipulation within ploticus. To select certain fields, reformat fields, concatenate fields, etc., try using a proc getdata filter. To perform accumulation, tabulation and counting, rewriting as percents, computation of totals, reversing record order, rotation of row/column matrix, break processing, etc., proc processdata may be useful (it operates on the data after they have been read in).
Recognized data formatsData files or streams should be plain ASCII text, not binary, and should be organized as a collection of rows having one or more fields. Fields may have numeric or alphanumeric content and may be delimited in one of these ways:
Notes regarding data input and parsing: Empty rows and commented rows are ignored (the comment marker may be specified via proc getdata ) . Data sets with variable number of fields may be accomodated by specifying proc getdata attribute nfields. Otherwise, the first usable row will dictate the expected number of fields per record. If a row has more than the expected number of fields, extra fields are silently ignored. If a row has less than the expected number of fields, blank fields are silently added until the record has same number of fields as other records. nfields may also be used to read only the first few fields on every row, and ignore the rest. Leading white space is allowed when using spacequoted or whitespace delimitation. It is not allowed on the other types. Comma-delimited data files may include commented lines and empty lines, but comment symbol must be at beginning of line, and empty lines may not contain any whitespace. Each row, including the last one, should be terminated with the standard line terminator for your system. For unix systems this is the newline character. For Win32 it is CR/LF; these are handled properly by MingW builds but not by unix builds. The data parser was improved for version 2.02; earlier versions did not support zero-length fields or data sets with variable number of fields.
Data that is specified within a ploticus script is subject to script processing: leading white space
is stripped off and the script interpreter will attempt to evaluate constructs that look like
operators or variables.
Missing dataMissing data values may be represented using a code or by a zero-length field, if the specific delimitation method allows them. When plotting, missing values are generally skipped over, but exactly what occurs depends on what kind of plot operation is being done. The individual plotting proc manual pages give details.Embedded #set statementsData files may contain embedded #set statements for setting ploticus variables directly from the data file. The syntax is:
Here's an example of a data file with embedded #set statements:
ExamplesGallery examples include:scat7.dat (white-space delimited) stock.csv (comma delimited) timeline3 (data specified within script) km2 (data specified within script). |
![]() data display engine Copyright Steve Grubb ![]() |