User Tools

Site Tools


FIND

The Extract Tool automates the creation and configuration of EXTRACT, FIND and REPEAT.

For a detailed reference with examples see Extracting Data From Files

FIND:

  • Provides a flexible and powerful way to use data in a file as input to an EASAP
  • Always has a parent EXTRACT
  • Matches a single line in a file

If FIND is a child to a REPEAT multiple lines are matched

A match is the first instance on the line that matches the pattern in Value: (subsequent matches are ignored).


FIND
Essential Parameters:
Value:Enter a combination of literal strings and/or percent delimited new LIST names
Type:Select types to match and copy as elements of LISTs in Value:
Integer, Real, or Text
Whitespace:Select how whitespace in Value: is handled during a match
Default: Very Stretchy, As Is, Stretchy

Value:

In general Value: is a combination of literal text and new LIST references.

A each percent-delimited LIST reference in Value: will create a new LIST with that name which contains the matched values from the file.

The number of elements in this LIST will correspond to the number of times a parent REPEAT has run its child FIND (ie. the number of lines that have matched).


Type:

Type: specifies whether each match is:

  • An Integer
  • A Real floating point number
  • More generally, a string of Text

Type: Real will recognize:

  • Floating point format (eg. 123.45)
  • Exponential format (eg. 1.23456e+02)
  • An integer

Type: Text will accept multiple words and numbers, in effect it is a wildcard data type matching strings of characters.

FIND will only copy the value of the first match on a line.

There are two ways to obtain more than a single value from a line in a file:

  1. Explicitly include multiple LIST references in Value:
  2. Extract multiple values into one Text-type value and then use SPLIT to separate the text into elements

Whitespace:

The Whitespace: parameter controls how spaces and tabs between literal string(s) and/or LIST(s) in Value: are handled in match comparisons.

Whitespace: As Is - preserve the order and number of spaces and tabs around an LIST reference in Value: parameter must match exactly what is found in the file for a match to be made.

Whitespace: Stretchy - allow space around a LIST reference to match 1 or more spaces or tabs around an appropriate data type in the file.

Whitespace: Very Stretchy - allow space around an LIST reference to match 0 or more spaces or tabs around an appropriate data type in the file.

To highlight the difference between Stretchy and Very Stretchy, let us look at two examples.


Example One

For a FIND:

  • Set Value: %r1% | %r2% | %r3%
  • Set Type: Real, Real, Real
  • Set the parent EXTRACT's file to: 1.2 | 3.50| 12.0

In this case, Stretchy will not match the text due to the lack of any spaces between the second number and the second pipe symbol. However Very Stretchy would correctly match the text since it does not require any spaces.


Example Two

For a FIND:

  • Set Value: %r1% %r2% %r3%
  • Set Type: Real, Real, Real
  • Set the parent EXTRACT file to: 1.2 3.50 12.0

Also consider if the file above instead contains the following text: 1.2 2.3

In this case, Stretchy will correctly match only the first line of text, while Very Stretchy will incorrectly match both lines. For the second line of text, Very Stretchy would assign:

  • r1 1.2
  • r2 2.
  • r3 3

The second floating point number is broken into two REAL's as the 'zero or more spaces' requirement allows this.

In general, Very Stretchy is for non-space delimited numbers, Stretchy is for space delimited numbers.


A sequence of FINDs

If an EXTRACT has a sequence of child REPEAT/FINDs, the first FIND object attempts a match on the first line of the file and will continue the conditions are satisfied to exit the REPEAT loop. The next subsequent FIND begins matching at the line following the line where the REPEAT was satisfied, unless configured otherwise.