User Tools

Site Tools


FIND


Tip: The Extract Tool creates and configures EXTRACT, FIND and REPEAT. A detailed reference with examples is available, see Extracting Data From Files.


FIND provides a flexible and powerful way to use data in a file as input to an EASAP.

FIND always has a Parent EXTRACT.

FIND matches a single line in a file. If FIND is a Child to REPEAT multiple lines are matched.

A 'match' is the first instance on the line that matches the pattern in the Value: parameter. Subsequent matches are ignored.


FIND
Essential Parameters:
Value:Combination of literal characters and zero or more object references
Type:Data type to match and set in Value: (Integer, Real, Text)
Whitespace:Select how white space in Value: parameter is handled during a match ('As Is', 'Stretchy', 'Very Stretchy'←default)

Value:

In general the Value: parameter is a combination of literal text and new or existing delimited object references.

A new percent-delimited object reference in the Value: parameter of a FIND object will create a LIST with that object name that contains the matched values from the file. The number of elements in this LIST will correspond to the number of times a Parent REPEAT has run its Child FIND (ie. the number of lines that have matched).


Type:

The Type: parameter specifies whether each match is:

  • 'Integer'
  • a 'Real' floating point number
  • more generally a string of 'Text'

'Real' data type will recognize:

  • floating point format (eg. 123.45)
  • exponential format (eg. 1.23456e+02)
  • an integer

'Text' data type will accept multiple words and numbers—in effect—it is a wildcard data type matching strings of characters.

FIND will only assign an object reference the value of the first match on a line. There are two ways to obtain more than a single value from a line in a file:

  • explicitly include multiple object references in the Value: parameter
  • extract multiple values into one 'Text' -type object reference and then use SPLIT to separate the text into elements.

Whitespace:

The Whitespace: parameter controls how spaces and tabs between object references in the Value: parameter are handled in match comparisons.

Whitespace: may be set to:

  • As Is
  • Stretchy
  • Very Stretchy←default

'As Is' preserves the order and number of spaces and tabs around an object reference in the Value: parameter must match exactly what is found in the file for a match to be made.

'Stretchy' allows space around an object reference to match 1 or more spaces or tabs around an appropriate data type in the file.

'Very Stretchy' allows space around an object reference to match 0 or more spaces or tabs around an appropriate data type in the file.

To highlight the difference between 'Stretchy' and 'Very Stretchy', let us look at two examples.


Example One

For a FIND:

  • Set Value: %r1% | %r2% | %r3%
  • Set Type: Real, Real, Real

Set the Parent EXTRACT file to: 1.2 | 3.50| 12.0

In this case, 'Stretchy' will not match the text due to the lack of any spaces between the second number and the second pipe symbol. However, 'Very Stretchy' would correctly match the text since it does not require any spaces.


Example Two

For a FIND:

  • Set Value: %r1% %r2% %r3%
  • Set Type: Real, Real, Real

Set the Parent EXTRACT file to: 1.2 3.50 12.0

Also consider if the file above instead contains the following text: 1.2 2.3

In this case,'Stretchy' will correctly match only the first line of text, while 'Very Stretchy' will incorrectly match both lines. For the second line of text, 'Very Stretchy' would assign:

  • r1→1.2
  • r2→2.
  • r3→3

The second floating point number is broken into two REAL's as the 'zero or more spaces' requirement allows this.

In general, 'Very Stretchy' is for non-space delimited numbers, 'Stretchy' is for space delimited numbers,


A Sequence of FIND's

If an EXTRACT has a sequence of Child REPEAT/FIND's, the first FIND object attempts a match on the first line of the file and will continue the conditions are satisfied to exit the REPEAT loop. The next subsequent FIND begins matching at the line following the line where the REPEAT was satisfied, unless configured otherwise.