Text File

 

 

 

Text File

 

The Text File source allows any fixed-width or delimited text file to be processed and transformed through the pasTransfer product.

 

Source File Format Page

 

The Source File Format Page allows the user to select the format of the file, encoding, delimiters, and enhanced options that describe how the file is formatted and processed.

 

          Format Tab

Auto Detect: pasTransfer will attempt to determine all applicable settings for the source document.

File Format: Whether the source file is delimited (columns are separated by predetermined character strings) or fixed-width (columns are aligned vertically such that all rows have the same length).

Encoding:  Specifies the encoding used to read source files.  If you are not sure of the file encoding, try auto detect.  If auto-detect does not work, try Unicode (UTF-8).

Row Delimiter: The character or set of characters that end each line.  This is usually {Cr}{Lf} in Windows files and often just {Lf} in Unix files.

Column Names:  Check this option to use column names found in the first row.  If column names do not exist, the product will generate names for you.

Text Qualifier: A character (usually double quotation marks) used to differentiate between text and non-text columns in a file.

Preview Text Box:  Displays a preview of the text file that has been selected as the source file.

 

          Delimited Tab

Column Delimiter: Only available when delimited option is selected.  The character used to separate columns in a delimited text file.

 

          Fixed Tab

Fixed File Editor: Only available when fixed-width option is selected.  Click inside the editor to mark the end of each column in a fixed-width file.  At the top of the editor is a ruler used to read the character position.

 

          Rules Tab

                    Rules are sets of regular expressions that instruct the text file processor whether a line of data should be imported from the source text file.  Rules consist of the following properties:

Purpose: The Purpose is used to instruct the the text file processor how the rule needs to be evaluated.  The available purposes are:

oProcess -  Only a process rule will result in lines of data being imported.  In order for a line to be valid for import, the regular expression for a process rule must match the line of data.  Additionally, there must be no defined start rules or the process rule must be in the same group as the active group.

oStart - The start rule is used to define the starting point for the group processing.  If the regular expression of a start rule matches a line of data, the defined group becomes active.  A start rule is always evaluated when no group is active and when no stop rule is defined for the group.

oStop -  The stop rule is used to to define the stopping point for the group processing.  If the regular expression of a stop rule matches a line of data and the stop rule is contained within the active group, the group becomes inactive, allowing more start rules to be evaluated.  Omission of a stop rule allows start rules to be evaluated for each line of data, even if a group is already active.

oSkip -  The skip rule is used to skip one or more lines in the source document.  If the regular expression of a skip rule matches a line of data and the skip rule is contained within the active group, the matched line is skipped over and not processed.

Group: The Group is used to group together rules that work to create a processing region in the text file.  Once a group begins processing, only rules within that group or not contained in a group will be evaluated until the group ends.

Skip: The Skip is the number of rows to skip ahead in the text file after the rule is matched and the purpose has been evaluated.

Regular Expression: The regular expression that, when matched to a line of source data, causes the rule to become active.

 

                    Taking into account the text file below:

                    H,ACME Inc.,

                    L,31.13,RoadRunner Food,112-123-01,

                    T,1.87,

                    H,ACME LLC,

                    L,97.08,Steam Roller,101-103-01,

                    T,5.82,

                    H,ACME Inc.,

                    L,12.24,RoadRunner Trap,030-123-04,

                    T,0.73,

                    H,ACME LLC,

                    L,15.15,Invisi-Rope,100-103-07,

                    T,0.91,

 

          To import all lines that begin with an L:

 

Purpose

Group

Skip

Regular Expression

Process

ACME Inc.

0

^L,

                    

          To import all lines that begin with an L for ACME Inc. only:

 

Purpose

Group

Skip

Regular Expression

Start

ACME Inc.

1

^H,ACME Inc.,$

Process

ACME Inc.

0

^L,

Stop

ACME Inc.

0

^H,ACME LLC,

 

          To import all lines that begin with an L for both ACME Inc. and ACME LLC, while retaining the header group:

 

Purpose

Group

Skip

Regular Expression

Start

ACME Inc.

1

^H,ACME Inc.,$

Start

ACME LLC

1

^H,ACME LLC,$

Process

 

2

^L,

 

          Advanced Tab

Line Range: A line is a single rows that appear in the source file and are typically separated from one another with an operating system specific character sequence.  Select the line number which begins that data that is desired to be processed (-1 marking beginning of file) and the line number which ends the data that is desired to be processed (-1 marking end of file).

Row Range: Rows are one or more lines of data in a file that comprise a single related set of data.  Typically, one line is the equivalent of one row, but some systems may be different.  Select the row number which begins that data that is desired to be processed (-1 marking beginning of file) and the row number which ends the data that is desired to be processed (-1 marking end of file).

Max Rows: The maximum number of rows to read.  The default value of -1 means read all rows.

Allow Variable Columns: Sometimes files may have fewer (or more) columns than expected.  When this is checked, you will not get an error - just empty values in place of missing columns.

Skip Blank Rows: If a blank line is detected in the file it will be skipped when checked.  This will not count against a row range filter.

Trim Whitespace: This will strip out any leading or trailing spaces from the data.  Very useful for fixed files, which can tend to be padded with whitespace if the value does not fit the width of the column.

Chunk Size:  Number of bytes to read from the source file at a time.  Tuning this setting has little effect on small files, but larger values may be useful on large files.

Comment Character: Any row that starts with the character specified here will be skipped.

Escape Character: Only available when delimited option is selected.  When text qualifiers are used, this is the character that escapes the text qualifier itself when used in a value.

 

Data Record Column Macros Page

 

The Data Record Column Macros Page will display the columns detected within the text file and provide a means of mapping columns to a Vendor Line.  Clicking and dragging an item from the list of columns to a macro-enabled text box or grid will create a macro to reference the column.

Preview:  This will display sample data from the source file as a table to verify the correct data has been selected.  It will not display all rows, but only a sample size.

Reset:  Clears the contents of all controls which contain column macros.

 

          General Tab

Account Code:  The identifying account code for the account.  This is a macro-enabled field.

Account Type:  The type of the account.  This is a macro-enabled field.

Description:  The description for an account.  This is a macro-enabled field.

Currency:  The currency for an account.  This is a macro-enabled field.

Active:  The active status for the account.  This is a macro-enabled field.

Site:  The site for the account.  This is a macro-enabled field.

 

          Attributes Tab

          Any values from the source file can be made into attributes and mapped later on the destination transformation.  Click and drag a column to the attribute data grid.  The default name of the attribute will be the name of the column.  To delete a row, select the row-header and press the delete button.

 


Copyright © 2024 pasUNITY, Inc.

 

Send comments on this topic.