Table of Contents

Search

  1. Preface
  2. Introduction to Data Transformation
  3. Data Processor Transformation
  4. Wizard Input and Output Formats
  5. Relational Input and Output
  6. XMap
  7. Libraries
  8. Schema Object
  9. Command Line Interface
  10. Scripts
  11. Parsers
  12. Script Ports
  13. Document Processors
  14. Formats
  15. Data Holders
  16. Anchors
  17. Transformers
  18. Actions
  19. Serializers
  20. Mappers
  21. Locators, Keys, and Indexing
  22. Streamers
  23. Validators, Notifications, and Failure Handling
  24. Validation Rules
  25. Custom Script Components

Data Transformation User Guide

Data Transformation User Guide

PdfToTxt_3_02

PdfToTxt_3_02

The
PdfToTxt_3_02
document processor converts PDF files to text.
The following table describes the properties of the
PdfToTxt_3_02
document processor:
Property
Description
enabled
Defines the value of
param2
or
param4
.
param1
Defines a string or variable that contains the word spacing factor. The
param1
property is named
WordSpacingFactor
and has only one property,
value
, which contains the string or variable. Default is 1.8.
param2
Determines whether the output document is optimized for tables. The
param2
property is named
OptimizeForTables
and has only one property,
enabled
, which has the following options:
  • Selected. The output document is optimized for tables.
  • Cleared. The output document is not optimized for tables.
Default is cleared.
param3
Defines a string or variable that contains the password. The
param3
property is named
Password
and has only one property,
value
, which contains the string or variable.
param4
The
param4
property is named
HideNewPageChar
and has only one property,
enabled
, which has the following options:
  • Selected. New page characters are hidden.
  • Cleared. New page characters are not hidden.
Default is cleared.
param5
Defines a string or variable that contains advanced optimizations. The
param5
property is named
AdvancedOptimizations
and has only one property,
value
, which contains the string or variable.
value
Defines the value of
param1
,
param3
, or
param5
.
The PdfToTxt pre-processor might not support certain PDFs with embedded fonts. If the pre-processor fails, copy the text from the input PDF into Notepad to check for embedded fonts. If you cannot paste the text or if is corrupted, the PDF probably contains embedded fonts.
This component is deprecated. The IntelliScript editor displays it for legacy projects. Do not use it in new Scripts.


Updated September 26, 2018