|
Intensity
|
Data scientists, programmers
|
$5,195/user/year
|
Same day
|
Yes
|
API
Alpha
/**
* \brief Load source and processing script, write target, deallocate source and script memory, caller has target file.
*
* Buffers pSourcePath & pProcessingScript, calls Delta, writes pTargetPath,
* deallocates pSourceBuffer & pProcessingScriptBuffer, does not return pSourceBuffer.
*
* \param pSourcePath
* \param pTargetPath
* \param pProcessingScript
* \param pSizeSourceBuffer pointer to the size of the pSourceBuffer
* \return FAILURE_OKAY on success else various FAILURE_* on error
*/
int32_t Alpha( char *pSourcePath, char *pTargetPath, char *pProcessingScript, uint32_t *pSizeSourceBuffer );
Bravo
/**
* \brief Load processing script, write target, deallocate script memory, caller has target file and deallocates pSourceBuffer.
*
* Buffers pProcessingScript, calls Delta, writes pTargetPath,
* deallocates pProcessingScriptBuffer, does not return pSourceBuffer.
*
* \param pSourceBuffer
* \param pTargetPath
* \param pProcessingScript
* \param pSizeSourceBuffer pointer to the size of the pSourceBuffer
* \return FAILURE_OKAY on success else various FAILURE_* on error
*/
int32_t Bravo( uint8_t *pSourceBuffer, char *pTargetPath, char *pProcessingScript, uint32_t *pSizeSourceBuffer );
Charlie
/**
* \brief Load processing script, deallocate script, returns target (pSourceBuffer), caller writes target
* and deallocates pSourceBuffer.
*
* Buffers pProcessingScript, calls Delta, deallocates pProcessingScriptBuffer, returns altered pSourceBuffer.
*
* \param pSourceBuffer
* \param pProcessingScript
* \param pSizeSourceBuffer pointer to the size of the pSourceBuffer
* \return FAILURE_OKAY on success else various FAILURE_* on error
*/
int32_t Charlie( uint8_t *pSourceBuffer, char *pProcessingScript, uint32_t *pSizeSourceBuffer );
Delta
/**
* \brief Returns target (pSourceBuffer), caller has target (pSourceBuffer) and deallocates pSourceBuffer
* and pProcessingScriptBuffer.
*
* Delta is the class called by all others.
*
* \param pSourceBuffer
* \param pProcessingScriptBuffer
* \param pSizeSourceBuffer pointer to the size of the pSourceBuffer
* \param pMaxSourceBuffer: size of pSourceBuffer when malloc'd
* \return FAILURE_OKAY on success else various FAILURE_* on error
*/
int32_t Delta( uint8_t *pSourceBuffer, uint8_t *pProcessingScriptBuffer, uint32_t *pSizeSourceBuffer, uint64_t pMaxSourceBuffer );
Echo
/**
* \brief processing script is populated, load source, write target, deallocate source memory, caller has target file.
*
* Buffers pSourcePath & pProcessingScript, calls Delta, writes pTargetPath,
* deallocates pSourceBuffer & pProcessingScriptBuffer, does not return pSourceBuffer.
*
* \param pSourcePath
* \param pTargetPath
* \param pProcessingScriptBuffer
* \param pSizeSourceBuffer pointer to the size of the pSourceBuffer
* \return FAILURE_OKAY on success else various FAILURE_* on error
**/
int32_t Echo( char *pSourcePath, char *pTargetPath, uint8_t *pProcessingScriptBuffer, uint32_t *pSizeSourceBuffer );
Scripting
-
Beautify
-
BeautifyXML [1 of 2]
-
Format code, indented with tabs.
-
Syntax: BeautifyXML
-
BeautifyXML [2 of 2]
-
Format code, indented with tabs and then remove the space between <End> and </End> and <A*> and </A> tags.
-
Syntax: BeautifyXML|FIX_END|
-
Change
-
ChangeTag
-
Replace chosen field (1 is tag1 else tag2) with replacement after locating tag1 then tag2 in sequence.
-
Syntax: ChangeTag|1 for tag1|tag1|tag2|replacement
-
ChangeWrappedString
-
Find signature, scan for opening and closing double quotes replace text between quotes with replacement.
-
Syntax: ChangeWrappedString|signature|replacement
-
Clean
-
CleanXML
-
Remove tabs, carriage returns, line feeds and hidden chars when writing final output. Note that StripXML has a higher priority than FormatXML in that StripXML will be used even if FormatXML and StripXML are declared in this file.
-
Syntax: CleanXML
-
Conceal
-
ConcealBlankTags
-
Hide passed tag sequence if there is nothing between the tags.
-
Syntax: ConcealBlankTags|tag_open|tag_close|
-
ConcealSpecialTags
-
Hide tags and the data between in a way that finds tags that contain carriage returns/line feeds between elements.
-
Syntax: ConcealSpecialTags|tag_open|tag_contains|tag_close|insert_to_start_of_buffer
-
Confirm
-
ConfirmField
-
Validates a field to ensure it exists, is set as a name => value pair in PHP array format, with leading and trailing characters verified to ensure it is correct format to be loaded and processed within another PHP script.
-
Syntax: ConfirmField|String|
-
Correct
-
CorrectQP
-
Repair quoteable-printable 7-bit email encoding.
-
Syntax: CorrectQP
-
Eliminate
-
EliminateBinary
-
Force deletion of the passed binary data where the second string of three is one, two or three 8-bit binary values represented in hexadecimal format. Each hex value must take the form of ‘0xFF’ where ‘0x’ is the hex prefix and ‘FF’ can be any hex value from ‘00’ to ‘FF’. Up to three hex values can be passed in the format of ‘0xFF0xFF0xFF’.
-
Syntax: EliminateBinary|tag_open|hex_binary|tag_close|
-
EliminateBytes
-
Delete all occurrences of 8-bit binary values represented in hexadecimal format. Each hex value must take the form of ‘XX’ where ‘XX’ is any hex value from ‘00’ to ‘FF’.
-
Syntax: EliminateBytes|pairs_of_hex_values|
-
EliminateContent
-
Delete data that starts with from and ends with to, where to is retained if 0 is passed
-
Syntax: EliminateContent|from|to|0|
-
EliminateContentAll
-
Delete all data that starts with from and ends with to, where to is retained if 0 is passed.
-
Syntax: EliminateContentAll|from|to|0
-
EliminateField
-
Delete FieldNumber (range 0-many) after Begin and before End by counting occurrences of FieldDelimiter.
-
Syntax: EliminateField|Begin|End|FieldNumber|FieldDelimiter|
-
EliminateFirstLine
-
Eliminate first line if it matches ToFind
-
Syntax: EliminateFirstLine|ToFind|
-
EliminateFirstToEnd
-
Locate Open, scan forward to First, eliminate until first occurence of End.
-
Syntax: EliminateFirstToEnd|Open|First|End|
-
EliminateForward
-
Eliminate all data that starts with Begin, starts with Next and ends with a line feed (0x0A).
-
Syntax: EliminateForward|Begin|Next|
-
EliminateFromTo
-
Eliminate all occurrences of Begin to End where End must occur after Begin.
-
Syntax: EliminateFromTo|Begin|Next|
-
EliminateLFs
-
Removes line terminators.
-
Syntax: EliminateLFs
-
EliminateLines
-
Eliminate every Line-Feed (0x0A) terminated line that contains Begin.
-
Syntax: EliminateLines|Begin|
-
EliminateOnLine
-
Eliminate all data on the same line that starts with Begin, ends with End.
-
Syntax: EliminateOnLine|Begin|End|
-
EliminatePattern
-
Eliminate all occurrences of Pattern with a trailing Signature.
-
Syntax: EliminatePattern|Pattern|Signature|
-
EliminateSpan
-
Eliminate all data that starts with Begin and ends with End.
-
Syntax: EliminateSpan|Begin|End|
-
EliminateString
-
Delete all occurrences of passed string from buffer.
-
Syntax: EliminateString|string_to_delete|
-
EliminateTag
-
Delete data that starts with tag_open and, if not passed, ends with '>'.
-
Syntax: EliminateTag|tag_open|
-
EliminateTag2
-
Locate exact match to tag_open, scan for exact match to tag_next_to_delete, and then delete tag_next_to_delete. Note that this is potentially dangerous in that tag_open and tag_next_to_delete could be separated in context and result in invalid data deletion. It also has limited use in that it could leave a mess behind of deleted opening tags with left over closing tags.
-
Syntax: EliminateTag2|tag_open|tag_next_to_delete|
-
Extract
-
ExtractText
-
Determine if source file is a supported document type.
-
Currently:
Microsoft Office Word .docx Excel .xlsx PowerPoint .pptx
-
If it is a supported documents then the text is extracted and provided for use by additional commands in the processing script.
-
Syntax: ExtractText
-
Preserve
-
PreserveMemory
-
Preserve file memory buffer to a file. This command is useful when creating a new processing script because it can write the file buffer at any stage of processing.
-
Syntax: PreserveMemory
-
Provisional
-
ProvisionalUpdate
-
If tag_find not found then insert it before tag_after.
-
Syntax: ProvisionalUpdate|tag_find|tag_after|
-
Put
-
PutBetweenTags
-
Locate exact match to tag_open scan for exact match to tag_next, and then insert tag_to_insert_between_open_and_next between tag_open and tag_next.
-
Syntax: PutBetweenTags|tag_open|tag_next|tag_to_insert_between_open_and_next|
-
PutBinaryPostfix
-
Append passed string with Adobe InDesign-specific binary line feed data. See the notes in InsertBinaryPrefix.
-
Syntax: PutBinaryPostfix|tag|
-
PutBinaryPrefix
-
Prepend passed string with Adobe InDesign-specific binary line feed data. Note that the binary data is embedded to force InDesign to drop lines feeds after tag closure. This is only needed when the InDesign tag formatting does not specifically call for a line feed to be dropped after a tag closes. It would be best to avoid using InsertBinaryPrefix and InsertBinaryPostfix by handling all line feeds through tag formatting within InDesign.
-
Syntax: PutBinaryPrefix|tag|
-
PutField
-
Put Insert at FieldNumber (range 0-n) after Begin and before End counting occurrences of FieldDelimiter.
-
Syntax: PutField|Begin|End|Insert|FieldNumber|FieldDelimiter|
-
PutPostfix
-
Insert a string at the end of the file memory buffer.
-
Syntax: PutPostfix|String|
-
PutPostfixLine
-
Put Add before each Line Feed (0x0A).
-
Syntax: PutPostfixLine|Add|
-
PutPrefix
-
Insert a string at the start of the file memory buffer.
-
Syntax: PutPrefix|String|
-
PutPrefixField
-
Prefix FieldNumber (range 0-n) with Prefix that starts with DelimiterBegin and ends with DelimiterEnd, replacing delimiters with ReplaceBegin and ReplaceEnd on each line.
-
Syntax: PutPrefixField|Prefix|DelimiterBegin|DelimiterEnd|ReplaceBegin|ReplaceEnd|LineMax|
-
PutPrefixLine
-
Prefix each line in the buffer with a concatination of Prefix + Delimiter.
-
Syntax: PutPrefixLine|Prefix|Delimiter|
-
PutString
-
Put a concatination of Field + Delimiter + Filename + Delimiter before each instance of Tag.
-
Syntax: PutString|Tag|Field|Delimiter|Filename|
-
Reduce
-
ReduceLineTerminators
-
Reduce all extraneous line feeds.
-
Syntax: ReduceLineTerminators
-
ReduceSpaces
-
Reduce all extraneous spaces.
-
Syntax: ReduceSpaces
-
Remove
-
RemoveBetween
-
Remove data between START and END.
-
Syntax: RemoveBetween|Start|End|
-
RemoveWithout
-
If Find is not found in the memory buffer then replace all memory buffer content with Replace.
-
Syntax: RemoveWithout|Find|Replace|
-
RemoveWrapper
-
Purge <tag_open> and </tag_open> if located on tag_level and followed by tag_after at tag_level+1.
-
Syntax: RemoveWrapper|tag_level|tag_open|tag_after|
-
Set
-
SetClosingTag
-
Locates tag_open and tag_close when they are both positioned at the same tag level, and then replaces tag_close with new_tag_close.
-
Syntax: SetClosingTag|tag_open|tag_close|new_tag_close|
-
SetFieldDelimiter
-
Set field delimiter within curly brackets
-
Syntax: SetFieldDelimiter{delchar}
-
Swap
-
SwapAtNestedLevel
-
Substitute tag_open located at tag_level with new_tag_open and then replaces matching closing TAG with new_tag_close. Note that if before is passed then it has to exist before tag_open for the changes to be made.
-
Syntax: SwapAtNestedLevel|tag_level|before|tag_open|new_tag_open|new_tag_close|
-
SwapNested
-
Complex search and substitution for data nested from two to three tag levels.
-
Syntax: SwapNested|sig_tag_root|sig_nested_1_tag|sig_nested_2_tag|sig_tag_close|replace_open|replace_close|
-
SwapNext
-
Change first occurrences of from to to from start of file buffer
-
Syntax: SwapNext|from|to|
-
SwapOutward
-
Search for primary opening and closing tags. If found, search backward and forward for secondary tags. If found, perform substitution. Why? Because some tags are so generic that SwapNested fails.
-
Syntax: SwapOutward|tag_open|tag_close|previous_tag_open|previous_tag_close|replace_open|replace_close|0=Do not extract text, 1=Extract Test|
-
SwapStrings
-
Change all occurrences of from to to.
-
Syntax: SwapStrings|from|to|
-
SwapTags
-
Swap sig_tag_open and sig_tag_close with replace_open and replace_close, keeping the data between.
-
Syntax: SwapTags|sig_tag_open|sig_tag_close|replace_open|replace_close|
-
Transfer
-
TransferBlock
-
Locate <tag_open> and </tag_open> located at tag_level_from.
-
If before is populated then determine if it precedes <tag_open> one level before.
-
Do not make any changes if it does not.
-
Extract the data between <tag_open> and </tag_open>,
hide <tag_open>data</tag_open>, move down till Tag Level == tag_level_to then start a new block using the passed parameters making sure to include the extracted data.
-
Note : tag_open does not have to have a leading '<'.
-
Syntax: TransferBlock|tag_level_from|tag_level_to|before|tag_open|replace_open|replace_close|
-
Transform
-
TransformLFs
-
Change Linefeed (0x0A) and Carriage Return (0x0D) ASCII values to "^LF^" and "^CR^"
-
Syntax: TransformLFs
|
|
Alteryx Designer
|
Complex workflows, analytics
|
$5,195/user/year start
Hidden Fees: Most users pay:
|
2-4 weeks
|
Minimal
|
General
In Depth
|
Alteryx Designer Cloud Was Trifacta
|
Visual data prep, AI suggestions
|
$10,000+/year
|
2-4 weeks
|
No
|
General
|
|
Apache Spark
|
Big data processing
|
Free (open source)
|
3-6 weeks
|
Yes
|
General
|
|
Dataiku
|
Enterprise ML workflows
|
$50,000+/year
Starts ~$48,000/year
- Enterprise plans well into six figures
|
4-8 weeks
|
Minimal
|
General
In Depth
|
|
Datameer
|
Cloud data platforms
|
$25,000+/year
|
2-4 weeks
|
No
|
General
|
|
Informatica Data Quality
|
Enterprise, compliance
|
$200,000+/year
Small implementations (2-5 users):
Mid-size deployments (10-20 users):
Enterprise licenses (50+ users):
- $750,000-$2,000,000+/year
|
3-6 months
|
Minimal
|
General
In Depth
|
|
KNIME
|
Data science workflows
|
KNIME Analytics Platform:
KNIME Business Hub
KNIME Server
|
2-3 weeks
|
Minimal
|
General
In Depth
|
|
Mammoth Analytics
|
Business analysts, no-code wrangling
|
$16/month
|
1-3 days
|
No
|
General
|
|
Microsoft Power Query
|
Excel/Power BI users
|
Included with Office
|
1 week
|
Minimal
|
General
|
|
OpenRefine
|
Small datasets, budget-conscious
|
Free (open source)
|
Same day
|
No
|
General
|
|
Python pandas
|
Data scientists, programmers
|
Free (open source)
Use of the API is complex and highly granular where Python code must be developed to utilize several subpackages.
|
1-2 weeks
|
Yes
|
General
API
|
|
R tidyverse
|
Statisticians, researchers
|
Free (open source)
|
1-2 weeks
|
Yes
|
General
Packages
|
|
SQL (various platforms)
|
Database-heavy workflows
|
Varies
|
Varies
|
Yes
|
General
|
|
Tableau Prep
|
Tableau users, visual flows
|
$900/user/year
Hidden Fees: Real Cost:
5-user team pays $54,000/yr
- $504/user/year Tableau Explorer
- $180/user/year Tableau Viewer
A mid-sized analytics team pays:
- 5 Creators: $54,000/year
- 10 Explorers: $5,040/year
- 25 Viewers: $4,500/year
Total: $63,540/year
|
1-2 weeks
|
No
|
In Depth
|
|
Talend Data Fabric
|
Mid-market, integration needs
|
$50,000-200,000+/year
Open Studio:
Cloud Starter:
Cloud Premium:
Data Fabric Enterprise:
Professional services:
- $50,000-200,000 for complex implementations
Training and certification:
- $5,000-15,000 per developer
Infrastructure costs:
- Cloud computing resources for data processing
Maintenance overhead:
- Dedicated ETL developers and administrators
Integration complexity:
- Custom connector development for unique systems
|
4-6 weeks
|
Minimal
|
General
In Depth
|