Dataforge is a Doh.js module providing a powerful system for defining and executing data manipulation pipelines using chained commands. It features synchronous (Dataforge
) and asynchronous (AsyncDataforge
) execution contexts, branching, data conversion, and various transformation utilities.
This module bundles several sub-modules:
dataforge_core
: The essential commands and patterns.YAML_dataforge
: Adds YAML conversion commands.nodejs_fs_dataforge
: Adds file system commands (Node.js only).db_dataforge
: Adds sqlite3 through better-sqlite3 for Node, bun:sqlite for Bun, and Alasql for Browser.AsyncDataforge
only)Install the module using the Doh.js CLI:
doh install dataforge
This installs the dataforge
module and its core dependencies. Specific features might require additional Node.js dependencies (like js-yaml
or database drivers) which are typically handled by Doh.js installation processes for those sub-modules.
Dataforge
vs AsyncDataforge
PatternsDataforge operations are executed using instances of either the Dataforge
or AsyncDataforge
patterns.
Dataforge
(Synchronous):
forge()
calls on the same instance (will error).// Requires 'dataforge' module
Doh.Module('my_module', ['dataforge'], function() {
let df = New('Dataforge');
let result = df.forge('initial data', [
{ Append: ' - transformed' }
]);
console.log(result); // Output: 'initial data - transformed'
});
AsyncDataforge
(Asynchronous):
forge()
.Fetch
, Post
).forge()
calls on the same instance are queued and executed sequentially, preventing race conditions for operations relying on shared state within that instance (like local branches or sequential processing).AsyncDataforge
instances for each parallel task.// Requires 'dataforge' module
Doh.Module('my_module', ['dataforge'], async function() {
let adf = New('AsyncDataforge');
let result = await adf.forge('initial data', [
{ Append: ' - asynchronously transformed' }
]);
console.log(result); // Output: 'initial data - asynchronously transformed'
});
Dataforge uses branches to manage data contexts. Each branch holds its own data register and mode settings.
main
branch: The default starting and returning branch.Branch
command.forge()
execution. They are cleared when the forge()
call completes.Global
command.forge()
calls on all Dataforge instances (synchronous and asynchronous).AsyncDataforge
instances.Branch
or Global
, an anonymous branch (e.g., anon0
, anon1
) is created.Replace
, Append
, Prepend
)The mode determines how the result of a command updates the data in the current branch.
Replace
(Default): The command's result overwrites the branch's current data.Append
: The command's result is appended to the branch's data (string concatenation, array concat, object merge).Prepend
: The command's result is prepended to the branch's data.Modes can be set persistently using Mode
or temporarily for the next command using Replace
, Append
, or Prepend
.
Commands are passed as an array to the forge()
method.
"CommandName"
{ "CommandName": argOrArgs }
argOrArgs
can be a single value or an array of values.Import
Replaces the current branch data with the provided argument. Respects the current mode.
// Default mode (Replace)
df.forge('old data', [{Import: "new data"}]); // Result: "new data"
// Append mode
df.forge('old data', ["Append", {Import: " new data"}]); // Result: "old data new data"
Alias: ImportFromValue
ConsoleLog
Logs the current branch data (if no arguments) or the provided arguments to the console.
df.forge('data to log', [
"ConsoleLog", // Logs: 'data to log'
{ ConsoleLog: "Custom message" } // Logs: "Custom message"
]);
Empty
Sets the current branch data to an empty string ''
.
df.forge('some data', ["Empty"]); // Result: ''
Debugger
Triggers the browser/Node.js debugger statement, pausing execution if developer tools are open.
df.forge('data', ["Debugger"]); // Pauses here
Mode
Sets the persistent mode (Replace
, Append
, Prepend
) for the current branch.
df.forge('data', [
{ Mode: "Append" },
{ Import: " more" } // Appends
// Mode remains Append for subsequent commands in this branch
]);
Replace
, Append
, Prepend
(Temporary Mode)Sets the mode for the next command only. Can optionally take data as an argument to perform an immediate Import with that temporary mode.
// Set temp mode for next command
df.forge('data', ["Append", { Import: " next" }]); // Result: "data next"
// Import immediately with temp mode
df.forge('data', [{ Append: " immediate" }]); // Result: "data immediate"
Branch
Executes a sub-pipeline (array of commands) in a specified or anonymous local branch. The current branch data is implicitly copied to the new branch on initialization.
df.forge('main data', [
{ Branch: ['subBranch', [
// Now in 'subBranch', data is 'main data'
{ Append: ' - modified' },
"Return" // Returns 'main data - modified' to main branch
]]}
// Back in 'main' branch
]); // Result: "main data - modified"
Global
Like Branch
, but operates on or creates a global branch that persists across forge()
calls and instances.
// First call
df.forge('initial', [{ Global: ['sharedState', [{ Append: ' first' }]]}]);
// Second call (same or different instance)
df.forge('something else', [
{ From: 'sharedState' } // Retrieves 'initial first'
]);
Return
Exits the current branch and returns control to the outer (calling) branch. If called from the main
branch, it exits the entire forge()
execution.
df.forge('outer', [
{ Branch: ['inner', [
{ Import: 'inner value' },
{ Return: 'explicit return' } // Returns 'explicit return'
]]}
]); // Result: 'explicit return'
From
Imports data from the specified branch into the current branch (respecting current mode). Does nothing if the source branch doesn't exist or is empty.
df.forge('main', [
{ Branch: ['source', [{Import: 'source data'}]] },
{ Append: ' - ' },
{ From: 'source' }
]); // Result: "main - source data"
Alias: ImportFrom
, ImportFromBranch
To
Exports the current branch's data to the specified local branch (respecting the current branch's mode). Initializes the target branch if it doesn't exist. Does not switch the current branch.
df.forge('export this', [
{ To: 'targetBranch' },
// 'targetBranch' now contains 'export this'
// Current branch remains 'main'
{ From: 'targetBranch' }
]); // Result: "export this"
Alias: ExportTo
, ExportToBranch
ToGlobal
Like To
, but exports to a specified global branch.
df.forge('export global', [{ ToGlobal: 'sharedTarget' }]);
Alias: ExportToGlobal
CloneTo
Performs a deep clone of the current branch's data and replaces the data in the target branch with the clone. Initializes the target branch if needed.
df.forge({ a: 1 }, [
{ CloneTo: 'cloneTarget' },
// 'cloneTarget' now has a distinct copy of { a: 1 }
]);
Delete
Deletes the specified local or global branch. Cannot delete reserved branches (like main
). Switches to main
if the current branch is deleted.
df.forge(null, [
{ Branch: ['temp', [{ Import: 'data' }]] },
{ Delete: 'temp' }
// 'temp' branch no longer exists
]);
Exit
Immediately stops the entire forge()
execution, returning the current data from the branch where Exit
was called.
df.forge('start', [
{ Append: ' step1' },
"Exit",
{ Append: ' step2' } // This is skipped
]); // Result: "start step1"
ExitIfEmpty
Calls Exit
if the current branch data is considered empty/lacks value (LacksValue(data)
).
df.forge('', ["ExitIfEmpty", { Append: ' not empty' }]); // Result: ''
df.forge('data', ["ExitIfEmpty", { Append: ' not empty' }]); // Result: 'data not empty'
If
Conditionally executes a command block based on evaluating SeeIf
conditions against the current data. Conditions can be nested.
df.forge("test value", [
{ If: [ // Conditions array
'IsString', 'And', 'HasValue', 'And', { LengthIsGreaterThan: 5 }
],
// Optional branch name for commands
'conditionalBranch',
// Commands to run if true
[
{ Append: " - condition was true!" }
]
}
]); // Result: "test value - condition was true!"
ConvertToArray
Wraps the current data value in an array. Optionally at a specific index.
df.forge("value", [{ ConvertToArray: 0 }]); // Result: ["value"]
df.forge("value", [{ ConvertToArray: 2 }]); // Result: [undefined, undefined, "value"]
ConvertFromArray
Extracts an element from the current array data at the specified index (default 0).
df.forge(["a", "b"], [{ ConvertFromArray: 1 }]); // Result: "b"
ConvertToObject
Creates an object with a single specified key, using the current data as the value.
df.forge("value", [{ ConvertToObject: "newKey" }]); // Result: { newKey: "value" }
Aliases: ToKey
, ConvertToObjectWithKey
ConvertFromObject
Extracts the value associated with the specified key from the current object data.
df.forge({ id: 123, name: "Test" }, [{ ConvertFromObject: "name" }]); // Result: "Test"
Alias: FromKey
ConvertToJSON
Serializes the current data (object/array) into a JSON string.
df.forge({ a: 1 }, ["ConvertToJSON"]); // Result: '{"a":1}'
ConvertFromJSON
Parses the current data (JSON string) into a JavaScript object/array.
df.forge('{"a":1}', ["ConvertFromJSON"]); // Result: { a: 1 }
ConvertToString
Converts the current data to its string representation.
df.forge(123, ["ConvertToString"]); // Result: "123"
ConvertToNumber
Converts the current data (string) to a number.
df.forge("42.5", [{"ConvertToNumber": null}]); // Result: 42.5
Dataforge provides a rich Handlebars templating engine, including dynamic protocol handlers for advanced content generation (e.g., {{file:path/to/file.ext}}
).
For detailed documentation on all Handlebars features, including template commands (ApplyHandlebars
, ToHandlebar
, etc.) and the Handlebar Protocol System, please see the dedicated Handlebars Sub-Module README.
FromRef
Uses Doh.parse_reference
to extract a value from the current data using a dot-notation path.
df.forge({ a: { b: { c: 123 }}}, [{ FromRef: "a.b.c" }]); // Result: 123
Alias: ImportFromRef
MeldDeep
Performs a deep merge (Doh.meld_deep
) of the argument object into the current branch's data object.
df.forge({ a: { b: 1 }}, [{ MeldDeep: { a: { c: 2 }}}]);
// Result: { a: { b: 1, c: 2 }}
MeldDeepFrom
Performs a deep merge of the data from a specified local branch into the current branch's data.
df.forge({ a: 1 }, [
{ Branch: ['source', [{ Import: { b: 2 } }]] },
{ MeldDeepFrom: 'source' }
]); // Result: { a: 1, b: 2 }
MeldDeepFromGlobal
Performs a deep merge of the data from a specified global branch into the current branch's data.
df.forge({ a: 1 }, [
{ Global: ['globalSource', [{ Import: { b: 2 } }]] },
// In a separate forge call or later in the same one:
{ MeldDeepFromGlobal: 'globalSource' }
]); // Result: { a: 1, b: 2 }
Each
Iterates over the current data (if array or object) and executes a sub-pipeline for each item. The result of the sub-pipeline replaces the original item/value.
// Array
df.forge([1, 2], [{ Each: [[{ IncrementNumber: null }]] }]);
// Result: [2, 3]
// Object
df.forge({ a: 1, b: 2 }, [{ Each: [[{ IncrementNumber: null }]] }]);
// Result: { a: 2, b: 3 }
(These commands operate on the current branch data if it's a string)
Trim
: Removes leading/trailing whitespace.LTrim
: Removes leading whitespace.RTrim
: Removes trailing whitespace.ToTitleCase
: Converts to Title Case.ToUpperCase
: Converts to UPPERCASE.ToLowerCase
: Converts to lowercase.ToCamelCase
: Converts to camelCase.ToSnakeCase
: Converts to snake_case.ToKebabCase
: Converts to kebab-case.ToPascalCase
: Converts to PascalCase.df.forge(" some string ", ["Trim", "ToTitleCase"]); // Result: "Some String"
(These commands operate on the current branch data if it's a number)
RoundNumber
: Rounds to nearest integer.FloorNumber
: Rounds down.CeilNumber
: Rounds up.TruncateNumber
: Removes decimal part.IncrementNumber
: Adds 1.DecrementNumber
: Subtracts 1.df.forge(5.7, ["FloorNumber", "IncrementNumber"]); // Result: 6
These commands clean or modify strings to make them safe for specific contexts.
SanitizeInput
: General-purpose sanitization (removes control chars, zero-width chars, etc.).SanitizeAlphaNumeric
: Keeps only letters and numbers.SanitizeNumber
: Keeps only numbers.SanitizeEmail
: Removes characters invalid in email addresses.SanitizePhone
: Removes characters invalid in phone numbers.SanitizeURL
: Removes characters invalid in URLs.SanitizePath
: Removes characters invalid in file paths.SanitizeFilename
: Removes characters invalid in filenames.SanitizeUsername
: Alias for SanitizeAlphaNumeric
.SanitizePassword
: Removes characters invalid in typical passwords.SanitizeToken
: Removes characters invalid in typical tokens.SanitizeCode
, SanitizeHTML
, SanitizeSQL
, SanitizeJSON
, SanitizeXML
, SanitizeCSS
, SanitizeJS
, SanitizeMarkdown
, SanitizeYAML
: Remove characters generally invalid for the respective formats.EscapeHTML
: Escapes <
, >
, &
, "
, '
for safe HTML display.EscapeJSON
: Escapes characters problematic within JSON strings.StripHTML
: Removes HTML tags.RemoveColorCodes
: Removes ANSI terminal color codes.df.forge("<script>alert('bad')</script>", ["StripHTML", "EscapeHTML"]);
// Result: "alert('bad')"
AsyncDataforge
only)These commands perform I/O and require using an AsyncDataforge
instance and await
.
Fetch
Fetches data from a URL using axios
(Node.js) or Doh.ajaxPromise
(Browser). Uses current data as URL if no argument provided.
let result = await adf.forge(null, [{ Fetch: "https://example.com/api/data" }]);
ImportFromURL
Fetches data using Doh.ajaxPromise
(browser-focused, forces HTTP).
let result = await adf.forge(null, [{ ImportFromURL: "/api/data.json" }]);
Post
Sends the current branch data as a POST request body to the specified URL.
let postData = { id: 1, value: "test" };
let result = await adf.forge(postData, [{ Post: "/api/submit" }]);
ForgeOnServer
Sends the current data and an array of commands to a server-side endpoint (/dataforge/forge
) for remote execution.
let result = await adf.forge('filename.txt', [
{ ForgeOnServer: ["FromFile"] } // Ask server to read file
]);
YAML_dataforge
sub-module)ConvertToYAML
Serializes the current object/array data to a YAML string.
df.forge({ a: 1, b: [2, 3] }, ["ConvertToYAML"]);
// Result: "a: 1\nb:\n - 2\n - 3\n"
ConvertFromYAML
Parses the current YAML string data into a JavaScript object/array.
df.forge("a: 1\nb: [2, 3]", ["ConvertFromYAML"]);
// Result: { a: 1, b: [2, 3] }
nodejs_fs_dataforge
sub-module) [Node.js Only]These commands interact with the local file system and only work in a Node.js environment. They work with both Dataforge
(sync) and AsyncDataforge
(async).
ChangeDir
: Changes the current working directory for subsequent file operations within the forge instance. [{ ChangeDir: "path/to/dir" }]
FromFile
: Reads content from a file path (provided as argument or from current data). [{ FromFile: "file.txt" }]
ToFile
: Writes the current data to a file path. [{ ToFile: "output.txt" }]
CopyFile
: Copies a file. [{ CopyFile: ["source.txt", "dest.txt"] }]
CopyFolder
: Recursively copies a folder. [{ CopyFolder: ["src_dir", "dest_dir"] }]
FromFolder
: Reads directory contents (returns array of filenames/subdirs). [{ FromFolder: "./dir" }]
FromFolderToList
: Reads contents of multiple files in a folder into an object keyed by filename. [{ FromFolderToList: "./data_files" }]
FromListToFolder
: Writes an object (keyed by filename) to multiple files in a directory. [{ FromListToFolder: "./output_dir" }]
FromStringToPathToGlobal
: Sanitizes the current data string into a path-safe string and stores it in a global branch. [{ FromStringToPathToGlobal: "globalBranchName" }]