Sitecore CDP Batch API Import with PowerShell scripts
To import a batch file into Sitecore CDP using the Batch API, these are the steps
- Create a batch file following the correct format
- No "Insert" is support, only "Upsert"!
- The file itself is not a valid JSON! It's multiple JSON in a single file, each JSON content are on a single line!
- "gzip" the batch file
- On Windows, can just use 7zip
- Or the PowerShell scripts below
- Get MD5 checksum and the file size in byte
- Use the PowerShell scripts below
- Or use this online tool: https://emn178.github.io/online-tools/md5_checksum.html
- Batch API call: Create the batch and get the AWS upload URL using the Batch API
- PUT method
- Generate a new UUID for identifying your import and use this UUID on your API call: https://api.boxever.com/v2/batches/[[Your UUID]]
- Use your api key and secret in basic authentication as username and password
- MD5 checksum value should be lower case!
- Request details can be found here: https://doc.sitecore.com/cdp/en/developers/sitecore-customer-data-platform--data-model-2-1/importing-a-batch-file-into-sitecore-cdp.html
- Get base64 string of the MD5 checksum from HEX value
- Use the PowerShell scripts below
- Or use this online tool: https://base64.guru/converter/encode/hex
- AWS API call: Upload the gzip file to AWS upload URL
- Headers:
- Content-Md5: use the value from the previous step
- x-amz-server-side-encryption: AES256
- Batch API call: Check the status of the batch import process
- View error logs if the import failed
- Open the link in browser, it downloads as "gz" file. Unzip and can check individual errors
- Common Errors
- {"ref":"1e20d1e1-07f0-4863-92f1-4d2e15c86a59","code":"400","message":"Not enough identifying information"}
- Enough fields must be provided so the customer can be identified
- {"ref":"null","code":"400","message":"Failed to parse import line"}
- Ensure that one JSON is flatted out in a single line, do not beautify the JSON!
- If the batch status says corrupted, ensure that MD5 checksum is correct and lower case!
Below is a PowerShell Scripts that
- Generates a new UUID
- "gzip" the batch file
- Outputs the file size of the gzip file in bytes
- Outputs the lower case version of the MD5 checksum of the gzip file
- Outputs the base64 value from the HEX value of the MD5 checksum
[CmdletBinding()]
param (
[string]
$filePath
)
Function Gzip-File([ValidateScript({ Test-Path $_ })][string]$File) {
$srcFile = Get-Item -Path $File
$newFileName = "$($srcFile.FullName).gz"
try {
$srcFileStream = New-Object System.IO.FileStream($srcFile.FullName, ([IO.FileMode]::Open), ([IO.FileAccess]::Read), ([IO.FileShare]::Read))
$dstFileStream = New-Object System.IO.FileStream($newFileName, ([IO.FileMode]::Create), ([IO.FileAccess]::Write), ([IO.FileShare]::None))
$gzip = New-Object System.IO.Compression.GZipStream($dstFileStream, [System.IO.Compression.CompressionMode]::Compress)
$srcFileStream.CopyTo($gzip)
}
catch {
Write-Host "$_.Exception.Message" -ForegroundColor Red
}
finally {
$gzip.Dispose()
$srcFileStream.Dispose()
$dstFileStream.Dispose()
}
}
Write-Host "UUID: " -ForegroundColor Green
(New-Guid).Guid
Gzip-File $filePath
$gfilePath = $filePath + ".gz"
Write-Host "File Size: " -ForegroundColor Green
(Get-Item $gfilePath ).length
$md5 = Get-FileHash -Path $gfilePath -Algorithm MD5
$hash = $md5.Hash.ToLower()
Write-Host "MD5 Hash: " -ForegroundColor Green
$hash
$bytes = [byte[]] -split ($hash -replace '..', '0x$& ')
Write-Host "Content-Md5: " -ForegroundColor Green
[System.Convert]::ToBase64String($bytes)
Save above as PrepBatch.ps1 and invoke like:
.\PrepBatch.ps1 .\input.batch
With the values generated, you can use them with tools like Postman to send all requests.