Nidhi Gupta
3 min readDec 24, 2023

--

Compression and Decompression of files using copy activity in the ADF pipeline

Decompression:

Decompression is the unzipping of files.

Step 1: Create azure resource group, azure storage account, and azure data factory account.

Step 2: Create a folder at the storage container data and output. The data folder will have the zip folder files in compressed format.

DATA FOLDER
OUTPUT FOLDER INITIAL
CONTAINER FOLDER

Step3:

(i) Create a Linked Service as a connection string to establish the connection from the source and target.

(ii) Create a dataset as CSV to point to the source and the target.

SOURCE DATASET

Note: Mark the source compression type based on the compressed file.

TARGET DATASET

(iii) Select the copy activity to copy data from the source to the destination.

AT THE SOURCE COPY ACTIVITY
AT THE SINK COPY ACTIVITY
COPY ACTIVITY
COPIED DATA AT THE SINK FOLDER

Compression:

Compression is the zipping of files using various compression formats .zip, .rar, .gzip, etc.

SAMPLE FILES .csv at the source folder at the ADL storage.

SAMPLE FILES IN SOURCE FOLDER
OUTPUT FOLDER INITIAL
SOURCE DATASET
SINK DATASET
AT THE SOURCE COPY ACTIVITY
AT THE SINK COPY ACTIVITY
COPY ACTIVITY
SINGLE ZIP FOR ALL THE FILES

To create separate .zip files for each .csv follow the below steps:

SOURCE DATASET

PARAMETER
SOURCE DATASET

SINK DATASET

PARAMETER
SINK DATASET

MASTER PIPELINE

GETMETADATA ACTIVITY
FOREACH ACTIVITY
SOURCE AT THE COPY ACTIVITY
SINK AT THE COPY ACTIVITY
SUCCESSFUL PIPELINE EXECUTION
DATA COPIED TO THE OUTPUT FOLDER

Thanks for the read.Do clap if find it useful😃

“Keep learning and keep sharing knowledge”

--

--

Nidhi Gupta

Azure Data Engineer 👨‍💻.Heading towards cloud technologies expertise✌️.