Your suggested change has been received. Thank you.

close

Suggest A Change

https://thales.na.market.dpondemand.io/docs/dpod/services/kmo….

back

DDC Administration

Appendix

search

Appendix

Scan Filters

This section provides you with information on scan filters in various data stores, syntax details and examples of their usage. ("N/S" stands for "not supported". Numbers used in the parameters section are provided as examples.)

Files

LocalStorage - Linux

Exclude location by prefix
  • root directory: / (Supported but not recommended)

  • 1 folder:
    /folder1 - Exclude all the folders starting by /folder1 like /folder1, /folder12, /folder123 and their content
    /folder1/ - Exclude all the folders starting by the name /folder1/ and their content like /folder1/file.txt, or /folder1/myfile.txt.

  • more folders:
    /folder1/folder2/ - Exclude all the folders starting by /folder1 /folder2/ like /folder1/folder2/file3.txt, or /folder1/folder2/folder3/myfile.txt

  • files:
    folder1/file.txt - Exclude all the files with name file and extension txt and located in a folder with name folder1

Exclude location by sufix
  • root directory: N/S

  • 1 folder:
    myfolder - Exclude all the directories with that name and their content. It is equivalent to */myfolder
    myfolder/*.txt - Exclude all the txt files in the folders with name myfolder

  • more folders:
    /folder1/folder2/ - Exclude the folder with name folder2 located in parent folder named folder1

  • files:
    myfolder/*.pdf - Exclude files with extension pdf: myfolder/*.pdf
    myfolder/otherDir/myfile.txt - Exclude a specific file: myfolder/otherDir/myfile.txt
    myfolder/subfolder/*.txt - Exclude all txt files located in that folder and subfolder. Subfolders can be used to improve the filtering.

Exclude locations by expression
  • root directory: / (Supported but not recommended)

  • 1 folder:
    */folder1/* - Exclude all folders named folder1 and their content

  • more folders:
    /folder1/folder - Exclude all folders in folder 1 whose name is folder (?) like /folder1/folder (1), /folder1/folder (2) or /folder1/folder (a))

  • files:
    *file.txt - Exclude the elements matching with the expression like /myfolder/file.txt, /myfolder/myfile.txt, /myfolder/otherfolder/file.txt, /myfolder/otherfolder/mysensitivefile.txt
    */myfile.txt - Exclude the elements matching with the expression like /myfolder/myfile.txt or /myfolder/otherfolder/myfile.txt (but NOT /myfolder/file.txt or /myfolder/otherfolder/mysensitivefile.txt)

Include locations within modification date

parameters:

  • toDate: "2021-05-30"
  • fromDate: "2021-08-01"

Include files whose modification date is between that dates

Include locations modified recently

parameters:

  • days: 5 - Include files in the 5 previous days modified, even though there is a different filter excluding them
Exclude locations greater than file size

parameters:

  • size: 20 (size in MB) - Exclude the files whose size in MB is 20 or greater than 20 MB

LocalStorage - Windows

Exclude location by prefix
  • root directory: C:\ - Exclude the volume C:\

  • 1 folder:
    *\my_folder - Exclude all the folders with name my_folder and their content

  • more folders:
    *\my_folder\my_subfolder - Exclude all subfolders with name my_subfolder located in a folder with name my_folder and their content

  • files:
    C:\mydir\ - Exclude all files located in C:\mydir\ that begins with name file like file1.txt, file2.pdf*, etc.

Exclude location by sufix
  • root directory: C:\ (Supported but not recommended)

  • 1 folder:
    my_folder - Exclude all the folders with name my_folder and their content

  • more folders:
    my_folder\my_subfolder - Exclude all the folders with name my_subfolder located in a folder with name my_folder

  • files:
    .txt - Exclude all files with extension txt

Exclude locations by expression
  • root directory: C:\ - Exclude a whole volume C:\*

  • 1 folder:
    *\my_folder - Exclude all folders named my_folder an their content

  • more folders:
    my_folder\my_subfolder - Exclude all the folders with name my_subfolder located in a folder with name my_folder

  • files:
    *.txt - Exclude all files with extension .txt

Include locations within modification date

parameters:

  • toDate: "2021-05-30"
  • fromDate: "2021-08-01"

Include files whose modification date is between that dates

Include locations modified recently

parameters:

  • days: 5 - Include files in the 5 previous days modified, even though there is a different filter excluding them
Exclude locations greater than file size

parameters:

  • size: 20 (size in MB) - Exclude the files whose size in MB is 20 or greater than 20 MB

SMB

Exclude location by prefix
  • root directory: N/S

  • 1 folder:
    <sharename>\\sambafolder or *\sambafolder

  • more folders:
    *folder\subfolder

  • files:
    *\sambafolder\file.txt

Exclude location by sufix
  • root directory: N/S

  • 1 folder:
    myfolder - Exclude folder called :
    \\samba
    \\exclude_suffix

  • more folders:
    myfolder\mysubfolder

  • files:
    myfolder\mysubfolder\file.txt

Exclude locations by expression
  • root directory: N/S

  • 1 folder:
    *my?folder - Exclude folder called \samba\my_folder, also works for my-folder, etc

  • more folders:
    *my_folder\my_subfolder

  • files:
    *my_folder\my_subfolder

Include locations within modification date

parameters:

  • toDate: "2021-05-30"
  • fromDate: "2021-08-01"
Include locations modified recently

parameters:

  • days: 5
Exclude locations greater than file size

parameters:

  • size: 20 (size in MB)

NFS

Exclude location by prefix
  • root directory: N/S

  • 1 folder: /mnt/nfs/myfolder or *myfolder - Exclude folder with name myfolder and all the subfolder contained

    The mount point for NFS file system in this example is:/mnt/nfs

  • more folders: /mnt/nfs/myfolder/mysubfolder or *myfolder/mysubfolder

  • files:
    /myfolder/myfile.txt - Exclude txt file with that name located in that folder

Exclude location by sufix
  • root directory: N/S

  • 1 folder:
    myfolder - Exclude folder named myfolder

  • more folders:
    myfolder/mysubfolder - Exclude folder myfolder/mysubfolder

  • files:
    myfolder/myfile.xlsx - Exclude that file in myfolder with that extension

Exclude locations by expression
  • root directory: *

  • 1 folder:
    *my?folder - Exclude folder called my_folder, my-folder, etc

  • more folders:
    myfolder/mysubfolder - Exclude folder myfolder/mysubfolder

  • files:
    myfolder/myfile.xlsx - Exclude that file in myfolder with that extension

Include locations within modification date

parameters:

  • toDate: "2021-05-30"
  • fromDate: "2021-08-01"
Include locations modified recently

parameters:

  • days: 5
Exclude locations greater than file size

parameters:

  • size: 20 (size in MB)

Hadoop

Exclude location by prefix
  • root directory: N/S

  • 1 folder:
    */my_data - Exclude all the directories with name my_data and their content

  • more folders:
    */my_data/subdir - Exclude all directories with name subdir wich parent directoriy is named my_data

  • files:
    /my_data/file.ppt - Exclude file with that name and extension located in the directory my_data

Exclude location by sufix
  • root directory: N/S

  • 1 folder:
    mydir - Exclude all directories with name mydir, my_directory, etc, and their content
    mydir/ - Exclude all directories with name mydir and their content

  • more folders:
    mydir/subdir - Exclude all directories with name subdir, subdirectory, etc, and their content which are located in a directory with name mydir
    mydir/subdir/ - Exclude directory with name subdir and their content which is located in a directory with name mydir

  • files:
    *.pptx - Exclude all files with extension pptx

Exclude locations by expression
  • root directory: N/S

  • 1 folder:
    */mydir - Exclude all directories with name mydir and their content

  • more folders:
    */mydir/subdir - Exclude directory with name subdir and their content which is located in a directory with name mydir

  • files:
    */mydir/My_file.pptx - Exclude the file with that name and extension located in a directory with name mydir

Include locations within modification date

parameters:

  • toDate: "2021-05-30"
  • fromDate: "2021-08-01"

Include files whose modification date is between that dates

Include locations modified recently

parameters:

  • days: 5 - Include files in the 5 previous days modified, even though there is a different filter excluding them
Exclude locations greater than file size

parameters:

  • size: 20 (size in MB) - Exclude the files whose size in MB is 20 or greater than 20 MB

AWS S3

Exclude location by prefix
  • root directory: Bucket (Sample: asrm-ddcqa*)

  • 1 folder: myfolder

  • more folders: myfolder/mysubfolder

  • files: -

Exclude location by sufix
  • root directory: Bucket

  • 1 folder: Folder

  • more folders: -

  • files: -

Exclude locations by expression
  • root directory: Bucket

  • 1 folder: Folder

  • more folders: -

  • files: -

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

Databases

Oracle

Exclude location by prefix
  • Schema: N/S

  • Table: HR(SERVICE_NAME=XE):1521/my (You have to include the database, SID and port in the format <database>(SERVICE_NAME=<sid>):<port>/<table-prefix>)

  • Column: N/S

Exclude location by sufix
  • Schema: N/S

  • Table: user (Just add the last characters of the name of the table. * is not needed.)

  • Column: N/S

Exclude locations by expression
  • Schema: N/S

  • Table: HR(SERVICE_NAME=XE):1521/sensitive_expression_data (Add the table at the end of the string: <database>(SERVICE_NAME=<sid>):<port>/<my-table>)

  • Column: N/S

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

IBM DB2

The format database, port, schema, table and column are required depending on the required level <database>:<port>/<schema>/<table>/<column-prefix>.

Exclude location by prefix
  • Database: testdb:50000 (Supported but not meaningful to use)

  • Schema: testdb:50000/DB2IN - Exclude schemas that beggin with DB2IN located in that database

  • Table: testdb:50000/DB2INST1/SENSITIVE_DATA - Exclude tables that begin with SENSITIVE_DATA located in that database/schema

  • Column: testdb:50000/DB2INST1/SENSITIVE_DATA/first_ - Exclude columns that begin with first_ located in that database/schema/table

Exclude location by sufix
  • Database: testdb:50000 (Supported but not meaningful to use)

  • Schema: _MYSCHEMA - Exclude a schema. * not needed.

  • Table: MYTABLE - Exclude a table. * not needed.

  • Column: _name - Exclude a column. * not needed.

Exclude locations by expression
  • Database: testdb:50000* (Supported but not meaningful to use)

  • Schema: testdb:50000/OTHER_SCHEMA/*" - Exclude a schema and its content

  • Table: testdb:50000/DB2INST1/*DATA_EX*" - Exclude all tables that contain DATA_EX in their names

  • Column: testdb:50000/DB2INST1/DATA_EXPRESSION/*first_name*" Exclude all columns that contain firs_name in its name

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

Microsoft SQL

The asterisk (*) means that the filter matches anything. There is no limitation as to adding the asterisk before a schema or not. For example, the filter can be effective with these two expressions:
⚫️ testddc:1433/testschema
⚫️ */testschema
You are omitting the database+port in the second expression - it means that if there is only one schema named testschema or the scan has only that database added as datastore, the expressions are equivalent. If you add more than one mssql databases as datastores in the scan and there are more than one schema named testschema, then the filters are not equivalent because with the second one all testchema schemes are filtered, not in the first expression (only the testschema schema in testddc database).

Exclude location by prefix
  • Database: testddc:1433

    Supported but not useful (the database name is specified when the datastore is instanciated so no meaning then to filter out the whole db).

  • Schema: testddc:1433/testschema

    The schema is aways relative to what is called the catalog (ie the database name and the port).
    The same filter can be written also as */testschema but using testschema only is not effective as a prefix.

  • Table: testddc:1433/testschema/mytable

    The entire complete path should be specified or optionally omitted using a *. A filter like */mytable has the side effect of being all the tables whose name is mytable being filtered - no matter which schema they belong to.

  • Column: N/S

Exclude location by sufix
  • Database: testddc:1433

    Supported but not useful (the database name is specified when the datastore is instantiated so no meaning then to filter out the whole db).

  • Schema: testddc:1433/testschema (As it is a suffix, testschema also works with the same effect)

  • Table: testddc:1433/testschema/mytable

    If the table name is unique in all schemas mytable is enough to filter.
    If different schemas contain the same tablename then caution should be used.

  • Column: N/S

Exclude locations by expression
  • Database: testddc:1433

    Supported but not useful (the database name is specified when the datastore is instantiated so there is no point filtering out the whole db).

  • Schema: testddc:1433/testschema

    Tthe schema is aways relative to what is called the catalog (i.e. the database name and the port). The same filter can be written also as /testschema but testschema* however it is not effective as a prefix.

  • Table: testddc:1433/testschema/mytable

    The entire complete path should be specified or optionally omitted using a *. A filter like */mytable has the side effect of having all the tables whose name is mytable filtered, no matter which schema they belong to.

  • Column: N/S

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

Azure tables

Azure tables only contains the Table. There are no databases or schemas.

Exclude location by prefix
  • Database: N/S
  • Schema: N/S
  • Table: table (Trailing '*' is mandatory to match row_count)
  • Column: N/S
Exclude location by sufix
  • Database: N/S
  • Schema: N/S
  • Table: table* (Trailing '*' is mandatory to match row_count)
  • Column: N/S
Exclude locations by expression
  • Database: N/S
  • Schema: N/S
  • Table: table* (Trailing '*' is mandatory to match row_count)
  • Column: N/S
Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

MySQL

Exclude location by prefix
  • Database: database

    It is not mandatory to include * at the end of the database.
    It will ignore the database called database.

  • Schema: N/S

    MySQL only contains the database and tables.
    In some cases, for MySQL, database are also refered as schema.
    MySQL scan path is database/table.

  • Table: database/table or *table

    It is mandatory to include * at the beginning of the table name.
    Replace the trailing * with `database/` if you only want to exclude a table from a specific database.

  • Column: *testddc/sensitive_data/email (It's mandatory to include * at the beginning.)

Exclude location by sufix
  • Database: database*

    It is mandatory to include * at the end of the database.
    You can replace * with `/table` to only ignore a specific table from that database.

  • Schema: N/S

  • Table: database/table or table

    It will ignore all tables ending with the word table.
    Include the full path if you wish to ignore only a table in a specific database.

  • Column: testddc/sensitive_data/email

Exclude locations by expression
  • Database: database* (It is mandatory to include * at the end of the database.)

  • Schema: N/S

  • Table: database/table or *table (It is mandatory to include * at the beginning of the table name.)

  • Column: *testddc/sensitive_data/email

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

postgreSQL

Exclude location by prefix
  • Database: database (e.g. hr)

  • Schema: datbase:port/schema (e.g. hr:5432/prod for excluding the schemas that have the prod prefix or hr:5432/* for excluding all schemas from hr db scan locations)

  • Table: datbase:port/schema/table (e.g. hr:5432/hr/prod)

  • Column: *datbase:port/schema/table/column* (e.g. *hr:5432/hr/prod/EMAIL*)

    It is mandatory to include * at the beginning and end.

Exclude location by sufix
  • Database: database* (e.g. hr*)

    It is mandatory to include * at the end of the database.

  • Schema: *schema* (e.g. *prod*)

    It is mandatory to include * at the beginning and end.
    Along with schema it is also excluding table having suffix prod.

  • Table: datbase:port/schema/table (e.g. hr:5432/hr/prod)

  • Column: datbase:port/schema/table/column (e.g. hr:5432/hr/prod/EMAIL)

Exclude locations by expression
  • Database: database* (e.g. hr*)

    It is mandatory to include * at the end.

  • Schema: datbase:port/schema/* (e.g. hr:5432/prod/*)

    It is mandatory to include * at the end.

  • Table: datbase:port/schema/table (e.g. hr:5432/hr/prod)

  • Column: *datbase:port/schema/table/column* (e.g. *hr:5432/prod/prod/EMAIL*)

    It is mandatory to include * at the beginning and end.

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

NoSQL

Azure blobs

Exclude location by prefix
  • Account: N/S

  • Container: ddctest* or ddctest

  • Blob: ddctest/reuters/reut2-012.sgm or *reut2-012.sgm

Exclude location by sufix
  • Account: N/S

  • Container: *my_data*

    ⚫️ */my_data - N/S.
    ⚫️ *my_data - N/S
    ⚫️ */my_data*/ - N/S.
    ⚫️ *my_data* - Supported, but along with the Container it excludes the Blob also having suffix my_data and in report filter type is "Exclude location by expression" not by suffix.

    Container is allowed for exclude_suffix, but you should be careful when using it.

  • Blob:

    1. my_data
    2. my_data*
    3. ddctestblob/reut2*
Exclude locations by expression
  • Account: N/S

  • Container: ddctestdata* or *test* (It is mandatory to include * at the end of container name.)

  • Blob:

    1. *reut2*
    2. *.txt
    3. ddctestblob/reut2*
    4. *dctestblob/*.txt
    5. */reut2* (It is mandatory to give a * at the start if you want to exclude a specific blob like reut2-012.sgm would be *reut2-012.sgm)
Include locations within modification date

parameters:

  • toDate: "2021-08-04"
  • fromDate: "2021-09-24"

Date format is in "YYYY-MM-DD".
toDate and fromDate are inclusive for files that are being scanned.
The files matching include_date_range - it does not have to match include_recent.

Include locations modified recently

parameters:

  • date: 2

The files matching include recent doesn't have to match include_data_range.

Exclude locations greater than file size

parameters:

  • size: 1 (Size in MB)

MongoDB

Exclude location by prefix
  • Database: database (It is not mandatory to include * at the end of the database.)

  • Collection: database/colletion* or *collection*

    It is mandatory to include * at the beginning and end of the schema.
    Leading * matches database while trailing * match the _id.
    _id is primary key in mongoDB. ER2 creates an object for each document/record/ row instead of each table.

  • Fields: N/S

Exclude location by sufix
  • Database: database* (It is mandatory to include * at the end of the database.)

  • Collection: database/colletion* or collection*

    It is mandatory to include * at the end of the database.
    The trailing * matches _id.

  • Fields: N/S

Exclude locations by expression
  • Databse: database* (It is mandatory to include * at the end of the database.)

  • Collection: database/collection* or *collection* (It is mandatory to include * at the beginning and end of the schema.)

  • Fields: N/S

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

SAP Hana

Exclude location by prefix
  • Database: database

  • Schema:

    1. database:port/schema_name
    2. database:port/*

    It is mandatory to include port if it is different than the default one.
    We can replace schema_name with * to exclude all schema's.

  • Table:

    1. database:port/schema_name/table
    2. database:port/*/table
    3. database:port/*/*

    You can replace schema_name with * , to exclude the table from all schemas. Also we can replace the table name with *, to exclude all tables.

  • Column: *database:port/schema/table/column* (It is mandatory to include * at the beginning and end.)

Exclude location by sufix
  • Database: database* (It is mandatory to include * at the end of the database.)

  • Schema: -

  • Table:

    1. table
    2. database:port/schema/*table

    It will ignore all tables ending with the word table.
    Include the full path if you wish to ignore only a table in a specific schema and database.

  • Column:

    1. column
    2. table/column
    3. database:port/schema/table/column

    "Pipe" (|) can be used to exclude multiple columns.

Exclude locations by expression
  • Database: *database* (It is mandatory to add * at the beginning and ending and then only the filter type in the report is considered as "exclude_expression".)

  • Schema: database:port/schema/* (It is mandatory to include * at the end.)

  • Table: database:port/schema/table_expression (e.g. HXE:39041/*/*cop?)

  • Column:

    1. *column*
    2. *table/column*
    3. *schema/*/column*
    4. *database:port/schema/table/column*

    To exclude the column location for a specific table, you have to add the table name before the column.
    To exclude from a specific schema you have to add the schema as shown in example 3.
    It is mandatory to include * at the beginning and end.

Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

Mail

G-Mail

Exclude location by prefix
  • User: datastorecicduser

  • Account: - (User and Account seem to be the same thing)

  • Folder/Label: datastorecicduser/inbox or *inbox

    Use lower case for folder names.
    Labels are not supported as label id is required to filter.
    The second option would scan the inbox of every user.
    Question: I am filtering Gmail scan by labels - why am I not getting expected results? Answer: For the default system labels, Gmail creates some folders that do not match the label name. Please look at Gmail documentation to learn the right labels to filter data objects in a Gmail scan.

  • File:

    1. datastorecicduser/inbox/mydata filter test data/Mon, 23 Aug 2021 10:54:40 +0530/50-contacts.csv or
    2. datastorecicduser/inbox/mydata filter test data/*/50-contacts.csv or
    3. datastorecicduser/inbox/test filter test data or
    4. datastorecicduser/Label_4800715244570918733/test filter test data/ or
    5. datastorecicduser/*/mydata filter test data/

    The second example would be recommened to the user to avoid manully checking email's date and time and converitng it to the required format.
    The third option is used if you want to scan a specific email and all its content.
    As mentioned in the previous comment, label filters need label id as mentioned in the fourth example.
    The user can replace label id with '*', as shown in the fifth example, if he is scanning a specific label and wants to filter a specific email or file.
    The fourth and fifth example can be used similarly for exclude_suffix and exclude_expression.

Exclude location by sufix
  • User: datastorecicduser* (Trailing * matches remaining path.)

  • Account: -

  • Folder/Label: datastorecicduser/inbox* or *inbox*

    Question: I am filtering Gmail scan by labels - why am I not getting the expected results? Answer: For the default system labels, Gmail creates some folders that do not match the label name. Please look at Gmail documentation to learn the right labels to filter data objects in a Gmail scan.

  • File:

    • datastorecicduser/inbox/mydata filter test data/Mon, 23 Aug 2021 10:54:40 +0530/50-contacts.csv or
    • datastorecicduser/inbox/mydata filter test data/*/50-contacts.csv or
    • datastorecicduser/inbox/mydata filter test data*
Exclude locations by expression
  • User: datastorecicduser*

  • Account: -

  • Folder/Label: datastorecicduser/inbox* or *inbox*

    Question: I am filtering Gmail scan by labels - why am I not getting the expected results? Answer: For the default system labels, Gmail creates some folders that do not match the label name. Please look at Gmail documentation to learn the right labels to filter data objects in a Gmail scan.

  • File:

    • datastorecicduser/inbox/mydata filter test data/Mon, 23 Aug 2021 10:54:40 +0530/50-contacts.csv or
    • datastorecicduser/inbox/mydata filter test data/*/50-contacts.csv or
    • datastorecicduser/inbox/mydata filter test data*
Include locations within modification date

parameters:

  • toDate: "2021-08-04"
  • fromDate: "2021-08-24"

Date format is in "YYYY-MM-DD".
toDate and fromDate are inclusive for files that are being scanned.
The files matching include_date_range do not have to match include_recent.
The date when the email was sent/recieved is considered.

Include locations modified recently

parameters:

  • days: 2

Days should be between 1 to 99 (both inclusive).
The files matching include_recent do not have to match include_data_range.
The date when the email was sent/received is considered.

Exclude locations greater than file size

N/A

Others

Teradata

Exclude location by prefix
  • Schema: teradata (You can replace * with a schema to exclude all schemas.)
  • Table: teradata/sensitive_data or */sensitive_data (You can include * to exclude table from all schemas.)
  • Column: *teradata/sensitive_data/EMAIL (It is mandatory to include * at the beginning.)
Exclude location by sufix
  • Schema: teradata*
  • Table: teradata/sensitive*
  • Column:

    1. EMAIL
    2. EMAIL | MAC_ADDR
    3. teradata/EMAIL|MAC_ADDR
    4. teradata/sensitive_data/EMAIL
Exclude locations by expression
  • Schema: *teradata* or teradata* or teradata
  • Table:

    • teradata/sensitive_data_replica
    • */sensitive_data_replica
  • Column:

    • *EMAIL*
    • *teradata/*/EMAIL*
Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

Exchange Online

Exclude location by prefix
  • Group: All Users

  • User/Account:

    1. All Users/sample@sjcpl.onmicrosoft.com or
    2. *sample@sjcpl.onmicrosoft.com

    The second option would filter out "sample@sjcpl.onmicrosoft.com" user data objects from every group.

  • Folder: All Users/sample@sjcpl.onmicrosoft.com/inbox or *inbox

    Folder name is case-sensitive.
    The second option would filter out inbox data objects of every user and group.

  • Attachment:

    1. All Users/sample@sjcpl.onmicrosoft.com/Inbox/Mail a/2021-02-22T06:40:18Z/maildir-a.zip or
    2. All Users/sample@sjcpl.onmicrosoft.com/Inbox/Mail a/*/maildir-a.zip or
    3. maildir-a.zip or
    4. All Users/sample@sjcpl.onmicrosoft.com/folder_name/subject or
    5. All Users/sample@sjcpl.onmicrosoft.com/Inbox/Mail a/2021-02-22T06:40:18Z or
    6. *subject

    The second example would be recommended to the user to avoid manually checking mail's date and time and converting it to required format.
    The third option would filter out data objects with attachment maildir-a.zip.
    The fourth option is used if you want to filter out a specific mail and all its content with a corresponding subject name.
    The fifth and sixth option would filter out data objects with given timestamp and subject name.

Exclude location by sufix
  • Group: All Users* (You have to use trailing * to exclude given location.)

  • User/Account: All Users/sample@sjcpl.onmicrosoft.com* or *sample@sjcpl.onmicrosoft.com*

  • Folder: All Users/sample@sjcpl.onmicrosoft.com/inbox* or *inbox*

  • Attachment:

    • All Users/sample@sjcpl.onmicrosoft.com/Inbox/Mail a/2021-02-22T06:40:18Z/maildir-a.zip* or
    • All Users/sample@sjcpl.onmicrosoft.com/Inbox/Mail a/*/maildir-a.zip* or
    • *maildir-a.zip*
Exclude locations by expression
  • Group: All Users* (You have to use trailing * to exclude a given location.)

  • User/Account: All Users/sample@sjcpl.onmicrosoft.com* or *sample@sjcpl.onmicrosoft.com*

  • Folder: All Users/sample@sjcpl.onmicrosoft.com/inbox* or *inbox*

  • Attachment:

    • All Users/sample@sjcpl.onmicrosoft.com/Inbox/Mail a/2021-02-22T06:40:18Z/maildir-a.zip* or
    • All Users/sample@sjcpl.onmicrosoft.com/Inbox/Mail a/*/maildir-a.zip* or
    • *maildir-a.zip*
Include locations within modification date

N/S

Include locations modified recently

N/S

Exclude locations greater than file size

N/S

Sharepoint

N/S

G-Drive

Exclude location by prefix
  • User: datastorecicduser (This is for a user with email address datastorecicduser@ddc-thalescpl.com)

  • Folder: datastorecicduser/my drive/some/folder (The path should be in lower case.)

  • File: datastorecicduser/my drive/some/folder/file.ext

Exclude location by sufix
  • User: datastorecicduser* (Trailing * matches the remaining path.)

  • Folder: datastorecicduser/my drive/some/folder* or *folder* (The second example with exclude folder present inside every user's drive.)

  • File: datastorecicduser/my drive/some/folder/file.ext* or *file.ext*

    Trailing * is mandatory even for absolute path.
    The second example with exclude file.ext present inside every user's drive.

Exclude locations by expression
  • User: datastorecicduser*

  • Folder: datastorecicduser/my drive/some/folder* or *folder*

  • File: datastorecicduser/my drive/some/folder/file.ext* or *file.ext*

Include locations within modification date

parameters:

  • toDate: "2021-08-18"
  • fromDate: "2021-08-19"

Date format is in "YYYY-MM-DD".
toDate and fromDate are inclusive for files that are being scanned.
The files matching include_date_range do not have to match .include_recent.
This filter only works on modified date.

Include locations modified recently

parameters:

  • days: 2

Days should be between 1 to 99 (both inclusive).
The files matching include_recent do not have to match .include_data_range.
This filter only works on a modified date.

Exclude locations greater than file size

parameters:

  • size: 2 (Size is in MB)

Fractions or decimal points are not allowed.

Information Types

Infotype Name Category Region
American Express Financial Global
Australian Bank Account Number Financial Oceania
Australian Business Number Financial Oceania
Australian Company Number Financial Oceania
Australian Driver License Number Personal Data Oceania
Australian Healthcare Identifier - Organisation Medical Oceania
Australian Individual Healthcare Identifier Medical Oceania
Australian Mailing Address Personal Data Oceania
Australian Medicare Card Medical Oceania
Australian Medicare Provider Medical Oceania
Australian Passport Number Personal Data Oceania
Australian Tax File Number National ID Oceania
Australian Telephone Number Personal Data Oceania
Austrian Driver License Number Personal Data Europe
Austrian Mailing Address Personal Data Europe
Austrian Passport Number Personal Data Europe
Austrian Personalausweis National ID Europe
Austrian SSN National ID Europe
Austrian Telephone Number Personal Data Europe
Belgian Driver License Number Personal Data Europe
Belgian eID National ID Europe
Belgian National Number National ID Europe
Belgian Passport Number Personal Data Europe
Belgian Telephone Number Personal Data Europe
Brazilian CPF National ID Americas
Brazilian Registro Geral National ID Americas
Bulgarian EGN National ID Europe
Canadian Bank Account Number Financial Americas
Canadian Health Service Number Medical Americas
Canadian Mailing Address Personal Data Americas
Canadian Passport Number Personal Data Americas
Canadian Personal Health Identification Number (PHIN) Medical Americas
Canadian Social Insurance Number National ID Americas
Canadian Telephone Number Personal Data Americas
Chilean RUN National ID Americas
China Union Pay Financial Global
Credentials username Personal Data Global
Credentials password Personal Data Global
Croatian OIB National ID Europe
Cypriot Passport Number Personal Data Europe
Czech Republic RC National ID Europe
Danish CPR National ID Europe
Danish Driver License Number Personal Data Europe
Danish Passport Number Personal Data Europe
Date Of Birth Personal Data Global
Date Of Birth (under 18) Personal Data Global
Diners Club Financial Global
Discover Financial Global
Drug Enforcement Agency Number Medical Americas
Dutch Burgerservicenummer National ID Europe
Dutch Driver License Number Personal Data Europe
Dutch NIK National ID Europe
Dutch Passport Number Personal Data Europe
Dutch Telephone Number Personal Data Europe
Email addresses Personal Data Global
Ethnicity (English) Personal Data Global
European EHIC Medical Europe
Finnish HETU National ID Europe
French Carte Vitale National ID Europe
French CNI National ID Europe
French Driver License Number Personal Data Europe
French INSEE National ID Europe
French Mailing Address Personal Data Europe
French Passport Number Personal Data Europe
French Telephone Number Personal Data Europe
Gambian National Identification Number National Africa
Gender (English) Personal Data Global
Generic Bank Account Number Financial Global
German Driver License Number Personal Data Europe
German Mailing Address Personal Data Europe
German Passport Number Personal Data Europe
German Personalausweis National ID Europe
German Telephone Number Personal Data Europe
Greek AFM National ID Europe
Greek AMKA National ID Europe
Greek Passport Number Personal Data Europe
Hong Kong ID National ID Asia
Hungarian Personal ID National ID Europe
Icelandish Kennitala National ID Europe
Indian Name Personal Data Asia
Indian Aadhaar Number National ID Asia
Indian PAN (Juridical) Number National ID Asia
Indian Voter ID National ID Asia
Indian Driving License Number Personal Data Asia
Indian Phone Number Personal Data Asia
Indian Address Personal Data Asia
Indian MGNREGA Job Card ID National ID Asia
Indian Bank Account Number Financial Data Asia
Indian Ration Card Number National ID Asia
Indian Marital Status Personal Data Asia
Indian Passport Number Personal Data Asia
International Bank Account Number (IBAN) Financial Global
IP Address Personal Data Global
Iranian National Identification Number National Asia
Irish Driver License Number Personal Data Europe
Irish Passport Card Number Personal Data Europe
Irish Passport Number Personal Data Europe
Irish Personal Public Service Number National Europe
Irish Telephone Number Personal Data Europe
ISO8583 message with PAN Financial Global
Israeli Bank Account Number Financial Asia
Israeli Identity Number National ID Asia
Italian CARTA D'IDENTITÀ National ID Europe
Italian Codice Fiscale National ID Europe
Italian Driver License Number Personal Data Europe
Italian Mailing Address Personal Data Europe
Italian Passport Personal Data Europe
Italian Telephone Number Personal Data Europe
Japanese Bank Account Number Financial Asia
Japanese Driver License Number Personal Data Asia
Japanese Passport Number Personal Data Asia
Japanese Resident Registration Number National Asia
Japanese Social Insurance Number (SIN) National Asia
JCB Financial Global
Laser Financial Global
Latvian Personas Kods National ID Europe
License Number Personal Data Global
Login credentials Personal Data Global
Luxembourg Driver License Number Personal Data Europe
Luxembourg ID National ID Europe
Luxembourg Passport Number Personal Data Europe
Luxembourg Phone Number Personal Data Europe
MAC Address Personal Data Global
Macedonian UMCN National ID Europe
Maestro Financial Global
Malaysian NRIC National ID Asia
Maltese eID National ID Europe
Mastercard Financial Global
Medicare Beneficiary Identifier (MBI) Medical North America
Mexican CURP National ID Americas
New Zealand Inland Revenue Number National ID Oceania
New Zealand Mailing Address Personal Data Oceania
New Zealand Passport Number Personal Data Oceania
New Zealand Telephone Number Personal Data Oceania
Norwegian Birth Number National ID Europe
Norwegian Driver License Number Personal Data Europe
Norwegian Passport Number Personal Data Europe
Passport Number Personal Data Global
Peoples Republic of China ID National ID Asia
Personal Names (Austrian) Personal Data Europe
Personal Names (Belgian) Personal Data Europe
Personal Names (English) Personal Data Global
Personal Names (French) Personal Data Europe
Personal Names (German) Personal Data Europe
Personal Names (Italian) Personal Data Europe
Personal Names (Netherlands) Personal Data Europe
Personal Names (Polish) Personal Data Europe
Personal Names (Portuguese) Personal Data Europe
Polish Driver License Number Personal Data Europe
Polish Identity Card National ID Europe
Polish Mailing Address Personal Data Europe
Polish Passport Number Personal Data Europe
Polish PESEL National ID Europe
Polish Telephone Number Personal Data Europe
Portuguese Citizen's Card National ID Europe
Portuguese Driver License Number Personal Data Europe
Portuguese Fiscal Number National ID Europe
Portuguese Identity Number National ID Europe
Portuguese Mailing Address Personal Data Europe
Portuguese Passport Number Personal Data Europe
Portuguese Phone Number Personal Data Europe
Private Label Card Financial Global
Profanity (English) Personal Data Global
Religion (English) Personal Data Global
Romanian Identity Card National ID Europe
Romanian Numerical Personal Code National ID Europe
Saudi Arabia National ID National ID Asia
Serbian UMCN National ID Europe
Singaporean NRIC National ID Asia
Slovakian RC National ID Europe
Slovenian EMSO National ID Europe
South African Identity Number National ID Africa
South Korean Corporation Registration Number (법인등록번호) Financial Asia
South Korean Driver License Number Personal Data Asia
South Korean Foreigner Number National ID Asia
South Korean Gwangju Bank (광주은행) Account Number Financial Asia
South Korean Jeju Bank (제주은행) Account Number Financial Asia
South Korean Jeonbuk Bank (전북은행) Account Number Financial Asia
South Korean KB Bank (국민은행) Account Number Financial Asia
South Korean KEB Hana Bank (KEB하나은행) Account Number Financial Asia
South Korean NH Bank (농협은행) Account Number Financial Asia
South Korean Passport Personal Data Asia
South Korean Phone Number Personal Data Asia
South Korean RRN National ID Asia
South Korean Shinhan Bank (신한은행) Account Number Financial Asia
South Korean Taxpayer Identification Number (사업자등록번호) Financial Asia
Spanish DNI National ID Europe
Spanish Driver License Number Personal Data Europe
Spanish NIE National ID Europe
Spanish Passport Number Personal Data Europe
Spanish Social Security Number National ID Europe
Spanish Telephone Number Personal Data Europe
Sri Lankan National Identity Card National ID Asia
Swedish Driver License Number Personal Data Europe
Swedish Nationellt ID-kort National ID Europe
Swedish Passport Number Personal Data Europe
Swedish Personnummer National ID Europe
SWIFT Code Financial Global
Swiss Social Security Number National ID Europe
Taiwanese ID National ID Asia
Thai Population Identification Code National ID Asia
Troy Financial Global
Turkish Identification Number National ID Europe
Turkish Telephone Number Personal Data Europe
United Arab Emirates ID National ID Asia
United Kingdom Community Health Index Medical Europe
United Kingdom Driver License Number Personal Data Europe
United Kingdom Electoral Roll Number Personal Data Europe
United Kingdom Health and Care Number Medical Europe
United Kingdom Mailing Address Personal Data Europe
United Kingdom National Health Service Number Medical Europe
United Kingdom NI Number National ID Europe
United Kingdom Passport Number Personal Data Europe
United Kingdom Self Assessment UTR Number National ID Europe
United Kingdom Telephone Number Personal Data Europe
United Kingdom VAT Number Financial Europe
United States Bank Account Number Financial Americas
United States Driver License Number Personal Data Americas
United States Health Insurance Claim Number Medical Americas
United States Health Plan Identifier Medical Americas
United States Individual Taxpayer Identification Number (ITIN) National ID Americas
United States Mailing Address Personal Data Americas
United States National Provider Identifier Medical Americas
United States Passport Number Personal Data North America
United States Passport Card Number Personal Data North America
United States Routing Transit Number Financial Americas
United States Social Security Number National Americas
United States Telephone Number Personal Data Americas
Visa Financial Global
Yugoslavia UMCN National ID Europe

Supported Formats

Files

Type Format
Compressed bzip2, Gzip (all types), TAR, Zip (all types)
Databases Access, DBase, SQLite, MSSQL MDF & LDF
Images BMP, FAX, GIF, JPG, PDF (embedded), PNG, TIF
Microsoft Backup Archive Microsoft Binary / BKF
Microsoft Office v5, 6, 95, 97, 2000, XP, 2003 onwards
Open Source Star Office / Open Office / Libre Office
Open Standards PDF, RTF, HTML, XML, CSV, TXT

Office files

WORD

  • Legacy: Legacy filename extensions denote binary Microsoft Word formatting that became outdated with the release of Microsoft Office 2007. Although the latest version of Microsoft Word can still open them, they are no longer developed. Legacy filename extensions include:

    • .doc – Legacy Word document; Microsoft Office refers to them as "Microsoft Word 97 – 2003 Document"
    • .dot – Legacy Word templates; officially designated "Microsoft Word 97 – 2003 Template"
    • .wbk – Legacy Word document backup; referred as "Microsoft Word Backup Document"
  • OOXML: Office Open XML (OOXML) format was introduced with Microsoft Office 2007 and became the default format of Microsoft Word ever since. Pertaining file extensions include:

    • .docx – Word document
    • .docm – Word macro-enabled document; same as docx, but may contain macros and scripts
    • .dotx – Word template
    • .dotm – Word macro-enabled template; same as dotx, but may contain macros and scripts
    • .docb – Word binary document introduced in Microsoft Office 2007

EXCEL

  • Legacy: Legacy filename extensions denote binary Microsoft Excel formats that became outdated with the release of Microsoft Office 2007. Although the latest version of Microsoft Excel can still open them, they are no longer developed. Legacy filename extensions include:

    • .xls – Legacy Excel worksheets; officially designated "Microsoft Excel 97-2003 Worksheet"
    • .xlt – Legacy Excel templates; officially designated "Microsoft Excel 97-2003 Template"
    • .xlm – Legacy Excel macro
  • OOXML: Office Open XML (OOXML) format was introduced with Microsoft Office 2007 and became the default format of Microsoft Excel ever since. Excel-related file extensions of this format include:

    • .xlsx – Excel workbook
    • .xlsm – Excel macro-enabled workbook; same as xlsx but may contain macros and scripts
    • .xltx – Excel template
    • .xltm – Excel macro-enabled template; same as xltx but may contain macros and scripts

POWERPOINT

  • Legacy:

    • .ppt – Legacy PowerPoint presentation
    • .pot – Legacy PowerPoint template
    • .pps – Legacy PowerPoint slideshow
  • OOXML:

    • .pptx – PowerPoint presentation
    • .pptm – PowerPoint macro-enabled presentation
    • .potx – PowerPoint template
    • .potm – PowerPoint macro-enabled template
    • .ppam – PowerPoint add-in
    • .ppsx – PowerPoint slideshow
    • .ppsm – PowerPoint macro-enabled slideshow
    • .sldx – PowerPoint slide
    • .sldm – PowerPoint macro-enabled slide

ACCESS

  • Legacy:

    • .ade – Protected Access Data Project (not supported in 2013)
    • .adp - Access Data Project (not supported in 2013)
    • .mdb - Access Database (2003 and earlier)
    • .cdb - Access Database (Pocket Access for Windows CE)
    • .mda - Access Database, used for addins (Access 2, 95, 97), previously used for workgroups (Access 2)
    • .mdt - Access Add-in Data (2003 and earlier)
    • .mdf - Access (SQL Server) detached database (2000)
    • .mde - Protected Access Database, with compiled VBA and macros (2003 and earlier)
    • .ldb - Access lock files (associated with .mdb)
  • Available formats since Access 2007:

    • .accdb – The file extension for the new Office Access 2007 file format. This takes the place of the MDB file extension
    • .accde – The file extension for Office Access 2007 files that are in "execute only" mode. ACCDE files have all Visual Basic for Applications (VBA) source code hidden. A user of an ACCDE file can only execute VBA code, but not view or modify it. ACCDE takes the place of the MDE file extension
    • .accdt – The file extension for Access Database Templates
    • .accdr – is a new file extension that enables you to open a database in runtime mode. By simply changing a database's file extension from .accdb to .accdr, you can create a "locked-down" version of your Office Access database. You can change the file extension back to .accdb to restore full functionality

OUTLOOK

  • .pst - Outlook
  • .ost - Outlook
  • .msg - Outlook
  • .dbx - Outlook

OTHER

  • .pub – a Microsoft Publisher publication
  • .xps – a XML-based document format used for printing (on Windows Vista and later) and preserving documents

Databases

  • Microsoft SQL
  • Oracle
  • IBM DB2
  • PostgresQL
  • SAP HANA
  • MySQL
  • MongoDB

Big Data

  • Hadoop
  • Teradata

Binary Large Objects

Database Object Type
Oracle BLOB, CLOB
Microsoft SQL Server VARBINARY, Filestream
PostgreSQL bytea, text, Large Objects(oid)
MySQL BLOB, TINYBLOB, MEDIUMBLOB, LONGBLOB
IBM DB2 BLOBs, CLOB, DBCLOB
Teradata BLOBs
MongoDB GridFS
SAP HANA BLOB, NCLOB

Configuration Backup

You can back up and restore the DDC configuration by using the Backup/Restore functionality available in CipherTrust Manager. Such a backup will include the following elements:

  • Data Stores

  • Branch Locations

  • Classification Profiles

  • Infotypes

  • Report definitions

This backup will not include the information about the scan executions.

Creating/Restoring the Configuration Backup

To create or restore a backup of your DDC configuration:

  1. Log in to CipherTrust Manager.

  2. Click the Admin Settings link on the dashboard.

  3. Select Backups from the sidebar on the left. This will display the Backups screen.

  4. To create a backup of your DDC configuration, click the Create Backup button.

  5. To restore your DDC configuration from a backup, click the Upload Backup button.

For more details refer to these sections of the CipherTrust Manager as a Service documentation:

Configuration Backup Limitations

  • The configuration backup references the DDC Active Node. Restoring the backup to a different CipherTrust Manager cluster leaves DDC referencing an invalid node, and therefore without any valid active node.

  • The configuration backup contains the definition of the DDC resources (such as the Scan or Data Store definitions). Restoring from a backup that does not contain a certain resource (for example, a Custom Classification Profile) or a resource version after a scan had been completed causes a TDP scan execution data to point to an invalid resource identifier.

    If you generate a report that points to the missing resource you may display incomplete data (such as not being able to display the resource name) and/or fail.

Creating/Restoring Backup of Scan Executions

To back up or restore the your Data Discovery and Classification scan executions data you need to access the DDC data stored in Hadoop. For details, refer to the Thales Data Platform Hadoop Backup section in the Thales Data Platform Administrator Guide.

Mounting an NFS Share

To mount an NFS share on a Proxy agent, run this command as root:

sudo mount &lt;nfs-server-hostname|nfs-server-ipaddress>:&lt;/target/directory/share-name>