Skip to main content

Overview

Knowlix Data Quality Management maintains database integrity through automated and manual cleanup processes. Remove duplicate records, archive outdated information, standardize text formatting, and ensure consistent data quality across all modules. Three core capabilities address common data quality challenges: Deduplication: Identify and merge duplicate contacts, leads, opportunities, products, or other records. Automated similarity matching finds near-duplicates with configurable matching rules. Merge duplicates while preserving related records and communication history. Record Recycling: Automatically archive or delete obsolete records based on age and inactivity. Configure rules to clean old archived leads, expired opportunities, inactive products, or historical data no longer needed for operations. Field Formatting: Standardize text data across the database. Trim extra spaces, enforce capitalization rules, format phone numbers to international standards, and convert HTML content to plain text. Maintain consistent formatting without manual editing. 💡 Pro Tip: Ask Your Knowlix for data quality tasks: “Find duplicate contacts” or “Show me archived opportunities older than 1 year”

Deduplication System

Identify and merge duplicate records to eliminate data redundancy.

Viewing Duplicates

The Duplicates dashboard groups similar records detected by deduplication rules. To access duplicates: Navigate to Data Cleaning → Deduplication The left sidebar shows active deduplication rules with the count of duplicates detected by each rule. Duplicate groupings display:
  • Similarity Rating: Percentage match between records (higher = more similar)
  • Created On: Original record creation date
  • Name: Record title or identifier
  • Field Values: Values from fields used for duplicate detection
  • Used In: Other models referencing this record
  • ID: Unique database identifier
  • Is Master: Designates which record becomes the merge target
Filtering by rule: Click a specific rule in the sidebar to show only duplicates detected by that rule. The “All” option shows duplicates from all active rules.

Merging Duplicate Records

To merge duplicates:
  1. Review the grouped similar records
  2. Select one record as the master (the record that will remain after merging)
  3. Click Merge at the top of the grouping
  4. Click OK to confirm
The master record absorbs all data from similar records. Related records (activities, emails, notes, linked documents) transfer to the master. Similar records are archived or deleted based on rule configuration. Post-merge logging: A message logs in the master record’s activity feed describing the merge, including which records were merged and when. Some record types include links to archived records for reference. Discarding groupings: If records aren’t actually duplicates (false positives), click DISCARD to hide the grouping. Discarded groupings don’t appear again unless you remove the discard filter. Viewing discarded groupings: Use the “Discarded” search filter to review previously discarded groupings. 💡 Merge Strategy: Always choose the most complete record as the master. If one record has more fields filled in, contact history, or linked documents, make it the master to preserve maximum information.

Deduplication Rules

Configure rules defining how duplicate detection works. To manage deduplication rules: Navigate to Data Cleaning → Configuration → Deduplication Creating or editing rules:
  1. Click New or select an existing rule
  2. Configure rule settings:
    • Model: Which record type to scan (Contacts, Leads, Products, etc.)
    • Domain: (Optional) Filter which records within the model are eligible
    • Duplicate Removal: Archive or Delete merged records
    • Merge Mode: Manual (requires review) or Automatic (merges without approval)
    • Similarity Threshold: (Automatic mode only) Minimum similarity percentage for auto-merge
    • Active: Enable or disable the rule
  3. Add at least one deduplication rule line:
    • Unique ID Field: Which field to compare (Name, Email, Phone, etc.)
    • Match If: Matching criteria (Exact Match or Case/Accent Insensitive Match)
  4. Save the rule
Rule execution:
  • Rules run automatically daily via scheduled tasks
  • Manually trigger any rule by opening it and clicking Deduplicate
  • The Duplicates smart button shows how many duplicates were found
Advanced configuration: Enable developer mode for additional options:
  • Suggestion Threshold: Minimum similarity to suggest duplicates (below this threshold, no suggestion)
  • Cross-Company: (Multi-company databases) Allow duplicate detection across companies
Example rule configuration: Deduplicate contacts with matching emails:
  • Model: Contact
  • Domain: (empty - all contacts)
  • Duplicate Removal: Archive
  • Merge Mode: Manual
  • Unique ID Field: Email
  • Match If: Case/Accent Insensitive Match
💡 Your Knowlix: “Create a deduplication rule for leads with matching phone numbers”

Record Recycling

Automatically identify and remove outdated records.

Viewing Recyclable Records

The Recycle Records dashboard lists records matching recycle rule criteria. To access recyclable records: Navigate to Data Cleaning → Recycle Records The left sidebar displays active recycle rules. Record display columns:
  • Record ID: Unique identifier
  • Record Name: Record title
Filtering by rule: Click a rule in the sidebar to show only records detected by that rule. “All” shows records from all rules.

Recycling Records

To recycle a record: Click the Validate button on the record row. The record is archived or deleted based on rule configuration. Discarding records: Click Discard if a record shouldn’t be recycled. Discarded records won’t appear in future scans by that rule. Viewing discarded records: Use the “Discarded” search filter to see previously discarded records.

Recycle Rules

Configure rules for identifying outdated records. To create recycle rules: Navigate to Data Cleaning → Configuration → Recycle Records No default rules exist. Click New to create one. Rule configuration:
  1. Select Model (record type to scan)
  2. (Optional) Set Filter to narrow eligible records
  3. Configure time criteria:
    • Time Field: Which date field determines record age (Last Updated, Created Date, etc.)
    • Delta: Time period value (e.g., 7, 30, 365)
    • Delta Unit: Time unit (Days, Weeks, Months, Years)
  4. Choose Recycle Mode:
    • Manual: Requires review before recycling
    • Automatic: Recycles without approval
  5. Select Recycle Action:
    • Archive: Move records to archived state
    • Delete: Permanently remove from database
  6. (Delete only) Include Archived: Also delete already-archived records matching criteria
  7. Save the rule
Manual rule execution: Open any rule and click Run Now to execute immediately. The Records smart button shows detected records. Example rule: Delete archived lost opportunities not updated in 1 year with “Too Expensive” lost reason:
  • Model: Lead/Opportunity
  • Filter: Active is not set AND Lost Reason is “Too Expensive”
  • Time Field: Last Updated on
  • Delta: 1
  • Delta Unit: Years
  • Recycle Mode: Automatic
  • Recycle Action: Delete
  • Include Archived: Enabled
💡 Compliance Tip: Before creating deletion rules, verify your data retention policies and legal requirements. Some industries require maintaining records for specific periods.

Field Cleaning and Formatting

Standardize text formatting across database fields.

Viewing Field Cleaning Suggestions

The Field Cleaning Records dashboard displays formatting suggestions. To access field cleaning: Navigate to Data Cleaning → Field Cleaning The left sidebar shows active cleaning rules. Display columns:
  • Record ID: Unique identifier
  • Record Name: Record title
  • Field: Which field contains the value to format
  • Current: Existing field value
  • Suggested: Proposed formatted value

Applying Formatting Changes

To apply a suggestion: Click the Validate button on the record row. The current value updates to the suggested format. Discarding suggestions: Click Discard if the current format is correct. The record won’t appear in future scans by that rule. Viewing discarded records: Use the “Discarded” search filter.

Field Cleaning Rules

Configure formatting standards for database fields. To manage cleaning rules: Navigate to Data Cleaning → Configuration → Field Cleaning A default “Contact” rule exists for formatting contact records. Edit it or create new rules. Creating cleaning rules:
  1. Click New
  2. Select Model (record type)
  3. Click Add a line in the Rules section
  4. In the popup, configure:
    • Field To Clean: Which field to format
    • Action: Formatting operation to apply
  5. Choose Cleaning Mode:
    • Manual: Requires review before applying changes
    • Automatic: Applies changes without approval
  6. Save
Available formatting actions: Trim Spaces: Removes excess whitespace.
  • All Spaces: Removes every space character
    • Example: “Dr. John Doe” becomes “Dr.JohnDoe”
  • Superfluous Spaces: Removes leading, trailing, and successive spaces
    • Example: “Dr. John Doe” becomes “Dr. John Doe”
Set Type Case: Enforces capitalization rules.
  • First Letters to Uppercase: Capitalizes first letter of each word
    • Example: “lumber inc, lorraine douglas” becomes “Lumber Inc, Lorraine Douglas”
  • All Uppercase: Converts entire string to capitals
    • Example: “lumber inc” becomes “LUMBER INC”
  • All Lowercase: Converts to all lowercase
    • Example: “LUMBER INC” becomes “lumber inc”
Format Phone: Converts phone numbers to international format based on country.
  • Example (Belgium): “061928374” becomes “+32 61 92 83 74”
  • Example (US): “800 555-0101” becomes “+1 800-555-0101”
Scrap HTML: Converts HTML markup to plain text.
  • Example: <h1>John Doe</h1><p>Lorem ipsum</p> becomes “John Doe Lorem ipsum”
Multiple rules: Add multiple formatting rules to the same field cleaning configuration. They apply in sequence. Manual rule execution: Open any rule and click Clean to run immediately. The Records smart button shows affected records. 💡 Formatting Consistency: Apply phone number formatting rules to all modules with phone fields (Contacts, Leads, Helpdesk, HR). Consistent formatting improves searchability and communication reliability.

Merge Action Manager

Control which database models support manual merging. To access merge settings: Enable developer mode, then navigate to Data Cleaning → Configuration → Merge Action Manager Display columns:
  • Model: Technical database model name
  • Model Description: User-friendly model name
  • Type: Base Object (system models) or Custom Object (user-created models)
  • Transient Model: Whether the model stores temporary data
  • Can Be Merged: Whether the merge action is enabled for this model
Enabling merge for models: Check the Can Be Merged box for models where you want users to manually merge records through the Actions menu. Filtering enabled models: Use the search bar filter to show only models where “Can Be Merged” is enabled. Use case: If you create custom models and want users to merge duplicate records, enable the merge action for those models here.

Best Practices

Run Deduplication Rules Regularly

Don’t wait for duplicate problems to accumulate. Enable automatic deduplication rules or manually run them weekly to maintain clean data.

Use Manual Mode for Initial Testing

When creating new deduplication or recycle rules, start with Manual mode. Review suggested actions to ensure rules work as intended before switching to Automatic mode.

Choose Appropriate Similarity Thresholds

High thresholds (90%+) reduce false positives but may miss legitimate duplicates. Lower thresholds (70-80%) catch more duplicates but require more manual review. Adjust based on your data quality needs.

Configure Time-Based Recycle Rules Conservatively

Start with longer time periods (2-3 years for deletions) and shorten as you gain confidence. Accidentally deleting recent data is worse than keeping old data slightly longer.

Always Archive Before Deleting

Create two recycle rules for the same data: first rule archives after 1 year, second rule deletes after 3 years. This creates a safety buffer for data recovery.

Test Field Cleaning Rules on Small Samples

Before enabling automatic field cleaning, run rules manually and review suggestions. Ensure formatting changes match your expectations before automating.

Combine Multiple Formatting Rules

Apply trimming before capitalization formatting. For example, trim superfluous spaces first, then apply “First Letters to Uppercase” for clean, consistently formatted names.

Document Your Data Quality Policies

Maintain written policies about when duplicates are merged, how long records are retained, and what formatting standards apply. Train team members on these policies.

Review Discarded Records Periodically

Quarterly review discarded duplicates and recycled records. If you’re frequently discarding the same types of records, your rules may need adjustment.

Backup Before Major Cleanup Operations

Before running large-scale automatic deletions or merges, back up your database. This provides rollback capability if rules behave unexpectedly.

Use Cross-Company Deduplication Carefully

In multi-company databases, cross-company duplicate detection can be valuable but may merge records that should remain separate for business reasons. Review cross-company duplicates carefully.

Monitor Automated Cleaning Results

Even with automatic rules, periodically review what’s being cleaned. Spot-check merged records, deleted records, and formatted fields to ensure quality.

Need Help?

Ask Your Knowlix:
  • “Find duplicate contacts in the database”
  • “Show me archived opportunities older than 2 years”
  • “Create a rule to format phone numbers”
  • “Merge these duplicate leads”
  • “Which recycle rules are running automatically?”
  • “Format all contact names to title case”
Contact Support: For questions about deduplication rule configuration, safe deletion timeframes, custom field formatting patterns, or bulk merge operations, contact Knowlix support through the Help menu.