A robust and flexible data sanitization component for PHP, part of the KaririCode Framework. It utilizes configurable processors and native functions to ensure data integrity and security in your applications.
- Features
- Installation
- Usage
- Available Sanitizers
- Configuration
- Integration with Other KaririCode Components
- Development and Testing
- Contributing
- License
- Support and Community
- Flexible attribute-based sanitization for object properties
- Comprehensive set of built-in sanitizers for common use cases
- Easy integration with other KaririCode components
- Configurable processors for customized sanitization logic
- Support for fallback values in case of sanitization failures
- Extensible architecture allowing custom sanitizers
- Robust error handling and reporting
- Chainable sanitization pipelines for complex data transformations
- Built-in support for multiple character encodings
- Protection against XSS and SQL injection attacks
You can install the Sanitizer component via Composer:
composer require kariricode/sanitizer
- PHP 8.3 or higher
- Composer
- Extensions:
ext-mbstring
,ext-dom
,ext-libxml
- Define your data class with sanitization attributes:
use KaririCode\Sanitizer\Attribute\Sanitize;
class UserProfile
{
#[Sanitize(processors: ['trim', 'html_special_chars'])]
private string $name = '';
#[Sanitize(processors: ['trim', 'email_sanitizer'])]
private string $email = '';
// Getters and setters...
}
- Set up the sanitizer and use it:
use KaririCode\ProcessorPipeline\ProcessorRegistry;
use KaririCode\Sanitizer\Sanitizer;
use KaririCode\Sanitizer\Processor\Input\TrimSanitizer;
use KaririCode\Sanitizer\Processor\Input\HtmlSpecialCharsSanitizer;
use KaririCode\Sanitizer\Processor\Input\EmailSanitizer;
$registry = new ProcessorRegistry();
$registry->register('sanitizer', 'trim', new TrimSanitizer());
$registry->register('sanitizer', 'html_special_chars', new HtmlSpecialCharsSanitizer());
$registry->register('sanitizer', 'email_sanitizer', new EmailSanitizer());
$sanitizer = new Sanitizer($registry);
$userProfile = new UserProfile();
$userProfile->setName(" Walmir Silva <script>alert('xss')</script> ");
$userProfile->setEmail(" walmir.silva@gmail.con ");
$result = $sanitizer->sanitize($userProfile);
echo $userProfile->getName(); // Output: "Walmir Silva"
echo $userProfile->getEmail(); // Output: "walmir.silva@gmail.com"
Here's an example of how to use the KaririCode Sanitizer in a real-world scenario, such as sanitizing blog post content:
use KaririCode\Sanitizer\Attribute\Sanitize;
class BlogPost
{
#[Sanitize(
processors: ['trim', 'html_special_chars', 'xss_sanitizer'],
messages: [
'trim' => 'Title was trimmed',
'html_special_chars' => 'Special characters in title were escaped',
'xss_sanitizer' => 'XSS attempt was removed from title',
]
)]
private string $title = '';
#[Sanitize(
processors: ['trim', 'markdown', 'html_purifier'],
messages: [
'trim' => 'Content was trimmed',
'markdown' => 'Markdown in content was processed',
'html_purifier' => 'HTML in content was purified',
]
)]
private string $content = '';
// Getters and setters...
}
// Usage example
$blogPost = new BlogPost();
$blogPost->setTitle(" Exploring KaririCode: A Modern PHP Framework <script>alert('xss')</script> ");
$blogPost->setContent("# Introduction\nKaririCode is a **powerful** and _flexible_ PHP framework designed for modern web development.");
$result = $sanitizer->sanitize($blogPost);
// Access sanitized data
echo $blogPost->getTitle(); // Sanitized title
echo $blogPost->getContent(); // Sanitized content
-
TrimSanitizer: Removes whitespace from the beginning and end of a string.
- Configuration Options:
characterMask
: Specifies which characters to trim. Default is whitespace.trimLeft
: Boolean to trim from the left side. Default istrue
.trimRight
: Boolean to trim from the right side. Default istrue
.
- Configuration Options:
-
HtmlSpecialCharsSanitizer: Converts special characters to HTML entities to prevent XSS attacks.
- Configuration Options:
flags
: Configurable flags likeENT_QUOTES | ENT_HTML5
.encoding
: Character encoding, e.g., 'UTF-8'.doubleEncode
: Boolean to prevent double encoding. Default istrue
.
- Configuration Options:
-
NormalizeLineBreaksSanitizer: Standardizes line breaks across different operating systems.
- Configuration Options:
lineEnding
: Specifies line ending style. Options: 'unix', 'windows', 'mac'.
- Configuration Options:
-
EmailSanitizer: Validates and corrects common email typos, normalizes email format, and handles whitespace.
- Configuration Options:
removeMailtoPrefix
: Boolean to remove 'mailto:' prefix. Default isfalse
.typoReplacements
: Associative array of common typo replacements.domainReplacements
: Corrects commonly misspelled domain names.
- Configuration Options:
-
PhoneSanitizer: Formats and validates phone numbers, including international support and custom formatting options.
- Configuration Options:
applyFormat
: Boolean to apply formatting. Default isfalse
.format
: Custom format pattern for phone numbers.placeholder
: Placeholder character used in formatting.
- Configuration Options:
-
AlphanumericSanitizer: Removes non-alphanumeric characters, with configurable options to allow certain special characters.
- Configuration Options:
allowSpace
,allowUnderscore
,allowDash
,allowDot
: Boolean options to allow specific characters.preserveCase
: Boolean to maintain case sensitivity.
- Configuration Options:
-
UrlSanitizer: Validates and normalizes URLs, ensuring proper protocol and structure.
- Configuration Options:
enforceProtocol
: Enforces a specific protocol, e.g., 'https://'.defaultProtocol
: The protocol to apply if none is present.removeTrailingSlash
: Boolean to remove trailing slash.
- Configuration Options:
-
NumericSanitizer: Ensures that the input is a numeric value, with options for decimal and negative numbers.
- Configuration Options:
allowDecimal
,allowNegative
: Boolean options to allow decimals and negative values.decimalSeparator
: Specifies the character used for decimals.
- Configuration Options:
-
StripTagsSanitizer: Removes HTML and PHP tags from input, with configurable options for allowed tags.
- Configuration Options:
allowedTags
: List of HTML tags to keep.keepSafeAttributes
: Boolean to keep certain safe attributes.safeAttributes
: Array of attributes to preserve.
- Configuration Options:
-
HtmlPurifierSanitizer: Sanitizes HTML content by removing unsafe tags and attributes, ensuring safe HTML rendering.
- Configuration Options:
allowedTags
: Specifies which tags are allowed.allowedAttributes
: Defines allowed attributes for each tag.removeEmptyTags
,removeComments
: Boolean to remove empty tags or HTML comments.htmlEntities
: Convert characters to HTML entities. Default istrue
.
- Configuration Options:
-
JsonSanitizer: Validates and prettifies JSON strings, removes invalid characters, and ensures proper JSON structure.
- Configuration Options:
prettyPrint
: Boolean to format JSON for readability.removeInvalidCharacters
: Boolean to remove invalid characters from JSON.validateUnicode
: Boolean to validate Unicode characters.
- Configuration Options:
-
MarkdownSanitizer: Processes and sanitizes Markdown content, escaping special characters and preserving the Markdown structure.
- Configuration Options:
allowedElements
: Specifies allowed Markdown elements (e.g., 'p', 'h1', 'a').escapeSpecialCharacters
: Boolean to escape special characters like '*', '_', etc.preserveStructure
: Boolean to maintain Markdown formatting.
- Configuration Options:
-
FilenameSanitizer: Ensures filenames are safe for use in file systems by removing unsafe characters and validating extensions.
- Configuration Options:
replacement
: Character used to replace unsafe characters. Default is'-'
.preserveExtension
: Boolean to keep the file extension.blockDangerousExtensions
: Boolean to block extensions like '.exe', '.js'.allowedExtensions
: Array of allowed extensions.
- Configuration Options:
-
SqlInjectionSanitizer: Protects against SQL injection attacks by escaping special characters and removing potentially harmful content.
- Configuration Options:
escapeMap
: Array of characters to escape.removeComments
: Boolean to strip SQL comments.escapeQuotes
: Boolean to escape quotes in SQL queries.
- Configuration Options:
-
XssSanitizer: Prevents Cross-Site Scripting (XSS) attacks by removing malicious scripts, attributes, and ensuring safe HTML output.
- Configuration Options:
removeScripts
: Boolean to remove<script>
tags.removeEventHandlers
: Boolean to remove 'on*' event handlers.encodeHtmlEntities
: Boolean to encode unsafe characters.
- Configuration Options:
The Sanitizer component can be configured globally or per-sanitizer basis. Here's an example of how to configure the HtmlPurifierSanitizer
:
use KaririCode\Sanitizer\Processor\Domain\HtmlPurifierSanitizer;
$htmlPurifier = new HtmlPurifierSanitizer();
$htmlPurifier->configure([
'allowedTags' => ['p', 'br', 'strong', 'em'],
'allowedAttributes' => ['href' => ['a'], 'src' => ['img']],
]);
$registry->register('sanitizer', 'html_purifier', $htmlPurifier);
For global configuration options, refer to the Sanitizer
class constructor.
The Sanitizer component is designed to work seamlessly with other KaririCode components:
- KaririCode\Contract: Provides interfaces and contracts for consistent component integration.
- KaririCode\ProcessorPipeline: Utilized for building and executing sanitization pipelines.
- KaririCode\PropertyInspector: Used for analyzing and processing object properties with sanitization attributes.
The registry is a core part of how sanitizers are managed within the KaririCode Framework. It acts as a centralized location to register and configure all sanitizers you plan to use in your application.
Here's how you can create and configure the registry:
// Create and configure the registry
$registry = new ProcessorRegistry();
// Register all required processors
$registry->register('sanitizer', 'trim', new TrimSanitizer());
$registry->register('sanitizer', 'html_special_chars', new HtmlSpecialCharsSanitizer());
$registry->register('sanitizer', 'normalize_line_breaks', new NormalizeLineBreaksSanitizer());
$registry->register('sanitizer', 'html_purifier', new HtmlPurifierSanitizer());
$registry->register('sanitizer', 'markdown', new MarkdownSanitizer());
$registry->register('sanitizer', 'numeric_sanitizer', new NumericSanitizer());
$registry->register('sanitizer', 'email_sanitizer', new EmailSanitizer());
$registry->register('sanitizer', 'phone_sanitizer', new PhoneSanitizer());
$registry->register('sanitizer', 'url_sanitizer', new UrlSanitizer());
$registry->register('sanitizer', 'alphanumeric_sanitizer', new AlphanumericSanitizer());
$registry->register('sanitizer', 'filename_sanitizer', new FilenameSanitizer());
$registry->register('sanitizer', 'json_sanitizer', new JsonSanitizer());
$registry->register('sanitizer', 'xss_sanitizer', new XssSanitizer());
$registry->register('sanitizer', 'sql_injection', new SqlInjectionSanitizer());
$registry->register('sanitizer', 'strip_tags', new StripTagsSanitizer());
This code demonstrates how to register various sanitizers with the registry, allowing you to easily manage which sanitizers are available throughout your application. Each sanitizer is given a unique identifier, which can then be referenced in attributes to apply specific sanitization rules.
For development and testing purposes, this package uses Docker and Docker Compose to ensure consistency across different environments. A Makefile is provided for convenience.
- Docker
- Docker Compose
- Make (optional, but recommended for easier command execution)
-
Clone the repository:
git clone https://github.com/KaririCode-Framework/kariricode-sanitizer.git cd kariricode-sanitizer
-
Set up the environment:
make setup-env
-
Start the Docker containers:
make up
-
Install dependencies:
make composer-install
make up
: Start all services in the backgroundmake down
: Stop and remove all containersmake build
: Build Docker imagesmake shell
: Access the PHP container shellmake test
: Run testsmake coverage
: Run test coverage with visual formattingmake cs-fix
: Run PHP CS Fixer to fix code stylemake quality
: Run all quality commands (cs-check, test, security-check)
For a full list of available commands, run:
make help
We welcome contributions to the KaririCode Sanitizer component! Here's how you can contribute:
- Fork the repository
- Create a new branch for your feature or bug fix
- Write tests for your changes
- Implement your changes
- Run the test suite and ensure all tests pass
- Submit a pull request with a clear description of your changes
Please read our Contributing Guide for more details on our code of conduct and development process.
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: https://kariricode.org/docs/sanitizer
- Issue Tracker: GitHub Issues
- Community Forum: KaririCode Club Community
- Stack Overflow: Tag your questions with
kariricode-sanitizer
Built with ❤️ by the KaririCode team. Empowering developers to create more secure and robust PHP applications.